Performing Performance!
Firstly, I’d like to welcome José to the project as a new external tester. Hope you have fun!
Work this week has again been concentrated on resolving outstanding issues. Many of the issues were quite trivial – if a little hard to diagnose. However, the one that has been burning most of my time this weekend is unformed units.
These dastardly units work exceedingly well on my laptop, but when the same build is installed on my main and more powerful machine, they exhibit big performance issues and other weird artefacts like random height changes!
This issue is unlike any other that I have faced because the symptoms only occur on my main machine(!) despite the version under test being identical on both machines. I was expecting that it would be my laptop having all the problems as it is a truly ancient machine. But in a bizarre twist of fate and possibly design, the laptop has no issues whatsoever - instead it’s my uber-mega-system which is having the problems.
What could be causing my faster machine to perform worse than my very slow machine?
I had a theory….
I suspected that the faster machine had a much higher throughput within the main render loop of the game. I further theorised that if there was a task involving a lot of processing power going on, the faster machine would theoretically call it more often than the slow machine – in fact, much more often! So in this case it pays not to be so fast!
After running Ancient Armies through my debugger it became obvious that there were a few places where the code was getting bogged down.
Whenever a unit is moved in Ancient Armies the software checks every vertex point of that unit against the underlying map to get the height information for that vertex. In addition, the system effectively recreates the unit’s vertex buffer based on the newly gained height information. It is this system that allows units to ‘terrain-follow’ when being moved around and to extend their heights when they are on the edge of a slope.
Two things occurred to me here:
- Checking every vertex of a mesh is highly inefficient. Firstly the mesh has to be locked and unlocked – an expensive operation. Secondly, why bother testing so many points? There is no real need to. Unformed units are especially susceptible to this as they have a much higher polygon count than their formed unit siblings.
- Also why bother recalculating the vertex buffer every time a unit is moved? After all, we only really need to do it when the unit experiences big changes in ground elevation, such as going up or down a hill. Again this is an expensive operation as buffers need to be locked and unlocked and once again our unformed unit fairs badly in this respect due to its high polygon count.
So what is my solution?
It would be a multi-pronged solution. The first thing I wanted to do was create a height finding technique that doesn’t require as many sample points and has no requirement to lock and unlock the unit’s vertex buffer. Basically, I wanted it to work with no dependency on the number of vertices within a unit. Next I would prevent the unit vertex buffers from being calculated unless transitioning hill elevations and finally I would ensure that the main render loop has a governor installed on it to limit its maximum throughput.
On to part one. How was I going to make vertex height checking more efficient on my units?
The cunning scheme I hatched was to rely on the fact that all my unit shapes have concave polygonal shapes. Concave polygons have no vertices whose coordinates fall inside the bounds of the other vertices within the shape. In plain English terms this means that I only need check the height at the few critical points of the shape that define its external boundary.
What is more, these points could be stored outside of the vertex buffer, which means that I would no longer have to take the performance hit of locking and unlocking the vertex buffer! Woot!
I set about modifying my code to dynamically calculate these boundary points. The screenshot below shows the points that my new code calculated as the boundary sample points for a standard square formation (click for full size image):
Cool Eh?
With the new algorithm the above unit only requires that four height samples are taken, a vast improvement when compared to the original system’s twenty one. So even for a unit that is not unformed, the savings in performance are huge – especially when one considers the number of calculations that go on under the hood to calculate each height precisely. It should be noted that the red sample points are only there for debugging purposes - they will obviously not be around once coding has finished!
So is that it?
Nope.
I took the liberty of making my sample point algorithm intelligent. Not intelligent as in HAL2000 intelligent, but intelligent as in being able to take a unit’s overall size into account! To show this in action here is a screen shot of a much larger unit – the Macedonian Hypaspists (click for full size image):
Here one can see that my algorithm has realised it is dealing with a relatively long unit and as a result it has calculated more sample points on the x-axis of the unit! Of course my cunning algorithm also works in depth (click for full size image):
Next I moved on to wedge units (click for full size image):
And of course my algorithm takes unit size into account for wedges too as shown below (click for full size image):
I’ve also started work on Rhomboid shaped units too, although these are not yet coded to dynamically allocate points based on size. (click for full size image):
I still have more work to do to completely resolve the unformed units issue. Obviously, Rhomboids need to be finished off and circular units need to be addressed. In addition to this I will need to put code in place to only update a unit’s vertex buffer when necessary, IE when going up or down a hill. Plus I would need to add a governor to the main render loop. All in all quite a lot to keep me amused!
What else have I been up to this week?
Well, I managed to find some time to setup my own email server and integrate this server with my issues system. As a result whenever work is carried out on an issue or a build is released, the relevant people automatically get an email informing them what’s going on!
I have also carried out further work on the tester’s forum in terms of stylistic design and providing enough documentation to get the testers going.
In terms of my fight against the issues raised by my testers, I think I’m winning – *cue cheesy grin* (click for full size image):
As one can see the number of outstanding issues as indicated by the blue line has taken a substantial dive!
That’s it for this week.
Laters
RobP









































