User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
An inquiry into using multiple cores in WWIIOL: "Does the game make use of multiple cores, or can I put the game on one core and FRAPs on another?"

Short Answer: If you're planning to record gameplay with Fraps (and possibly other recording tools, too) - you will likely benefit from setting the affinity of the recording application to a specific CPU core other than Core 0.

If you are running a hyperthreading CPU, such as the i7, some of the CPUs you can see under Windows are "virtual". The i7, for example, is a quad core CPU - it has 4 CPU cores. Windows will show 8 CPUs. This is because each core is capable of running 2 threads at once. CPU 0 and 1 are actually one CPU core. And if your CPU has Intel's speedboost technology, then the CPU core that appears as CPU 0 and CPU 1 in Windows are able to run faster, so that's a great place to have the game run.

Long Answer:

Some of you may recall a huge performance gain from a while back when we flipped a switch in our compiler and enabled SSE instructions.

Sadly, multiple-CPU cores is a bit of a soft-con by the CPU vendors. Having multiple CPU cores is not like having multiple SLI graphics cards - which automatically distribute the workload betwee themselves. It is more like plonking down additional PCs next to each other and somehow expecting that to make your game run faster :(

The CPU is not some part of your computer - it is the brain of the machine. But a CPU "core" is, to all intents and purposes, a separate CPU. It just happens to be on the same physical chip as your primary!

In a nutshell, the fundamental exercise of sharing a bit of work with another CPU core is actually akin to launching a separate program. It's expensive. Until it's launched, your work is on hold because it has to be done from your CPU core.

Imagine sitting down at a table with 7 other people and being asked to add 1000 numbers together. That'll take a little while. But you are told you can use the 7 other folks to calculate the result.

In order to do that, you'd have to divide up the work. First of all, work out how many numbers each of you is going to add (1000/8 = 125). Then you have to assign everyone their 125 numbers. Everyone adds their numbers at a slightly different speed so you have to wait for everyone to finish. Collect all 8 results and add them together to get your final number and turn it in.

For 1000 numbers, that would probably be a big help. But if you only had 10 numbers, it's probably faster just to add them yourself, right? So if you were going to do that again, your first step might be "find out how many people I have to help me, and figure out if it is worth sharing the work". I mean, if you only have to add 2 numbers, you'll spend more time working out if you should share the work than you will adding "1 + 1".

In a computer, it is less simple. Everything a program does has to be expressed as CPU instuctions. But distributing work across cores is a sort of existential problem: There is no CPU instruction available to a program to do it. From the perspective of a program, multi-tasking is handled by the Operating System. But the operating system is just another program, another collection of CPU instructions.

Go back to our adding example above. Instead of being all at one table, you are all in separate rooms. To find out how many others are available to help you, or to send or receive information to one of them, you must speak to the manager (operating system).

Speaking to the manager is itself a complicated task! You have to write down your question (or instruction) on a piece of paper and leave your cubicle. The manager will then step in, read the paper and write out his answer on another piece of paper before leaving and allowing you back in. Remember, the manager is a busy guy so there's no guarantee he will respond immediately, either :(

This all adds up very quickly to a lot of overhead (and there are several overheads I haven't even attempted to describe).

Our game is written in C++ and C. There are tricks we can use that will make portions of code execute across multiple cores. But first we have to identify portions of work that are both (a) independent enough for concurrent execution, (b) expensive enough to warrant the overhead. Or else we have to rewrite them entirely.

Some portions of our game do try to take advantage of any additional CPU cores that are available. Unfortunately, Windows still prefers to try and do everything on one CPU core. It will even "park" some of your CPU cores to save power.

Intel themselves document a day of work spent to get a 1.29x speed up on a quad core process for a single routine :( And that is by Intel's experts at taking advantage of multiple cores...

Battleground Europe does benefit slightly from additional CPU cores, especially if you are using NetCode2. But we have not yet begun the long and scary task of rewriting code to best take advantage of multiple CPU cores.

However: If you use the "set affinity" feature to lock the game to one CPU core, it may reduce game performance by as much as 10-25% under heavy load conditions.
0 #7 Dkamerad 2010-06-29 16:29
Course the rendering is dependent on the physics, but is any actual physics work done in the rendering section? Surely one thread could do the rendering while another does the next physics loop? Ala producer-consumer with a buffer size of one.
0 #6 kfsone 2010-06-28 17:46
@madrebel The render loop is dependent on the physics to tell it what to draw and from where (think about a paratrooper on the chute), so it's not asynchronous.
0 #5 Dkamerad 2010-06-28 17:08
That's I think there'd be the most improvement madrebel, though I think that falls into the category of "the long and scary task" That being, the things that may most benefit from parallelization are the most tightly coupled.
0 #4 x15 2010-06-28 14:36
Vibora, do you have an AMD? If so it sounds like you need the AMD Dual-Core Optimizer.
0 #3 vibora 2010-06-27 01:48
If I don't set affinity for one core the game time keeps changing!!!
0 #2 chkicker 2010-06-26 12:24
Wow! That took awhile to absorb but answers a tonne of questions that I've always wondered about. Cheers!
0 #1 madrebel 2010-06-26 05:20
what about things that always run asynchronously to the graphics loop like physics? would that be a good candidate to dedicate to a second core for things like better HE, ballistics, flight model etc?
Add comment

Site Search