Windows thread scheduling

Windows has never claimed to be a real time operating system but in reality its not even a timely operating system.

This is a quick post as a follow up to the previous post on windows timers. As we saw in that post the period of the windows kernel interval interrupt is quite variable these interrupts are the engine that steps Windows forward. Within Windows the kernel interval period affects more than just jitter on various timer APIs, it changes the rate of thread scheduling, the Sleep API, WM_TIMER messages and waitable timers.

For example, if you execute Sleep(1) to suspend execution for 1ms, you might not wake up for 15.625ms.

int main()
{
  while (1)
  {
    //assume QPC is 10Mhz for this example
    uint64_t start;
    QueryPerformanceCounter((LARGE_INTEGER*)&start);

    Sleep(1);

    uint64_t end;
    QueryPerformanceCounter((LARGE_INTEGER*)&end);

    printf("%f\n", (double)(end - start)/10000);
  }
}

Running this loop without setting the timer interval resolution you’ll get results like what is shown below. The actual interval bounces around and the closest we got to what we asked for is 3ms. Remember the kernel interval is a system wide setting, set to whatever is the minimum request from all running processes.

15.3901
15.0397
15.2044
3.0466
15.5567
15.5170

Running the exact same code after calling NtSetTimerResolution to force the interval period to the allowed minimum, which on this machine is 0.5ms, it actually does sleep for the 1ms asked for and it does so quite accurately but only because the sleep period is a multiple of the 500uS timer interval.

1.0230
1.0214
1.0210
1.0213
1.0210

Remember all of these tests are on an unloaded system so there are no scheduling conflicts and increasing thread priority makes no difference.

Sleep(0) is not affected by the timer period because it simply yields the current time slice, it has nothing to do with elapsed time, this is the way to yield on a Windows machine. The calling thread never leaves the runnable state and the Sleep(0) call typically returns in 2-3 microseconds on system that has available processor time, if there isn’t a processor available or there are other threads with higher priority, you might have to wait for the full interval period or longer. If you want to play nice then having Sleep(0) in a tight loop is better than just spinning and burning CPU cycles. Remember if your thread waits 15ms by chance it could have a drastic effect on game frame timing.


Lets see what OSX and Linux do with a similar test.

OSX

Using OSX Sonoma 14.4.1 (23E224), on a M1 Max and the following code:

int main()
{
  while(1)
  {
    //time_ns is a 64bit utility wrapper for clock_gettime(CLOCK_MONOTONIC)
    uint64_t start = time_ns();
    usleep(1000);
    uint64_t end = time_ns();
    printf("%f\n",((double)end-start)/1e6);
  }
}

Sleeping for 1ms yields the following when locked to a P-Core.

1.259750
1.259791
1.254250
1.259042
1.264125
1.269500

OSX is 25% over the requested delay but it is very consistent. Its interesting that Windows is actually more accurate than OSX when sleeping for 1ms when the interval timer is set to 0.5ms but this is only true is the sleep period is a multiple of the interval period.

Sleeping for 300uS on a P-Core yields the following:

0.384875ms
0.380625ms
0.379500ms
0.385500ms
0.386208ms
0.381750ms

Again very consistent but approximately 25% high. On an E-Core it is not as accurate or consistent but I haven’t looked in to it any closer.

Linux

Using Ubuntu 22.04 on kernel 6.2.0-39, 16 core Ryzen7, the code is identical to OSX.

int main()
{
  while(1)
  {
    //time_ns is a 64bit wrapper for clock_gettime(CLOCK_MONOTONIC)
    uint64_t start = time_ns();
    usleep(1000);
    uint64_t end = time_ns();
    printf("%f\n",((double)end-start)/1e6);
  }
}

Sleeping for 1ms yields the following:

1.081928
1.078950
1.082176
1.080142
1.080684

and sleeping for 300uS

0.353397
0.353522
0.353417
0.353463
0.353513

Although technically Windows was more accurate for 1ms delay when the interval was set to 500uS, Linux is still the clear winner, its both very accurate and very consistent in all cases tested. Even if the system start to bog down it’s still very consistent although the accuracy drops. Windows is all over the place when you bog the system down. OSX was pretty decent too but the E-Cores need some investigation.

The sub millisecond tests aren’t even possible on Windows. In general if you need microsecond delays on Windows, spin the CPU in a loop using QueryPerformanceCounter, and maybe Sleep(0) to yield if you want to play nice. If you need really accurate delays and can’t handle the odd spike then don’t yield the CPU as there is always a chance you might not be back for 15+ms, even on multi-core systems. Threaded job managers should also think twice about unnecessarily yielding the CPU.

I don’t know why windows continues to have a kernel timer that continually changes period. I also don’t know why the entire operating system uses the same timer. It probably made sense years ago but on modern multi-processor machines with multiple hardware timers it makes no sense. Its just another way the windows kernel is showing its age. This also explains why Windows doesn’t have a microsecond sleep API. It would be pointless because it can’t reliably sleep for milliseconds.

If you like this content, follow us and subscribe on social media