repro code is at http://pastebin.com/B68N4AFY if anyone's interested.
On 11/01/16 13:58, Jason Harmening wrote: > Hi everyone, > > I recently upgraded my main amd64 server from 10.3-stable (r302011) to > 11.0-stable (r308099). It went smoothly except for one big issue: > certain applications (but not the system as a whole) respond very > sluggishly, and video playback of any kind is extremely choppy. > > The system is under very light load, and I see no evidence of abnormal > interrupt latency or interrupt load. More interestingly, if I place the > system under full load (~0.0% idle) the problem *disappears* and > playback/responsiveness are smooth and quick. > > Running ktrace on some of the affected apps points me at the problem: > huge variance in the amount of time spent in the nanosleep system call. > A sleep of, say, 5ms might take anywhere from 5ms to ~500ms from entry > to return of the syscall. OTOH, anything CPU-bound or that waits on > condvars or I/O interrupts seems to work fine, so this doesn't seem to > be an issue with overall system latency. > > I can repro this with a simple program that just does a 3ms usleep in a > tight loop (i.e. roughly the amount of time a video player would sleep > between frames @ 30fps). At light load ktrace will show the huge > nanosleep variance; under heavy load every nanosleep will complete in > almost exactly 3ms. > > FWIW, I don't see this on -current, although right now all my -current > images are VMs on different HW so that might not mean anything. I'm not > aware of any recent timer- or scheduler- specific changes, so I'm > wondering if perhaps the recent IPI or taskqueue changes might be > somehow to blame. > > I'm not especially familiar w/ the relevant parts of the kernel, so any > guidance on where I should focus my debugging efforts would be much > appreciated. > > Thanks, > Jason >
signature.asc
Description: OpenPGP digital signature