On 4/19/07, Lee Revell <[EMAIL PROTECTED]> wrote:
IMHO audio streamers should use SCHED_FIFO thread for time critical work. I think it's insane to expect the scheduler to figure out that these processes need low latency when they can just be explicit about it. "Professional" audio software does it already, on Linux as well as other OS...
It is certainly true that SCHED_FIFO is currently necessary in the layers of an audio application lying closest to the hardware, if you don't want to throw a monstrous hardware ring buffer at the problem. See the alsa-devel archives for a patch to aplay (sched_setscheduler plus some cleanups) that converts it from "unsafe at any speed" (on a non-RT kernel) to a rock-solid 18ms round trip from PCM in to PCM out. (The hardware and driver aren't terribly exotic for an SoC, and the measurement was done with aplay -C | aplay -P -- on a not-particularly-tuned CONFIG_PREEMPT kernel with a 12ms+ peak scheduling latency according to cyclictest. A similar test via /dev/dsp, done through a slightly modified OSS emulation layer to the same driver, measures at 40ms and is probably tuned too conservatively.) Note that SCHED_FIFO may be less necessary on an -rt kernel, but I haven't had that option on the embedded hardware I've been working with lately. Ingo, please please pretty please pick a -stable branch one of these days and provide a git repo with -rt integrated against that branch. Then I could port our chip support to it -- all of which will be GPLed after the impending code review -- after which I might have a prayer of strong-arming our chip vendor into porting their WiFi driver onto -rt. It's really a much more interesting scheduler use case than make -j200 under X, because it's a best-effort SCHED_BATCH-ish load that wants to be temporally clustered for power management reasons. (Believe it or not, a stable -rt branch with a clock-scaling-aware scheduler is the one thing that might lead to this major WiFi vendor's GPLing their driver core. They're starting to see the light on the biz dev side, and the nature of the devices their chip will go in makes them somewhat less concerned about the regulatory fig leaf aspect of a closed-source driver; but they would have to port off of the third-party real-time executive embedded within the driver, and mainline's task and timer granularity won't cut it. I can't even get more detail about _why_ it won't cut it unless there's some remotely supportable -rt base they could port to.) But I think SCHED_FIFO on a chain of tasks is fundamentally not the right way to handle low audio latency. The object with a low latency requirement isn't the task, it's the device. When it's starting to get urgent to deliver more data to the device, the task that it's waiting on should slide up the urgency scale; and if it's waiting on something else, that something else should slide up the scale; and so forth. Similarly, responding to user input is urgent; so when user input is available (by whatever mechanism), the task that's waiting for it should slide up the urgency scale, etc. In practice, you probably don't want to burden desktop Linux with priority inheritance where you don't have to. Priority queues with algorithmically efficient decrease-key operations (Fibonacci heaps and their ilk) are complicated to implement and have correspondingly high constant factors. (However, a sufficiently clever heuristic for assigning quasi-static task priorities would usually short-circuit the priority cascade; if you can keep N small in the tasks-with-unpredictable-priority queue, you can probably use a simpler flavor with O(log N) decrease-key. Ask someone who knows more about data structures than I do.) More importantly, non-real-time application coders aren't very smart about grouping data structure accesses on one side or the other of a system call that is likely to release a lock and let something else run, flushing application data out of cache. (Kernel coders aren't always smart about this either; see LKML threads a few weeks ago about racy, and cache-stall-prone, f_pos handling in VFS.) So switching tasks immediately on lock release is usually the wrong thing to do if letting the task run a little longer would allow it to reach a point where it has to block anyway. Anyway, I already described the urgency-driven strategy to the extent that I've thought it out, elsewhere in this thread. I only held this draft back because I wanted to double-check my latency measurements. Cheers, - Michael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/