Re: [HelenOS-devel] async not fast enough for crazy mouse (was: Framebuffer problems)

Ján Veselý Sat, 20 Apr 2013 07:21:46 -0700

On Sat, Apr 20, 2013 at 1:40 PM, Martin Sucha <[email protected]> wrote:
>> using separate fibrils for each interrupt allows you to handle them in
>> different paths and asynchronously. In this case it might not be
>> necessary (you still have different paths for ps2a and ps2b), but some
>> device drivers might make use of it. Interrupts that arrived later can
>> be handled sooner/faster. examles that come to mind:
>> Full duplex audio that handles recording and playback separately
>> (different fragment sizes, formats,...).
>> graphics cards that need to handle vblank interrupts ASAP without
>> waiting for output connect/disconnect handlers.
>>
>> ideally there would be more ways to handle interrupts, one fibril,
>> #cpu fibrils, dynamic fibrils, so that each driver can select the most
>> suitable handler
>>
>> in reality I don't think this is a real problem now (after the stack
>> size was reduced to 1 page).
>
> I know that using separate fibrils may be beneficial for some drivers,
> but I still think the handling of notifications is currently broken. The
> problem is not in the size of the stack (although allocating whole page
> when you just need to process a few bytes seems like a waste of
> resources if you ask me), but the real problem is that there is *no
> upper bound* on the number of fibrils spawned. In most cases you just
> need a few fibrils anyway, because you have a limited number of things
> you do in a driver (e.g. for audio you might have as many fibrils as
> there are channels you are handling, but this number is low and doesn't
> change much over time).


I don't see a reason why there should be an upper bound on the number
of fibrils used. The only problem I see is that failure to create new fibril is
not handled gracefully. In general I don't see high number of fibrils
as a problem,
fibrils are supposed to be lightweight and created/destroyed as needed.
It enables alternative solutions to using producer/consumer fifos every time.
If we drop the lightweight 'easy creation/destruction' property of fibrils,
then IMO they offer no advantage over threads

> The async framework doesn't know how the notifications are handled and
> shouldn't spawn new fibril for each notification, but (as you also
> pointed out) the driver should choose the most suitable strategy. I
> don't think that we should have different handlers in the async
> framework, as the driver may easily implement whatever policy is best in
> its loop handling notifications (i.e. it may route the notifications to
> different fibrils, enqueue them in a buffer, etc.).

As long as there is more than few users of per notification fibril then it
should be part of some kind of framework, Whether that would be async
framework or a driver framework build on top of it, is open for debate.

>
>> Interrupts that arrived later can be handled sooner/faster.
> Well, just spawning a new fibril does not help by itself. You either
> need to have multiple kernel threads or wait for some condition (so
> there is a fibril switch) to actually handle the interrupts in different
> order. But once you start waiting for some condition in an interrupt
> handler (e.g. a buffer you are writing to becomes full), it is very
> likely that the other fibrils handling the same interrupt type will
> become blocked as well.

yes, the point is that by having multiple fibrils only those that use the same
resource get blocked, others can continue their work. adding more
threads to fibril manager is easy and we should not base design decisions
on the fact we currently use only one. moreover there are other opportunities
than locks for fibril switch. If there were none but locks, we could
remove the locks and rely on cooperative fibril scheduling to avoid
race conditions.

>
>> the buggy situation combines extremely
>> slow host (emulated noKVM), with extremely intensive workload, in
>> reality ps2 is limited to max 200 Hz (usually 120Hz), and if the mouse
>> was connected via USB it would require less ipc messages.
> Yes this is what triggers the bugs (the behaviour is not a single bug
> actually), but it does not mean it should not be fixed.
>
> The driver gets killed because it spawns new fibrils at a rate greater
> than at which they are executed. One bug is that it actually spawns so
> much fibrils, the other that the function to spawn a fibril crashes the
> task if the allocation is not possible.

I beg to differ. as I said, using many fibrils is not a bug, I agree with the
latter part that failure to create fibril should not kill the task.

>
> Note that the spawning of new fibrils is used as a rather inefficient
> unbounded queue here (the i8042 driver runs in a single kernel thread,
> just like all the other drivers do).
> The i8042 driver actually relies on the fact that notification fibrils
> are scheduled sequentially, because the I/O port is read in the kernel
> and the read data is delivered in the notification itself. Spawn a new
> kernel thread in the driver or change the implementation of how the
> fibrils get scheduled and the data might be appended in the wrong order
> depending on which fibril gets executed first (the same bug is present
> in my ns8250 modifications).

i8042 is a good example of a driver that would make good use of single
fibril handler.

>
> So, I still think that the proper solution is not to spawn a new fibril
> for each notification. And changing this behaviour is not a mere
> optimization -- it fixes a design flaw in notification handling.

I see no significant flaw. whether you use custom data structure or rely on
fibril queue is a question of efficiency and thus it *is* a
performance optimization.
some drivers prefer/require sequential handling of interrupts, in that
case using
single fibril and a fifo (whether it is a kernel side notification
fifo or driver side does
not really matter) is beneficial and guarantees sequential processing.
other drivers do not care about the order of interrupts (like USB hc)
and can happily use multiple fibrils to handle interrupts.

I don't think we currently have a driver that *requires* interrupts to
be handled
in separate fibrils. More complex devices like network cards with
offload engines and GPUs
may be good candidates, but I don't know enough about them (yet :)) to
make a definitive statement.

Jan

>
> Martin Sucha
>
>
> _______________________________________________
> HelenOS-devel mailing list
> [email protected]
> http://lists.modry.cz/cgi-bin/listinfo/helenos-devel

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/cgi-bin/listinfo/helenos-devel

Re: [HelenOS-devel] async not fast enough for crazy mouse (was: Framebuffer problems)

Reply via email to