On Tue 06-05-14 16:00:37, Will Deacon wrote: > On Tue, May 06, 2014 at 03:00:32PM +0100, Jan Kara wrote: > > On Tue 06-05-14 14:12:34, Will Deacon wrote: > > > On Tue, May 06, 2014 at 01:29:58PM +0100, Jan Kara wrote: > > > > Well, with serial console the backlog can get actually pretty big. > > > > During > > > > boot on large machines I've seen CPUs stuck in that very loop in > > > > console_unlock() for tens of seconds. Obviously that causes problems - > > > > e.g. > > > > watchdog fires, RCU lockup detector fires, when interrupts are disabled, > > > > some hardware gives up because its interrupts weren't served for too > > > > long. > > > > All in all the machine just dies. > > > > > > Right, so there's the usual compromise here between throughput and > > > latency. > > I'd see that compromise if enabling & disabling interrupts would be > > taking considerable amount of time. I don't think that was your concern, > > was it? Maybe I just misunderstood you... > > Well, that isn't the quickest operation on ARM (since it's > self-synchronising), but I was actually referring to the ability to drain > the log buffer (with interrupts disabled) vs the ability to service > interrupts quickly. The moment we re-enable interrupts, we can start adding > more messages to the buffer from the IRQ path (I didn't attempt to solve the > multi-CPU case, as I mentioned before). I see. But practically the multi-CPU case is much more common than the IRQ case, isn't it?
> > > That said, printing one message each time seems to go too far in the > > > opposite direction for my liking, so the best bet is likely to limit the > > > work to some fixed number of messages. Do you have any feeling for such a > > > limit? > > If you really are concerned about enabling and disabling of interrupts > > taking significant time (and it may be, I just don't know), then printing > > couple of messages without enabling them makes sense. How many is a tricky > > question since it depends on the console speed. I had a similar problem > > when I was deciding in my patch when we should ask another CPU to take over > > printing from the current CPU (to avoid the issues I've described in the > > previous email). I was experimenting with various stuff but in the end I > > restorted to a stupid "after X characters are printed". > > Yeah, so you also end up with the same problem of tuning your heuristics. > Peter's suggestion of X == 42 is as good as any arbitrary constant I can > suggest, hence my snapshotting of log_next_seq originally. Yes I can fully understand where you came from :). I just wanted to point out that your choice isn't a particularly good one either, > > > > And the backlog builds up because while one cpu is doing the printing in > > > > console_unlock() all the other cpus are busily adding new messages to > > > > the > > > > buffer faster than they can be printed... > > > > > > Understood, but that's also the situation without this patch (and not one > > > that I think you can fix without hurting latency). > > Sure. I have a patch which transitions printing to another CPU once in a > > while so single CPU isn't hogged for too long and that solves the issues I > > have observed. But Alan didn't like this solution so the issue is unfixed > > for now. > > Interesting. Do you have a pointer to the thread? The patchset posting starts here: https://lkml.org/lkml/2014/3/25/343 Patch 5/8 is probably the most interesting for you (patches 1-4 are already in the mm tree). Honza -- Jan Kara <j...@suse.cz> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/