On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
> On 14.06.2022 10:32, Roger Pau Monné wrote:
> > On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> >> On 14.06.2022 08:52, Roger Pau Monné wrote:
> >>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>>>> Prevent dropping console output from the hardware domain, since 
> >>>>>>>>>>> it's
> >>>>>>>>>>> likely important to have all the output if the boot fails without
> >>>>>>>>>>> having to resort to sync_console (which also affects the output 
> >>>>>>>>>>> from
> >>>>>>>>>>> other guests).
> >>>>>>>>>>>
> >>>>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>>>>>
> >>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) 
> >>>>>>>>>> more
> >>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>>>>>> aiming at boot output only, wouldn't you want to stop the 
> >>>>>>>>>> overriding
> >>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>>>>>> Dom0's kernel messages?
> >>>>>>>>>
> >>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>>>>>> this request is something that come up internally.
> >>>>>>>>>
> >>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>>>
> >>>>>>>> Which would raise the question of why we have log levels for 
> >>>>>>>> non-guest
> >>>>>>>> messages.
> >>>>>>>
> >>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>>>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>>>>>> above.  It's still useful to have log levels for non-guest messages,
> >>>>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>>>> example.
> >>>>>>
> >>>>>> It was me who was confused, because of the two log-everything variants
> >>>>>> we have (console and serial). You're right that your change is 
> >>>>>> unrelated
> >>>>>> to log levels. However, when there are e.g. many warnings or when an
> >>>>>> admin has lowered the log level, what you (would) do is effectively
> >>>>>> force sync_console mode transiently (for a subset of messages, but
> >>>>>> that's secondary, especially because the "forced" output would still
> >>>>>> be waiting for earlier output to make it out).
> >>>>>
> >>>>> Right, it would have to wait for any previous output on the buffer to
> >>>>> go out first.  In any case we can guarantee that no more output will
> >>>>> be added to the buffer while Xen waits for it to be flushed.
> >>>>>
> >>>>> So for the hardware domain it might make sense to wait for the TX
> >>>>> buffers to be half empty (the current tx_quench logic) by preempting
> >>>>> the hypercall.  That however could cause issues if guests manage to
> >>>>> keep filling the buffer while the hardware domain is being preempted.
> >>>>>
> >>>>> Alternatively we could always reserve half of the buffer for the
> >>>>> hardware domain, and allow it to be preempted while waiting for space
> >>>>> (since it's guaranteed non hardware domains won't be able to steal the
> >>>>> allocation from the hardware domain).
> >>>>
> >>>> Getting complicated it seems. I have to admit that I wonder whether we
> >>>> wouldn't be better off leaving the current logic as is.
> >>>
> >>> Another possible solution (more like a band aid) is to increase the
> >>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> >>> fine with the high throughput of boot messages.
> >>
> >> You mean the buffer whose size is controlled by serial_tx_buffer?
> > 
> > Yes.
> > 
> >> On
> >> large systems one may want to simply make use of the command line
> >> option then; I don't think the built-in default needs changing. Or
> >> if so, then perhaps not statically at build time, but taking into
> >> account system properties (like CPU count).
> > 
> > So how about we use:
> > 
> > min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
> 
> That would _reduce_ size on small systems, wouldn't it? Originally
> you were after increasing the default size. But if you had meant
> max(), then I'd fear on very large systems this may grow a little
> too large.

See previous followup about my mistake of using min() instead of
max().

On a system with 512 CPUs that would be 512KB, I don't think that's a
lot of memory, specially taking into account that a system with 512
CPUs should have a matching amount of memory I would expect.

It's true however that I very much doubt we would fill a 512K buffer,
so limiting to 64K might be a sensible starting point?

> > Maybe we should also take CPU frequency into account, but that seems
> > too complex for the purpose.
> 
> Why would frequency matter? Other aspects I could see mattering is
> node count and maybe memory size.

Higher frequency likely means faster boot, and faster buffer fill,
because the baudrate of the console is constant.

Thanks, Roger.

Reply via email to