On Thu, Jun 11, 2015 at 04:55:47PM +0200, Petr Mladek wrote:
> The output from "echo l >/proc/sysrq-trigger" looks reasonable.
> It does not mix output from different CPUs. This is expected
> because of the @lock.
> 
> The other observation is that it prints CPUs in _random_ order:
> 28, 24, 25, 1, 26, 2, 27, 3, ...
> 
> The order is fine when I disable the irq_work.
> 
> It means that irq_works are usually faster than printk_nmi_flush() =>
> printk_nmi_flush() is not that useful => all the complexity with
> the three atomic variables (head, tail, read) did not bring
> much win.

The reason I did the printk_nmi_flush() is that the irq_work might
_never_ happen if the cpu in question is stuck with interrupts disabled.

Do something like:

        local_irq_disable();
        for (;;)
                ;
        local_irq_enable(); /* like this'll ever happen */

And you won't see the offending stacktrace that makes your machine a
useless piece of power sucking junk.

> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index c099b082cd02..99bfc1e3f32a 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1821,13 +1821,200 @@ int vprintk_default(const char *fmt, va_list args)
> > +static void __printk_nmi_flush(struct irq_work *work)
> > +{
> > +   static raw_spinlock_t lock = __RAW_SPIN_LOCK_INITIALIZER(lock);
> > +   struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
> > +   int len, head, size, i, last_i;
> > +
> > +again:
> > +   /*
> > +    * vprintk_nmi()        truncate
> > +    *
> > +    * [S] head             [S] head
> > +    *     wmb              mb
> > +    * [S] tail             [S] read
> 
> BTW, this is quite cryptic for me. Coffee did not help :-)

It states that those two sites store head first, then issue a barrier
and then store something else. This means we need to read the lot in the
reverse order and issue a matching barrier.

>        *
> > +    * therefore:
> > +    */
> > +   i = atomic_read(&s->read);
> > +   len = atomic_read(&s->tail); /* up to the tail is stable */
> > +   smp_rmb();
> > +   head = atomic_read(&s->head);
> > +
> > +   /*
> > +    * We cannot truncate tail because it could overwrite a store from
> > +    * vprintk_nmi(), however vprintk_nmi() will always update tail to the
> > +    * correct value.
> > +    *
> > +    * Therefore if head < tail, we missed a truncate and should do so now.
> > +    */
> > +   if (head < len)
> > +           len = 0;
> 
> This is a bit confusing. It is a complicated way how to return on the next 
> test.

Yes, but its two separate pieces of logic. The above deals with the
truncate not being able to also reset tail. While the below test is a
'simple' do I have to actually do anything test.

> If I get this correctly. This might happen only inside
> _printk_nmi_flush() called on another CPU (from
> printk_nmi_flush()) when it interferes with the queued
> irq_work. The irq_work is faster and truncates the buffer.
> 
> So, the return is fine after all because the irq_work printed
> everything.

Either that, or another NMI happened while we started the irq_work, this
will re(enqueue) the irq_work, this one will in fact observe all data,
print the whole buffer and leave an empty buffer behind.

Then the irq_work happens _again_, sees an empty buffer and figures
that's ok, nothing to do.

> > +   if (len - i <= 0) /* nothing to do */
> > +           return;
> > +   /*
> > +    * 'Consume' this chunk, avoids concurrent callers printing the same
> > +    * stuff.
> > +    */
> > +   if (atomic_cmpxchg(&s->read, i, len) != i)
> > +           goto again;
> 
> I think that this is racy:

Indeed. I'll think a bit on that.

> I think that ordering CPUs is not worth it. I would go back to the
> first solution, add the @lock there, and double check races with
> seq_buf().

You mean, provide printk_nmi_flush() but completely screw concurrency?
And basically reserve it for a last ditch effort for getting crap out?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to