* Ingo Molnar ([EMAIL PROTECTED]) wrote: > > * Ankita Garg <[EMAIL PROTECTED]> wrote: > > > local_irq_save(flags); > > buf = _stp_chan->buf[smp_processor_id()]; > > if (unlikely(buf->offset + length > _stp_chan->subbuf_size)) > > length = relay_switch_subbuf(buf, length); > > memcpy(buf->data + buf->offset, data, length); > > buf->offset += length; > > local_irq_restore(flags); > > oh, what a fine piece of s^H^H :-/ Who in their right mind calls this > from _tracing_ code: > > smp_mb(); > if (waitqueue_active(&buf->read_wait)) > /* > * Calling wake_up_interruptible() from here > * will deadlock if we happen to be logging > * from the scheduler (trying to re-grab > * rq->lock), so defer it. > */ > __mod_timer(&buf->timer, jiffies + 1); > > and the comment is utter rubbish: __mod_timer() can lock up just as > much. Just use an adaptive-polling method to drive the draining of the > relay buffer, instead of mucking with timers from within the tracing > code. Whoever implemented this has absolutely zero clue i have to say > ... > > the smp_mb() is rubbish too. > > could you try the patch below, does it fix the problem? >
Hi Ingo, I did not have this problem with LTTng, since I only touch atomic variables in tracing code. However, I iterate on a rcu list of active traces in a timer interrupt to see if subbuffers must be read and, if yes, I send a wakeup to my user-space daemon. I had to change the protection around this rcu list read from preempt disable to mutex lock in this timer because try_to_wakeup is called in it, which takes a spinlock. Note: I don't use relay code at my tracing site. I guess they might have to switch to such an asynchronous delivery system if they want to do this properly. Simply put, your polling solution is exactly what I do, but I check a flag set by the writer instead of waking up the readers unconditionally. Mathieu > Ingo > > -------------------------------------> > Subject: relay: fix timer madness > From: Ingo Molnar <[EMAIL PROTECTED]> > > remove timer calls (!!!) from deep within the tracing infrastructure. > This was totally bogus code that can cause lockups and worse. > Poll the buffer every 2 jiffies for now. > > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > --- > kernel/relay.c | 14 +++++--------- > 1 file changed, 5 insertions(+), 9 deletions(-) > > Index: linux-rt-rebase.q/kernel/relay.c > =================================================================== > --- linux-rt-rebase.q.orig/kernel/relay.c > +++ linux-rt-rebase.q/kernel/relay.c > @@ -319,6 +319,10 @@ static void wakeup_readers(unsigned long > { > struct rchan_buf *buf = (struct rchan_buf *)data; > wake_up_interruptible(&buf->read_wait); > + /* > + * Stupid polling for now: > + */ > + mod_timer(&buf->timer, jiffies + 1); > } > > /** > @@ -336,6 +340,7 @@ static void __relay_reset(struct rchan_b > init_waitqueue_head(&buf->read_wait); > kref_init(&buf->kref); > setup_timer(&buf->timer, wakeup_readers, (unsigned long)buf); > + mod_timer(&buf->timer, jiffies + 1); > } else > del_timer_sync(&buf->timer); > > @@ -604,15 +609,6 @@ size_t relay_switch_subbuf(struct rchan_ > buf->subbufs_produced++; > buf->dentry->d_inode->i_size += buf->chan->subbuf_size - > buf->padding[old_subbuf]; > - smp_mb(); > - if (waitqueue_active(&buf->read_wait)) > - /* > - * Calling wake_up_interruptible() from here > - * will deadlock if we happen to be logging > - * from the scheduler (trying to re-grab > - * rq->lock), so defer it. > - */ > - __mod_timer(&buf->timer, jiffies + 1); > } > > old = buf->data; -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/