Re: [Linux-zigbee-devel] [PATCH 3/3] at86rf230: straighten out RX vs. TX synchronization

Werner Almesberger Wed, 22 Jun 2011 08:40:33 -0700

Alexander Smirnov wrote:
> > +       synchronize_irq(lp->spi->irq);
> > +       flush_work(&lp->irqwork);
> >
> 
> Could you please explain why we need these flush and sync?


synchronize_irq is for the case that an RX interrupt occurred before
leaving RX_ON, and the interrupt handler has not reached schedule_work
by the time we flush_work. If that interrupt was allowed to run after
flush_work (or somewhere in the middle of it), it would lead to the
work to be scheduled, thus invalidating the assertion that no work is
pending after flush_work.

This is admittedly an unlikely scenario on "normal" hardware. Not sure
how things like SMT can affect the relative speed of concurrent
execution, though.

flush_work is for the case that we got an RX interrupt before leaving
RX_ON, scheduling the work, but the worker hasn't executed yet. This
could now happen on a number of occasions. If it happens before the
transmission completes, it could cause us to try to enter RX_ON while
still in BUSY_TX, with an uncertain outcome.

> As I understand, now we are unprotected from RX interrupts, but lp->is_tx is
> still '1'. What will happen?

Oh, sorry. I forgot to move the is_tx. I'll send a patch in my next
mail.

I found another bug in my patch: we also need to go through the whole
synchronize_irq and flush_work ritual if
wait_for_completion_interruptible is interrupted, to prevent a pending
TRX_END from being interpreted as signaling an incoming frame.

Atmel could have made all this a lot easier by giving TX and RX
separate interrupt bits :)

> 
> 
> > +       if (!rc)
> > +               rc = rc2;
> >  err:
> > -       pr_err("%s error: %d\n", __func__, rc);
> > -       spin_lock_irqsave(&lp->lock, flags);
> > +       if (rc)
> > +               pr_err("%s error: %d\n", __func__, rc);
> >
> 
> For me such an error handling isn't too clear, we received an error status,
> then we do something, then print message...
> May be it will be better to keep current logic: if something fails - goto
> error_label and finish function instead of continue execution.

The intent here is not to continue as if nothing was wrong, but to
return to a valid state and report the first error that happened.

But before going into details, I have a more fundamental question:
under what circumstances is the wait_for_completion_interruptible
supposed to be interrupted ?

I'm a bit uncertain about the general error model in the driver.
E.g., at86rf230_state will bemoan a failed transition but doesn't
do anything to rectify the problem. Should it ? If not, who does ?

Also, what is the assumption of the nature of errors ? Driver bugs ?
Intermittent SPI problems ? Someone removing the transceiver during
operation ?

- Werner

------------------------------------------------------------------------------
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today.
http://p.sf.net/sfu/quest-sfdev2dev
_______________________________________________
Linux-zigbee-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-zigbee-devel

Re: [Linux-zigbee-devel] [PATCH 3/3] at86rf230: straighten out RX vs. TX synchronization

Reply via email to