Re: [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix

Jeff Garzik Sun, 21 Aug 2005 14:13:09 -0700

Tejun Heo wrote:

 Hello, Jeff.

Jeff Garzik wrote:
Tejun Heo wrote:
 Hello, Jeff and Albert.

 This patchset fixes the following race.
(port A has ATA device and B ATAPI).

port B    : ata_issue_prot() (ATAPI_NODATA)
port B    : packet_task scheduled
port A    : ata_issue_prot() (DMA)
intr    : ata_interrupt()
port A    : ata_host_intr -> this is all good & dandy
port B    : ata_host_intr -> finishes ATAPI cmd w/ error (request sense)

 This is where the race is, we're polling for port B's qc, we must not
mess with it from interrupt.  Now, port B has dangling packet_task
which will race w/ whatever will run on port B.

 The problem is that we don't always protect polled ports from
interrupts with ata_qc_set_polling() and for non-DMA ATA commands
there's no way to discern if actually an IRQ has occurred (this
sucks), so we end up finishing the other port's command.

 This condition occurs quite often if both port A and B are busy.  The
most common result is assertion failure in atapi_packet_task
(assert(qc->flags & ATA_QCFLAG_ACTIVE)), but depending on timing and
on SMP, weirder things could happen, I think.

 Note that for ATAPI_DMA, interrupt from the other port won't mess
with a polled command as we can tell that it's not ours with
bmdma_status, but, if spurious interrupt occurs on the port, the
packet_task will go dangling.  That's why ATAPI_DMA also needs
protection.  The baseline is that all polled qc's need to be protected
with ata_qc_set_polling() until polling task is done with the command.


[ Start of patch descriptions ]
While this is something to look into, the supplied patches aredefinitely not the way we want to go. We need to follow the statediagram for the PACKET command protocol. ATA4 [1] diagrams are a goodthing to read, since they include mention of the behavior of certainATAPI devices that can send an interrupt between taskfile-out andwrite-cdb steps in the sequence.
In patch #2, you're making ata_irq_on() way too heavy. In patch #1,calling ata_qc_set_polling() for non-polled commands is a hack.
Have you received patches in the reverse order? #1 changesata_irq_on() and #2 adds ata_qc_set_polling().
Hmmm, only a call to ata_chk_status() is added to ata_irq_on(), which Ithink is needed regardless of other changes, and ata_wait_idle() isremoved. Does that make the function heavy?
The better solution is to track the PACKET command protocol state muchmore closely, so that the code _knows_ when it should and shouldn't begetting an interrupt.
This is required anyway because, as mentioned in another email, an ATAdevice might assert INTRQ for certain events, such as non-datacommands, where the controller is not required to assert the BMDMA IRQbit.
I _suspect_ that many host controllers cause the BMDMA IRQ bit totrack the ATA INTRQ status precisely, but this theory has not beenvalidated.
    Jeff


[1] http://www.t13.org/project/d1153r18-ATA-ATAPI-4.pdf
Without interrupt pending information from BMDMA bit for non-DMAcommands (which I don't think we can use for non-DMA cmds as we'll neverbe sure if all controllers behave that way), the problem is that formany SATA controllers, more than one ports share single interrupt line.And without interrupt pending bit, shared interrupt means a lot ofspurious interrupts making it impossible to know when to expect interrupts.


Incorrect...  this is why I keep harping on the "ATA host state machine."

You rely on proper implementation to know when to expect interrupts.Read the state diagrams, they tell you precisely when an interrupt maybe expected. By definition, any other interrupt is probably a PCIshared interrupt, or a hardware or software bug.

IDE driver deals with this by having only one command active perinterrupt, but SATA doesn't have such scheme yet. And I don't know ifsuch a scheme is desirable at all. Maybe just continuing to pollnon-DMA commands (which isn't much a burden anyway) and keeping DMAcommands fast is a better approach.

The IDE driver has a high density, but look closely... it follows theATA host state machine as well.

So, how should we do here? To follow ATA/ATAPI state machine, we needto implement exclusion among ports sharing an interrupt. Is this theway to go? Arggggggg... Lack of interrupt pending bit is such a painin the ass. :-(

If the code knows what state its in, it knows whether or not to expectan interrupt. All that state information should already be synchronizedby spinlock(host-lock), too.


        Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix

Reply via email to