Hello Tejun,

Tejun Heo wrote:
Hello, Daniel.

Daniel Beichl wrote:
Okay, I've been thinking about it for some time.  I think I know what's
going on now.  Those authentication related commands are issued by
invoking ioctls on the device node.  The cdrom driver receives the
ioctls and issues respective SCSI commands for it.  All those DVD auth
commands either transfer or receive data from the device and thus
specifies certain data direction when they're issued.  libata selects
command protocol accordingly.

It's all good and dandy till now but the problem is that those commands
can be issued with data length of zero!  This is allowed by the SCSI MMC
standard and used to probe whether the command succeeds or not before
taking further actions.  When this happens, libata chooses command
protocol according to the data direction, but has to feed 0-length data
to the DMA engine.  It seems sata_sil dma engine barfs if that happens.
 Can you please add the following code snippet at the head of
atapi_xlat()?

    if (!nodata && !qc->nbytes) {
        printk("XXX forcing PIO for 0 length data cdb %02x\n",
            scmd->cmnd[0]);
        dump_stack();
        using_pio = 1;
    }

Also, do you mind cc'ing [EMAIL PROTECTED]

Hi Tejun,


i inserted the code snippet you mentioned to atapi_xlat() in libata-scsi.c
and the dvd authentication succeeds.

Please see the attached dmesg file for the generated stack dumps.

Good to know it works.  :-)

i did some more testing with my patch i sent you on friday and it shows
that it restricts all atapi commands to pio so it is probably bad to use it.
The question is whether this is the right place to fix it, as this seems
to limit all
drivers to pio if a zero data length command is transferred.
It would make more sense to me if the individual driver decides this by
performing a similar check in the check_atapi_dma function implemented
by the
individual driver. I did this for the sata_sil driver in the attached
patch.

The thing is that those commands don't transfer any data at all, so it
doesn't really matter whether the data phase is specified as DMA or PIO.
 ATA_PROT_ATAPI degenerates into ATA_PROT_ATAPI_NODATA if there is no
data to transfer (in ATA spec they share the same protocol state machine
and the data request status bit dictates what actually happens).

I think this actually should be fixed in the cdrom driver.  It shouldn't
issue data command with 0 data length with data protocol.  Can you test
whether the attached patch works?  Also, it seems we'll need to add a
WARN_ON() in sr such that bugs like this can be caught more easily.

Thanks.

I gave the whole thing some thought. You suspected the sil dma engine to
cause the issue, because it might only be usable with a certain command packet size. If these commands result in packets their size is not zero. Not all packet commands transfer
data sectors, but they themselves have a size.

So i investigated libata and learnt from libata-scsi.c that nbytes is set to the size of the scsi command corresponding to the atapi command that should be sent to the device near the end of atapi_xlat().

I added code to dump the packet size and data direction as set in the scsi command and noticed that the
controller freezes the port if a packet with the direction DMA_TO_DEVICE
and a length smaller than ~60 bytes is issued, the command type does not matter at all.

I tried this with the cdrecord -atip option which generate packets of
command cdb[0] = 0x00, length 60 and 16 in said direction.
The first command succeeds, the second one triggers the dma bug.

The question now is, is the dma engine of the sil chip indeed the problem, or did i cover another problem located somewhere else by restricting dma to packets not smaller than 60 bytes. The datasheet of the sil3512 does not specify a lower limit for dma transfers.

It does not appear to be a modulo N issue either as i have seen the transfer of
packets of 12 or 16 bytes fail but then again packets of 60 bytes succeeds.

size       result
==========
001100 fail
010000 fail
111100 succeed

A thing that crossed my mind was a timing issue. Perhaps the dmaing of < 60 bytes is that fast that the interrupt that causes it is not yet handled correctly. But this is just a very wild guess.

I tried your latest patch and the dma bug did show up. I have attached a patch to this mail that decides based on the dma direction and the command size in the sil_check_atapi_dma() routine whether to allow dma or not. This fixed the dvd auth issue and the cdrecord -atip issue for me,
but i am pretty sure there is a better solution.

thanks for your support

Daniel


--- sata_sil.c	2007-05-07 19:18:04.000000000 +0200
+++ sata_sil.c	2007-05-07 19:19:36.000000000 +0200
@@ -43,10 +43,11 @@
 #include <linux/interrupt.h>
 #include <linux/device.h>
 #include <scsi/scsi_host.h>
+#include <scsi/scsi_cmnd.h>
 #include <linux/libata.h>
 
 #define DRV_NAME	"sata_sil"
-#define DRV_VERSION	"2.1"
+#define DRV_VERSION	"2.1a"
 
 enum {
 	SIL_MMIO_BAR		= 5,
@@ -121,6 +122,7 @@
 static irqreturn_t sil_interrupt(int irq, void *dev_instance);
 static void sil_freeze(struct ata_port *ap);
 static void sil_thaw(struct ata_port *ap);
+static int sil_check_atapi_dma(struct ata_queued_cmd *qc);
 
 
 static const struct pci_device_id sil_pci_tbl[] = {
@@ -196,6 +198,7 @@
 	.tf_read		= ata_tf_read,
 	.check_status		= ata_check_status,
 	.exec_command		= ata_exec_command,
+	.check_atapi_dma	= sil_check_atapi_dma,
 	.dev_select		= ata_std_dev_select,
 	.post_set_mode		= sil_post_set_mode,
 	.bmdma_setup            = ata_bmdma_setup,
@@ -289,6 +292,26 @@
 module_param(slow_down, int, 0444);
 MODULE_PARM_DESC(slow_down, "Sledgehammer used to work around random problems, by limiting commands to 15 sectors (0=off, 1=on)");
 
+static int sil_check_atapi_dma(struct ata_queued_cmd *qc)
+{
+	/* limit atapi commands less than 60bytes to pio as certain sil */
+	/* chips seem to have trouble to perform dma for these commands */
+	if ( (qc->scsicmd != NULL) &&
+	     (qc->scsicmd->sc_data_direction == DMA_TO_DEVICE) &&
+	     (qc->scsicmd->request_bufflen < 60) )
+	{
+		ata_dev_printk( qc->dev, KERN_INFO, "reducing cmd %02x (direction=%d, len=%d) to pio\n", 
+			qc->cdb[0], 
+			qc->scsicmd->sc_data_direction, 
+			qc->scsicmd->request_bufflen );
+		/* do not allow dma */
+		return -1;
+	}
+
+	/* allow dma */
+	return 0;
+}
+
 
 static unsigned char sil_get_device_cache_line(struct pci_dev *pdev)
 {

Reply via email to