Re: libata: CD and dvd devices not recognized
Albert Lee wrote: Hi Yarema, Thanks for the detailed log. It looks like the bad INQUIRY command CDB (4:0,1,0) 12 01 00 00 fe 00 00 00 00 (INQUIRY, length=254, EVPD=1) is coming from the user space, not the SCSI mid-layer. I guess two problems together caused this bug: 1. Ubuntu Linux issues an incorrect INQUIRY command to the drive. (Other distros seem to have the INQUIRY correct.) 2. The incorrect INQUIRY happens to cause the AOpen drive frozen. (The HP drive is immune from the incorrect INQUIRY command. check condition is returned for the bad INQUIRY.) We have two possible solutions here: a. Patch Ubuntu, such that the incorrect INQUIRY is fixed. b. Patch kernel, such that the AOpen drives are blacklisted. Each INQUIRY is inspected for the blacklisted drives. If the INQUIRY looks wrong, the INQUIRY is rejected. I guess a. is the preferred solution... I second Albert's opinion. Please report this to ubuntu people so that the origin of the problem can be fixed. Thanks a lot. I admire your ability and patience in tracking these difficult issues. :-) -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: problem detecting ATAPI device with ahci on 2.6.21-rcx
Kristen Carlson Accardi wrote: Hi, I upgrade a machine from 2.6.20-2.6.21-rc4 and am now having problems with my ATAPI device getting detected properly. Back tracking to 2.6.21-rc1, I find the problem existed there too, but not in 2.6.20. Please post boot dmesg of 2.6.20. Is your ATAPI device connected using 80c cable? -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: libata: CD and dvd devices not recognized
We have two possible solutions here: a. Patch Ubuntu, such that the incorrect INQUIRY is fixed. b. Patch kernel, such that the AOpen drives are blacklisted. Each INQUIRY is inspected for the blacklisted drives. If the INQUIRY looks wrong, the INQUIRY is rejected. I guess a. is the preferred solution... We have two problems here #1 Ubuntu got the inquiry command wrong #2 Until now we considered INQUIRY a safe command for SG_IO passthrough. We can't really take INQUIRY out of SG_IO so do we decide its the hardware vendors problem or do something cleverer in the filters ? Alan - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
sata dvd not detected on jmicron with some kernel-configurations.
When booting a gnu/linux distribution's install-cd i noticed that the included [1]kernel (2.6.20.3) would not probe/find my jmicron connected sata-dvd. My own [2]kernel-configuration finds the drive just fine but the distributor's kernel with all sorts of drivers built-in wont. If i connect the dvd-rom to my ich8r controller, both kernels finds it just fine. The jmicron controller is configured as AHCI in the systems bios and AHCI is built-in in both kernels along with scsi-cdrom support. [1] http://fredrik.obra.se/linux-2.6.20.3.config [2] http://fredrik.obra.se/2.6.20.3.config Any suggestions on what breaks it? Cheers. -- Fredrik Rinnestam - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH libata-dev#upstream-fixes] libata: IDENTIFY backwards for drive side cable detection
For drive side cable detection to work correctly, drives need to be identified backwards such that the slave device releases PDIAG- before the mater drive tries to detect cable type. ata_bus_probe() was fixed by commit f31f0cc2f0b7527072d94d02da332d9bb8d7d94c but the new EH path wasn't fixed. This patch makes new EH path do IDENTIFY backwards. ata_dev_configure() for new devices are still performed master first. This is to keep the detection messages in forward order. Signed-off-by: Tejun Heo [EMAIL PROTECTED] --- Jeff, this one should go into #upstream-fixes. The following regression is fixed by this. http://thread.gmane.org/gmane.linux.ide/17433 Thanks. diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 361953a..c89664a 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -1743,12 +1743,17 @@ static int ata_eh_revalidate_and_attach(struct ata_port *ap, { struct ata_eh_context *ehc = ap-eh_context; struct ata_device *dev; + unsigned int new_mask = 0; unsigned long flags; int i, rc = 0; DPRINTK(ENTER\n); - for (i = 0; i ATA_MAX_DEVICES; i++) { + /* For PATA drive side cable detection to work, IDENTIFY must +* be done backwards such that PDIAG- is released by the slave +* device before the master device is identified. +*/ + for (i = ATA_MAX_DEVICES - 1; i = 0; i--) { unsigned int action, readid_flags = 0; dev = ap-device[i]; @@ -1760,13 +1765,13 @@ static int ata_eh_revalidate_and_attach(struct ata_port *ap, if (action ATA_EH_REVALIDATE ata_dev_ready(dev)) { if (ata_port_offline(ap)) { rc = -EIO; - break; + goto err; } ata_eh_about_to_do(ap, dev, ATA_EH_REVALIDATE); rc = ata_dev_revalidate(dev, readid_flags); if (rc) - break; + goto err; ata_eh_done(ap, dev, ATA_EH_REVALIDATE); @@ -1784,40 +1789,53 @@ static int ata_eh_revalidate_and_attach(struct ata_port *ap, rc = ata_dev_read_id(dev, dev-class, readid_flags, dev-id); - if (rc == 0) { - ehc-i.flags |= ATA_EHI_PRINTINFO; - rc = ata_dev_configure(dev); - ehc-i.flags = ~ATA_EHI_PRINTINFO; - } else if (rc == -ENOENT) { + switch (rc) { + case 0: + new_mask |= 1 i; + break; + case -ENOENT: /* IDENTIFY was issued to non-existent * device. No need to reset. Just * thaw and kill the device. */ ata_eh_thaw_port(ap); dev-class = ATA_DEV_UNKNOWN; - rc = 0; - } - - if (rc) { - dev-class = ATA_DEV_UNKNOWN; break; + default: + dev-class = ATA_DEV_UNKNOWN; + goto err; } + } + } - if (ata_dev_enabled(dev)) { - spin_lock_irqsave(ap-lock, flags); - ap-pflags |= ATA_PFLAG_SCSI_HOTPLUG; - spin_unlock_irqrestore(ap-lock, flags); + /* Configure new devices forward such that user doesn't see +* device detection messages backwards. +*/ + for (i = 0; i ATA_MAX_DEVICES; i++) { + dev = ap-device[i]; - /* new device discovered, configure xfermode */ - ehc-i.flags |= ATA_EHI_SETMODE; - } - } + if (!(new_mask (1 i))) + continue; + + ehc-i.flags |= ATA_EHI_PRINTINFO; + rc = ata_dev_configure(dev); + ehc-i.flags = ~ATA_EHI_PRINTINFO; + if (rc) + goto err; + + spin_lock_irqsave(ap-lock, flags); + ap-pflags |= ATA_PFLAG_SCSI_HOTPLUG; + spin_unlock_irqrestore(ap-lock, flags); + + /* new device discovered, configure xfermode */ + ehc-i.flags |= ATA_EHI_SETMODE; } - if (rc) - *r_failed_dev = dev; + return 0; - DPRINTK(EXIT\n); + err: +
Re: [PATCH libata-dev#upstream-fixes] libata: IDENTIFY backwards for drive side cable detection
On Thu, 22 Mar 2007 22:24:19 +0900 Tejun Heo [EMAIL PROTECTED] wrote: For drive side cable detection to work correctly, drives need to be identified backwards such that the slave device releases PDIAG- before the mater drive tries to detect cable type. ata_bus_probe() was fixed by commit f31f0cc2f0b7527072d94d02da332d9bb8d7d94c but the new EH path wasn't fixed. This patch makes new EH path do IDENTIFY backwards. ata_dev_configure() for new devices are still performed master first. This is to keep the detection messages in forward order. Signed-off-by: Tejun Heo [EMAIL PROTECTED] Acked-by: Alan Cox [EMAIL PROTECTED] Why do we have two implementations of the same code ? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: libata: CD and dvd devices not recognized
Hello. Albert Lee wrote: Thanks for the detailed log. It looks like the bad INQUIRY command CDB (4:0,1,0) 12 01 00 00 fe 00 00 00 00 (INQUIRY, length=254, EVPD=1) is coming from the user space, not the SCSI mid-layer. I guess two problems together caused this bug: 1. Ubuntu Linux issues an incorrect INQUIRY command to the drive. (Other distros seem to have the INQUIRY correct.) But what is incorrect about sending INQUIRY with EVPD bit? MBR, Sergei - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH libata-dev#upstream-fixes] libata: IDENTIFY backwards for drive side cable detection
Alan Cox wrote: On Thu, 22 Mar 2007 22:24:19 +0900 Tejun Heo [EMAIL PROTECTED] wrote: For drive side cable detection to work correctly, drives need to be identified backwards such that the slave device releases PDIAG- before the mater drive tries to detect cable type. ata_bus_probe() was fixed by commit f31f0cc2f0b7527072d94d02da332d9bb8d7d94c but the new EH path wasn't fixed. This patch makes new EH path do IDENTIFY backwards. ata_dev_configure() for new devices are still performed master first. This is to keep the detection messages in forward order. Signed-off-by: Tejun Heo [EMAIL PROTECTED] Acked-by: Alan Cox [EMAIL PROTECTED] Why do we have two implementations of the same code ? ata_bus_probe() is scheduled to be killed once all old-EH code is removed. That's the old probe path. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Change Libata Error Handling for Drive Testing
Hi Tejun, JFYI, it turns out that spurious interrupts was caused by User Scan before drive is ready. I wait for 2 seconds after drive is powered on which is not sufficient for some drives. Alt status should be checked first but there's no good way to check it in user space. Does User Scan related code check alt status before drive is touched? Thanks, Fajun On 3/19/07, Fajun Chen [EMAIL PROTECTED] wrote: On 3/19/07, Tejun Heo [EMAIL PROTECTED] wrote: Fajun Chen wrote: Please ignore the changes to pata_sil680.c. The same failure happened to standard sil680 driver without my change as well. Does it also happen when the second port is empty? Yes, it happens even when one of the port (either one) is powered off. It used to happen in the middle of our IO test application, now it happened much early in our test spinup process with debugging version of ata_host_intr() function. We boot up the target (ARM XScale processor) with hard drive powered off, then power up the drive and do test spinup. What test spinup does is to issue sysfs user scan on the port followed by Identify Device. Below is my debugging version of ata_host_intr() function with ATA_IRQ_TRAP enabled and hacked. What puzzled me is that none of the sub-counters (initial value is 1) get incremented in most failures? Please see dmesg log for details. I did see one failure (out of many) where idle_irq_hsm_state is incremented and matches idle_irq counter though. Thanks, Fajun inline unsigned int ata_host_intr (struct ata_port *ap, struct ata_queued_cmd *qc) { u8 status, host_stat = 0; /* VPRINTK(ata%u: protocol %d task_state %d\n, */ /* ap-id, qc-tf.protocol, ap-hsm_task_state); */ /* printk(KERN_INFO ata%u: protocol %d task_state %d\n, */ /* ap-id, qc-tf.protocol, ap-hsm_task_state); */ /* Check whether we are expecting interrupt in this state */ switch (ap-hsm_task_state) { case HSM_ST_FIRST: /* Some pre-ATAPI-4 devices assert INTRQ * at this state when ready to receive CDB. */ /* Check the ATA_DFLAG_CDB_INTR flag is enough here. * The flag was turned on only for atapi devices. * No need to check is_atapi_taskfile(qc-tf) again. */ if (!(qc-dev-flags ATA_DFLAG_CDB_INTR)) { /* printk(KERN_INFO ata%u: flags %lu\n, */ /* ap-id, qc-dev-flags); */ ap-stats.idle_irq_non_atapi++; goto idle_irq; } break; case HSM_ST_LAST: if (qc-tf.protocol == ATA_PROT_DMA || qc-tf.protocol == ATA_PROT_ATAPI_DMA) { /* check status of DMA engine */ host_stat = ap-ops-bmdma_status(ap); VPRINTK(ata%u: host_stat 0x%X\n, ap-id, host_stat); /* if it's not our irq... */ if (!(host_stat ATA_DMA_INTR)) /* printk(KERN_INFO ata%u: host_stat %d\n, */ /* ap-id, host_stat); */ ap-stats.idle_irq_host_state++; goto idle_irq; /* before we do anything else, clear DMA-Start bit */ ap-ops-bmdma_stop(qc); if (unlikely(host_stat ATA_DMA_ERR)) { /* error when transfering data to/from memory */ qc-err_mask |= AC_ERR_HOST_BUS; ap-hsm_task_state = HSM_ST_ERR; } } break; case HSM_ST: break; default: /* printk(KERN_INFO ata%u: hsm_state %d\n, */ /* ap-id, ap-hsm_task_state); */ ap-stats.idle_irq_hsm_state++; goto idle_irq; } /* check altstatus */ status = ata_altstatus(ap); if (status ATA_BUSY) { /* printk(KERN_INFO ata%u: altstatus %d\n, */ /* ap-id, status); */ ap-stats.idle_irq_altstatus++; goto idle_irq; } /* check main status, clearing INTRQ */ status = ata_chk_status(ap); if (unlikely(status ATA_BUSY)) { /* printk(KERN_INFO ata%u: status %d\n, */ /* ap-id, status); */ ap-stats.idle_irq_status++; goto idle_irq; } /* ack bmdma irq events */ ap-ops-irq_clear(ap); ata_hsm_move(ap, qc, status, 0); return 1; /* irq handled */ idle_irq: ap-stats.idle_irq++; #ifdef ATA_IRQ_TRAP if ((ap-stats.idle_irq %
Re: libata - 2.6.21-rc4-git5, ata channel still badly configured
On Thu, Mar 22, 2007 at 03:44:58PM +0900, Tejun Heo wrote: Lukas Hejtmanek wrote: Subject: ata_piix: PATA UDMA/100 configured as UDMA/33 References : http://lkml.org/lkml/2007/2/20/294 Submitter : Fabio Comolli [EMAIL PROTECTED] Status : patch exists Does this fix your problem? Yes, it does. Thank you. -- Lukáš Hejtmánek - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
2.6.20.3 AMD64 oops in CFQ code
This is a uniprocessor AMD64 system running software RAID-5 and RAID-10 over multiple PCIe SiI3132 SATA controllers. The hardware has been very stable for a long time, but has been acting up of late since I upgraded to 2.6.20.3. ECC memory should preclude the possibility of bit-flip errors. Kernel 2.6.20.3 + linuxpps patches (confined to drivers/serial, and not actually in use as I stole the serial port for a console). It takes half a day to reproduce the problem, so bisecting would be painful. BackupPC_dump mostly writes to a large (1.7 TB) ext3 RAID5 partition. Here are two oopes, a few minutes (16:31, to be precise) apart. Unusually, it oopsed twice *without* locking up the system.. Usually, I see this followed by an error from drivers/input/keyboard/atkbd.c: printk(KERN_WARNING atkbd.c: Spurious %s on %s. Some program might be trying access hardware directly.\n, emitted at 1 Hz with the keyboard LEDs flashing and the system unresponsive to keyboard or pings. (I think it was spurious ACK on serio/input0, but my memory may be faulty.) If anyone has any suggestions, they'd be gratefully received. Unable to handle kernel NULL pointer dereference at 0098 RIP: [8031504a] cfq_dispatch_insert+0x18/0x68 PGD 777e9067 PUD 78774067 PMD 0 Oops: [1] CPU 0 Modules linked in: ecb Pid: 2837, comm: BackupPC_dump Not tainted 2.6.20.3-g691f5333 #40 RIP: 0010:[8031504a] [8031504a] cfq_dispatch_insert+0x18/0x68 RSP: 0018:8100770bbaf8 EFLAGS: 00010092 RAX: 81007fb36c80 RBX: RCX: 0001 RDX: 00010003e4e7 RSI: RDI: RBP: 81007fb37a00 R08: R09: 81005d390298 R10: 81007fcb4f80 R11: 81007fcb4f80 R12: 81007facd280 R13: 0004 R14: 0001 R15: FS: 2b322d120d30() GS:805de000() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0098 CR3: 7bcf CR4: 06e0 Process BackupPC_dump (pid: 2837, threadinfo 8100770ba000, task 81007fc5d8e0) Stack: 8100770f39f0 0004 0001 80315253 803b2607 81005da2bc40 81007fac3800 81007facd280 81007facd280 81005d390298 Call Trace: [80315253] cfq_dispatch_requests+0x152/0x512 [803b2607] scsi_done+0x0/0x18 [8030d9f1] elv_next_request+0x137/0x147 [803b7ce0] scsi_request_fn+0x6a/0x33a [8024d407] generic_unplug_device+0xa/0xe [80407ced] unplug_slaves+0x5b/0x94 [80223d65] sync_page+0x0/0x40 [80223d9b] sync_page+0x36/0x40 [80256d45] __wait_on_bit_lock+0x36/0x65 [80237496] __lock_page+0x5e/0x64 [8028061d] wake_bit_function+0x0/0x23 [802074de] find_get_page+0xe/0x2d [8020b38e] do_generic_mapping_read+0x1c2/0x40d [8020bd80] file_read_actor+0x0/0x118 [8021422e] generic_file_aio_read+0x15c/0x19e [8020bafa] do_sync_read+0xc9/0x10c [80210342] may_open+0x5b/0x1c6 [802805ef] autoremove_wake_function+0x0/0x2e [8020a857] vfs_read+0xaa/0x152 [8020faf3] sys_read+0x45/0x6e [8025041e] system_call+0x7e/0x83 Code: 4c 8b ae 98 00 00 00 4c 8b 70 08 e8 63 fe ff ff 8b 43 28 4c RIP [8031504a] cfq_dispatch_insert+0x18/0x68 RSP 8100770bbaf8 CR2: 0098 1Unable to handle kernel NULL pointer dereference at 0098 RIP: [8031504a] cfq_dispatch_insert+0x18/0x68 PGD 79bd2067 PUD 789f9067 PMD 0 Oops: [2] CPU 0 Modules linked in: ecb Pid: 2834, comm: BackupPC_dump Not tainted 2.6.20.3-g691f5333 #40 RIP: 0010:[8031504a] [8031504a] cfq_dispatch_insert+0x18/0x 68 RSP: 0018:8100789b5af8 EFLAGS: 00010092 RAX: 81007fb36c80 RBX: RCX: 0001 RDX: 00010007ac16 RSI: RDI: RBP: 81007fb37a00 R08: R09: 810064dd45e0 R10: 81007fcb4f80 R11: 81007fcb4f80 R12: 81007facd280 R13: 0004 R14: 0001 R15: FS: 2b0a7c680d30() GS:805de000() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0098 CR3: 79d36000 CR4: 06e0 Process BackupPC_dump (pid: 2834, threadinfo 8100789b4000, task 81007a23 5140) Stack: 81007b9ebbd0 0004 0001 80315253 803b2607 81000e67ba00 81007fac3800 81007facd280 81007facd280 810064dd45e0 Call Trace: [80315253] cfq_dispatch_requests+0x152/0x512 [803b2607] scsi_done+0x0/0x18 [8030d9f1] elv_next_request+0x137/0x147 [803b7ce0] scsi_request_fn+0x6a/0x33a [8024d407]
Re: 2.6.20.3 AMD64 oops in CFQ code
This is a uniprocessor AMD64 system running software RAID-5 and RAID-10 over multiple PCIe SiI3132 SATA controllers. The hardware has been very stable for a long time, but has been acting up of late since I upgraded to 2.6.20.3. ECC memory should preclude the possibility of bit-flip errors. Tried checking the memory with memtest86? Do you have k8_edac module loaded? If you don't, I'd recomend using it to get reports of recoverable/unrecoverable memory errors, check http://bluesmoke.sf.net/ for latest version. -- Aristeu - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.20.3 AMD64 oops in CFQ code
On Thu, Mar 22 2007, [EMAIL PROTECTED] wrote: This is a uniprocessor AMD64 system running software RAID-5 and RAID-10 over multiple PCIe SiI3132 SATA controllers. The hardware has been very stable for a long time, but has been acting up of late since I upgraded to 2.6.20.3. ECC memory should preclude the possibility of bit-flip errors. Kernel 2.6.20.3 + linuxpps patches (confined to drivers/serial, and not actually in use as I stole the serial port for a console). It takes half a day to reproduce the problem, so bisecting would be painful. BackupPC_dump mostly writes to a large (1.7 TB) ext3 RAID5 partition. Here are two oopes, a few minutes (16:31, to be precise) apart. Unusually, it oopsed twice *without* locking up the system.. Usually, I see this followed by an error from drivers/input/keyboard/atkbd.c: printk(KERN_WARNING atkbd.c: Spurious %s on %s. Some program might be trying access hardware directly.\n, emitted at 1 Hz with the keyboard LEDs flashing and the system unresponsive to keyboard or pings. (I think it was spurious ACK on serio/input0, but my memory may be faulty.) If anyone has any suggestions, they'd be gratefully received. Unable to handle kernel NULL pointer dereference at 0098 RIP: [8031504a] cfq_dispatch_insert+0x18/0x68 PGD 777e9067 PUD 78774067 PMD 0 Oops: [1] CPU 0 Modules linked in: ecb Pid: 2837, comm: BackupPC_dump Not tainted 2.6.20.3-g691f5333 #40 RIP: 0010:[8031504a] [8031504a] cfq_dispatch_insert+0x18/0x68 RSP: 0018:8100770bbaf8 EFLAGS: 00010092 RAX: 81007fb36c80 RBX: RCX: 0001 RDX: 00010003e4e7 RSI: RDI: RBP: 81007fb37a00 R08: R09: 81005d390298 R10: 81007fcb4f80 R11: 81007fcb4f80 R12: 81007facd280 R13: 0004 R14: 0001 R15: FS: 2b322d120d30() GS:805de000() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0098 CR3: 7bcf CR4: 06e0 Process BackupPC_dump (pid: 2837, threadinfo 8100770ba000, task 81007fc5d8e0) Stack: 8100770f39f0 0004 0001 80315253 803b2607 81005da2bc40 81007fac3800 81007facd280 81007facd280 81005d390298 Call Trace: [80315253] cfq_dispatch_requests+0x152/0x512 [803b2607] scsi_done+0x0/0x18 [8030d9f1] elv_next_request+0x137/0x147 [803b7ce0] scsi_request_fn+0x6a/0x33a [8024d407] generic_unplug_device+0xa/0xe [80407ced] unplug_slaves+0x5b/0x94 [80223d65] sync_page+0x0/0x40 [80223d9b] sync_page+0x36/0x40 [80256d45] __wait_on_bit_lock+0x36/0x65 [80237496] __lock_page+0x5e/0x64 [8028061d] wake_bit_function+0x0/0x23 [802074de] find_get_page+0xe/0x2d [8020b38e] do_generic_mapping_read+0x1c2/0x40d [8020bd80] file_read_actor+0x0/0x118 [8021422e] generic_file_aio_read+0x15c/0x19e [8020bafa] do_sync_read+0xc9/0x10c [80210342] may_open+0x5b/0x1c6 [802805ef] autoremove_wake_function+0x0/0x2e [8020a857] vfs_read+0xaa/0x152 [8020faf3] sys_read+0x45/0x6e [8025041e] system_call+0x7e/0x83 3 (I think) seperate instances of this, each involving raid5. Is your array degraded or fully operational? -- Jens Axboe - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
request_queue_t depends on CONFIG_BLOCK
How can this compile error be fixed properly? request_queue_t is inside CONFIG_BLOCK, ide_drive_s (and likely others) use it unconditionally. CC arch/powerpc/kernel/setup_64.o In file included from linux-2.6.21-rc4/arch/powerpc/kernel/setup_64.c:23: linux-2.6.21-rc4/include/linux/ide.h:556: error: expected specifier-qualifier-list before 'request_queue_t' linux-2.6.21-rc4/include/linux/ide.h:695: warning: 'struct request' declared inside parameter list linux-2.6.21-rc4/include/linux/ide.h:695: warning: its scope is only this definition or declaration, which is probably not what you want linux-2.6.21-rc4/include/linux/ide.h:823: warning: 'struct request' declared inside parameter list linux-2.6.21-rc4/include/linux/ide.h:856: error: field 'wrq' has incomplete type linux-2.6.21-rc4/include/linux/ide.h:1199: error: expected ')' before '*' token make[2]: *** [arch/powerpc/kernel/setup_64.o] Error 1 CC arch/powerpc/kernel/setup-common.o In file included from linux-2.6.21-rc4/arch/powerpc/kernel/setup-common.c:24: linux-2.6.21-rc4/include/linux/ide.h:556: error: expected specifier-qualifier-list before 'request_queue_t' linux-2.6.21-rc4/include/linux/ide.h:695: warning: 'struct request' declared inside parameter list linux-2.6.21-rc4/include/linux/ide.h:695: warning: its scope is only this definition or declaration, which is probably not what you want linux-2.6.21-rc4/include/linux/ide.h:823: warning: 'struct request' declared inside parameter list linux-2.6.21-rc4/include/linux/ide.h:856: error: field 'wrq' has incomplete type linux-2.6.21-rc4/include/linux/ide.h:1199: error: expected ')' before '*' token make[2]: *** [arch/powerpc/kernel/setup-common.o] Error 1 make[2]: Target `__build' not remade because of errors. make[1]: *** [arch/powerpc/kernel] Error 2 - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: request_queue_t depends on CONFIG_BLOCK
On Thu, Mar 22, 2007 at 10:52:34PM +0100, Olaf Hering wrote: How can this compile error be fixed properly? request_queue_t is inside CONFIG_BLOCK, ide_drive_s (and likely others) use it unconditionally. CC arch/powerpc/kernel/setup_64.o In file included from linux-2.6.21-rc4/arch/powerpc/kernel/setup_64.c:23: start looking for the problem here. Why does you arch code include ide.h? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] sd: implement START/STOP management
Tejun Heo wrote: Hello, Douglas. Douglas Gilbert wrote: Tejun, I note at this point that the IMMED bit in the START STOP UNIT cdb is clear. [The code might note that as well.] All SCSI disks that I have seen, implement the IMMED bit and according to the SAT standard, so should SAT layers like the one in libata. With the IMMED bit clear: - on spin up, it will wait until disk is ready. Okay unless there are a lot of disks, in which case we could ask Matthew Wilcox for help - on spin down, will wait until media is stopped. That could be 20 seconds, and if there were multiple disks I guess the question is do we need to wait until a disk is spun down before dropping power to it and suspending. I think we do. As we're issuing SYNCHRONIZE CACHE prior to spinning down disks, it's probably okay to drop power early data-integrity-wise but still... We can definitely use IMMED=1 during resume (needs to be throttled somehow tho). This helps even when there is only one disk. We can let the disk spin up in the background and proceed with the rest of resuming process. Unfortunately, libata SAT layer doesn't do IMMED and even if it does (I've tried and have a patch available) it doesn't really work because during host resume each port enters EH and resets and revalidates each device. Many if not most ATA harddisks don't respond to reset or IDENTIFY till it's fully spun up meaning libata EH has to wait for all drives to spin up. libata EH runs inside SCSI EH thread meaning SCSI comman issue blocks till libata EH finishes resetting the port. So, IMMED or not, sd gotta wait for libata disks. If we want to do parallel spin down, PM core needs to be updated such that there are two events - issue and done - somewhat similar to what SCSI is doing to probe devices parallelly. If we're gonna do that, we maybe can apply the same mechanism to resume path so that we can do things parallelly IMMED or not. Seems, there is another way of doing a bank spin up / spin down: doing it in two passes. On the first pass START_STOP will be issued with IMMED=1 on all devices, then on the second pass START_STOP will be issued with IMMED=0. So the devices will spin up / spin down in the parallel, but synchronously, hence the needed result will be achieved with minimal code changes, although it will indeed need upper layer changes in struct device_driver's suspend(), resume(), etc. callers. Vlad - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] sd: implement START/STOP management
On Thu, 22 Mar 2007, Vladislav Bolkhovitin wrote: Seems, there is another way of doing a bank spin up / spin down: doing it in two passes. On the first pass START_STOP will be issued with IMMED=1 on all devices, then on the second pass START_STOP will be issued with IMMED=0. So the devices will spin up / spin down in the parallel, but synchronously, hence the needed result will be achieved And maybe trip the PSU's overcurrent defenses? There is a reason to default to sequential spin-up for disks... Of course, it can be user-selectable. But should it be the default? -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] sd: implement START/STOP management
Henrique de Moraes Holschuh wrote: On Thu, 22 Mar 2007, Vladislav Bolkhovitin wrote: Seems, there is another way of doing a bank spin up / spin down: doing it in two passes. On the first pass START_STOP will be issued with IMMED=1 on all devices, then on the second pass START_STOP will be issued with IMMED=0. So the devices will spin up / spin down in the parallel, but synchronously, hence the needed result will be achieved And maybe trip the PSU's overcurrent defenses? There is a reason to default to sequential spin-up for disks... But on spin down there is no such problem Of course, it can be user-selectable. But should it be the default? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: libata: CD and dvd devices not recognized
Sergei Shtylyov wrote: Hello. Albert Lee wrote: Thanks for the detailed log. It looks like the bad INQUIRY command CDB (4:0,1,0) 12 01 00 00 fe 00 00 00 00 (INQUIRY, length=254, EVPD=1) is coming from the user space, not the SCSI mid-layer. I guess two problems together caused this bug: 1. Ubuntu Linux issues an incorrect INQUIRY command to the drive. (Other distros seem to have the INQUIRY correct.) But what is incorrect about sending INQUIRY with EVPD bit? Nothing wrong from the SCSI point of view. However, in the early ATAPI spec (sff-8020i), this EVPD bit is reserved. And apprently some imperfect ATAPI CD-ROM drive doesn't handle it well when EVPD = 1. :( Hmm, how about the revised version: 1. Ubuntu Linux issues a correct INQUIRY command to the drive which set EVPD = 1. However, EVPD is reserved per early ATAPI spec and the AOpen 56X/AKH drive times out in this case. ... -- albert - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: libata: CD and dvd devices not recognized
Alan Cox wrote: We have two possible solutions here: a. Patch Ubuntu, such that the incorrect INQUIRY is fixed. b. Patch kernel, such that the AOpen drives are blacklisted. Each INQUIRY is inspected for the blacklisted drives. If the INQUIRY looks wrong, the INQUIRY is rejected. I guess a. is the preferred solution... We have two problems here #1 Ubuntu got the inquiry command wrong #2 Until now we considered INQUIRY a safe command for SG_IO passthrough. We can't really take INQUIRY out of SG_IO so do we decide its the hardware vendors problem or do something cleverer in the filters ? Maybe the SG_IO author has better idea (ccing Doug)? BTW, in addition to the AOpen INQUIRY with EVPD problem, we have another imperfect ATAPI drive (TORiSAN) that freezes when READ = 128KB. (http://bugzilla.kernel.org/show_bug.cgi?id=6710) We can limit dev-max_sectors to workaround the TORiSAN problem. But I don't know whether dev-max_sectors also works for SG_IO? If no, some user space application, unaware of the problem, might send a correct READ that locks the drive completely. -- albert - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: request_queue_t depends on CONFIG_BLOCK
On Thu, Mar 22, Christoph Hellwig wrote: On Thu, Mar 22, 2007 at 10:52:34PM +0100, Olaf Hering wrote: How can this compile error be fixed properly? request_queue_t is inside CONFIG_BLOCK, ide_drive_s (and likely others) use it unconditionally. CC arch/powerpc/kernel/setup_64.o In file included from linux-2.6.21-rc4/arch/powerpc/kernel/setup_64.c:23: start looking for the problem here. Why does you arch code include ide.h? Because it is needed in a few places. Better hide everything in ide.h inside #ifdef CONFIG_IDE - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html