Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)
On Thu, Dec 06 2007 at 2:26 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: Hi Boaz, On Tue, 04 Dec 2007 15:39:12 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote: On Sat, Dec 01 2007 at 1:35 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote: This patch converts bidi of scsi mid-layer to use blk_end_request(). rq-next_rq represents a pair of bidi requests. (There are no other use of 'next_rq' of struct request.) For both requests in the pair, end_that_request_chunk() should be called before end_that_request_last() is called for one of them. Since the calls to end_that_request_first()/chunk() and end_that_request_last() are packaged into blk_end_request(), the handling of next_rq completion has to be moved into blk_end_request(), too. Bidi sets its specific value to rq-data_len before the request is completed so that upper-layer can read it. This setting must be between end_that_request_chunk() and end_that_request_last(), because rq-data_len may be used in end_that_request_chunk() by blk_trace and so on. To satisfy the requirement, use blk_end_request_callback() which is added in PATCH 25 only for the tricky drivers. If bidi didn't reuse rq-data_len and added new members to request for the specific value, it could set before end_that_request_chunk() and use the standard blk_end_request() like below. void scsi_end_bidi_request(struct scsi_cmnd *cmd) { struct request *req = cmd-request; rq-resid = scsi_out(cmd)-resid; rq-next_rq-resid = scsi_in(cmd)-resid; if (blk_end_request(req, 1, req-data_len)) BUG(); scsi_release_buffers(cmd); scsi_next_command(cmd); } ... snip ... rq-data_len = scsi_out(cmd)-resid is Not Just a problem of bidi it is a General problem of scsi residual handling, and user code. Even today before any bidi. at scsi_lib.c at scsi_io_completion() we do req-data_len = scsi_get_resid(cmd); ( or: req-data_len = cmd-resid; depends which version you look) And then call scsi_end_request() which calls __end_that_request_first/last So it is assumed even today that req-data_len is not touched by __end_that_request_first/last unless __end_that_request_first returned that there is more work to do and the command is resubmitted in which case the resid information is discarded. So if the regular resid handling is acceptable - Set req-data_len before the call to __end_that_request_first/last, or blk_end_request() in your case, then here goes your second client of the _callback and it can be removed. But if it is found that req-data_len is touched and the resid information gets lost, than it should be fixed for the common uni-io case, by - for example - pass resid to the blk_end_request() function. (So in any way the _callback can go) Thank you for the explanation of scsi's rq-data_len usage. I see that scsi usually uses rq-data_len for cmd-resid. I have investigated the possibility of setting data_len before the call to blk_end_request. But no matter whether data_len is touched or not, we need a callback for bidi. So I would like to go with the current patch. I explained the reason and some details below. As far as I can see, rq-data_len is just referenced by blk_add_trace_rq() in __end_that_request_first(), not modified. And I don't change any logic around there in the block-layer. So there shouldn't be any critical problem for scsi residual handing. (although I'm not sure that scsi expectes cmd-resid to be traced by blk_trace.) Anyway, I see that it is no critical problem for bidi to set cmd-resid to rq-data_len before blk_end_request() call. But if I do that, blk_end_request() can't get the next_rq's size to complete in its code below. +/* Bidi request must be completed as a whole */ +if (blk_bidi_rq(rq) +__end_that_request_first(rq-next_rq, uptodate, + blk_rq_bytes(rq-next_rq))) +return 1; So I will have to move next_rq completion to bidi and use _callback() anyway like the following. - static int dummy_cb(struct request *rq) { return 1; } void scsi_end_bidi_request(struct scsi_cmnd *cmd) { struct request *req = cmd-request; unsigned int dlen = req-data_len; unsigned int next_dlen = req-next_rq-data_len; req-data_len = scsi_out(cmd)-resid; req-next_rq-data_len = scsi_in(cmd)-resid; /* Complete only DATA of next_rq using _callback and dummy function */ if (!blk_end_request_callback(req-next_rq, 1, next_dlen, dummy_cb)) BUG(); if (blk_end_request(req, 1, dlen)) BUG(); scsi_release_buffers(cmd); scsi_next_command(cmd); } - I prefer the current patch rather than the code like above, since the code calls
Re: [PATCH 14/14] libata: use PIO for misc ATAPI commands
Petr Vandrovec wrote: Alan Cox wrote: It eventually has to end up in -rc. If not for 2.6.25-rc1 is too early, we can put it in #testing and put it into #upstream later. Nobody cares about libata git trees. If you want some initial test coverage put it in -mm. primarily worried about. Command type dependent quick fallback might help but ancient controllers are more likely to bring the whole machine down when a DMA transaction goes south. Quite the reverse in my experience - the dumber the controller the more likely that ATAPI DMA and LBA48 and other stuff just works anyway. Yes. FYI, if you'll start sending ATAPI commands with DATA_OUT phase using PIO from VM under VMware, it will politely ask you to reconfigure OS in the virtual machine to use DMA, and most probably it won't work until you really do so... Windows are sending all these commands using DMA, and I believe they do same for majority of DATA_IN commands as well (and Windows also set byte count to correct PIO-like value even for DMA commands) There are few DATA OUT commands and most of them are sector (2k) aligned. We use DMA for them. The same goes for sector aligned DATA IN commands and READ CD but what you say is inconsistent with what I've seen from SATA bus trace. Windows was using PIO for all misc DATA IN commands - such as REQUEST SENSE, GET CONFIGURATION, MODE SENSE, etc... And, libata will set byte count to PIO-like value for DMA from 2.6.25 if not from 24. Thanks. -- tejun - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 14/14] libata: use PIO for misc ATAPI commands
Alan Cox wrote: It eventually has to end up in -rc. If not for 2.6.25-rc1 is too early, we can put it in #testing and put it into #upstream later. Nobody cares about libata git trees. If you want some initial test coverage put it in -mm. primarily worried about. Command type dependent quick fallback might help but ancient controllers are more likely to bring the whole machine down when a DMA transaction goes south. Quite the reverse in my experience - the dumber the controller the more likely that ATAPI DMA and LBA48 and other stuff just works anyway. Yes. FYI, if you'll start sending ATAPI commands with DATA_OUT phase using PIO from VM under VMware, it will politely ask you to reconfigure OS in the virtual machine to use DMA, and most probably it won't work until you really do so... Windows are sending all these commands using DMA, and I believe they do same for majority of DATA_IN commands as well (and Windows also set byte count to correct PIO-like value even for DMA commands) Given that very few customers reported this problem in past 8 years, I would guess that your attempt to use PIO only will actually exercise more untested code in the firmware than DMA code paths. Petr - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
PROBLEM: WARNING: at kernel/irq/manage.c:158 enable_irq() during boot
Hi, The full raport in the attached file. Regards Wojciech Zareba [1.] One line summary of the problem: WARNING: at kernel/irq/manage.c:158 enable_irq() during boot [2.] Full description of the problem/report: Dec  6 11:58:23 titanium kernel: WARNING: at kernel/irq/manage.c:158 enable_irq() Dec  6 11:58:23 titanium kernel:  [c014b5c3] enable_irq+0x6e/0xa2 Dec  6 11:58:23 titanium kernel:  [f887b783] probe_hwif+0x6d8/0x7c7 [ide_core] Dec  6 11:58:23 titanium kernel:  [f887c0be] probe_hwif_init_with_fixup+0xc/0x80 [ide_core] Dec  6 11:58:23 titanium kernel:  [c0190007] elf_core_dump+0x627/0xb60 Dec  6 11:58:23 titanium kernel:  [f887df7b] ide_setup_pci_device+0x6f/0x9c [ide_core] Dec  6 11:58:23 titanium kernel:  [f883b1a7] pdc202new_init_one+0xf/0x10 [pdc202xx_new] Dec  6 11:58:23 titanium kernel:  [c01d94de] pci_device_probe+0x36/0x55 Dec  6 11:58:23 titanium kernel:  [c021f8cb] driver_probe_device+0xc8/0x14b Dec  6 11:58:23 titanium kernel:  [c021fa37] __driver_attach+0x52/0x87 Dec  6 11:58:23 titanium kernel:  [c021eecc] bus_for_each_dev+0x35/0x57 Dec  6 11:58:23 titanium kernel:  [c021f748] driver_attach+0x16/0x18 Dec  6 11:58:23 titanium kernel:  [c021f9e5] __driver_attach+0x0/0x87 Dec  6 11:58:23 titanium kernel:  [c021f1a8] bus_add_driver+0x6d/0x153 Dec  6 11:58:23 titanium kernel:  [c01d961d] __pci_register_driver+0x4b/0x77 Dec  6 11:58:23 titanium kernel:  [c0143a39] sys_init_module+0x1525/0x15fb Dec  6 11:58:23 titanium kernel:  [f8879af9] ide_config_drive_speed+0x0/0x314 [ide_core] Dec  6 11:58:23 titanium kernel:  [c0103f9e] syscall_call+0x7/0xb Dec  6 11:58:23 titanium kernel:  [c02b] wireless_nlevent_process+0x15/0x31 Dec  6 11:58:23 titanium kernel:  === My hardware: PC based on the Giga-Byte motherboard 8PE667 Ultra (chipset 845 PE). There are 3 disks: one with Debian (but kernel compiled by me) and 2 disks striped (RAID 0) with Windows 2000. [3.] Keywords (i.e., modules, networking, kernel): kernel IDE IRQ [4.] Kernel version (from /proc/version): Linux version 2.6.23.9-titan-1 ([EMAIL PROTECTED]) (gcc version 4.2.3 20071014 (prerelease) (Debian 4.2.2-3)) #5 SMP Thu Nov 29 11:11:30 CET 2007 [5.] Output of Oops.. N/A [6.] A small shell script or example program which triggers the problem (if possible) Just boot on this hardware. [7.] Environment [7.1.] Software (add the output of the ver_linux script here) If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux titanium 2.6.23.9-titan-1 #5 SMP Thu Nov 29 11:11:30 CET 2007 i686 GNU/Linux Gnu C 4.2.3 Gnu make 3.81 binutils Binutils util-linux 2.13 mount 2.13 module-init-tools 3.3-pre11 e2fsprogs 1.40.2 Linux C Library2.7 Dynamic linker (ldd) 2.7 Procps 3.2.7 Net-tools 1.60 Console-tools 0.2.3 Sh-utils 5.97 udev 114 wireless-tools 29 Modules Loaded ppdev lp ac ipv6 dm_snapshot dm_mirror dm_mod loop snd_ens1371 snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi snd_seq_device snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer parport_pc parport snd snd_page_alloc button i2c_i801 iTCO_wdt pcspkr rtc i2c_core iTCO_vendor_support intel_agp agpgart evdev ext3 jbd mbcache ide_disk ide_cd cdrom ata_piix ata_generic pata_pdc2027x libata scsi_mod piix floppy pdc202xx_new ohci_hcd generic ide_core ehci_hcd uhci_hcd thermal processor fan [7.2.] Processor information (from /proc/cpuinfo): processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.53GHz stepping: 7 cpu MHz : 2545.587 cache size : 512 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe up pebs bts sync_rdtsc cid bogomips: 5095.57 clflush size: 64 [7.3.] Module information (from /proc/modules): ppdev 8964 0 - Live 0xf89c6000 lp 11244 0 - Live 0xf8b2c000 ac 5636 0 - Live 0xf892d000 ipv6 233084 14 - Live 0xf8b8b000 dm_snapshot 17060 0 - Live 0xf8b26000 dm_mirror 22016 0 - Live 0xf89fd000 dm_mod 52912 2 dm_snapshot,dm_mirror, Live 0xf8b37000 loop 17284 0 - Live 0xf89f7000 snd_ens1371 22628 3 - Live 0xf89d1000 snd_seq_dummy 3972 0 - Live 0xf891a000 snd_seq_oss 29588 0 - Live 0xf8a2d000 snd_seq_midi 8352 0 - Live 0xf897b000 snd_seq_midi_event 7168 2 snd_seq_oss,snd_seq_midi, Live 0xf88cb000 snd_seq 46620 6 snd_seq_dummy,snd_seq_oss,snd_seq_midi,snd_seq_midi_event, Live 0xf8a07000 snd_rawmidi 22816 2
Revisiting - 2.6.23.8 - Hang with sata_mv (7042) + Flat 4Gig (no holes) Memory
Well, I thought my problems were solved with the latest set of patches - and it definitely improved the behavior - but I have found out that it just delayed the problem - and I still get a soft lockup (no good info in the soft lockup trace) creating large (300Meg) when using the sata_mv/7042 driver in 2.6.23.8 I am very embarrassed that I didn't do more testing before declaring victory...humbly apologies to all... To re-state the problem Hardware/Configuration: MPC8548E with a 7042 (rev 2 - connected internal via a PEX switch) 2.6.23.8 (using PHYS_64BIT PTE_64BIT - for 36 bit addressing MSI is NOT compiled in) Flat 4Gig Memory Map (no holes - 0 - 0x0__ defined - special low reserve memory is also used) Local Bus PCI Express IOMem mapped to unique space in 0xC__ with extensions to the ioremap routines to create the appropriate requested physical address... This is (and should be) transparent to the requesting function that calls ioremap. 2 SATA hard drives connected. To recreate: Write a large file (now greater than 310Mbytes) - hangs and soft lockup is detected by kernel - no useful info in stack trace... Of interest: a) Replace sata_mv.c - with the 'old' Marvell's reference driver and it works perfectly!! b) Also, sata_mv works perfectly in all conditions - if we boot with less than the ~3750M from the command line (which I note is ~below where its PEX IOmemory space is located). My thoughts (besides @[EMAIL PROTECTED]@[EMAIL PROTECTED]@#) In the old Marvell reference driver - we had to modify the EDMA setup to configure the dma_high request/response addresses to point to the proper (0xC__) location. No other modifications were required - so it's a little confusing what is going on here. It is obvious from #b above that this has something to do with accessing/reading/writing data to/from this chip, and when this happens - it scribbles on important internal information and/or gets into a confused state where it just locks up... Again, sorry for my inadequate testing reports before...and I look forward to anyone's input on this! Sincerely, Tom Morrison Principal S/W Engineer Empirix, Inc (www.empirix.com) [EMAIL PROTECTED] (781) 266 - 3567 - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: laptop reboots right after hibernation
to., 06.12.2007 kl. 11.38 +0900, skrev Tejun Heo: Thanks. Almost there. Can you please try the attached two patches and report the boot log? Here we go again. Cheers Kjartan Initializing cgroup subsys cpuset Linux version 2.6.24-rc4 ([EMAIL PROTECTED]) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-35)) #3 SMP Thu Dec 6 13:29:39 CET 2007 BIOS-provided physical RAM map: BIOS-e820: - 0009fc00 (usable) BIOS-e820: 0009fc00 - 000a (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - bf7d (usable) BIOS-e820: bf7d - bf7e5600 (reserved) BIOS-e820: bf7e5600 - bf7f8000 (ACPI NVS) BIOS-e820: bf7f8000 - bf80 (reserved) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: fed2 - fed9b000 (reserved) BIOS-e820: feda - fedc (reserved) BIOS-e820: fee0 - fee01000 (reserved) BIOS-e820: ffb0 - ffc0 (reserved) BIOS-e820: fff0 - 0001 (reserved) 2167MB HIGHMEM available. 896MB LOWMEM available. Entering add_active_range(0, 0, 784336) 0 entries of 256 used Zone PFN ranges: DMA 0 - 4096 Normal 4096 - 229376 HighMem229376 - 784336 Movable zone start PFN for each node early_node_map[1] active PFN ranges 0:0 - 784336 On node 0 totalpages: 784336 DMA zone: 56 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4040 pages, LIFO batch:0 Normal zone: 3080 pages used for memmap Normal zone: 00 pages, LIFO batch:31 HighMem zone: 7587 pages used for memmap HighMem zone: 547373 pages, LIFO batch:31 Movable zone: 0 pages used for memmap DMI 2.4 present. Using APIC driver default ACPI: RSDP 000F78B0, 0024 (r2 HP) ACPI: XSDT BF7E57C8, 007C (r1 HPQOEM SLIC-MPC1 HP 1) ACPI: FACP BF7E5684, 00F4 (r4 HP 30AD3 HP 1) ACPI: DSDT BF7E5ACC, FE7B (r1 HP nc64001 MSFT 10E) ACPI: FACS BF7F7E80, 0040 ACPI: SLIC BF7E5844, 0176 (r1 HPQOEM SLIC-MPC1 HP 1) ACPI: HPET BF7E59BC, 0038 (r1 HP 30AD1 HP 1) ACPI: APIC BF7E59F4, 0068 (r1 HP 30AD1 HP 1) ACPI: MCFG BF7E5A5C, 003C (r1 HP 30AD1 HP 1) ACPI: TCPA BF7E5A98, 0032 (r2 HP 30AD1 HP 1) ACPI: SSDT BF7F5947, 0059 (r1 HP HPQNLP1 MSFT 10E) ACPI: SSDT BF7F59A0, 032D (r1 HP HPQSAT1 MSFT 10E) ACPI: SSDT BF7F64E0, 025F (r1 HP Cpu0Tst 3000 INTL 20060317) ACPI: SSDT BF7F673F, 00A6 (r1 HP Cpu1Tst 3000 INTL 20060317) ACPI: SSDT BF7F67E5, 04D7 (r1 HPCpuPm 3000 INTL 20060317) ACPI: PM-Timer IO Port: 0x1008 ACPI: Local APIC address 0xfee0 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6:15 APIC version 20 ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) Processor #1 6:15 APIC version 20 ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0]) IOAPIC[0]: apic_id 1, version 32, address 0xfec0, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs ACPI: HPET id: 0x8086a201 base: 0xfed0 Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at c000 (gap: bf80:3f40) swsusp: Registered nosave memory region: 0009f000 - 000a swsusp: Registered nosave memory region: 000a - 000e swsusp: Registered nosave memory region: 000e - 0010 Built 1 zonelists in Zone order, mobility grouping on. Total pages: 773613 Kernel command line: ro root=LABEL=/1 rhgb quiet pci=assign-busses selinux=off mapped APIC to b000 (fee0) mapped IOAPIC to a000 (fec0) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 CPU 0 irqstacks, hard=c082 soft=c080 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 1828.814 MHz processor. Console: colour VGA+ 80x25 console [tty0] enabled Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar ... MAX_LOCKDEP_SUBCLASSES:8 ... MAX_LOCK_DEPTH: 30 ... MAX_LOCKDEP_KEYS:2048 ... CLASSHASH_SIZE: 1024 ... MAX_LOCKDEP_ENTRIES: 8192 ... MAX_LOCKDEP_CHAINS: 16384 ... CHAINHASH_SIZE: 8192 memory used by lock dependency info: 1024 kB per task-struct memory footprint: 1680 bytes Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory:
Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)
Hi Boaz, Jens, On Thu, 06 Dec 2007 11:24:44 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote: Index: 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c === --- 2.6.24-rc3-mm2.orig/drivers/scsi/scsi_lib.c +++ 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c @@ -629,28 +629,6 @@ void scsi_run_host_queues(struct Scsi_Ho scsi_run_queue(sdev-request_queue); } -static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate) -{ - struct request_queue *q = cmd-device-request_queue; - struct request *req = cmd-request; - unsigned long flags; - - add_disk_randomness(req-rq_disk); - - spin_lock_irqsave(q-queue_lock, flags); - if (blk_rq_tagged(req)) - blk_queue_end_tag(q, req); - - end_that_request_last(req, uptodate); - spin_unlock_irqrestore(q-queue_lock, flags); - - /* -* This will goose the queue request function at the end, so we don't -* need to worry about launching another command. -*/ - scsi_next_command(cmd); -} - /* * Function:scsi_end_request() * @@ -921,6 +899,20 @@ void scsi_release_buffers(struct scsi_cm EXPORT_SYMBOL(scsi_release_buffers); /* + * Called from blk_end_request_callback() after all DATA in rq and its next_rq + * are completed before rq is completed/freed. + */ +static int scsi_end_bidi_request_cb(struct request *rq) +{ + struct scsi_cmnd *cmd = rq-special; + + rq-data_len = scsi_out(cmd)-resid; + rq-next_rq-data_len = scsi_in(cmd)-resid; + + return 0; +} + +/* * Bidi commands Must be complete as a whole, both sides at once. * If part of the bytes were written and lld returned * scsi_in()-resid and/or scsi_out()-resid this information will be left @@ -931,22 +923,28 @@ void scsi_end_bidi_request(struct scsi_c { struct request *req = cmd-request; - end_that_request_chunk(req, 1, req-data_len); - req-data_len = scsi_out(cmd)-resid; - - end_that_request_chunk(req-next_rq, 1, req-next_rq-data_len); - req-next_rq-data_len = scsi_in(cmd)-resid; - - scsi_release_buffers(cmd); - /* *FIXME: If ll_rw_blk.c is changed to also put_request(req-next_rq) -* in end_that_request_last() then this WARN_ON must be removed. +* in blk_end_request() then this WARN_ON must be removed. * for now, upper-driver must have registered an end_io. */ WARN_ON(!req-end_io); - scsi_finalize_request(cmd, 1); + /* +* blk_end_request() family take care of data completion of next_rq. +* blk_end_request() family use next_rq-data_len for +* the completion data size of next_rq. +* So resid can't be set before the data completion of next_rq +* in blk_end_request(). +* To resolve that, use the callback feature of blk_end_request(). +*/ + if (blk_end_request_callback(req, 1, req-data_len, +scsi_end_bidi_request_cb)) + /* req has not been completed */ + BUG(); + + scsi_release_buffers(cmd); + scsi_next_command(cmd); } /* Index: 2.6.24-rc3-mm2/block/ll_rw_blk.c === --- 2.6.24-rc3-mm2.orig/block/ll_rw_blk.c +++ 2.6.24-rc3-mm2/block/ll_rw_blk.c @@ -3817,6 +3817,12 @@ int blk_end_request(struct request *rq, if (blk_fs_request(rq) || blk_pc_request(rq)) { if (__end_that_request_first(rq, uptodate, nr_bytes)) return 1; + + /* Bidi request must be completed as a whole */ + if (blk_bidi_rq(rq) + __end_that_request_first(rq-next_rq, uptodate, +blk_rq_bytes(rq-next_rq))) + return 1; } add_disk_randomness(rq-rq_disk); @@ -3840,6 +3846,12 @@ int __blk_end_request(struct request *rq if (blk_fs_request(rq) || blk_pc_request(rq)) { if (__end_that_request_first(rq, uptodate, nr_bytes)) return 1; + + /* Bidi request must be completed as a whole */ + if (blk_bidi_rq(rq) + __end_that_request_first(rq-next_rq, uptodate, +blk_rq_bytes(rq-next_rq))) + return 1; } add_disk_randomness(rq-rq_disk); @@ -3884,6 +3896,12 @@ int blk_end_request_callback(struct requ if (blk_fs_request(rq) || blk_pc_request(rq)) { if (__end_that_request_first(rq, uptodate, nr_bytes)) return 1; + + /* Bidi request must be completed as a whole */ + if (blk_bidi_rq(rq) + __end_that_request_first(rq-next_rq, uptodate, +blk_rq_bytes(rq-next_rq))) + return 1; }
Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
On Sat, 1 Dec 2007 06:26:08 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: I am putting a new machine together and I have dual raptor raid 1 for the root, which works just fine under all stress tests. Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on sale now adays): I ran the following: dd if=/dev/zero of=/dev/sdc dd if=/dev/zero of=/dev/sdd dd if=/dev/zero of=/dev/sde (as it is always a very good idea to do this with any new disk) And sometime along the way(?) (i had gone to sleep and let it run), this occurred: [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x401 action 0x2 frozen Gee we're seeing a lot of these lately. [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in [42880.680292] res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10 (ATA bus error) [42881.841899] ata3: soft resetting port [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42915.919042] ata3.00: qc timeout (cmd 0xec) [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5) [42915.919149] ata3.00: revalidation failed (errno=-5) [42915.919206] ata3: failed to recover some devices, retrying in 5 secs [42920.912458] ata3: hard resetting port [42926.411363] ata3: port is slow to respond, please be patient (Status 0x80) [42930.943080] ata3: COMRESET failed (errno=-16) [42930.943130] ata3: hard resetting port [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42931.413523] ata3.00: configured for UDMA/133 [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4) [42931.413655] ata3: EH complete [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors (750156 MB) [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Usually when I see this sort of thing with another box I have full of raptors, it was due to a bad raptor and I never saw it again after I replaced the disk that it happened on, but that was using the Intel P965 chipset. For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge). I am going to do some further testing but does this indicate a bad drive? Bad cable? Bad connector? As you can see above, /dev/sdc stopped responding for a little bit and then the kernel reset the port. Why is this though? What is the likely root cause? Should I replace the drive? Obviously this is not normal and cannot be good at all, the idea is to put these drives in a RAID5 and if one is going to timeout that is going to cause the array to go degraded and thus be worthless in a raid5 configuration. Can anyone offer any insight here? It would be interesting to try 2.6.21 or 2.6.22. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sata_mv: Fix broken Marvell 7042 support.
Jeff Garzik wrote: .. The problem is not at the chip or device level, and this is the same problem as any number of other cards with softRAID on it. Not a new problem, not a new solution... .. What other cards do we support that automatically overwrite user data without confirmation or notice of any kind? Just curious. This card does it to any connected drive at power-on, without the user taking any action or even being told about it. Cheers - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sata_mv: Fix broken Marvell 7042 support.
Jeff Garzik wrote: .. OTOH it is quite reasonable to explore auto-loading DM on top of the bare drive, and populating a DM table, if you see that particular BIOS signature or [insert other detection method]. .. I am very interested to hear a more detailed explanation of this, as I don't really see how it addresses the problems. Probably because I don't know much about device mapper. But my understanding of it is that it re-exports portions of the original device as a second device. It doesn't seem to prevent using the original device afterward, and I don't know if dm devices can have GRUB installed on them or not. So a more full tutorial might be in order here. Cheers - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)
On Thu, 6 Dec 2007, Andrew Morton wrote: On Sat, 1 Dec 2007 06:26:08 -0500 (EST) Justin Piszcz [EMAIL PROTECTED] wrote: I am putting a new machine together and I have dual raptor raid 1 for the root, which works just fine under all stress tests. Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on sale now adays): I ran the following: dd if=/dev/zero of=/dev/sdc dd if=/dev/zero of=/dev/sdd dd if=/dev/zero of=/dev/sde (as it is always a very good idea to do this with any new disk) And sometime along the way(?) (i had gone to sleep and let it run), this occurred: [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x401 action 0x2 frozen Gee we're seeing a lot of these lately. [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 512 in [42880.680292] res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10 (ATA bus error) [42881.841899] ata3: soft resetting port [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42915.919042] ata3.00: qc timeout (cmd 0xec) [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5) [42915.919149] ata3.00: revalidation failed (errno=-5) [42915.919206] ata3: failed to recover some devices, retrying in 5 secs [42920.912458] ata3: hard resetting port [42926.411363] ata3: port is slow to respond, please be patient (Status 0x80) [42930.943080] ata3: COMRESET failed (errno=-16) [42930.943130] ata3: hard resetting port [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [42931.413523] ata3.00: configured for UDMA/133 [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4) [42931.413655] ata3: EH complete [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors (750156 MB) [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Usually when I see this sort of thing with another box I have full of raptors, it was due to a bad raptor and I never saw it again after I replaced the disk that it happened on, but that was using the Intel P965 chipset. For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge). I am going to do some further testing but does this indicate a bad drive? Bad cable? Bad connector? As you can see above, /dev/sdc stopped responding for a little bit and then the kernel reset the port. Why is this though? What is the likely root cause? Should I replace the drive? Obviously this is not normal and cannot be good at all, the idea is to put these drives in a RAID5 and if one is going to timeout that is going to cause the array to go degraded and thus be worthless in a raid5 configuration. Can anyone offer any insight here? It would be interesting to try 2.6.21 or 2.6.22. This was due to NCQ issues (disabling it fixed the problem). Justin. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hard drives only detected when booting from CD on nVidia MCP67 SATA
On 12/05/2007 06:53 PM, Chuck Ebbert wrote: With kernel 2.6.23 on an Acer 7220 notebook using nVidia MCP67 SATA, hard drives are only detected after first booting from a CD. Boot from hard drive No drives detected Boot live CD Detected Boot CD to GRUB menu, Detected then warm-boot from hard drive Non-detect case: ahci :00:09.0: version 2.3 ACPI: PCI Interrupt Link [LSI0] enabled at IRQ 23 ACPI: PCI Interrupt :00:09.0[A] - Link [LSI0] - GSI 23 (level, low) - IRQ 16 input: ImPS/2 Generic Wheel Mouse as /class/input/input2 ahci :00:09.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl IDE mode ahci :00:09.0: flags: 64bit sntf led clo pmp pio slum part PCI: Setting latency timer of device :00:09.0 to 64 scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci ata1: SATA max UDMA/133 cmd 0xf8854100 ctl 0x bmdma 0x irq 221 ata2: SATA max UDMA/133 cmd 0xf8854180 ctl 0x bmdma 0x irq 221 ata3: SATA max UDMA/133 cmd 0xf8854200 ctl 0x bmdma 0x irq 221 ata4: SATA max UDMA/133 cmd 0xf8854280 ctl 0x bmdma 0x irq 221 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2: SATA link down (SStatus 0 SControl 300) ata3: SATA link down (SStatus 0 SControl 300) ata4: SATA link down (SStatus 0 SControl 300) Waiting for driver initialization But, if a LiveCD is used to boot or if a LiveCD was used before an hot reboot (without a power off), disks are correctly found : Loading ahci.ko ahci :00:09.0: version 2.3 ACPI: PCI Interrupt Link [LSI0] enabled at IRQ 23 ACPI: PCI Interrupt :00:09.0[A] - Link [LSI0] - GSI 23 (level, low) - IRQ 16 input: ImPS/2 Generic Wheel Mouse as /class/input/input2 ahci :00:09.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl IDE mode ahci :00:09.0: flags: 64bit sntf led clo pmp pio slum part PCI: Setting latency timer of device :00:09.0 to 64 scsi0 : ahci scsi1 : ahci scsi2 : ahci scsi3 : ahci ata1: SATA max UDMA/133 cmd 0xf8854100 ctl 0x bmdma 0x irq 221 ata2: SATA max UDMA/133 cmd 0xf8854180 ctl 0x bmdma 0x irq 221 ata3: SATA max UDMA/133 cmd 0xf8854200 ctl 0x bmdma 0x irq 221 ata4: SATA max UDMA/133 cmd 0xf8854280 ctl 0x bmdma 0x irq 221 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: Hitachi HTS541612J9SA00, SBDOC70P, max UDMA/100 ata1.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/100 ata2: SATA link down (SStatus 0 SControl 300) ata3: SATA link down (SStatus 0 SControl 300) ata4: SATA link down (SStatus 0 SControl 300) Possibly fixed by: Commit: 3cc3eb1148e4b2dfabf7a1dcf36fd8be1331ca95 [libata] AHCI: enable AHCI mode, before using AHCI reset Plus: Commit: ab6fc95f609b372a19e18ea689986846ab1ba29c [libata] AHCI: fix newly introduced host-reset bug ?? - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Failure with SATA DVD-RW
On Thu, 6 Dec 2007 01:33:16 + (UTC) Parag Warudkar [EMAIL PROTECTED] wrote: Tom Lanyon tomlanyon at gmail.com writes: scsi4: ahci ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300) ata5.00: ATAPI, max UDMA/66 ata5.00: qc timeout (cmd 0xef) ata5.00: failed to set xfermode (err_mask=0x104) ata5.00: limiting speed to UDMA/44 ata5: failed to recover some devices, retrying in 5 secs ata5: port is slow to respond, please be patient (Status 0x80) ata5: port failed to respond (30 secs, status 0x80) ata5: COMRESET failed (device not ready) ata5: hardreset failed, retrying in 5 secs ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300) ata5.00: ATAPI, max UDMA/66 ata5.00: qc timeout (cmd 0xef) ata5.00: failed to set xfermode (err_mask=0x104) ata5.00: limiting speed to PIO0 ata5: failed to recover some devices, retrying in 5 secs ata5: port is slow to respond, please be patient (Status 0x80) ata5: port failed to respond (30 secs, status 0x80) ata5: COMRESET failed (device not ready) ata5: hardreset failed, retrying in 5 secs ata5.00: ATAPI, max UDMA/66 ata5.00: qc timeout (cmd 0xef) ata5.00: failed to set xfermode (err_mask=0x104) ata5.00: disabled Looks like it is trying to set transfer mode to UDMA/66 and failing. After that it tried UDMA/44 and failed again. Next UDMA/66 again with unsurprising result - failed. After that PIO0 which seems to cause some kind of trouble, then it tries UDMA/66 again, and I am not stating the result again :) ! Any ideas what to try to get it working under AHCI? I recall reading somewhere - the Pioneer drive needs UDMA/33 which it did not try in your case - need to some how have it try UDMA/33 but I don't find a boot parameter which will do that. So may be adding a quirk for this device to limit the xfer mode to 33 may work. What does your dmesg output for the drives look like when you run in IDE compat mode? (Particularly the DMA for this drive?) Please cc linux-ide on sata, pata and ide-related issues. If nothing happens within a few days please raise a report at bugzilla.kernel.org so we can ignore this in an organised fashion, thanks. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Failure with SATA DVD-RW
(argh, shit, resent. Please don't massage the cc list. Do reply-to-all) On Thu, 6 Dec 2007 01:33:16 + (UTC) Parag Warudkar [EMAIL PROTECTED] wrote: Tom Lanyon tomlanyon at gmail.com writes: scsi4: ahci ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300) ata5.00: ATAPI, max UDMA/66 ata5.00: qc timeout (cmd 0xef) ata5.00: failed to set xfermode (err_mask=0x104) ata5.00: limiting speed to UDMA/44 ata5: failed to recover some devices, retrying in 5 secs ata5: port is slow to respond, please be patient (Status 0x80) ata5: port failed to respond (30 secs, status 0x80) ata5: COMRESET failed (device not ready) ata5: hardreset failed, retrying in 5 secs ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300) ata5.00: ATAPI, max UDMA/66 ata5.00: qc timeout (cmd 0xef) ata5.00: failed to set xfermode (err_mask=0x104) ata5.00: limiting speed to PIO0 ata5: failed to recover some devices, retrying in 5 secs ata5: port is slow to respond, please be patient (Status 0x80) ata5: port failed to respond (30 secs, status 0x80) ata5: COMRESET failed (device not ready) ata5: hardreset failed, retrying in 5 secs ata5.00: ATAPI, max UDMA/66 ata5.00: qc timeout (cmd 0xef) ata5.00: failed to set xfermode (err_mask=0x104) ata5.00: disabled Looks like it is trying to set transfer mode to UDMA/66 and failing. After that it tried UDMA/44 and failed again. Next UDMA/66 again with unsurprising result - failed. After that PIO0 which seems to cause some kind of trouble, then it tries UDMA/66 again, and I am not stating the result again :) ! Any ideas what to try to get it working under AHCI? I recall reading somewhere - the Pioneer drive needs UDMA/33 which it did not try in your case - need to some how have it try UDMA/33 but I don't find a boot parameter which will do that. So may be adding a quirk for this device to limit the xfer mode to 33 may work. What does your dmesg output for the drives look like when you run in IDE compat mode? (Particularly the DMA for this drive?) Please cc linux-ide on sata, pata and ide-related issues. If nothing happens within a few days please raise a report at bugzilla.kernel.org so we can ignore this in an organised fashion, thanks. - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Why we were seeing so many spurious NCQ completions
Tejun Heo wrote: Hello, all. This has been going on for quite some time now but I finally succeeded to reproduce the problem and find out what has been going on. It wasn't drive's or controller's fault. The spurious completion detection logic was wrong which makes all of this my fault. :-) The attached patch induces NCQ spurious completions by inserting artificial delays during irq handling. The following is log with the patch applied. A [ 1125.478813] ata35: MON issue=0x0 SAct=0x1 sactive=0x3 SDB FIS=004040a1:0002 B [ 1125.480248] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB FIS=004040a1:0001 C [ 1125.481614] ata35: MON issue=0x0 SAct=0x5 sactive=0x7 SDB FIS=004040a1:0002 D [ 1125.481704] ata35: YYY 0x2 - 0x4 E [ 1125.481722] ata35: XXX issue=0x0 SAct=0x1 sactive=0x1 SDB FIS=004040a1:0004 F [ 1125.483087] ata35: MON issue=0x0 SAct=0x0 sactive=0x1 SDB FIS=004040a1:0001 G [ 1125.484297] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB FIS=004040a1:0001 Thanks a lot for tracking this down, and thanks even more for being humble enough to admit mistakes. More kernel hackers should follow your example. I continue to be a proud mentor, watching you kick ass on the Linux kernel scene :) Jeff - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] sata_mv: Fix broken Marvell 7042 support.
Mark Lord wrote: Jeff Garzik wrote: ... If you pop the BIOS chip or plug the card into a non-x86 box (or any of several other alternatives), the problem is likely to go away. .. Yeah, I was hoping for a removable BIOS chip, but it's soldered in place. And that's not a solution for most users anyway. That was an example, silly :) I'm not asking users to pop out chips. I'm illustrating that they are separate and distinct pieces, and you cannot assume. Boot into a non-x86 platform, or use your x86 BIOS to disable all optional ROMs, and the BIOS-stomps-data issue goes away. I'm not saying the _problem_ goes away; instead I am illustrating why it is incorrect to update sata_mv for this problem. The solution belongs elsewhere, because the problem is not with the chip, but the BIOS. Continuing with the other emails... What other cards do we support that automatically overwrite user data without confirmation or notice of any kind? If you use any vendor RAID (BIOS RAID / fake RAID), and fail to use DM+dmraid, then data corruption occurs due to lack of knowledge about the presence of underlying BIOS-created RAID metadata. Your case is just another case of problems caused by lack of knowledge of the underlying vendor RAID that the BIOS insists upon using. I'm pretty sure the most recently Fedora release has full dmraid support for known formats, so AFAICS the task at hand should be simply to figure out how to identify the underlying vendor RAID (on-disk signatures are greatly preferred over PCI ID matching), and update dmraid accordingly. Welcome to the suck that is BIOS RAID :) Jeff - To unsubscribe from this list: send the line unsubscribe linux-ide in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Why we were seeing so many spurious NCQ completions
Hello, all. This has been going on for quite some time now but I finally succeeded to reproduce the problem and find out what has been going on. It wasn't drive's or controller's fault. The spurious completion detection logic was wrong which makes all of this my fault. :-) The attached patch induces NCQ spurious completions by inserting artificial delays during irq handling. The following is log with the patch applied. A [ 1125.478813] ata35: MON issue=0x0 SAct=0x1 sactive=0x3 SDB FIS=004040a1:0002 B [ 1125.480248] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB FIS=004040a1:0001 C [ 1125.481614] ata35: MON issue=0x0 SAct=0x5 sactive=0x7 SDB FIS=004040a1:0002 D [ 1125.481704] ata35: YYY 0x2 - 0x4 E [ 1125.481722] ata35: XXX issue=0x0 SAct=0x1 sactive=0x1 SDB FIS=004040a1:0004 F [ 1125.483087] ata35: MON issue=0x0 SAct=0x0 sactive=0x1 SDB FIS=004040a1:0001 G [ 1125.484297] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB FIS=004040a1:0001 MON lines are printed on each SDB FIS while YYY line indicates that SDB FIS RX area has changed during the artificial delay. XXX line notes condition which triggers spurious NCQ completion - invoking EH is disabled for debugging. Here's what happens. 1. On A, NCQ command 0 and 1 are in flight - command 0 is still being transmitted to the device. The first SDB FIS indicates completion of command 1. 2. Between A and B, the driver issues NCQ commands 1 and 2. 0 is still in flight. 3. On B, command 0 completes and drive sends completion for it. 4. Between B and C, the driver issues NCQ command 0. 5. On C, command 1 completes and drive sends completion for it. 6. On D, then the drive completes command 2 and sends completion for it. This makes SDB FIS RX area updated and as the driver is still in IRQ handler, sets IRQ pending bit again. Note that YYY line is printed *before* actually completing commands. So, after printing YYY line, the driver completes both commands 1 and 2. 7. On E, the IRQ handler is invoked again because of the IRQ pending status set from #6. However, completions contained in the SDB FIS which triggered this IRQ handler invocation is already processed. ie. Completion for command 2 is already processed in the previous IRQ handler invocation, so this time IRQ handler has nothing to do but SDB FIS RX area shows that this IRQ is for SDB FIS which includes completion for command 2, which triggers spurious NCQ completion condition. 8. It goes on. So, trying to detect spurious completions using IRQ and RX FIS area turns out to be stupid as they aren't interlocked. I'll soon post a patch to remove spurious completion check and blacklist resulted from it. Thanks a lot. -- tejun diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 4688dbf..9f9a658 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1664,6 +1664,10 @@ static void ahci_port_intr(struct ata_port *ap) } if (status PORT_IRQ_SDB_FIS) { + const __le32 *f = pp-rx_fis + RX_FIS_SDB; + u32 t = le32_to_cpu(f[1]); + int i; + /* If SNotification is available, leave notification * handling to sata_async_notification(). If not, * emulate it by snooping SDB FIS RX area. @@ -1686,6 +1690,32 @@ static void ahci_port_intr(struct ata_port *ap) if (f0 (1 15)) sata_async_notification(ap); } + + if (le32_to_cpu(f[1]) ~pp-active_link-sactive) + ata_link_printk(pp-active_link, KERN_INFO, +XXX issue=0x%x SAct=0x%x sactive=0x%x SDB FIS=%08x:%08x\n, +readl(port_mmio + PORT_CMD_ISSUE), +readl(port_mmio + PORT_SCR_ACT), +pp-active_link-sactive, +le32_to_cpu(f[0]), le32_to_cpu(f[1])); + else + ata_link_printk(pp-active_link, KERN_INFO, +MON issue=0x%x SAct=0x%x sactive=0x%x SDB FIS=%08x:%08x\n, +readl(port_mmio + PORT_CMD_ISSUE), +readl(port_mmio + PORT_SCR_ACT), +pp-active_link-sactive, +le32_to_cpu(f[0]), le32_to_cpu(f[1])); + + for (i = 0; i 100; i++) { + if (t != le32_to_cpu(f[1])) { +ata_link_printk(pp-active_link, KERN_INFO, + YYY 0x%x - 0x%x\n, + t, le32_to_cpu(f[1])); +break; + } + udelay(1); + cpu_relax(); + } } /* pp-active_link is valid iff any command is in flight */ @@ -1746,7 +1776,7 @@ static void ahci_port_intr(struct ata_port *ap) * with HSM violation. EH will turn off NCQ * after several such failures. */ - ata_ehi_push_desc(ehi, + /* ata_ehi_push_desc(ehi, spurious completions during NCQ issue=0x%x SAct=0x%x FIS=%08x:%08x, readl(port_mmio + PORT_CMD_ISSUE), @@ -1754,7 +1784,7 @@ static void ahci_port_intr(struct ata_port *ap) le32_to_cpu(f[0]), le32_to_cpu(f[1])); ehi-err_mask |= AC_ERR_HSM; ehi-action |= ATA_EH_SOFTRESET; - ata_port_freeze(ap); + ata_port_freeze(ap);*/ } else { if (!pp-ncq_saw_sdb) ata_port_printk(ap, KERN_INFO,
[PATCH #upstream-fixes] libata: kill spurious NCQ completion detection
Spurious NCQ completion detection implemented in ahci was incorrect. On AHCI receving and processing FISes and raising interrupts are not interlocked and spurious interrupts are expected. For example, if an interrupt occurs while interrupt handler is running and the running interrupt handler handles the event the new IRQ indicated, after IRQ handler finishes, it will be executed again because IRQ pending bit is set by the new interrupt but there won't be anything to process. Please read the following message for more information. http://article.gmane.org/gmane.linux.ide/26012 This patch... * Removes all spurious IRQ whining from ahci. Spurious NCQ completion detection was completely wrong. Spurious D2H Register FIS taught us that some early drives send spurious D2H Register FIS with I bit set while NCQ commands are in progress but none of recent drives does that and even the ones which show such behavior can do NCQ fine. * Kills all NCQ blacklist entries which were added because of spurious NCQ completions. I tracked down each commit and verified all removed ones are actually added because of spurious completions. WD740ADFD-00NLR1 wasn't deleted but moved upward because the drive not only had spurious NCQ completions but also is slow on sequential data transfers if NCQ is enabled. Maxtor 7V300F0 was added by 0e3dbc01d53940fe10e5a5cfec15ede3e929c918 from Alan Cox. I can only find evidences that the drive only had troubles with spuruious completions by searching the mailing list. This entry needs to be verified and removed if it doesn't have other NCQ related problems. Signed-off-by: Tejun Heo [EMAIL PROTECTED] Cc: Alan Cox [EMAIL PROTECTED] --- Alan, can you please check why 7V300F0 was added? Thanks a lot. drivers/ata/ahci.c| 74 +- drivers/ata/libata-core.c | 18 --- 2 files changed, 4 insertions(+), 88 deletions(-) diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 4688dbf..7ef497a 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1638,7 +1638,7 @@ static void ahci_port_intr(struct ata_port *ap) struct ahci_host_priv *hpriv = ap-host-private_data; int resetting = !!(ap-pflags ATA_PFLAG_RESETTING); u32 status, qc_active; - int rc, known_irq = 0; + int rc; status = readl(port_mmio + PORT_IRQ_STAT); writel(status, port_mmio + PORT_IRQ_STAT); @@ -1696,80 +1696,12 @@ static void ahci_port_intr(struct ata_port *ap) rc = ata_qc_complete_multiple(ap, qc_active, NULL); - /* If resetting, spurious or invalid completions are expected, -* return unconditionally. -*/ - if (resetting) - return; - - if (rc 0) - return; - if (rc 0) { + /* while resetting, invalid completions are expected */ + if (unlikely(rc 0 !resetting)) { ehi-err_mask |= AC_ERR_HSM; ehi-action |= ATA_EH_SOFTRESET; ata_port_freeze(ap); - return; - } - - /* hmmm... a spurious interrupt */ - - /* if !NCQ, ignore. No modern ATA device has broken HSM -* implementation for non-NCQ commands. -*/ - if (!ap-link.sactive) - return; - - if (status PORT_IRQ_D2H_REG_FIS) { - if (!pp-ncq_saw_d2h) - ata_port_printk(ap, KERN_INFO, - D2H reg with I during NCQ, - this message won't be printed again\n); - pp-ncq_saw_d2h = 1; - known_irq = 1; - } - - if (status PORT_IRQ_DMAS_FIS) { - if (!pp-ncq_saw_dmas) - ata_port_printk(ap, KERN_INFO, - DMAS FIS during NCQ, - this message won't be printed again\n); - pp-ncq_saw_dmas = 1; - known_irq = 1; } - - if (status PORT_IRQ_SDB_FIS) { - const __le32 *f = pp-rx_fis + RX_FIS_SDB; - - if (le32_to_cpu(f[1])) { - /* SDB FIS containing spurious completions -* might be dangerous, whine and fail commands -* with HSM violation. EH will turn off NCQ -* after several such failures. -*/ - ata_ehi_push_desc(ehi, - spurious completions during NCQ - issue=0x%x SAct=0x%x FIS=%08x:%08x, - readl(port_mmio + PORT_CMD_ISSUE), - readl(port_mmio + PORT_SCR_ACT), - le32_to_cpu(f[0]), le32_to_cpu(f[1])); - ehi-err_mask |= AC_ERR_HSM; - ehi-action |= ATA_EH_SOFTRESET; - ata_port_freeze(ap); -