Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)

2007-12-06 Thread Boaz Harrosh
On Thu, Dec 06 2007 at 2:26 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote:
 Hi Boaz,
 
 On Tue, 04 Dec 2007 15:39:12 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Sat, Dec 01 2007 at 1:35 +0200, Kiyoshi Ueda [EMAIL PROTECTED] wrote:
 This patch converts bidi of scsi mid-layer to use blk_end_request().

 rq-next_rq represents a pair of bidi requests.
 (There are no other use of 'next_rq' of struct request.)
 For both requests in the pair, end_that_request_chunk() should be
 called before end_that_request_last() is called for one of them.
 Since the calls to end_that_request_first()/chunk() and
 end_that_request_last() are packaged into blk_end_request(),
 the handling of next_rq completion has to be moved into
 blk_end_request(), too.

 Bidi sets its specific value to rq-data_len before the request is
 completed so that upper-layer can read it.
 This setting must be between end_that_request_chunk() and
 end_that_request_last(), because rq-data_len may be used
 in end_that_request_chunk() by blk_trace and so on.
 To satisfy the requirement, use blk_end_request_callback() which
 is added in PATCH 25 only for the tricky drivers.

 If bidi didn't reuse rq-data_len and added new members to request
 for the specific value, it could set before end_that_request_chunk()
 and use the standard blk_end_request() like below.

 void scsi_end_bidi_request(struct scsi_cmnd *cmd)
 {
 struct request *req = cmd-request;

 rq-resid = scsi_out(cmd)-resid;
 rq-next_rq-resid = scsi_in(cmd)-resid;

 if (blk_end_request(req, 1, req-data_len))
 BUG();

 scsi_release_buffers(cmd);
 scsi_next_command(cmd);
 }
 ...
 snip
 ...
 rq-data_len = scsi_out(cmd)-resid is Not Just a problem of bidi
 it is a General problem of scsi residual handling, and user code.

 Even today before any bidi. at scsi_lib.c at scsi_io_completion()
 we do req-data_len = scsi_get_resid(cmd);
 ( or: req-data_len = cmd-resid; depends which version you look)
 And then call scsi_end_request() which calls __end_that_request_first/last
 So it is assumed even today that req-data_len is not touched by
 __end_that_request_first/last unless __end_that_request_first returned
 that there is more work to do and the command is resubmitted in which
 case the resid information is discarded.

 So if the regular resid handling is acceptable - Set req-data_len
 before the call to __end_that_request_first/last, or blk_end_request()
 in your case, then here goes your second client of the _callback and
 it can be removed.
 But if it is found that req-data_len is touched and the resid information
 gets lost, than it should be fixed for the common uni-io case, by - for 
 example
 - pass resid to the blk_end_request() function.
 (So in any way the _callback can go)
 
 Thank you for the explanation of scsi's rq-data_len usage.
 I see that scsi usually uses rq-data_len for cmd-resid.
 
 I have investigated the possibility of setting data_len before
 the call to blk_end_request.
 But no matter whether data_len is touched or not, we need a callback
 for bidi.  So I would like to go with the current patch.
 
 I explained the reason and some details below.
 
 
 As far as I can see, rq-data_len is just referenced
 by blk_add_trace_rq() in __end_that_request_first(), not modified.
 And I don't change any logic around there in the block-layer.
 So there shouldn't be any critical problem for scsi residual handing.
 (although I'm not sure that scsi expectes cmd-resid to be traced
  by blk_trace.)
 
 Anyway, I see that it is no critical problem for bidi to set cmd-resid
 to rq-data_len before blk_end_request() call.
 But if I do that, blk_end_request() can't get the next_rq's size
 to complete in its code below.
 
 +/* Bidi request must be completed as a whole */
 +if (blk_bidi_rq(rq) 
 +__end_that_request_first(rq-next_rq, uptodate,
 + blk_rq_bytes(rq-next_rq)))
 +return 1;
 
 So I will have to move next_rq completion to bidi and use _callback()
 anyway like the following.
 -
 static int dummy_cb(struct request *rq)
 {
   return 1;
 }
 
 void scsi_end_bidi_request(struct scsi_cmnd *cmd)
 {
   struct request *req = cmd-request;
   unsigned int dlen = req-data_len;
   unsigned int next_dlen = req-next_rq-data_len;
  
   req-data_len = scsi_out(cmd)-resid;
   req-next_rq-data_len = scsi_in(cmd)-resid;
  
   /* Complete only DATA of next_rq using _callback and dummy function */
   if (!blk_end_request_callback(req-next_rq, 1, next_dlen, dummy_cb))
   BUG();
  
   if (blk_end_request(req, 1, dlen))
   BUG();
 
   scsi_release_buffers(cmd);
   scsi_next_command(cmd);
 }
 -
 
 I prefer the current patch rather than the code like above,
 since the code calls 

Re: [PATCH 14/14] libata: use PIO for misc ATAPI commands

2007-12-06 Thread Tejun Heo
Petr Vandrovec wrote:
 Alan Cox wrote:
 It eventually has to end up in -rc.  If not for 2.6.25-rc1 is too early,
 we can put it in #testing and put it into #upstream later.

 Nobody cares about libata git trees. If you want some initial test
 coverage put it in -mm.

 primarily worried about.  Command type dependent quick fallback might
 help but ancient controllers are more likely to bring the whole machine
 down when a DMA transaction goes south.

 Quite the reverse in my experience - the dumber the controller the more
 likely that ATAPI DMA and LBA48 and other stuff just works anyway.
 
 Yes.  FYI, if you'll start sending ATAPI commands with DATA_OUT phase
 using PIO from VM under VMware, it will politely ask you to reconfigure
 OS in the virtual machine to use DMA, and most probably it won't work
 until you really do so...  Windows are sending all these commands using
 DMA, and I believe they do same for majority of DATA_IN commands as well
 (and Windows also set byte count to correct PIO-like value even for DMA
 commands)

There are few DATA OUT commands and most of them are sector (2k)
aligned.  We use DMA for them.  The same goes for sector aligned DATA IN
commands and READ CD but what you say is inconsistent with what I've
seen from SATA bus trace.  Windows was using PIO for all misc DATA IN
commands - such as REQUEST SENSE, GET CONFIGURATION, MODE SENSE, etc...

And, libata will set byte count to PIO-like value for DMA from 2.6.25 if
not from 24.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/14] libata: use PIO for misc ATAPI commands

2007-12-06 Thread Petr Vandrovec

Alan Cox wrote:

It eventually has to end up in -rc.  If not for 2.6.25-rc1 is too early,
we can put it in #testing and put it into #upstream later.


Nobody cares about libata git trees. If you want some initial test
coverage put it in -mm.


primarily worried about.  Command type dependent quick fallback might
help but ancient controllers are more likely to bring the whole machine
down when a DMA transaction goes south.


Quite the reverse in my experience - the dumber the controller the more
likely that ATAPI DMA and LBA48 and other stuff just works anyway.


Yes.  FYI, if you'll start sending ATAPI commands with DATA_OUT phase 
using PIO from VM under VMware, it will politely ask you to reconfigure 
OS in the virtual machine to use DMA, and most probably it won't work 
until you really do so...  Windows are sending all these commands using 
DMA, and I believe they do same for majority of DATA_IN commands as well 
(and Windows also set byte count to correct PIO-like value even for DMA 
commands)


Given that very few customers reported this problem in past 8 years, I 
would guess that your attempt to use PIO only will actually exercise 
more untested code in the firmware than DMA code paths.

Petr

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PROBLEM: WARNING: at kernel/irq/manage.c:158 enable_irq() during boot

2007-12-06 Thread wojtekz
Hi,

The full raport in the attached file.

Regards
Wojciech Zareba
[1.] One line summary of the problem: 
WARNING: at kernel/irq/manage.c:158 enable_irq() during boot

[2.] Full description of the problem/report:
Dec  6 11:58:23 titanium kernel: WARNING: at kernel/irq/manage.c:158 
enable_irq()
Dec  6 11:58:23 titanium kernel:  [c014b5c3] enable_irq+0x6e/0xa2
Dec  6 11:58:23 titanium kernel:  [f887b783] probe_hwif+0x6d8/0x7c7 
[ide_core]
Dec  6 11:58:23 titanium kernel:  [f887c0be] 
probe_hwif_init_with_fixup+0xc/0x80 [ide_core]
Dec  6 11:58:23 titanium kernel:  [c0190007] elf_core_dump+0x627/0xb60
Dec  6 11:58:23 titanium kernel:  [f887df7b] ide_setup_pci_device+0x6f/0x9c 
[ide_core]
Dec  6 11:58:23 titanium kernel:  [f883b1a7] pdc202new_init_one+0xf/0x10 
[pdc202xx_new]
Dec  6 11:58:23 titanium kernel:  [c01d94de] pci_device_probe+0x36/0x55
Dec  6 11:58:23 titanium kernel:  [c021f8cb] driver_probe_device+0xc8/0x14b
Dec  6 11:58:23 titanium kernel:  [c021fa37] __driver_attach+0x52/0x87
Dec  6 11:58:23 titanium kernel:  [c021eecc] bus_for_each_dev+0x35/0x57
Dec  6 11:58:23 titanium kernel:  [c021f748] driver_attach+0x16/0x18
Dec  6 11:58:23 titanium kernel:  [c021f9e5] __driver_attach+0x0/0x87
Dec  6 11:58:23 titanium kernel:  [c021f1a8] bus_add_driver+0x6d/0x153
Dec  6 11:58:23 titanium kernel:  [c01d961d] __pci_register_driver+0x4b/0x77
Dec  6 11:58:23 titanium kernel:  [c0143a39] sys_init_module+0x1525/0x15fb
Dec  6 11:58:23 titanium kernel:  [f8879af9] 
ide_config_drive_speed+0x0/0x314 [ide_core]
Dec  6 11:58:23 titanium kernel:  [c0103f9e] syscall_call+0x7/0xb
Dec  6 11:58:23 titanium kernel:  [c02b] 
wireless_nlevent_process+0x15/0x31
Dec  6 11:58:23 titanium kernel:  ===

My hardware: PC based on the Giga-Byte motherboard 8PE667 Ultra (chipset 845 
PE).
There are 3 disks: one with Debian (but kernel compiled by me) and 2 disks 
striped (RAID 0)
with Windows 2000.

[3.] Keywords (i.e., modules, networking, kernel):
kernel IDE IRQ

[4.] Kernel version (from /proc/version):
Linux version 2.6.23.9-titan-1 ([EMAIL PROTECTED]) (gcc version 4.2.3 20071014 
(prerelease) (Debian 4.2.2-3)) #5 SMP Thu Nov 29 11:11:30 CET 2007

[5.] Output of Oops.. N/A

[6.] A small shell script or example program which triggers the
 problem (if possible)
Just boot on this hardware.

[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux titanium 2.6.23.9-titan-1 #5 SMP Thu Nov 29 11:11:30 CET 2007 i686 
GNU/Linux

Gnu C  4.2.3
Gnu make   3.81
binutils   Binutils
util-linux 2.13
mount  2.13
module-init-tools  3.3-pre11
e2fsprogs  1.40.2
Linux C Library2.7
Dynamic linker (ldd)   2.7
Procps 3.2.7
Net-tools  1.60
Console-tools  0.2.3
Sh-utils   5.97
udev   114
wireless-tools 29
Modules Loaded ppdev lp ac ipv6 dm_snapshot dm_mirror dm_mod loop 
snd_ens1371 snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq 
snd_rawmidi snd_seq_device snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss 
snd_pcm snd_timer parport_pc parport snd snd_page_alloc button i2c_i801 
iTCO_wdt pcspkr rtc i2c_core iTCO_vendor_support intel_agp agpgart evdev ext3 
jbd mbcache ide_disk ide_cd cdrom ata_piix ata_generic pata_pdc2027x libata 
scsi_mod piix floppy pdc202xx_new ohci_hcd generic ide_core ehci_hcd uhci_hcd 
thermal processor fan
[7.2.] Processor information (from /proc/cpuinfo):
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 15
model   : 2
model name  : Intel(R) Pentium(R) 4 CPU 2.53GHz
stepping: 7
cpu MHz : 2545.587
cache size  : 512 KB
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 2
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe up pebs
bts sync_rdtsc cid
bogomips: 5095.57
clflush size: 64

[7.3.] Module information (from /proc/modules):
ppdev 8964 0 - Live 0xf89c6000
lp 11244 0 - Live 0xf8b2c000
ac 5636 0 - Live 0xf892d000
ipv6 233084 14 - Live 0xf8b8b000
dm_snapshot 17060 0 - Live 0xf8b26000
dm_mirror 22016 0 - Live 0xf89fd000
dm_mod 52912 2 dm_snapshot,dm_mirror, Live 0xf8b37000
loop 17284 0 - Live 0xf89f7000
snd_ens1371 22628 3 - Live 0xf89d1000
snd_seq_dummy 3972 0 - Live 0xf891a000
snd_seq_oss 29588 0 - Live 0xf8a2d000
snd_seq_midi 8352 0 - Live 0xf897b000
snd_seq_midi_event 7168 2 snd_seq_oss,snd_seq_midi, Live 0xf88cb000
snd_seq 46620 6 snd_seq_dummy,snd_seq_oss,snd_seq_midi,snd_seq_midi_event, Live 
0xf8a07000
snd_rawmidi 22816 2 

Revisiting - 2.6.23.8 - Hang with sata_mv (7042) + Flat 4Gig (no holes) Memory

2007-12-06 Thread Morrison, Tom
Well, I thought my problems were solved with the latest set of
patches - and it definitely improved the behavior - but I have 
found out that it just delayed the problem - and I still get a 
soft lockup (no good info in the soft lockup trace) creating 
large (300Meg) when using the sata_mv/7042 driver in 2.6.23.8

I am very embarrassed that I didn't do more testing before 
declaring victory...humbly apologies to all...

To re-state the problem

Hardware/Configuration:
   MPC8548E with a 7042 (rev 2 - connected internal via a PEX switch) 
   2.6.23.8 (using PHYS_64BIT  PTE_64BIT - for 36 bit addressing
  MSI is NOT compiled in)
   Flat 4Gig Memory Map (no holes - 0 - 0x0__ defined - special
   low reserve memory is also used)

   Local Bus  PCI Express IOMem mapped to unique space in 
0xC__ with extensions to the ioremap routines 
to create the appropriate requested physical address...
This is (and should be) transparent to the requesting 
function that calls ioremap.

   2 SATA hard drives connected.

To recreate:
   Write a large file (now greater than 310Mbytes) - hangs
   and soft lockup is detected by kernel - no useful info 
   in stack trace...

Of interest:
   a) Replace sata_mv.c - with the 'old' Marvell's reference 
  driver and it works perfectly!!

   b) Also, sata_mv works perfectly in all conditions - if we boot with 
  less than the ~3750M from the command line (which I note is ~below

  where its PEX IOmemory space is located).


My thoughts (besides @[EMAIL PROTECTED]@[EMAIL PROTECTED]@#)

In the old Marvell reference driver - we had to modify 
the EDMA setup to configure the dma_high request/response 
addresses to point to the proper (0xC__) location.
No other modifications were required - so it's a little 
confusing what is going on here.

It is obvious from #b above that this has something to 
do with accessing/reading/writing data to/from this chip,
and when this happens - it scribbles on important internal
information and/or gets into a confused state where it 
just locks up...

Again, sorry for my inadequate testing reports before...and I look
forward
to anyone's input on this!

Sincerely,


Tom Morrison
Principal S/W Engineer
Empirix, Inc (www.empirix.com)
[EMAIL PROTECTED]
(781) 266 - 3567
 




-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: laptop reboots right after hibernation

2007-12-06 Thread Kjartan Maraas

to., 06.12.2007 kl. 11.38 +0900, skrev Tejun Heo:
 Thanks.  Almost there.  Can you please try the attached two patches and
 report the boot log?
 
Here we go again.

Cheers
Kjartan

Initializing cgroup subsys cpuset
Linux version 2.6.24-rc4 ([EMAIL PROTECTED]) (gcc version 4.1.2 20071124 (Red 
Hat 4.1.2-35)) #3 SMP Thu Dec 6 13:29:39 CET 2007
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009fc00 (usable)
 BIOS-e820: 0009fc00 - 000a (reserved)
 BIOS-e820: 000e - 0010 (reserved)
 BIOS-e820: 0010 - bf7d (usable)
 BIOS-e820: bf7d - bf7e5600 (reserved)
 BIOS-e820: bf7e5600 - bf7f8000 (ACPI NVS)
 BIOS-e820: bf7f8000 - bf80 (reserved)
 BIOS-e820: fec0 - fec01000 (reserved)
 BIOS-e820: fed2 - fed9b000 (reserved)
 BIOS-e820: feda - fedc (reserved)
 BIOS-e820: fee0 - fee01000 (reserved)
 BIOS-e820: ffb0 - ffc0 (reserved)
 BIOS-e820: fff0 - 0001 (reserved)
2167MB HIGHMEM available.
896MB LOWMEM available.
Entering add_active_range(0, 0, 784336) 0 entries of 256 used
Zone PFN ranges:
  DMA 0 - 4096
  Normal   4096 -   229376
  HighMem229376 -   784336
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
0:0 -   784336
On node 0 totalpages: 784336
  DMA zone: 56 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4040 pages, LIFO batch:0
  Normal zone: 3080 pages used for memmap
  Normal zone: 00 pages, LIFO batch:31
  HighMem zone: 7587 pages used for memmap
  HighMem zone: 547373 pages, LIFO batch:31
  Movable zone: 0 pages used for memmap
DMI 2.4 present.
Using APIC driver default
ACPI: RSDP 000F78B0, 0024 (r2 HP)
ACPI: XSDT BF7E57C8, 007C (r1 HPQOEM SLIC-MPC1 HP  1)
ACPI: FACP BF7E5684, 00F4 (r4 HP 30AD3 HP  1)
ACPI: DSDT BF7E5ACC, FE7B (r1 HP   nc64001 MSFT  10E)
ACPI: FACS BF7F7E80, 0040
ACPI: SLIC BF7E5844, 0176 (r1 HPQOEM SLIC-MPC1 HP  1)
ACPI: HPET BF7E59BC, 0038 (r1 HP 30AD1 HP  1)
ACPI: APIC BF7E59F4, 0068 (r1 HP 30AD1 HP  1)
ACPI: MCFG BF7E5A5C, 003C (r1 HP 30AD1 HP  1)
ACPI: TCPA BF7E5A98, 0032 (r2 HP 30AD1 HP  1)
ACPI: SSDT BF7F5947, 0059 (r1 HP   HPQNLP1 MSFT  10E)
ACPI: SSDT BF7F59A0, 032D (r1 HP   HPQSAT1 MSFT  10E)
ACPI: SSDT BF7F64E0, 025F (r1 HP  Cpu0Tst 3000 INTL 20060317)
ACPI: SSDT BF7F673F, 00A6 (r1 HP  Cpu1Tst 3000 INTL 20060317)
ACPI: SSDT BF7F67E5, 04D7 (r1 HPCpuPm 3000 INTL 20060317)
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 6:15 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 6:15 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
ACPI: IOAPIC (id[0x01] address[0xfec0] gsi_base[0])
IOAPIC[0]: apic_id 1, version 32, address 0xfec0, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
ACPI: HPET id: 0x8086a201 base: 0xfed0
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at c000 (gap: bf80:3f40)
swsusp: Registered nosave memory region: 0009f000 - 000a
swsusp: Registered nosave memory region: 000a - 000e
swsusp: Registered nosave memory region: 000e - 0010
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 773613
Kernel command line: ro root=LABEL=/1 rhgb quiet pci=assign-busses selinux=off
mapped APIC to b000 (fee0)
mapped IOAPIC to a000 (fec0)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c082 soft=c080
PID hash table entries: 4096 (order: 12, 16384 bytes)
Detected 1828.814 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:8
... MAX_LOCK_DEPTH:  30
... MAX_LOCKDEP_KEYS:2048
... CLASSHASH_SIZE:   1024
... MAX_LOCKDEP_ENTRIES: 8192
... MAX_LOCKDEP_CHAINS:  16384
... CHAINHASH_SIZE:  8192
 memory used by lock dependency info: 1024 kB
 per task-struct memory footprint: 1680 bytes
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 

Re: [PATCH 27/28] blk_end_request: changing scsi mid-layer for bidi (take 3)

2007-12-06 Thread Kiyoshi Ueda
Hi Boaz, Jens,

On Thu, 06 Dec 2007 11:24:44 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
  Index: 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c
  ===
  --- 2.6.24-rc3-mm2.orig/drivers/scsi/scsi_lib.c
  +++ 2.6.24-rc3-mm2/drivers/scsi/scsi_lib.c
  @@ -629,28 +629,6 @@ void scsi_run_host_queues(struct Scsi_Ho
  scsi_run_queue(sdev-request_queue);
   }
   
  -static void scsi_finalize_request(struct scsi_cmnd *cmd, int uptodate)
  -{
  -   struct request_queue *q = cmd-device-request_queue;
  -   struct request *req = cmd-request;
  -   unsigned long flags;
  -
  -   add_disk_randomness(req-rq_disk);
  -
  -   spin_lock_irqsave(q-queue_lock, flags);
  -   if (blk_rq_tagged(req))
  -   blk_queue_end_tag(q, req);
  -
  -   end_that_request_last(req, uptodate);
  -   spin_unlock_irqrestore(q-queue_lock, flags);
  -
  -   /*
  -* This will goose the queue request function at the end, so we don't
  -* need to worry about launching another command.
  -*/
  -   scsi_next_command(cmd);
  -}
  -
   /*
* Function:scsi_end_request()
*
  @@ -921,6 +899,20 @@ void scsi_release_buffers(struct scsi_cm
   EXPORT_SYMBOL(scsi_release_buffers);
   
   /*
  + * Called from blk_end_request_callback() after all DATA in rq and its 
  next_rq
  + * are completed before rq is completed/freed.
  + */
  +static int scsi_end_bidi_request_cb(struct request *rq)
  +{
  +   struct scsi_cmnd *cmd = rq-special;
  +
  +   rq-data_len = scsi_out(cmd)-resid;
  +   rq-next_rq-data_len = scsi_in(cmd)-resid;
  +
  +   return 0;
  +}
  +
  +/*
* Bidi commands Must be complete as a whole, both sides at once.
* If part of the bytes were written and lld returned
* scsi_in()-resid and/or scsi_out()-resid this information will be left
  @@ -931,22 +923,28 @@ void scsi_end_bidi_request(struct scsi_c
   {
  struct request *req = cmd-request;
   
  -   end_that_request_chunk(req, 1, req-data_len);
  -   req-data_len = scsi_out(cmd)-resid;
  -
  -   end_that_request_chunk(req-next_rq, 1, req-next_rq-data_len);
  -   req-next_rq-data_len = scsi_in(cmd)-resid;
  -
  -   scsi_release_buffers(cmd);
  -
  /*
   *FIXME: If ll_rw_blk.c is changed to also put_request(req-next_rq)
  -*   in end_that_request_last() then this WARN_ON must be removed.
  +*   in blk_end_request() then this WARN_ON must be removed.
   *   for now, upper-driver must have registered an end_io.
   */
  WARN_ON(!req-end_io);
   
  -   scsi_finalize_request(cmd, 1);
  +   /*
  +* blk_end_request() family take care of data completion of next_rq.
  +* blk_end_request() family use next_rq-data_len for 
  +* the completion data size of next_rq.
  +* So resid can't be set before the data completion of next_rq
  +* in blk_end_request().
  +* To resolve that, use the callback feature of blk_end_request().
  +*/
  +   if (blk_end_request_callback(req, 1, req-data_len,
  +scsi_end_bidi_request_cb))
  +   /* req has not been completed */
  +   BUG();
  +
  +   scsi_release_buffers(cmd);
  +   scsi_next_command(cmd);
   }
   
   /*
  Index: 2.6.24-rc3-mm2/block/ll_rw_blk.c
  ===
  --- 2.6.24-rc3-mm2.orig/block/ll_rw_blk.c
  +++ 2.6.24-rc3-mm2/block/ll_rw_blk.c
  @@ -3817,6 +3817,12 @@ int blk_end_request(struct request *rq, 
  if (blk_fs_request(rq) || blk_pc_request(rq)) {
  if (__end_that_request_first(rq, uptodate, nr_bytes))
  return 1;
  +
  +   /* Bidi request must be completed as a whole */
  +   if (blk_bidi_rq(rq) 
  +   __end_that_request_first(rq-next_rq, uptodate,
  +blk_rq_bytes(rq-next_rq)))
  +   return 1;
  }
   
  add_disk_randomness(rq-rq_disk);
  @@ -3840,6 +3846,12 @@ int __blk_end_request(struct request *rq
  if (blk_fs_request(rq) || blk_pc_request(rq)) {
  if (__end_that_request_first(rq, uptodate, nr_bytes))
  return 1;
  +
  +   /* Bidi request must be completed as a whole */
  +   if (blk_bidi_rq(rq) 
  +   __end_that_request_first(rq-next_rq, uptodate,
  +blk_rq_bytes(rq-next_rq)))
  +   return 1;
  }
   
  add_disk_randomness(rq-rq_disk);
  @@ -3884,6 +3896,12 @@ int blk_end_request_callback(struct requ
  if (blk_fs_request(rq) || blk_pc_request(rq)) {
  if (__end_that_request_first(rq, uptodate, nr_bytes))
  return 1;
  +
  +   /* Bidi request must be completed as a whole */
  +   if (blk_bidi_rq(rq) 
  +   __end_that_request_first(rq-next_rq, uptodate,
  +blk_rq_bytes(rq-next_rq)))
  +   return 1;
  }
   
   

Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)

2007-12-06 Thread Andrew Morton
On Sat, 1 Dec 2007 06:26:08 -0500 (EST)
Justin Piszcz [EMAIL PROTECTED] wrote:

 I am putting a new machine together and I have dual raptor raid 1 for the 
 root, which works just fine under all stress tests.
 
 Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on 
 sale now adays):
 
 I ran the following:
 
 dd if=/dev/zero of=/dev/sdc
 dd if=/dev/zero of=/dev/sdd
 dd if=/dev/zero of=/dev/sde
 
 (as it is always a very good idea to do this with any new disk)
 
 And sometime along the way(?) (i had gone to sleep and let it run), this 
 occurred:
 
 [42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x401 
 action 0x2 frozen

Gee we're seeing a lot of these lately.

 [42880.680231] ata3.00: irq_stat 0x00400040, connection status changed
 [42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 
 0x0 data 512 in
 [42880.680292]  res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10 
 (ATA bus error)
 [42881.841899] ata3: soft resetting port
 [42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
 [42915.919042] ata3.00: qc timeout (cmd 0xec)
 [42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
 [42915.919149] ata3.00: revalidation failed (errno=-5)
 [42915.919206] ata3: failed to recover some devices, retrying in 5 secs
 [42920.912458] ata3: hard resetting port
 [42926.411363] ata3: port is slow to respond, please be patient (Status 
 0x80)
 [42930.943080] ata3: COMRESET failed (errno=-16)
 [42930.943130] ata3: hard resetting port
 [42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
 [42931.413523] ata3.00: configured for UDMA/133
 [42931.413586] ata3: EH pending after completion, repeating EH (cnt=4)
 [42931.413655] ata3: EH complete
 [42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors 
 (750156 MB)
 [42931.413809] sd 2:0:0:0: [sdc] Write Protect is off
 [42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
 [42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache: 
 enabled, doesn't support DPO or FUA
 
 Usually when I see this sort of thing with another box I have full of 
 raptors, it was due to a bad raptor and I never saw it again after I 
 replaced the disk that it happened on, but that was using the Intel P965 
 chipset.
 
 For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of 
 the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge).
 
 I am going to do some further testing but does this indicate a bad drive? 
 Bad cable?  Bad connector?
 
 As you can see above, /dev/sdc stopped responding for a little bit and 
 then the kernel reset the port.
 
 Why is this though?  What is the likely root cause?  Should I replace the 
 drive?  Obviously this is not normal and cannot be good at all, the idea 
 is to put these drives in a RAID5 and if one is going to timeout that is 
 going to cause the array to go degraded and thus be worthless in a raid5 
 configuration.
 
 Can anyone offer any insight here?

It would be interesting to try 2.6.21 or 2.6.22.

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sata_mv: Fix broken Marvell 7042 support.

2007-12-06 Thread Mark Lord

Jeff Garzik wrote:

..
The problem is not at the chip or device level, and this is the same 
problem as any number of other cards with softRAID on it.  Not a new 
problem, not a new solution...

..

What other cards do we support that automatically overwrite user data
without confirmation or notice of any kind?

Just curious.

This card does it to any connected drive at power-on,
without the user taking any action or even being told about it.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sata_mv: Fix broken Marvell 7042 support.

2007-12-06 Thread Mark Lord

Jeff Garzik wrote:
..
OTOH it is quite reasonable to explore auto-loading DM on top of the 
bare drive, and populating a DM table, if you see that particular BIOS 
signature or [insert other detection method].

..

I am very interested to hear a more detailed explanation of this,
as I don't really see how it addresses the problems.

Probably because I don't know much about device mapper.

But my understanding of it is that it re-exports portions of
the original device as a second device.

It doesn't seem to prevent using the original device afterward,
and I don't know if dm devices can have GRUB installed on them or not.

So a more full tutorial might be in order here.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel 2.6.23.9 / P35 Chipset + WD 750GB Drives (reset port)

2007-12-06 Thread Justin Piszcz



On Thu, 6 Dec 2007, Andrew Morton wrote:


On Sat, 1 Dec 2007 06:26:08 -0500 (EST)
Justin Piszcz [EMAIL PROTECTED] wrote:


I am putting a new machine together and I have dual raptor raid 1 for the
root, which works just fine under all stress tests.

Then I have the WD 750 GiB drive (not RE2, desktop ones for ~150-160 on
sale now adays):

I ran the following:

dd if=/dev/zero of=/dev/sdc
dd if=/dev/zero of=/dev/sdd
dd if=/dev/zero of=/dev/sde

(as it is always a very good idea to do this with any new disk)

And sometime along the way(?) (i had gone to sleep and let it run), this
occurred:

[42880.680144] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x401
action 0x2 frozen


Gee we're seeing a lot of these lately.


[42880.680231] ata3.00: irq_stat 0x00400040, connection status changed
[42880.680290] ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb
0x0 data 512 in
[42880.680292]  res 40/00:ac:d8:64:54/00:00:57:00:00/40 Emask 0x10
(ATA bus error)
[42881.841899] ata3: soft resetting port
[42885.966320] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[42915.919042] ata3.00: qc timeout (cmd 0xec)
[42915.919094] ata3.00: failed to IDENTIFY (I/O error, err_mask=0x5)
[42915.919149] ata3.00: revalidation failed (errno=-5)
[42915.919206] ata3: failed to recover some devices, retrying in 5 secs
[42920.912458] ata3: hard resetting port
[42926.411363] ata3: port is slow to respond, please be patient (Status
0x80)
[42930.943080] ata3: COMRESET failed (errno=-16)
[42930.943130] ata3: hard resetting port
[42931.399628] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[42931.413523] ata3.00: configured for UDMA/133
[42931.413586] ata3: EH pending after completion, repeating EH (cnt=4)
[42931.413655] ata3: EH complete
[42931.413719] sd 2:0:0:0: [sdc] 1465149168 512-byte hardware sectors
(750156 MB)
[42931.413809] sd 2:0:0:0: [sdc] Write Protect is off
[42931.413856] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[42931.413867] sd 2:0:0:0: [sdc] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA

Usually when I see this sort of thing with another box I have full of
raptors, it was due to a bad raptor and I never saw it again after I
replaced the disk that it happened on, but that was using the Intel P965
chipset.

For this board, it is a Gigabyte GSP-P35-DS4 (Rev 2.0) and I have all of
the drives (2 raptors, 3 750s connected to the Intel ICH9 Southbridge).

I am going to do some further testing but does this indicate a bad drive?
Bad cable?  Bad connector?

As you can see above, /dev/sdc stopped responding for a little bit and
then the kernel reset the port.

Why is this though?  What is the likely root cause?  Should I replace the
drive?  Obviously this is not normal and cannot be good at all, the idea
is to put these drives in a RAID5 and if one is going to timeout that is
going to cause the array to go degraded and thus be worthless in a raid5
configuration.

Can anyone offer any insight here?


It would be interesting to try 2.6.21 or 2.6.22.



This was due to NCQ issues (disabling it fixed the problem).

Justin.
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Hard drives only detected when booting from CD on nVidia MCP67 SATA

2007-12-06 Thread Chuck Ebbert
On 12/05/2007 06:53 PM, Chuck Ebbert wrote:
 With kernel 2.6.23 on an Acer 7220 notebook using nVidia MCP67 SATA,
 hard drives are only detected after first booting from a CD.
 
 Boot from hard drive  No drives detected
 
 Boot live CD  Detected
 
 Boot CD to GRUB menu, Detected
 then warm-boot from hard
 drive
 
 
 Non-detect case:
 
 ahci :00:09.0: version 2.3
 ACPI: PCI Interrupt Link [LSI0] enabled at IRQ 23
 ACPI: PCI Interrupt :00:09.0[A] - Link [LSI0] - GSI 23 (level, low) - 
 IRQ 16
 input: ImPS/2 Generic Wheel Mouse as /class/input/input2
 ahci :00:09.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl IDE mode
 ahci :00:09.0: flags: 64bit sntf led clo pmp pio slum part
 PCI: Setting latency timer of device :00:09.0 to 64
 scsi0 : ahci
 scsi1 : ahci
 scsi2 : ahci
 scsi3 : ahci
 ata1: SATA max UDMA/133 cmd 0xf8854100 ctl 0x bmdma 0x irq 221
 ata2: SATA max UDMA/133 cmd 0xf8854180 ctl 0x bmdma 0x irq 221
 ata3: SATA max UDMA/133 cmd 0xf8854200 ctl 0x bmdma 0x irq 221
 ata4: SATA max UDMA/133 cmd 0xf8854280 ctl 0x bmdma 0x irq 221
 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
 ata2: SATA link down (SStatus 0 SControl 300)
 ata3: SATA link down (SStatus 0 SControl 300)
 ata4: SATA link down (SStatus 0 SControl 300)
 Waiting for driver initialization
 
 But, if a LiveCD is used to boot or if a LiveCD was used before an hot reboot 
 (without a power off), disks are correctly found :
 
 Loading ahci.ko
 ahci :00:09.0: version 2.3
 ACPI: PCI Interrupt Link [LSI0] enabled at IRQ 23
 ACPI: PCI Interrupt :00:09.0[A] - Link [LSI0] - GSI 23 (level, low) - 
 IRQ 16
 input: ImPS/2 Generic Wheel Mouse as /class/input/input2
 ahci :00:09.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl IDE mode
 ahci :00:09.0: flags: 64bit sntf led clo pmp pio slum part
 PCI: Setting latency timer of device :00:09.0 to 64
 scsi0 : ahci
 scsi1 : ahci
 scsi2 : ahci
 scsi3 : ahci
 ata1: SATA max UDMA/133 cmd 0xf8854100 ctl 0x bmdma 0x irq 221
 ata2: SATA max UDMA/133 cmd 0xf8854180 ctl 0x bmdma 0x irq 221
 ata3: SATA max UDMA/133 cmd 0xf8854200 ctl 0x bmdma 0x irq 221
 ata4: SATA max UDMA/133 cmd 0xf8854280 ctl 0x bmdma 0x irq 221
 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
 ata1.00: ATA-7: Hitachi HTS541612J9SA00, SBDOC70P, max UDMA/100
 ata1.00: 234441648 sectors, multi 16: LBA48 NCQ (depth 0/32)
 ata1.00: configured for UDMA/100
 ata2: SATA link down (SStatus 0 SControl 300)
 ata3: SATA link down (SStatus 0 SControl 300)
 ata4: SATA link down (SStatus 0 SControl 300)
 
 

Possibly fixed by:

Commit: 3cc3eb1148e4b2dfabf7a1dcf36fd8be1331ca95
[libata] AHCI: enable AHCI mode, before using AHCI reset

Plus:

Commit: ab6fc95f609b372a19e18ea689986846ab1ba29c
[libata] AHCI: fix newly introduced host-reset bug

??
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failure with SATA DVD-RW

2007-12-06 Thread Andrew Morton
On Thu, 6 Dec 2007 01:33:16 + (UTC)
Parag Warudkar [EMAIL PROTECTED] wrote:

 Tom Lanyon tomlanyon at gmail.com writes:
 
  scsi4: ahci
  ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300)
  ata5.00: ATAPI, max UDMA/66
  ata5.00: qc timeout (cmd 0xef)
  ata5.00: failed to set xfermode (err_mask=0x104)
  ata5.00: limiting speed to UDMA/44
  ata5: failed to recover some devices, retrying in 5 secs
  ata5: port is slow to respond, please be patient (Status 0x80)
  ata5: port failed to respond (30 secs, status 0x80)
  ata5: COMRESET failed (device not ready)
  ata5: hardreset failed, retrying in 5 secs
  ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300)
  ata5.00: ATAPI, max UDMA/66
  ata5.00: qc timeout (cmd 0xef)
  ata5.00: failed to set xfermode (err_mask=0x104)
  ata5.00: limiting speed to PIO0
  ata5: failed to recover some devices, retrying in 5 secs
  ata5: port is slow to respond, please be patient (Status 0x80)
  ata5: port failed to respond (30 secs, status 0x80)
  ata5: COMRESET failed (device not ready)
  ata5: hardreset failed, retrying in 5 secs
  ata5.00: ATAPI, max UDMA/66
  ata5.00: qc timeout (cmd 0xef)
  ata5.00: failed to set xfermode (err_mask=0x104)
  ata5.00: disabled
  
 Looks like it is trying to set transfer mode to UDMA/66 and failing. After 
 that it tried UDMA/44 and failed again. Next UDMA/66 again with unsurprising 
 result - failed. After that PIO0 which seems to cause some kind of trouble, 
 then it tries UDMA/66 again, and I am not stating the result again :) ! 
 
  Any ideas what to try to get it working under AHCI?
  
 
 I recall reading somewhere - the Pioneer drive needs UDMA/33 which it did not 
 try in your case - need to some how have it try UDMA/33 but I don't find a 
 boot parameter which will do that. So may be adding a quirk for this device 
 to 
 limit the xfer mode to 33 may work. 
 
 What does your dmesg output for the drives look like when you run in IDE 
 compat mode? (Particularly the DMA for this drive?)
 

Please cc linux-ide on sata, pata and ide-related issues.

If nothing happens within a few days please raise a report at
bugzilla.kernel.org so we can ignore this in an organised fashion, thanks.

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failure with SATA DVD-RW

2007-12-06 Thread Andrew Morton

(argh, shit, resent.  Please don't massage the cc list.  Do reply-to-all)

On Thu, 6 Dec 2007 01:33:16 + (UTC)
Parag Warudkar [EMAIL PROTECTED] wrote:

 Tom Lanyon tomlanyon at gmail.com writes:
 
  scsi4: ahci
  ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300)
  ata5.00: ATAPI, max UDMA/66
  ata5.00: qc timeout (cmd 0xef)
  ata5.00: failed to set xfermode (err_mask=0x104)
  ata5.00: limiting speed to UDMA/44
  ata5: failed to recover some devices, retrying in 5 secs
  ata5: port is slow to respond, please be patient (Status 0x80)
  ata5: port failed to respond (30 secs, status 0x80)
  ata5: COMRESET failed (device not ready)
  ata5: hardreset failed, retrying in 5 secs
  ata5: SATA link up at 1.5 Gbps (SStatus 113 SControl 300)
  ata5.00: ATAPI, max UDMA/66
  ata5.00: qc timeout (cmd 0xef)
  ata5.00: failed to set xfermode (err_mask=0x104)
  ata5.00: limiting speed to PIO0
  ata5: failed to recover some devices, retrying in 5 secs
  ata5: port is slow to respond, please be patient (Status 0x80)
  ata5: port failed to respond (30 secs, status 0x80)
  ata5: COMRESET failed (device not ready)
  ata5: hardreset failed, retrying in 5 secs
  ata5.00: ATAPI, max UDMA/66
  ata5.00: qc timeout (cmd 0xef)
  ata5.00: failed to set xfermode (err_mask=0x104)
  ata5.00: disabled
  
 Looks like it is trying to set transfer mode to UDMA/66 and failing. After 
 that it tried UDMA/44 and failed again. Next UDMA/66 again with unsurprising 
 result - failed. After that PIO0 which seems to cause some kind of trouble, 
 then it tries UDMA/66 again, and I am not stating the result again :) ! 
 
  Any ideas what to try to get it working under AHCI?
  
 
 I recall reading somewhere - the Pioneer drive needs UDMA/33 which it did not 
 try in your case - need to some how have it try UDMA/33 but I don't find a 
 boot parameter which will do that. So may be adding a quirk for this device 
 to 
 limit the xfer mode to 33 may work. 
 
 What does your dmesg output for the drives look like when you run in IDE 
 compat mode? (Particularly the DMA for this drive?)
 

Please cc linux-ide on sata, pata and ide-related issues.

If nothing happens within a few days please raise a report at
bugzilla.kernel.org so we can ignore this in an organised fashion, thanks.

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why we were seeing so many spurious NCQ completions

2007-12-06 Thread Jeff Garzik

Tejun Heo wrote:

Hello, all.

This has been going on for quite some time now but I finally succeeded
to reproduce the problem and find out what has been going on.  It
wasn't drive's or controller's fault.  The spurious completion
detection logic was wrong which makes all of this my fault.  :-)

The attached patch induces NCQ spurious completions by inserting
artificial delays during irq handling.  The following is log with the
patch applied.

A [ 1125.478813] ata35: MON issue=0x0 SAct=0x1 sactive=0x3 SDB 
FIS=004040a1:0002
B [ 1125.480248] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB 
FIS=004040a1:0001
C [ 1125.481614] ata35: MON issue=0x0 SAct=0x5 sactive=0x7 SDB 
FIS=004040a1:0002
D [ 1125.481704] ata35: YYY 0x2 - 0x4
E [ 1125.481722] ata35: XXX issue=0x0 SAct=0x1 sactive=0x1 SDB 
FIS=004040a1:0004
F [ 1125.483087] ata35: MON issue=0x0 SAct=0x0 sactive=0x1 SDB 
FIS=004040a1:0001
G [ 1125.484297] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB 
FIS=004040a1:0001


Thanks a lot for tracking this down, and thanks even more for being 
humble enough to admit mistakes.  More kernel hackers should follow your 
example.


I continue to be a proud mentor, watching you kick ass on the Linux 
kernel scene :)


Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sata_mv: Fix broken Marvell 7042 support.

2007-12-06 Thread Jeff Garzik

Mark Lord wrote:

Jeff Garzik wrote:
...
If you pop the BIOS chip or plug the card into a non-x86 box (or any 
of several other alternatives), the problem is likely to go away.

..



Yeah, I was hoping for a removable BIOS chip, but it's soldered in place.
And that's not a solution for most users anyway.


That was an example, silly :)  I'm not asking users to pop out chips. 
I'm illustrating that they are separate and distinct pieces, and you 
cannot assume.


Boot into a non-x86 platform, or use your x86 BIOS to disable all 
optional ROMs, and the BIOS-stomps-data issue goes away.


I'm not saying the _problem_ goes away; instead I am illustrating why it 
is incorrect to update sata_mv for this problem.  The solution belongs 
elsewhere, because the problem is not with the chip, but the BIOS.


Continuing with the other emails...


What other cards do we support that automatically overwrite user data
without confirmation or notice of any kind? 


If you use any vendor RAID (BIOS RAID / fake RAID), and fail to use 
DM+dmraid, then data corruption occurs due to lack of knowledge about 
the presence of underlying BIOS-created RAID metadata.


Your case is just another case of problems caused by lack of knowledge 
of the underlying vendor RAID that the BIOS insists upon using.


I'm pretty sure the most recently Fedora release has full dmraid support 
for known formats, so AFAICS the task at hand should be simply to figure 
out how to identify the underlying vendor RAID (on-disk signatures are 
greatly preferred over PCI ID matching), and update dmraid accordingly.


Welcome to the suck that is BIOS RAID :)

Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Why we were seeing so many spurious NCQ completions

2007-12-06 Thread Tejun Heo
Hello, all.

This has been going on for quite some time now but I finally succeeded
to reproduce the problem and find out what has been going on.  It
wasn't drive's or controller's fault.  The spurious completion
detection logic was wrong which makes all of this my fault.  :-)

The attached patch induces NCQ spurious completions by inserting
artificial delays during irq handling.  The following is log with the
patch applied.

A [ 1125.478813] ata35: MON issue=0x0 SAct=0x1 sactive=0x3 SDB 
FIS=004040a1:0002
B [ 1125.480248] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB 
FIS=004040a1:0001
C [ 1125.481614] ata35: MON issue=0x0 SAct=0x5 sactive=0x7 SDB 
FIS=004040a1:0002
D [ 1125.481704] ata35: YYY 0x2 - 0x4
E [ 1125.481722] ata35: XXX issue=0x0 SAct=0x1 sactive=0x1 SDB 
FIS=004040a1:0004
F [ 1125.483087] ata35: MON issue=0x0 SAct=0x0 sactive=0x1 SDB 
FIS=004040a1:0001
G [ 1125.484297] ata35: MON issue=0x4 SAct=0x6 sactive=0x7 SDB 
FIS=004040a1:0001

MON lines are printed on each SDB FIS while YYY line indicates that
SDB FIS RX area has changed during the artificial delay.  XXX line
notes condition which triggers spurious NCQ completion - invoking EH
is disabled for debugging.

Here's what happens.

1. On A, NCQ command 0 and 1 are in flight - command 0 is still being
   transmitted to the device.  The first SDB FIS indicates completion
   of command 1.

2. Between A and B, the driver issues NCQ commands 1 and 2.  0 is
   still in flight.

3. On B, command 0 completes and drive sends completion for it.

4. Between B and C, the driver issues NCQ command 0.

5. On C, command 1 completes and drive sends completion for it.

6. On D, then the drive completes command 2 and sends completion for
   it.  This makes SDB FIS RX area updated and as the driver is still
   in IRQ handler, sets IRQ pending bit again.  Note that YYY line is
   printed *before* actually completing commands.  So, after printing
   YYY line, the driver completes both commands 1 and 2.

7. On E, the IRQ handler is invoked again because of the IRQ pending
   status set from #6.  However, completions contained in the SDB FIS
   which triggered this IRQ handler invocation is already processed.
   ie. Completion for command 2 is already processed in the previous
   IRQ handler invocation, so this time IRQ handler has nothing to do
   but SDB FIS RX area shows that this IRQ is for SDB FIS which
   includes completion for command 2, which triggers spurious NCQ
   completion condition.

8. It goes on.

So, trying to detect spurious completions using IRQ and RX FIS area
turns out to be stupid as they aren't interlocked.  I'll soon post a
patch to remove spurious completion check and blacklist resulted from
it.

Thanks a lot.

-- 
tejun
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 4688dbf..9f9a658 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1664,6 +1664,10 @@ static void ahci_port_intr(struct ata_port *ap)
 	}
 
 	if (status  PORT_IRQ_SDB_FIS) {
+		const __le32 *f = pp-rx_fis + RX_FIS_SDB;
+		u32 t = le32_to_cpu(f[1]);
+		int i;
+
 		/* If SNotification is available, leave notification
 		 * handling to sata_async_notification().  If not,
 		 * emulate it by snooping SDB FIS RX area.
@@ -1686,6 +1690,32 @@ static void ahci_port_intr(struct ata_port *ap)
 			if (f0  (1  15))
 sata_async_notification(ap);
 		}
+
+		if (le32_to_cpu(f[1])  ~pp-active_link-sactive)
+			ata_link_printk(pp-active_link, KERN_INFO,
+XXX issue=0x%x SAct=0x%x sactive=0x%x SDB FIS=%08x:%08x\n,
+readl(port_mmio + PORT_CMD_ISSUE),
+readl(port_mmio + PORT_SCR_ACT),
+pp-active_link-sactive,
+le32_to_cpu(f[0]), le32_to_cpu(f[1]));
+		else
+			ata_link_printk(pp-active_link, KERN_INFO,
+MON issue=0x%x SAct=0x%x sactive=0x%x SDB FIS=%08x:%08x\n,
+readl(port_mmio + PORT_CMD_ISSUE),
+readl(port_mmio + PORT_SCR_ACT),
+pp-active_link-sactive,
+le32_to_cpu(f[0]), le32_to_cpu(f[1]));
+
+		for (i = 0; i  100; i++) {
+			if (t != le32_to_cpu(f[1])) {
+ata_link_printk(pp-active_link, KERN_INFO,
+		YYY 0x%x - 0x%x\n,
+		t, le32_to_cpu(f[1]));
+break;
+			}
+			udelay(1);
+			cpu_relax();
+		}
 	}
 
 	/* pp-active_link is valid iff any command is in flight */
@@ -1746,7 +1776,7 @@ static void ahci_port_intr(struct ata_port *ap)
 			 * with HSM violation.  EH will turn off NCQ
 			 * after several such failures.
 			 */
-			ata_ehi_push_desc(ehi,
+			/*			ata_ehi_push_desc(ehi,
 spurious completions during NCQ 
 issue=0x%x SAct=0x%x FIS=%08x:%08x,
 readl(port_mmio + PORT_CMD_ISSUE),
@@ -1754,7 +1784,7 @@ static void ahci_port_intr(struct ata_port *ap)
 le32_to_cpu(f[0]), le32_to_cpu(f[1]));
 			ehi-err_mask |= AC_ERR_HSM;
 			ehi-action |= ATA_EH_SOFTRESET;
-			ata_port_freeze(ap);
+			ata_port_freeze(ap);*/
 		} else {
 			if (!pp-ncq_saw_sdb)
 ata_port_printk(ap, KERN_INFO,


[PATCH #upstream-fixes] libata: kill spurious NCQ completion detection

2007-12-06 Thread Tejun Heo
Spurious NCQ completion detection implemented in ahci was incorrect.
On AHCI receving and processing FISes and raising interrupts are not
interlocked and spurious interrupts are expected.

For example, if an interrupt occurs while interrupt handler is running
and the running interrupt handler handles the event the new IRQ
indicated, after IRQ handler finishes, it will be executed again
because IRQ pending bit is set by the new interrupt but there won't be
anything to process.

Please read the following message for more information.

  http://article.gmane.org/gmane.linux.ide/26012

This patch...

* Removes all spurious IRQ whining from ahci.  Spurious NCQ completion
  detection was completely wrong.  Spurious D2H Register FIS taught us
  that some early drives send spurious D2H Register FIS with I bit set
  while NCQ commands are in progress but none of recent drives does
  that and even the ones which show such behavior can do NCQ fine.

* Kills all NCQ blacklist entries which were added because of spurious
  NCQ completions.  I tracked down each commit and verified all
  removed ones are actually added because of spurious completions.

  WD740ADFD-00NLR1 wasn't deleted but moved upward because the drive
  not only had spurious NCQ completions but also is slow on sequential
  data transfers if NCQ is enabled.

  Maxtor 7V300F0 was added by 0e3dbc01d53940fe10e5a5cfec15ede3e929c918
  from Alan Cox.  I can only find evidences that the drive only had
  troubles with spuruious completions by searching the mailing list.
  This entry needs to be verified and removed if it doesn't have other
  NCQ related problems.

Signed-off-by: Tejun Heo [EMAIL PROTECTED]
Cc: Alan Cox [EMAIL PROTECTED]
---
Alan, can you please check why 7V300F0 was added?  Thanks a lot.

 drivers/ata/ahci.c|   74 +-
 drivers/ata/libata-core.c |   18 ---
 2 files changed, 4 insertions(+), 88 deletions(-)

diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 4688dbf..7ef497a 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -1638,7 +1638,7 @@ static void ahci_port_intr(struct ata_port *ap)
struct ahci_host_priv *hpriv = ap-host-private_data;
int resetting = !!(ap-pflags  ATA_PFLAG_RESETTING);
u32 status, qc_active;
-   int rc, known_irq = 0;
+   int rc;
 
status = readl(port_mmio + PORT_IRQ_STAT);
writel(status, port_mmio + PORT_IRQ_STAT);
@@ -1696,80 +1696,12 @@ static void ahci_port_intr(struct ata_port *ap)
 
rc = ata_qc_complete_multiple(ap, qc_active, NULL);
 
-   /* If resetting, spurious or invalid completions are expected,
-* return unconditionally.
-*/
-   if (resetting)
-   return;
-
-   if (rc  0)
-   return;
-   if (rc  0) {
+   /* while resetting, invalid completions are expected */
+   if (unlikely(rc  0  !resetting)) {
ehi-err_mask |= AC_ERR_HSM;
ehi-action |= ATA_EH_SOFTRESET;
ata_port_freeze(ap);
-   return;
-   }
-
-   /* hmmm... a spurious interrupt */
-
-   /* if !NCQ, ignore.  No modern ATA device has broken HSM
-* implementation for non-NCQ commands.
-*/
-   if (!ap-link.sactive)
-   return;
-
-   if (status  PORT_IRQ_D2H_REG_FIS) {
-   if (!pp-ncq_saw_d2h)
-   ata_port_printk(ap, KERN_INFO,
-   D2H reg with I during NCQ, 
-   this message won't be printed again\n);
-   pp-ncq_saw_d2h = 1;
-   known_irq = 1;
-   }
-
-   if (status  PORT_IRQ_DMAS_FIS) {
-   if (!pp-ncq_saw_dmas)
-   ata_port_printk(ap, KERN_INFO,
-   DMAS FIS during NCQ, 
-   this message won't be printed again\n);
-   pp-ncq_saw_dmas = 1;
-   known_irq = 1;
}
-
-   if (status  PORT_IRQ_SDB_FIS) {
-   const __le32 *f = pp-rx_fis + RX_FIS_SDB;
-
-   if (le32_to_cpu(f[1])) {
-   /* SDB FIS containing spurious completions
-* might be dangerous, whine and fail commands
-* with HSM violation.  EH will turn off NCQ
-* after several such failures.
-*/
-   ata_ehi_push_desc(ehi,
-   spurious completions during NCQ 
-   issue=0x%x SAct=0x%x FIS=%08x:%08x,
-   readl(port_mmio + PORT_CMD_ISSUE),
-   readl(port_mmio + PORT_SCR_ACT),
-   le32_to_cpu(f[0]), le32_to_cpu(f[1]));
-   ehi-err_mask |= AC_ERR_HSM;
-   ehi-action |= ATA_EH_SOFTRESET;
-   ata_port_freeze(ap);
-