Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-02-20 Thread Guillaume Laurès

Hello all,

A bit of update to this issue.

Switching the cabling of the most problematic drive with a new one  
didn't fix the issue.
I couldn't yet switch the power supply with a more powerful one.  
However I temporarily added a pci-e SATA host and another drive, the  
situation was just as bad, not worse ;-) This HBA just had 2 ports,  
so I couldn't swap it completely with the nvidia ports.


Just as a guess I added sata_nv.adma=0 and the problem completly  
disappeared for 4 weeks now. Removing it from cmdline just make it  
come back.


Unfortunately I'm running xen on this host so I don't think adma  
patches if any may be suitable to me, at least as long as something >  
2.6.20 doesn't come in gentoo for xen...


Regards,

GoM

Le 9 janv. 08 à 01:00, Robert Hancock a écrit :

So it seems to be either a cabling problem or a bug with sata_nv ?  
I'm running gentoo's 2.6.20-xen, and maybe my problem looks like  
the sata_nv/adma/samsung problems reports I can see on the net ?


I don't think it's the same problem. In that case the controller  
appears to indicate that it didn't even start processing the  
command. In this case it's indicating that the command was sent to  
the drive and is still waiting for it to indicate completion.


It could be a cabling problem, yes. Also, are you sure your power  
supply is able to handle the power draw of that many drives?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-02-20 Thread Guillaume Laurès

Hello all,

A bit of update to this issue.

Switching the cabling of the most problematic drive with a new one  
didn't fix the issue.
I couldn't yet switch the power supply with a more powerful one.  
However I temporarily added a pci-e SATA host and another drive, the  
situation was just as bad, not worse ;-) This HBA just had 2 ports,  
so I couldn't swap it completely with the nvidia ports.


Just as a guess I added sata_nv.adma=0 and the problem completly  
disappeared for 4 weeks now. Removing it from cmdline just make it  
come back.


Unfortunately I'm running xen on this host so I don't think adma  
patches if any may be suitable to me, at least as long as something   
2.6.20 doesn't come in gentoo for xen...


Regards,

GoM

Le 9 janv. 08 à 01:00, Robert Hancock a écrit :

So it seems to be either a cabling problem or a bug with sata_nv ?  
I'm running gentoo's 2.6.20-xen, and maybe my problem looks like  
the sata_nv/adma/samsung problems reports I can see on the net ?


I don't think it's the same problem. In that case the controller  
appears to indicate that it didn't even start processing the  
command. In this case it's indicating that the command was sent to  
the drive and is still waiting for it to indicate completion.


It could be a cabling problem, yes. Also, are you sure your power  
supply is able to handle the power draw of that many drives?


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-08 Thread Robert Hancock



Guillaume Laurès wrote:


Le 8 janv. 08 à 01:29, Robert Hancock a écrit :


From your report:

ata5: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 
0x1501000 status 0x400

ata5: CPB 0: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 1: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 2: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 3: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 4: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 5: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 6: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 7: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 8: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 9: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 10: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 11: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 12: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 13: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 14: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 15: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 16: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 17: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 18: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 19: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 20: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 21: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 22: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 23: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 24: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 25: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 26: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 27: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 28: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 29: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 30: ctl_flags 0x1f, resp_flags 0x1
ata5: Resetting port
ata5.00: exception Emask 0x0 SAct 0x1f02 SErr 0x0 action 0x2 frozen
ata5.00: cmd 60/40:08:8f:eb:67/00:00:03:00:00/40 tag 1 cdb 0x0 data 
32768 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:40:17:eb:67/00:00:03:00:00/40 tag 8 cdb 0x0 data 
4096 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/18:48:47:eb:67/00:00:03:00:00/40 tag 9 cdb 0x0 data 
12288 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:50:77:eb:67/00:00:03:00:00/40 tag 10 cdb 0x0 data 
4096 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:58:87:eb:67/00:00:03:00:00/40 tag 11 cdb 0x0 data 
4096 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/48:60:d7:eb:67/00:00:03:00:00/40 tag 12 cdb 0x0 data 
36864 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5: soft resetting port

The CPB resp_flags 0x2 entries are ones where the drive has been sent 
the request and the controller is waiting for a response. The timeout 
is 30 seconds, so that means the drive failed to service those queued 
commands for that length of time.


It may be that your drive has a poor NCQ implementation that can 
starve some of the pending commands for a long time under heavy load?


Thanks for your answer. That could very well be the problem, as all 4 
drives on the sata_nv HBA are older than the sata_sil ones.
I'm going to swap them to see if the problem is reproducible on the 
sata_sil HBA. (see test #2)


- Test #1
I switched the scheduler to CFQ on all disks and ran the file 
reorganizer all night. In the morning I ended with a drive missing in 
the array. And lots of SATA port resets, with plenty of 0x2 again, see 
the attached log.



BTW, you can see around "md2: recovery done" a second disk failed before the 
first was completely rebuilt.



- Test #2

I swapped all the drives with this scheme: sda->sdh, sdb->sdg, sdc->sdf,..., 
sdg->sdb, sdh->sda. So now all the newer drives are attached through sata_nv (ata5:8), 
the oldest through sata_sil (ata1:4)

I kept the scheduler to anticipatory and ran xfs_frs. 60 seconds later it 
hanged. Still on ata5/ata6, i.e. sata_nv. Drive reconstruction...

Then I switched the scheduler to CFQ. xfs_fsr + 10 seconds: another freeze. No 
drive loss from the array though. See the dmesg below.



So it seems to be either a cabling problem or a bug with sata_nv ? I'm running 
gentoo's 2.6.20-xen, and maybe my problem looks like the sata_nv/adma/samsung 
problems reports I can see on the net ?



I don't think it's the same problem. In that case the controller appears 
to indicate that it didn't even start processing the command. In this 
case it's indicating that the command was sent to the drive and is still 
waiting for it to indicate completion.


It could be a cabling problem, yes. Also, are you sure your power supply 
is able to handle the power draw of that many drives?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-08 Thread Chuck Ebbert
On 01/07/2008 03:22 PM, Guillaume Laurès wrote:
> However even with CFQ/NOOP I keep getting soft resets, only the sata_nv
> SATA ports, when under load. Any thoughs ?

Try the adma=0 option for the driver?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-08 Thread J.A. Magallón
On Tue, 8 Jan 2008 15:52:35 +0100, Guillaume Laurès <[EMAIL PROTECTED]> wrote:


> > ata5.00: exception Emask 0x0 SAct 0x1f02 SErr 0x0 action 0x2 frozen
> > ata5.00: cmd 60/40:08:8f:eb:67/00:00:03:00:00/40 tag 1 cdb 0x0 data  
> > 32768 in
> >  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Perhaps it is a generic libata bug, I have similar problems without
any array.
See this message:

http://marc.info/?l=linux-kernel=119975354031937=2

It's an ICH5 controller, not nv. This was rc7.
I have now gone back to rc6 to see if problems persist.

--
J.A. Magallon  \   Software is like sex:
 \ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam05 (gcc 4.2.2 20071128 (4.2.2-2mdv2008.1)) SMP PREEMPT
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-08 Thread J.A. Magallón
On Tue, 8 Jan 2008 15:52:35 +0100, Guillaume Laurès [EMAIL PROTECTED] wrote:


  ata5.00: exception Emask 0x0 SAct 0x1f02 SErr 0x0 action 0x2 frozen
  ata5.00: cmd 60/40:08:8f:eb:67/00:00:03:00:00/40 tag 1 cdb 0x0 data  
  32768 in
   res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Perhaps it is a generic libata bug, I have similar problems without
any array.
See this message:

http://marc.info/?l=linux-kernelm=119975354031937w=2

It's an ICH5 controller, not nv. This was rc7.
I have now gone back to rc6 to see if problems persist.

--
J.A. Magallon jamagallon()ono!com \   Software is like sex:
 \ It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam05 (gcc 4.2.2 20071128 (4.2.2-2mdv2008.1)) SMP PREEMPT
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-08 Thread Chuck Ebbert
On 01/07/2008 03:22 PM, Guillaume Laurès wrote:
 However even with CFQ/NOOP I keep getting soft resets, only the sata_nv
 SATA ports, when under load. Any thoughs ?

Try the adma=0 option for the driver?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-08 Thread Robert Hancock



Guillaume Laurès wrote:


Le 8 janv. 08 à 01:29, Robert Hancock a écrit :


From your report:

ata5: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 
0x1501000 status 0x400

ata5: CPB 0: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 1: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 2: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 3: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 4: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 5: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 6: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 7: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 8: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 9: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 10: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 11: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 12: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 13: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 14: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 15: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 16: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 17: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 18: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 19: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 20: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 21: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 22: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 23: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 24: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 25: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 26: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 27: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 28: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 29: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 30: ctl_flags 0x1f, resp_flags 0x1
ata5: Resetting port
ata5.00: exception Emask 0x0 SAct 0x1f02 SErr 0x0 action 0x2 frozen
ata5.00: cmd 60/40:08:8f:eb:67/00:00:03:00:00/40 tag 1 cdb 0x0 data 
32768 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:40:17:eb:67/00:00:03:00:00/40 tag 8 cdb 0x0 data 
4096 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/18:48:47:eb:67/00:00:03:00:00/40 tag 9 cdb 0x0 data 
12288 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:50:77:eb:67/00:00:03:00:00/40 tag 10 cdb 0x0 data 
4096 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:58:87:eb:67/00:00:03:00:00/40 tag 11 cdb 0x0 data 
4096 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/48:60:d7:eb:67/00:00:03:00:00/40 tag 12 cdb 0x0 data 
36864 in

 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5: soft resetting port

The CPB resp_flags 0x2 entries are ones where the drive has been sent 
the request and the controller is waiting for a response. The timeout 
is 30 seconds, so that means the drive failed to service those queued 
commands for that length of time.


It may be that your drive has a poor NCQ implementation that can 
starve some of the pending commands for a long time under heavy load?


Thanks for your answer. That could very well be the problem, as all 4 
drives on the sata_nv HBA are older than the sata_sil ones.
I'm going to swap them to see if the problem is reproducible on the 
sata_sil HBA. (see test #2)


- Test #1
I switched the scheduler to CFQ on all disks and ran the file 
reorganizer all night. In the morning I ended with a drive missing in 
the array. And lots of SATA port resets, with plenty of 0x2 again, see 
the attached log.



BTW, you can see around md2: recovery done a second disk failed before the 
first was completely rebuilt.



- Test #2

I swapped all the drives with this scheme: sda-sdh, sdb-sdg, sdc-sdf,..., 
sdg-sdb, sdh-sda. So now all the newer drives are attached through sata_nv (ata5:8), 
the oldest through sata_sil (ata1:4)

I kept the scheduler to anticipatory and ran xfs_frs. 60 seconds later it 
hanged. Still on ata5/ata6, i.e. sata_nv. Drive reconstruction...

Then I switched the scheduler to CFQ. xfs_fsr + 10 seconds: another freeze. No 
drive loss from the array though. See the dmesg below.



So it seems to be either a cabling problem or a bug with sata_nv ? I'm running 
gentoo's 2.6.20-xen, and maybe my problem looks like the sata_nv/adma/samsung 
problems reports I can see on the net ?



I don't think it's the same problem. In that case the controller appears 
to indicate that it didn't even start processing the command. In this 
case it's indicating that the command was sent to the drive and is still 
waiting for it to indicate completion.


It could be a cabling problem, yes. Also, are you sure your power supply 
is able to handle the power draw of that many drives?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-07 Thread Robert Hancock

Laurès wrote:

Hello,

Dear kernel developers, my dmesg asked me to report this, so here I go ;)
Here is what I found in my dmesg: "anticipatory: forced dispatching is 
broken (nr_sorted=1), please report this".


- First, let's talk about the machine: it's quite pushed so maybe the 
cause is me doing something wrong rather than a bug in the kernel.


I got this alert on a dual core amd64 xen host. It has 8 SATA drives 
making a raid 5 array. This array makes a virtual block device for one 
of the virtual machines:  an Openfiler appliance. Openfiler then manages 
logical volumes on this device including an XFS partition shared via 
NFS. 2 MythTV hosts continuously write MPEG2 tv shows on it (1 to 4Gb 
each).
Still following ? Here is a summary: MPEG2 files -> NFS -> XFS -> LVM -> 
Xen VBD -> RAID 5 -> 8x SATA disks.


- Next, the symptoms.

This setup is only 2 weeks old. Behavior was quite good, except for some 
unexplained failures from the sata_nv attached disks. Not always from 
the same disk. Never from any disks attached through the sata_sil HBA.
Eventually a second disk would go down before the end of the raid 
reconstruction (still a sata_nv attached one).
Since the disks showed nothing wrong with smartmontools I re-added them 
each time. So far the raid array was strong enough to be fully 
recovered, mdadm --force and xfs_check are my friends ;-)
It seems to happen more often now that the XFS partition is quite 
heavily fragmented, and I can't even run the defragmenter without a 
quick failure.
I didn't payed big attention to the logs and quickly decided to buy a 
SATA Sil PCI card to get rid of the Nvidia SATA HBA.


- Now the problem.

Yesterday, however, the MPEG2 streams hanged for a few tens of seconds 
just as usual. But there were no disk failure. The array was still in 
good shape, although dmesg showed the same "ata[56]: Resetting port", 
"SCSI errors" etc. fuss.
However this was new in dmesg: "anticipatory: forced dispatching is 
broken (nr_sorted=1), please report this". Got 4 identical in a row.
Maybe managing 8 block devices queues under load with the anticipatory 
scheduler is too much ? I immediately switched to deadline on the 8 
disks, and I'll see if it it happens again by stressing the whole system 
more and more.
I have no clue if anticipatory is a good choice or definitely not in my 
case, anyone can point some documentation or good advices ?


- How to reproduce.

Here is what I would do:
Harness a small CPU with lots of sata/scsi drives.
Do raid 5 with big block size (1-4Mb) on it.
Make a 50G XFS file system with sunit/swidth options
Trigger bonnie++ with 1Gachieve 98%+ fragmentation.

Defrag !

- Finally the usual bug report stuff is attached.


From your report:

ata5: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 
status 0x400

ata5: CPB 0: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 1: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 2: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 3: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 4: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 5: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 6: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 7: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 8: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 9: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 10: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 11: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 12: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 13: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 14: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 15: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 16: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 17: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 18: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 19: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 20: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 21: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 22: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 23: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 24: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 25: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 26: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 27: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 28: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 29: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 30: ctl_flags 0x1f, resp_flags 0x1
ata5: Resetting port
ata5.00: exception Emask 0x0 SAct 0x1f02 SErr 0x0 action 0x2 frozen
ata5.00: cmd 60/40:08:8f:eb:67/00:00:03:00:00/40 tag 1 cdb 0x0 data 32768 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:40:17:eb:67/00:00:03:00:00/40 tag 8 cdb 0x0 data 4096 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/18:48:47:eb:67/00:00:03:00:00/40 tag 9 cdb 0x0 data 12288 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:50:77:eb:67/00:00:03:00:00/40 tag 10 cdb 0x0 data 4096 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:58:87:eb:67/00:00:03:00:00/40 tag 11 cdb 0x0 data 4096 in
 res 

Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-07 Thread Guillaume Laurès
Well, I should have tried to answer my own question earlier... but  
there is still a problem.


A quick reading of Wikipedia later, I can tell the following.
First, the anticipatory scheduler seems to be the worse choice when  
dealing with raid arrays.

Now, a better choice in my case could be:
- the CFQ scheduler for the array in the dom0;
- and the NOOP scheduler for the VBDs in the domUs (seems already  
there by default btw)


However even with CFQ/NOOP I keep getting soft resets, only the  
sata_nv SATA ports, when under load. Any thoughs ?



Thanks to all who did not flame me on this one ...;-)

GoM 
--

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-07 Thread Guillaume Laurès
Well, I should have tried to answer my own question earlier... but  
there is still a problem.


A quick reading of Wikipedia later, I can tell the following.
First, the anticipatory scheduler seems to be the worse choice when  
dealing with raid arrays.

Now, a better choice in my case could be:
- the CFQ scheduler for the array in the dom0;
- and the NOOP scheduler for the VBDs in the domUs (seems already  
there by default btw)


However even with CFQ/NOOP I keep getting soft resets, only the  
sata_nv SATA ports, when under load. Any thoughs ?



Thanks to all who did not flame me on this one ...;-)

GoM 
--

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: I/O scheduler problem with an 8 SATA disks raid 5 under heavy load ?

2008-01-07 Thread Robert Hancock

Laurès wrote:

Hello,

Dear kernel developers, my dmesg asked me to report this, so here I go ;)
Here is what I found in my dmesg: anticipatory: forced dispatching is 
broken (nr_sorted=1), please report this.


- First, let's talk about the machine: it's quite pushed so maybe the 
cause is me doing something wrong rather than a bug in the kernel.


I got this alert on a dual core amd64 xen host. It has 8 SATA drives 
making a raid 5 array. This array makes a virtual block device for one 
of the virtual machines:  an Openfiler appliance. Openfiler then manages 
logical volumes on this device including an XFS partition shared via 
NFS. 2 MythTV hosts continuously write MPEG2 tv shows on it (1 to 4Gb 
each).
Still following ? Here is a summary: MPEG2 files - NFS - XFS - LVM - 
Xen VBD - RAID 5 - 8x SATA disks.


- Next, the symptoms.

This setup is only 2 weeks old. Behavior was quite good, except for some 
unexplained failures from the sata_nv attached disks. Not always from 
the same disk. Never from any disks attached through the sata_sil HBA.
Eventually a second disk would go down before the end of the raid 
reconstruction (still a sata_nv attached one).
Since the disks showed nothing wrong with smartmontools I re-added them 
each time. So far the raid array was strong enough to be fully 
recovered, mdadm --force and xfs_check are my friends ;-)
It seems to happen more often now that the XFS partition is quite 
heavily fragmented, and I can't even run the defragmenter without a 
quick failure.
I didn't payed big attention to the logs and quickly decided to buy a 
SATA Sil PCI card to get rid of the Nvidia SATA HBA.


- Now the problem.

Yesterday, however, the MPEG2 streams hanged for a few tens of seconds 
just as usual. But there were no disk failure. The array was still in 
good shape, although dmesg showed the same ata[56]: Resetting port, 
SCSI errors etc. fuss.
However this was new in dmesg: anticipatory: forced dispatching is 
broken (nr_sorted=1), please report this. Got 4 identical in a row.
Maybe managing 8 block devices queues under load with the anticipatory 
scheduler is too much ? I immediately switched to deadline on the 8 
disks, and I'll see if it it happens again by stressing the whole system 
more and more.
I have no clue if anticipatory is a good choice or definitely not in my 
case, anyone can point some documentation or good advices ?


- How to reproduce.

Here is what I would do:
Harness a small CPU with lots of sata/scsi drives.
Do raid 5 with big block size (1-4Mb) on it.
Make a 50G XFS file system with sunit/swidth options
Trigger bonnie++ with 1Gfiles4G and fill the FS to 80-95%, trying to 
achieve 98%+ fragmentation.

Defrag !

- Finally the usual bug report stuff is attached.


From your report:

ata5: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 
status 0x400

ata5: CPB 0: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 1: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 2: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 3: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 4: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 5: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 6: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 7: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 8: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 9: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 10: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 11: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 12: ctl_flags 0x1f, resp_flags 0x2
ata5: CPB 13: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 14: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 15: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 16: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 17: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 18: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 19: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 20: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 21: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 22: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 23: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 24: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 25: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 26: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 27: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 28: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 29: ctl_flags 0x1f, resp_flags 0x1
ata5: CPB 30: ctl_flags 0x1f, resp_flags 0x1
ata5: Resetting port
ata5.00: exception Emask 0x0 SAct 0x1f02 SErr 0x0 action 0x2 frozen
ata5.00: cmd 60/40:08:8f:eb:67/00:00:03:00:00/40 tag 1 cdb 0x0 data 32768 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:40:17:eb:67/00:00:03:00:00/40 tag 8 cdb 0x0 data 4096 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/18:48:47:eb:67/00:00:03:00:00/40 tag 9 cdb 0x0 data 12288 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:50:77:eb:67/00:00:03:00:00/40 tag 10 cdb 0x0 data 4096 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata5.00: cmd 60/08:58:87:eb:67/00:00:03:00:00/40 tag 11 cdb 0x0 data 4096