Re: libata interface fatal error

2007-06-16 Thread Florian Effenberger

Hi there,

we tested out two 600W Fortron PSUs, also tried a BIOS update. Didn't 
work out.


We also tried the jumper on the disks labelled SSP (Spread Spectrum 
Clocking), didn't work out out as well.


What seemed to help at least a little bit is to use the 12V connector on 
the board, that is normally dedicated for graphic cards.


The best test to reproduce the problem, according to a colleague also 
working on the machine, is a cat /dev/zero  zero.bin


Do you still think it is a PSU or hardware problem? Do you need more 
details/logs?


Thanks!
Florian
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Combined mode quirk removal kills performance

2007-06-16 Thread Jeff Garzik

Andrew Morton wrote:

So I revisited [...]
http://bugzilla.kernel.org/show_bug.cgi?id=8636

I don't see why it's not-a-bug.  The second guy (Stephen Clark) had to toss
out the FC6 kernel and build his own kernel to fix this regression.


(I hope you don't mind me copying linux-ide)

The combined mode removal changed the driver load configuration.  Some 
distros updated with that knowledge, but apparently Fedora did not.


Combined mode would explicitly reserve ports for libata in pci/quirks.c, 
a horrible layering violation that created special case module load 
order dependencies.


The change made libata and the old-IDE driver behave like normal Linux 
drivers.  They can both grab an entire PCI device and drive it 
correctly, so -- just like every other situation where two drivers can 
match the same PCI ID -- the one that loads first wins.


Kernel configs need to be tweaked a bit, due to this change.  That's why 
it was communicated in advance (but poorly, as it appears from your 
questions and existing bug reports).  Two common results appear in bug 
reports:


1) One possible result of a complete lack of kconfig tweaking (blindly 
hitting enter during 'make oldconfig') is falling back to the 
most-compatible configuration, the legacy IDE driver (either libata or 
old-IDE), with resultant slow performance.


2) no root! panic! and similar I-cant-find-your-hard-drive results for 
people with hardcoding root= configurations, for the minority where a 
device moved from /dev/hdX - /dev/sdX, or vice versa.


The damage is hoped to be limited to:

* Intel ICH5/6[/7?] users with combined mode enabled, which is a 
not-small subset of all ICH[567] users.


* Users that did /not/ choose the combined_mode=libata kernel command 
line option, a popular option for restoring performance /broken/ by 
running two drivers in tandem [i.e. the old way, recently removed].


* In the combined mode configuration, one device is /dev/hdX (often the 
CD-ROM) and one device is /dev/sdX, so only one of those devices will 
move.  Standard LVM and mount-by-label/uuid found in standard distro 
installs makes this move seamless for many.


* ...in distros that hopefully took stock of their compatibility 
picture, and modified their kernel configs and drivers accordingly. 
Some distros were defaulting to combined_mode=libata (==libata drives 
all applicable Intel ICHx IDE devices), others were not.  That affects 
the decision about kernel config changes.


Jeff





-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PROBLEM]: hdparm strange behaviour for 2.6.21 and later

2007-06-16 Thread Thanos Kyritsis
Hello,

starting with kernel 2.6.21 and up to kernel 2.6.22-rc4, I'm having the 
following problem:

/etc/rc.d/rc.local contains the following:
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hda
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdb
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdc
/usr/sbin/hdparm -q -d1 -q -u1 -q -c1 -q -k1 /dev/hdd

(I'm using Slackware, no Debian-style automated hdparm.conf is running 
during bootup, that's why these are in rc.local)

The above seem to somehow lock up the boot procedure just at the point 
where rc.local gets executed, so the system never reaches login prompt.
All drivers (kernelspace) and system daemons (userspace) before rc.local 
do normally load, but there are no strange messages in the console or in 
the system logs and because I cannot login, I cannot trace it any further. 
I believe the kernel is in running state because the machine responds to 
ICMP pings from the ethernet, but since the login prompt is not up, the 
already running sshd/telnetd do not provide any help.

The strange thing is that if I remove all the quiet options (-q) from the 
above commands, everything works like it should. Furthermore, if I 
comment them out from rc.local, then boot, login, and execute them by 
hand (with -q), again everything works like it should. Lockup only happens if 
I run 2 or more hdparm commands, if I leave only one (doesn't matter 
which one) hdparm command in rc.local (with -q), it works.

This is not happening for kernels up to 2.6.20.14 and I'm using the same 
above hdparm options for over a year while the hardware hasn't changed 
at all. 

Speaking of hardware:
Pentium 4 HT, ICH5 IDE Controller, running on SMP/HT kernel 
(ticks enabled @ 1000 Hz, PREEMPT/low-latency is on, 
CONFIG_BLK_DEV_IDEDMA=y).
hda and hdb are Hard drives.
hdc and hdd are DVD drives (hdc is a recorder).


Can this be regarded as a kernel bug at all ? Can I do something to properly 
debug it and help you out ?

I posted it here because I couldn't help noticing the following inside .21's 
Changelog:

commit 8799620400b0b1a4729d8be828b5bfb3d2a8db1a
Author: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]
Date:   Mon Mar 26 23:03:19 2007 +0200

ide: fix locking for manual DMA enable/disable (hdparm -d)

Since hwif-ide_dma_check and hwif-ide_dma_on never queue any commands
(ide_config_drive_speed() sets transfer mode using polling and has no error
recovery) we are safe with setting hwgroup-busy for the time while DMA
setting for a drive is changed (so it won't race against I/O commands in 
fly).

I audited briefly all -ide_dma_check/-ide_dma_on/-tuneproc/-speedproc
implementations and they all look OK wrt to this change.

This patch finally allowed me to close kernel bugzilla bug #8169
(once again thanks to Patrick Horn for reporting the issue  testing 
patches).

Cc: Sergei Shtylyov [EMAIL PROTECTED]
Cc: Alan Cox [EMAIL PROTECTED]
Signed-off-by: Bartlomiej Zolnierkiewicz [EMAIL PROTECTED]


-- 
Thanos Kyritsis djart at linux.gr

- What's your ONE purpose in life ?
- To explode, of course! ;-)
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Disk errors reported on boot - kernel 2.6.18

2007-06-16 Thread Alan Cox
 I recently installed a new disk on the same computer. This one is a Hitchi 
 HDS721616PLAT80.
 
 When this disk is connected to the ICH5 controller, the following error 
 lines appear on boot:
 
 ---
 hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
 hda: drive_cmd: error=0x04 { DriveStatusError }
 ide: failed opcode was: 0xb0

Your new drive rejected a SMART command. It was asked (by user space I
suspect - ie smartd) to do some kind of smart operation and reported
51/04 - which is basically 'I do not know/support the command you have
asked me to perform) - amd opcode 0xB0 is smart.

 The drive is working perfectly. I even verified it with a tool provided by 
 the manufacturer. SMART also does not report any error .
 The same disk gives no error whatsoever when connected to a ITE8212 IDE 
 Controler present on the same computer.

The IT8212 with raid firmware doesn't support SMART so any smart
configuration funnies would be hidden

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Machine hanging on synchronize cache on shutdown 2.6.22-rc4-git[45678]

2007-06-16 Thread Mikael Pettersson
On Sat, 16 Jun 2007 15:52:33 +0400, Brad Campbell wrote:
 I've got a box here based on current Debian Stable.
 It's got 15 Maxtor SATA drives in it on 4 Promise TX4 controllers.
 
 Using kernel 2.6.21.x it shuts down, but of course with a huge clack as 15 
 drives all do emergency 
 head parks simultaneously. I thought I'd upgrade to 2.6.22-rc to get around 
 this but the machine 
 just hangs up hard apparently trying to sync cache on a drive.
 
 I've run this process manually, so I know it is being performed properly.
 
 Prior to shutdown, all nfsd processes are stopped, filesystems unmounted and 
 md arrays stopped.
 /proc/mdstat shows
 [EMAIL PROTECTED]:~# cat /proc/mdstat
 Personalities : [raid6] [raid5] [raid4]
 unused devices: none
 [EMAIL PROTECTED]:~#
 
 Here is the final hangup.
 
 http://www.fnarfbargle.com/CIMG1029.JPG

Something sent a command to the disk on ata15 after the PHY had been
offlined and the interface had been put in SLUMBER state (SStatus 614).
Consequently the command timed out. Libata tried a soft reset, and then
a hard reset, after which the machine hung.

I don't think sata_promise is the guilty party here. Looks like some
layer above sata_promise got confused about the state of the interface.

I did a quick sata_promise test here with kernel 2.6.22-rc4-git8 and FC4
userspace, and there was no problem shutting the machine down.

/Mikael
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html