FYI.....

Nate, if the 3ware guys want to work with us, happy to do so.

Hans

Nate Amsden wrote:

>hi hans.
>
>im sure your a very busy guy so i will try to keep it brief. I have found
>what appears to me to be a critical bug in the interaction between
>3ware raid10 IDE arrays and reiserfs 3.5.32. I have already contacted
>3ware and their engineers are looking into it(no word yet). I hope
>that you may have some idea on how i can debug this further
>(I'm not a developer so i am already near my limit of my abilities)
>
>Systems Tested & Confirmed affected
>=====================================
>Linux 2.2.19+reiserfs 3.5.32+reiserfs nfsd patch + 3Ware Raid10 arrays
>(I have 2 such systems)
>
>Systems Tested & Confirmed not affected
>========================================
>2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware Raid5 arrays
>(i have 1 system)
>2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware Raid1 arrays
>(i have 1 system)
>2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware raid10 array
>running on an ext2 filesystem rather then a reiserfs filesystem
>2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch running on a non
>3ware raid array(have tested 5 different systems sofar) with the reiserfs
>filesystem
>2.2.19 + no reiserfs patches running on ext2 filesystem
>
>all systems are running the same drivers & kernel patches.
>
>Issue
>===========================================
>A symbolic link that points to itself causes either a immediate hard
>lock up, or an immediate system reboot. No errors are logged, system
>just dies instantly.
>
>How to reproduce
>============================================
>(running bash, all tests were run as user 'root')
>cd /path/to/reiserfs/filesystem
>ln -s link1 link1
>ls -l lin<HIT TAB FOR FILENAME COMPLETION>
>(system lockup or reboot at this point)
>
>Or, use scp to attempt to copy the symlink(or a directory tree that
>contains it), system locks up or reboots when it attempts to access it.
>
>Exceptions
>=============================================
>Using 'tar' to archive the looping symlink does not cause a crash
>Using 'rsync' to copy the looping symlink does not cause a crash
>
>Full list of kernel patches:
>=============================
>Reiserfs 3.5.32
>Reiserfs 3.5.32 nfsd 
>Openwall v4 (www.openwall.com/linux)
>Intel EEPRO 100+ 1.17B Update from www.scyld.com
>Latest 3ware raid driver from www.3ware.com
>IDE patch from www.linux-ide.org
>
>Hardware configuration:
>(1) Intel P3-800Mhz
>768MB Memory
>Intel L440GX+ Motherboard
>3Ware 6800 Series 8-port IDE Raid card
>6 x 80GB Maxtor drives connected to 3Ware card in raid 10
>1 x 30GB Maxtor drive connected to Intel IDE Controller
>1 x IDE CDROM connected to Intel IDE Controller
>1 x Floppy
>2 x Quantum DLT8000 Tape drives(external) hooked to different SCSI 
>channels on the L440GX+
>Big 4U Rack case with half dozen high RPM fans
>Climate controlled enviornment(68 degrees F 24/7), Hooked to
>a APC SmartUPS 1400XLNET+Battery pack(2 hours battery backup)
>
>Software Configuration:
>Debian GNU/Linux 2.2r4
>30GB Maxtor drive is partitioned up for / /usr /var /home /boot all running EXT2
>3Ware raid array has a 223GB reiserfs filesystem mounted on /raid
>3Ware raid array has a 1GB swap partition
>
>How i found it
>============================
>We use rsync a lot here to back up data to these big 3ware systems mostly
>for off-site backups. there is a remote server at a colo that is being backed
>up via rsync it has about a dozen of these looping symlinks. the
>remote system is running 100% EXT2. I don't know why it has them or who 
>made them, but they are there. About a week ago i wanted to replicate the 
>data to another server and used scp from one of the 3ware systems to copy the 
>data over. the 3ware system rebooted. i ran more tests locally and caused 
>another 6-7 crashes over a period of a few hours, 100% reproducable. I
>eventually 
>narrowed it down to the 2 systems that run raid10. The systems are relativly 
>solid otherwise, but doing the above causes an instant crash. I narrowed
>it down to reiserfs when i attempted to duplicate the bug on a non
>raid array running EXT2. i saw the system spit out "Too many levels of
>symbolic links" when it tried to access the file. once i saw that
>i was able to narrow it down to reiserfs & looping symlinks on the raid
>array.
>
>all of my linux systems run 2.2 kernels, from early tests i decided
>to push back any initial deployments of 2.4 until sometime in late
>2002.
>
>I tried to include as much useful info as i could, but if you need more
>info i'd be happy to provide it.
>
>happy holidays
>
>nate
>
>
>



Reply via email to