Hans Reiser wrote:

> FYI.....
>
> Nate, if the 3ware guys want to work with us, happy to do so.
>
> Hans
>
> Nate Amsden wrote:
>
> >hi hans.
> >
> >im sure your a very busy guy so i will try to keep it brief. I have found
> >what appears to me to be a critical bug in the interaction between
> >3ware raid10 IDE arrays and reiserfs 3.5.32. I have already contacted
> >3ware and their engineers are looking into it(no word yet). I hope
> >that you may have some idea on how i can debug this further
> >(I'm not a developer so i am already near my limit of my abilities)
> >
> >Systems Tested & Confirmed affected
> >=====================================
> >Linux 2.2.19+reiserfs 3.5.32+reiserfs nfsd patch + 3Ware Raid10 arrays
> >(I have 2 such systems)

Hello !

There are some fixes for reiserfs-3.5.32 :
ftp://ftp.namesys.com/pub/reiserfs-for-2.2/fixes-for-3.5.32/

One of them is probably could help in this case :
ftp://ftp.namesys.com/pub/reiserfs-for-2.2/fixes-for-3.5.32/symlink.c.dif

All of them are included in the latest reiserfs-3.5.34 as well :
ftp://ftp.namesys.com/pub/reiserfs-for-2.2/linux-2.2.19-reiserfs-3.5.34-patch.bz2

Nate, could you please try these fixes or the latest reiserfs-3.5.34 ?

Thanks,
Yura


>
> >
> >Systems Tested & Confirmed not affected
> >========================================
> >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware Raid5 arrays
> >(i have 1 system)
> >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware Raid1 arrays
> >(i have 1 system)
> >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware raid10 array
> >running on an ext2 filesystem rather then a reiserfs filesystem
> >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch running on a non
> >3ware raid array(have tested 5 different systems sofar) with the reiserfs
> >filesystem
> >2.2.19 + no reiserfs patches running on ext2 filesystem
> >
> >all systems are running the same drivers & kernel patches.
> >
> >Issue
> >===========================================
> >A symbolic link that points to itself causes either a immediate hard
> >lock up, or an immediate system reboot. No errors are logged, system
> >just dies instantly.
> >
> >How to reproduce
> >============================================
> >(running bash, all tests were run as user 'root')
> >cd /path/to/reiserfs/filesystem
> >ln -s link1 link1
> >ls -l lin<HIT TAB FOR FILENAME COMPLETION>
> >(system lockup or reboot at this point)
> >
> >Or, use scp to attempt to copy the symlink(or a directory tree that
> >contains it), system locks up or reboots when it attempts to access it.
> >
> >Exceptions
> >=============================================
> >Using 'tar' to archive the looping symlink does not cause a crash
> >Using 'rsync' to copy the looping symlink does not cause a crash
> >
> >Full list of kernel patches:
> >=============================
> >Reiserfs 3.5.32
> >Reiserfs 3.5.32 nfsd
> >Openwall v4 (www.openwall.com/linux)
> >Intel EEPRO 100+ 1.17B Update from www.scyld.com
> >Latest 3ware raid driver from www.3ware.com
> >IDE patch from www.linux-ide.org
> >
> >Hardware configuration:
> >(1) Intel P3-800Mhz
> >768MB Memory
> >Intel L440GX+ Motherboard
> >3Ware 6800 Series 8-port IDE Raid card
> >6 x 80GB Maxtor drives connected to 3Ware card in raid 10
> >1 x 30GB Maxtor drive connected to Intel IDE Controller
> >1 x IDE CDROM connected to Intel IDE Controller
> >1 x Floppy
> >2 x Quantum DLT8000 Tape drives(external) hooked to different SCSI
> >channels on the L440GX+
> >Big 4U Rack case with half dozen high RPM fans
> >Climate controlled enviornment(68 degrees F 24/7), Hooked to
> >a APC SmartUPS 1400XLNET+Battery pack(2 hours battery backup)
> >
> >Software Configuration:
> >Debian GNU/Linux 2.2r4
> >30GB Maxtor drive is partitioned up for / /usr /var /home /boot all running EXT2
> >3Ware raid array has a 223GB reiserfs filesystem mounted on /raid
> >3Ware raid array has a 1GB swap partition
> >
> >How i found it
> >============================
> >We use rsync a lot here to back up data to these big 3ware systems mostly
> >for off-site backups. there is a remote server at a colo that is being backed
> >up via rsync it has about a dozen of these looping symlinks. the
> >remote system is running 100% EXT2. I don't know why it has them or who
> >made them, but they are there. About a week ago i wanted to replicate the
> >data to another server and used scp from one of the 3ware systems to copy the
> >data over. the 3ware system rebooted. i ran more tests locally and caused
> >another 6-7 crashes over a period of a few hours, 100% reproducable. I
> >eventually
> >narrowed it down to the 2 systems that run raid10. The systems are relativly
> >solid otherwise, but doing the above causes an instant crash. I narrowed
> >it down to reiserfs when i attempted to duplicate the bug on a non
> >raid array running EXT2. i saw the system spit out "Too many levels of
> >symbolic links" when it tried to access the file. once i saw that
> >i was able to narrow it down to reiserfs & looping symlinks on the raid
> >array.
> >
> >all of my linux systems run 2.2 kernels, from early tests i decided
> >to push back any initial deployments of 2.4 until sometime in late
> >2002.
> >
> >I tried to include as much useful info as i could, but if you need more
> >info i'd be happy to provide it.
> >
> >happy holidays
> >
> >nate
> >
> >
> >

Reply via email to