Hans Reiser wrote: > FYI..... > > Nate, if the 3ware guys want to work with us, happy to do so. > > Hans > > Nate Amsden wrote: > > >hi hans. > > > >im sure your a very busy guy so i will try to keep it brief. I have found > >what appears to me to be a critical bug in the interaction between > >3ware raid10 IDE arrays and reiserfs 3.5.32. I have already contacted > >3ware and their engineers are looking into it(no word yet). I hope > >that you may have some idea on how i can debug this further > >(I'm not a developer so i am already near my limit of my abilities) > > > >Systems Tested & Confirmed affected > >===================================== > >Linux 2.2.19+reiserfs 3.5.32+reiserfs nfsd patch + 3Ware Raid10 arrays > >(I have 2 such systems)
Hello ! There are some fixes for reiserfs-3.5.32 : ftp://ftp.namesys.com/pub/reiserfs-for-2.2/fixes-for-3.5.32/ One of them is probably could help in this case : ftp://ftp.namesys.com/pub/reiserfs-for-2.2/fixes-for-3.5.32/symlink.c.dif All of them are included in the latest reiserfs-3.5.34 as well : ftp://ftp.namesys.com/pub/reiserfs-for-2.2/linux-2.2.19-reiserfs-3.5.34-patch.bz2 Nate, could you please try these fixes or the latest reiserfs-3.5.34 ? Thanks, Yura > > > > >Systems Tested & Confirmed not affected > >======================================== > >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware Raid5 arrays > >(i have 1 system) > >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware Raid1 arrays > >(i have 1 system) > >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch + 3Ware raid10 array > >running on an ext2 filesystem rather then a reiserfs filesystem > >2.2.19+ reiserfs 3.5.32 + reiserfs 3.5.32 nfsd patch running on a non > >3ware raid array(have tested 5 different systems sofar) with the reiserfs > >filesystem > >2.2.19 + no reiserfs patches running on ext2 filesystem > > > >all systems are running the same drivers & kernel patches. > > > >Issue > >=========================================== > >A symbolic link that points to itself causes either a immediate hard > >lock up, or an immediate system reboot. No errors are logged, system > >just dies instantly. > > > >How to reproduce > >============================================ > >(running bash, all tests were run as user 'root') > >cd /path/to/reiserfs/filesystem > >ln -s link1 link1 > >ls -l lin<HIT TAB FOR FILENAME COMPLETION> > >(system lockup or reboot at this point) > > > >Or, use scp to attempt to copy the symlink(or a directory tree that > >contains it), system locks up or reboots when it attempts to access it. > > > >Exceptions > >============================================= > >Using 'tar' to archive the looping symlink does not cause a crash > >Using 'rsync' to copy the looping symlink does not cause a crash > > > >Full list of kernel patches: > >============================= > >Reiserfs 3.5.32 > >Reiserfs 3.5.32 nfsd > >Openwall v4 (www.openwall.com/linux) > >Intel EEPRO 100+ 1.17B Update from www.scyld.com > >Latest 3ware raid driver from www.3ware.com > >IDE patch from www.linux-ide.org > > > >Hardware configuration: > >(1) Intel P3-800Mhz > >768MB Memory > >Intel L440GX+ Motherboard > >3Ware 6800 Series 8-port IDE Raid card > >6 x 80GB Maxtor drives connected to 3Ware card in raid 10 > >1 x 30GB Maxtor drive connected to Intel IDE Controller > >1 x IDE CDROM connected to Intel IDE Controller > >1 x Floppy > >2 x Quantum DLT8000 Tape drives(external) hooked to different SCSI > >channels on the L440GX+ > >Big 4U Rack case with half dozen high RPM fans > >Climate controlled enviornment(68 degrees F 24/7), Hooked to > >a APC SmartUPS 1400XLNET+Battery pack(2 hours battery backup) > > > >Software Configuration: > >Debian GNU/Linux 2.2r4 > >30GB Maxtor drive is partitioned up for / /usr /var /home /boot all running EXT2 > >3Ware raid array has a 223GB reiserfs filesystem mounted on /raid > >3Ware raid array has a 1GB swap partition > > > >How i found it > >============================ > >We use rsync a lot here to back up data to these big 3ware systems mostly > >for off-site backups. there is a remote server at a colo that is being backed > >up via rsync it has about a dozen of these looping symlinks. the > >remote system is running 100% EXT2. I don't know why it has them or who > >made them, but they are there. About a week ago i wanted to replicate the > >data to another server and used scp from one of the 3ware systems to copy the > >data over. the 3ware system rebooted. i ran more tests locally and caused > >another 6-7 crashes over a period of a few hours, 100% reproducable. I > >eventually > >narrowed it down to the 2 systems that run raid10. The systems are relativly > >solid otherwise, but doing the above causes an instant crash. I narrowed > >it down to reiserfs when i attempted to duplicate the bug on a non > >raid array running EXT2. i saw the system spit out "Too many levels of > >symbolic links" when it tried to access the file. once i saw that > >i was able to narrow it down to reiserfs & looping symlinks on the raid > >array. > > > >all of my linux systems run 2.2 kernels, from early tests i decided > >to push back any initial deployments of 2.4 until sometime in late > >2002. > > > >I tried to include as much useful info as i could, but if you need more > >info i'd be happy to provide it. > > > >happy holidays > > > >nate > > > > > >