Christoph Hellwig wrote:
> On Sat, Mar 09, 2002 at 12:14:57PM +0100, Per Jessen wrote:
> 
>>I've just received the most recent c't magazine, which has an
>>interesting article/comparison of the journaled file systems in
>>the linux world - xfs, jfs, ext3, reiser ....
>>
>>JFS doesn't get a particularly honourable report - and the 
>>description and experiences as described in the article certainly don't 
>>match my own.
>>
>>Anyone else read this article ?
>>
> 
> Sure, the c't appeared in my inbox today in the morning, as it does in
> alomost every German-speaking hacker's :)
> 
> But in my regular stree-testings using fsstress and cerberus I'm
> absoloutly unable to reproduce is using a variety of test enviroments.
> On the other hand JFS <= 1.0.14 had a very similar bug when built
> modular (I really wonder why the QA of the distributors shipping JFS
> didn't found it - once I built a modular JFS on my test machines I was
> able to trigger it within a few minutes).
> 
> I've Cc'ed Oliver so he can comment on the exact test enviroment
> (Kernel, Hardware, JFS-Version, Config-Options).
> 
>       Christoph
> 
> 

Hi,

here are some details regarding my test environment.

First of all, I have to say that I was quite suprised by the unexpected 
problems with JFS. (Btw, I reported everything in detail to Steve Best 
and discussed the results with him.)

I used a Pentium 4, 1400 MHz, Intel chip set, 256 MByte RAM. The system 
survived one weekend with continuous kernel compiling and a complete 
Cerberus run without problems and did not exhibit any irregularities 
with the other fs in any of my tests and benchmarks, so I concluded that 
the cause of the JFS failures should not be a general hardware problem. 
If it really was hardware related, it must be a very special problem 
that only shines up with JFS.

I used a standard Red Hat Linux 7.2, installed on one ext3 partition, 
with a regular kernel 2.4.17 from www.kernel.org, patched with the 
jfs-2.4-common-1.0.15 and the jfs-2.4.17-1.0.15 patches, dating from 
February 15th. No rejects with the patches, no error messages when 
building the kernel. The relevant .config settings (the complete .config 
is available on request):

CONFIG_JFS_FS=y
# CONFIG_JFS_DEBUG is not set

Booting the JFS kernel did not yield any errors. All stability and 
performance tests were run in single user mode after rebooting the system.

My Cerberus and LTP setting were similar to the one used by Red Hat 
(people.redhat.com/bmatthews) but restrained on the fs related tests. 
Cerberus/LTP yielded a problem after 10 or 15 minutes: I got a kernel 
error stating "invalid operand 0000" from JFS code (don't have 
registers, stack, and call trace information available). The error 
seemed to occur while running the iogen LTP program.

An "rm -rf" of the files left from Cerberus and LTP hung, and the JFS 
device could not be unmounted (umount gave an lseek error). After 
rebooting, mounting was impossible ("wrong fs type, bad option, bad 
superblock"). "fsck.jfs -n" detected errors in the Fileset 
File/Directory Allocation Map control information, in the Fileset 
File/Directory Allocation Map, and incorrect data in disk allocation 
structures and disk allocation control structures. When doing a 
"fsck.jfs -a", it replayed the Log and said the fs was clean. The JFS 
device could be mounted then.

After a reboot, I started a second Cerberus/LTP run with exactly the 
same settings, and it succeeded.

My second test was a Perl script that repeatedly started several file 
system intensive tasks in parallel. After some time, all processes 
accessing the JFS hung. The kernel itself did not crash (you could 
remote login etc.). A "strace -p" to these processes gave no response -- 
the processes did not exhibit any activity (I guess they hung somewhere 
in their read(), open(), readdir() or lseek() calls) -- quite the same 
behavior as the "rm -rf" after the Cerberus/LTP crash.

The tasks (the JFS partition was mounted to /jfs and contained a subdir 
jfs/test with three complete kernel trees [about 35,000 files/500 MByte]):

* "ls -lR /jfs"
* "find /jfs -type f -exec grep XXX {} \;
* cp -a /jfs/test/* /jfs/tmp3
* dbench 10; sleep 180
* a program that creates, writes, and removes 32,000 files in /jfs/tmp1
* a programm that recursively creates and removes a directory tree of 
depth 12 in /jfs/tmp2
* a programm that creates a 500 MByte sparse file /jfs/large_file and 
does lseek(), read() and write() calls within.

The last program reported a read() error, getting less bytes than 
requested before the the "crash".

I can post my test script, the programs, and the exact setting if 
someone is interested in.

Oliver

-- 
Dr. Oliver Diedrich                      Helstorfer Strasse 7
c't Magazin f�r Computertechnik     D-30625 Hannover, Germany
e-mail: [EMAIL PROTECTED]              Tel: +49 (0)511 5352 300

_______________________________________________
Jfs-discussion mailing list
[EMAIL PROTECTED]
http://www-124.ibm.com/developerworks/oss/mailman/listinfo/jfs-discussion

Reply via email to