Christoph Hellwig wrote:
> On Sat, Mar 09, 2002 at 12:14:57PM +0100, Per Jessen wrote:
>
>>I've just received the most recent c't magazine, which has an
>>interesting article/comparison of the journaled file systems in
>>the linux world - xfs, jfs, ext3, reiser ....
>>
>>JFS doesn't get a particularly honourable report - and the
>>description and experiences as described in the article certainly don't
>>match my own.
>>
>>Anyone else read this article ?
>>
>
> Sure, the c't appeared in my inbox today in the morning, as it does in
> alomost every German-speaking hacker's :)
>
> But in my regular stree-testings using fsstress and cerberus I'm
> absoloutly unable to reproduce is using a variety of test enviroments.
> On the other hand JFS <= 1.0.14 had a very similar bug when built
> modular (I really wonder why the QA of the distributors shipping JFS
> didn't found it - once I built a modular JFS on my test machines I was
> able to trigger it within a few minutes).
>
> I've Cc'ed Oliver so he can comment on the exact test enviroment
> (Kernel, Hardware, JFS-Version, Config-Options).
>
> Christoph
>
>
Hi,
here are some details regarding my test environment.
First of all, I have to say that I was quite suprised by the unexpected
problems with JFS. (Btw, I reported everything in detail to Steve Best
and discussed the results with him.)
I used a Pentium 4, 1400 MHz, Intel chip set, 256 MByte RAM. The system
survived one weekend with continuous kernel compiling and a complete
Cerberus run without problems and did not exhibit any irregularities
with the other fs in any of my tests and benchmarks, so I concluded that
the cause of the JFS failures should not be a general hardware problem.
If it really was hardware related, it must be a very special problem
that only shines up with JFS.
I used a standard Red Hat Linux 7.2, installed on one ext3 partition,
with a regular kernel 2.4.17 from www.kernel.org, patched with the
jfs-2.4-common-1.0.15 and the jfs-2.4.17-1.0.15 patches, dating from
February 15th. No rejects with the patches, no error messages when
building the kernel. The relevant .config settings (the complete .config
is available on request):
CONFIG_JFS_FS=y
# CONFIG_JFS_DEBUG is not set
Booting the JFS kernel did not yield any errors. All stability and
performance tests were run in single user mode after rebooting the system.
My Cerberus and LTP setting were similar to the one used by Red Hat
(people.redhat.com/bmatthews) but restrained on the fs related tests.
Cerberus/LTP yielded a problem after 10 or 15 minutes: I got a kernel
error stating "invalid operand 0000" from JFS code (don't have
registers, stack, and call trace information available). The error
seemed to occur while running the iogen LTP program.
An "rm -rf" of the files left from Cerberus and LTP hung, and the JFS
device could not be unmounted (umount gave an lseek error). After
rebooting, mounting was impossible ("wrong fs type, bad option, bad
superblock"). "fsck.jfs -n" detected errors in the Fileset
File/Directory Allocation Map control information, in the Fileset
File/Directory Allocation Map, and incorrect data in disk allocation
structures and disk allocation control structures. When doing a
"fsck.jfs -a", it replayed the Log and said the fs was clean. The JFS
device could be mounted then.
After a reboot, I started a second Cerberus/LTP run with exactly the
same settings, and it succeeded.
My second test was a Perl script that repeatedly started several file
system intensive tasks in parallel. After some time, all processes
accessing the JFS hung. The kernel itself did not crash (you could
remote login etc.). A "strace -p" to these processes gave no response --
the processes did not exhibit any activity (I guess they hung somewhere
in their read(), open(), readdir() or lseek() calls) -- quite the same
behavior as the "rm -rf" after the Cerberus/LTP crash.
The tasks (the JFS partition was mounted to /jfs and contained a subdir
jfs/test with three complete kernel trees [about 35,000 files/500 MByte]):
* "ls -lR /jfs"
* "find /jfs -type f -exec grep XXX {} \;
* cp -a /jfs/test/* /jfs/tmp3
* dbench 10; sleep 180
* a program that creates, writes, and removes 32,000 files in /jfs/tmp1
* a programm that recursively creates and removes a directory tree of
depth 12 in /jfs/tmp2
* a programm that creates a 500 MByte sparse file /jfs/large_file and
does lseek(), read() and write() calls within.
The last program reported a read() error, getting less bytes than
requested before the the "crash".
I can post my test script, the programs, and the exact setting if
someone is interested in.
Oliver
--
Dr. Oliver Diedrich Helstorfer Strasse 7
c't Magazin f�r Computertechnik D-30625 Hannover, Germany
e-mail: [EMAIL PROTECTED] Tel: +49 (0)511 5352 300
_______________________________________________
Jfs-discussion mailing list
[EMAIL PROTECTED]
http://www-124.ibm.com/developerworks/oss/mailman/listinfo/jfs-discussion