I haven't hit nikita-2967 again, but I got several other interesting results.
The first panic didn't cause corruption: reiser4 panicked cowardly: reiser4[pdflush(16048)]: scan_by_coord (fs/reiser4/flush.c:3431)[nikita-3435]: Kernel panic - not syncing: reiser4[pdflush(16048)]: scan_by_coord (fs/reiser4/flush.c:3431)[nikita-3435]: The second affected my root partition, not the one I was stress testing: reiser4 panicked cowardly: reiser4[ent:hda3!(841)]: capture_anonymous_pages (fs/reiser4/plugin/file/file.c:1007)[vs-49]: Kernel panic - not syncing: reiser4[end:hda3!(841)]: capture_anonymous_pages (fs/reiser4/plugin/file/file.c:1007)[vs-49]: I booted from a live CD to document the corruption (which seemed to have been completely fixed). http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck2_--check_hda3.txt.gz http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck2_--fix_hda3.txt.gz http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck2_--check_after_--fix_hda3.txt.gz When I rebooted, I got another panic when my system tried to mount / read-write: reiser4 pnicked cowardly: reiser4[mount(3614)]: check_blocks_bitmap (fs/reiser4/plugin/space/bitmap.c:1268)[zam-623]: Kernel panic - not syncing: reiser4[mount(3614)]: check_blocks_bitmap (fs/reiser4/plugin/space/bitmap.c:1268)[zam-623]: On the second reboot, it worked again. The third panic was one I've seen before (http://marc.theaimsgroup.com/?l=reiserfs&m=115259665831650&w=2): reiser4 panicked cowardly: reiser4[rm(25870)]: sibling_list_remove (fs/reiser4/tree_walk.c:813)[zam-32245]: Kernel panic - not syncing: reiser4[rm(25870)]: sibling_list_remove (fs/reiser4/tree_walk.c:813)[zam-32245]: http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck3_--check.txt.gz http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck3_--fix.txt.gz http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck3_--check_after_--fix.txt.gz The fourth was another repeat: reiser4 panicked cowardly: reiser4[pdflush(198)]: capture_anonymous_pages (fs/reiser4/plugin/file/file.c:1007)[vs-49]: Kernel panic - not syncing: reiser4[pdflush(198)]: capture_anonymous_pages (fs/reiser4/plugin/file/file.c:1007)[vs-49]: http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck4_--check.txt.gz http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck4_--fix.txt.gz http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060720/fsck4_--check_after_fix.txt.gz Where the fsck logs from tests 3 and 4 say entries were removed, they mean it. Those files were GONE. I would expect this to happen to temporary files being written during the panic, but header files should only have been open for reading if at all. I have metadata dumps from before and after one of the fsck --fix runs. Should I make them available? On Wed, 2006-07-19 at 18:07 +0400, Vladimir V. Saveliev wrote: > Hello > > On Wed, 2006-07-19 at 07:28 -0600, Jake Maciejewski wrote: > > Thanks. Now with debug enabled I've gotten: > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/panic1.txt.gz > > the attached patch fixes a problem nikita-2967 reports about. Would you > please check whther it helps. > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/fsck1_--check.txt.gz > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/fsck1_--fix.txt.gz > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/fsck1_--check_after_--fix.txt.gz > > > > and > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/messages2.txt.gz > > followed by > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/messages2b.txt.gz > > > > and without debug: > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/messages3.txt.gz > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/fsck3_--check.txt.gz > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/fsck3_--fix.txt.gz > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060719/fsck3_--check_after_--fix.txt.gz > > > > On Tue, 2006-07-18 at 18:18 +0400, Vladimir V. Saveliev wrote: > > > Hello > > > > > > On Tue, 2006-07-18 at 00:52 -0600, Jake Maciejewski wrote: > > > > Thanks for the patch, but I can still reproduce the problem. I've been > > > > running the attached program to try to speed up the testing process a > > > > bit. Interrupting and restarting the compilation loop also seems to > > > > help. > > > > > > > > > > ok > > > > > > > If I had hours to wait, it would probably crash eventually without > > > > additional encouragement, but I'm doing everything as an unprivileged > > > > user, so I don't think my tests are unreasonable. > > > > > > > > Anyway, I'm still getting a panic with debug enabled: > > > > > > > > reiser4 panicked cowardly: reiser4[find(16411)]: reiser4_dirty_inode > > > > (fs/reiser4/super_ops.c:173)[]: > > > > Kernel panic - not syncing: reiser4[find(16411)]: reiser4_dirty_inode > > > > (fs/reiser4/super_ops.c:173)[]: > > > > > > > > > > The attached patch should fix the above. > > > > > > > Without debug enabled I've seen: > > > > > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060718/messages1.txt.gz > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060718/fsck1_--check.txt.gz > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060718/fsck1_--fix.txt.gz > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060718/fsck1_--check_after_--fix.txt.gz > > > > > > > > but usually I get: > > > > > > > > http://people.msoe.edu/~maciejej/patches/AMD64_reiser4_debug/20060718/messages3.txt.gz > > > > > > > > with no corruption (although I've been rebooting before complete > > > > failure). > > > > > > > > On Mon, 2006-07-17 at 21:38 +0400, Vladimir V. Saveliev wrote: > > > > > Hello > > > > > > > > > > On Mon, 2006-07-17 at 18:10 +0400, Vladimir V. Saveliev wrote: > > > > > > Hello > > > > > > > > > > > > On Sun, 2006-07-16 at 12:44 -0500, [EMAIL PROTECTED] wrote: > > > > > > > Has my previous post > > > > > > > (http://marc.theaimsgroup.com/?l=reiserfs&m=115259665831650&w=2) > > > > > > > been > > > > > > > overlooked, or have I not provided enough information? Do I need > > > > > > > to > > > > > > > reproduce these issues on 2.6.18-rc1-mm2? Should I be trying any > > > > > > > patches? > > > > > > > > > > > > > > > > > > > > > > > please try the attached patch. > > > > > > > > > > > your test crashes reiser4 on my test box. I hope to get a patch > > > > > > ready > > > > > > later today. Not sure that I got the same problem as you, though. We > > > > > > will see. > > > > > > > > > > > > > The bottom line is with 2.6.17-mm6, I've always been able to OOPs > > > > > > > or panic > > > > > > > reiser4 on my amd64 machine (haven't tried x86 yet) by using all > > > > > > > available > > > > > > > physical memory. > > > > > > > > > > > > > > > > > > > > > > > > > > -- Jake Maciejewski <[EMAIL PROTECTED]>