[BUG] Reiserfs panic while running fsstress due to multiple truncate safe links for a file.
Resending, since there were no responses to the earlier post. Hi, I was working on a reiserfs panic with 2.6.17-rc3, while running fs stress tests. The panic message looked like : REISERFS: panic (device Null superblock): reiserfs[4248]: assertion !(truncate (REISERFS_I(inode)-i_flags i_link_saved_truncate_mask) ) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re exists for truncated inode 13b5a -- Summary of the problem --- Reiserfs uses safe links ( directory entries with some special key value) to keep track of truncated or unlinked files to ensure integrity across crashes. Whenever there is a truncate/unlink on a file, Reiserfs creates a safe link for the same and deletes the same once the operation is complete. If the machine crashes before committing the operation, whenever the fs is mounted next time, the fs will look for the saved links ( easy to find out, since they have special key) and commit the operation that was unfinished. The problem here occurs as follows: Whenever there is an extending DIO write operation, the fs would create a safe link so as to ensure the file size consistent, if there is crash in between the DIO. This will be deleted once the write operation finishes. If the DIO write happens to go through a HOLE region in the file, it will fall into normal buffered write, which is done through the address space operations prepare_write() commit_write(). Now, the prepare_write() might allocate blocks for the file (if needed). So if there is some error at a later point (say ENOSPC) in prepare_write(), we need to discard the allocated blocks. This is done by calling vmtruncate() on the file. This call leads to reiserfs specific truncate, which would try to add a save link for the file. This addition causes a reiserfs_panic, since there is already a save link stored for the file. I have a simple testcase to reproduce the problem, which does the same as described above. I will attach it if required. Any thoughts on how to fix this ? thanks, Suzuki K P Linux Technology Centre, IBM Software Labs.
Re: Reiserfs bug in 2.6.17-rc3-mm1
Hello On Sun, 2006-05-07 at 12:35 -0700, Joe Feise wrote: rc3-mm1 plus the reiser4-radix-tree-direct-data-fix.patch first version of this patch is insufficient. Please unapply it and try new version. -Joe dmesg output: kernel BUG at fs/reiser4/flush.c:1038! invalid opcode: [#1] PREEMPT last sysfs file: /class/net/eth2/ifindex Modules linked in: pl2303 usbserial softdog cisco_ipsec snd_pcm_oss snd_mixer_oss snd_cs46xx gameport snd_rawmidi snd_seq_device snd_ac97_codec snd_ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc zoran i2c_algo_bit videodev saa7111 i2c_corepegasus arc4 ppp_mppe ppp_deflate ppp_generic slhc usblp CPU:0 EIP:0060:[c01cb0d4]Tainted: P VLI EFLAGS: 00010287 (2.6.17-rc3-mm1 #4) EIP is at flush_current_atom+0x1cf/0x247 eax: e0efc080 ebx: f4b45e00 ecx: f4b45e00 edx: f5c2e000 esi: f5dabe4c edi: f5daa000 ebp: 0001 esp: f5dabe0c ds: 007b es: 007b ss: 0068 Process ent:sdb2! (pid: 1832, threadinfo=f5daa000 task=f5d89ab0) Stack: 0f5dabe18 0001 f5dabe90 c5db8a40 e0efc080 f5daa000 f5c21cd8 c01c84b8 f5dabe4c f5dabeec f5dabea8 f7e57dcc f5dabe90 e0efc080 f7e57d80 f7e57dcc f58d7c00 c01d6b60 0001 d75ba3ac Call Trace: c01c84b8 flush_some_atom+0x245/0x367 c01d6b60 writeout+0xc8/0x1e7 c0175f27 generic_sync_sb_inodes+0x211/0x2a8 c01d857d entd_flush+0x9f/0xbc c01d8257 entd+0xd5/0x2a3 c0128bdc autoremove_wake_function+0x0/0x43 c0128bdc autoremove_wake_function+0x0/0x43 c01d8182 entd+0x0/0x2a3 c01286f4 kthread+0x9c/0xa1 c0128658 kthread+0x0/0xa1 c0100f1d kernel_thread_helper+0x5/0xb Code: 87 c7 04 24 20 72 47 c0 e8 b8 b4 f4 ff e8 ae 84 f3 ff e9 5c ff ff ff e8 59 c2 27 00 e9 28 ff ff ff e8 4f c2 27 00 e9 0f ff ff ff 0f 0b 0e 04 e5 bd 47 c0 e9 f3 fe ff ff 8b 54 24 0c 85 d2 74 09 EIP: [c01cb0d4] flush_current_atom+0x1cf/0x247 SS:ESP 0068:f5dabe0c 6note: ent:sdb2![1832] exited with preempt_count 2 Reiser4 used to check radix tree emptiness by comparing tree height against 0. With radix-tree-direct-data.patch not empty tree can have zero height. This patch makes reiser4 to check tree emptiness using tree root. Signed-off-by: Vladimir V. Saveliev [EMAIL PROTECTED] diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c fs/reiser4/jnode.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c --- linux-2.6.17-rc3-mm1/fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix 2006-05-08 12:54:12.0 +0400 +++ linux-2.6.17-rc3-mm1-vs/fs/reiser4/jnode.c 2006-05-08 12:54:57.0 +0400 @@ -432,7 +432,7 @@ static void inode_attach_jnode(jnode * n inode = node-key.j.mapping-host; info = reiser4_inode_data(inode); rtree = jnode_tree_by_reiser4_inode(info); - if (rtree-height == 0) { + if (rtree-rnode == NULL) { /* prevent inode from being pruned when it has jnodes attached to it */ write_lock_irq(inode-i_data.tree_lock); @@ -464,7 +464,7 @@ static void inode_detach_jnode(jnode * n /* delete jnode from inode's radix tree of jnodes */ check_me(zam-1046, radix_tree_delete(rtree, node-key.j.index)); - if (rtree-height == 0) { + if (rtree-rnode == NULL) { /* inode can be pruned now */ write_lock_irq(inode-i_data.tree_lock); inode-i_data.nrpages--; _
Re: Reiser4 2.6.16.2 / 2.6.17-rc3-mm1 WARNING: out of memory?
Hello On Sun, 2006-05-07 at 17:05 -0500, Yien Zheng wrote: I thought the patch might have fixed it for me, but it happened again. Sorry, first version of patch is not correct. Please unapply it and try the attached one. I think I'm getting the same error you were too but let me paste mine in case it adds any additional info: kernel BUG at fs/inode.c:251! invalid opcode: [#1] PREEMPT last sysfs file: /block/sdj/size Modules linked in: smbfs usbcore dm_mod CPU:0 EIP:0060:[c015c277]Not tainted VLI EFLAGS: 00010286 (2.6.17-rc3-mm1 #8) EIP is at clear_inode+0x16/0xa5 eax: c08fb8ac ebx: c08fb8ac ecx: c08fb8ac edx: c08fb8ac esi: c9cf9c80 edi: c08c941c ebp: c08fb8ac esp: c9862f1c ds: 007b es: 007b ss: 0068 Process emerge (pid: 8062, threadinfo=c9862000 task=cba5c590) Stack: 0c08fb8ac c9cf9c80 c018dc4b c08fb8ac c08fb8ac c018dbba c015cf0d c08fb8ac c86a7000 c0154c22 c08fb8ac c19c53c0 cc2e3a20 390b39cc 000d c86a7041 0010 0296 ca037d20 ca3ab2a0 0001 Call Trace: c018dc4b reiser4_delete_inode+0x91/0x9d c018dbba reiser4_delete_inode+0x0/0x9d c015cf0d generic_delete_inode+0x6c/0xea c0154c22 do_unlinkat+0xb7/0xfc c0155510 sys_renameat+0x58/0x60 c0154ca2 sys_unlink+0xb/0xe c02be48f syscall_call+0x7/0xb Code: c7 42 04 a8 03 31 c0 89 15 a8 03 31 c0 ff 0d 48 d8 39 c0 5b c3 56 53 8b 5c 24 0c 53 e8 ae cb fe ff 83 bb c4 00 00 00 00 58 74 08 0f 0b fb 00 f5 2a 2d c0 8b 83 1c 01 00 00 a8 10 75 08 0f 0b fc EIP: [c015c277] clear_inode+0x16/0xa5 SS:ESP 0068:c9862f1c 44reiser4[emerge(8062)]: release_unix_file (fs/reiser4/plugin/file/file.c:2670)[vs-44]: WARNING: out of memory? 4reiser4[emerge(8062)]: release_unix_file (fs/reiser4/plugin/file/file.c:2670)[vs-44]: WARNING: out of memory? On 5/6/06, Joseph Landers [EMAIL PROTECTED] wrote: Thanks for the patch, I still get the same (first post) error message, as before, only when booting into the reiser4 partition, although now the system stays up a bit longer before dying, I am able to execute commands and run programs for a few minutes before the system halting/becoming unresponsive booting on ext3 and mounting the reiser4 wseems to be fine, it's just peculiar, or maybe I am not using the reiser4 partition enough to make the memory problem significant? I have tried 2.6.16 to check if any bugs were introduced since then and that has the same problem, so I will just have to wait for a patch to fix this now I think the problem is in /fs/inode.c which is the kernel inode file, not reiser4s inode file, maybe reiser4 is sending the wrong delete inode command to it? It only seems to segfault once which is peculiar too? _ FREE pop-up blocking with the new MSN Toolbar – get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/ Reiser4 used to check radix tree emptiness by comparing tree height against 0. With radix-tree-direct-data.patch not empty tree can have zero height. This patch makes reiser4 to check tree emptiness using tree root. Signed-off-by: Vladimir V. Saveliev [EMAIL PROTECTED] diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c fs/reiser4/jnode.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c --- linux-2.6.17-rc3-mm1/fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix 2006-05-08 12:54:12.0 +0400 +++ linux-2.6.17-rc3-mm1-vs/fs/reiser4/jnode.c 2006-05-08 12:54:57.0 +0400 @@ -432,7 +432,7 @@ static void inode_attach_jnode(jnode * n inode = node-key.j.mapping-host; info = reiser4_inode_data(inode); rtree = jnode_tree_by_reiser4_inode(info); - if (rtree-height == 0) { + if (rtree-rnode == NULL) { /* prevent inode from being pruned when it has jnodes attached to it */ write_lock_irq(inode-i_data.tree_lock); @@ -464,7 +464,7 @@ static void inode_detach_jnode(jnode * n /* delete jnode from inode's radix tree of jnodes */ check_me(zam-1046, radix_tree_delete(rtree, node-key.j.index)); - if (rtree-height == 0) { + if (rtree-rnode == NULL) { /* inode can be pruned now */ write_lock_irq(inode-i_data.tree_lock); inode-i_data.nrpages--; _
Re: [BUG] Reiserfs panic while running fsstress due to multiple truncate safe links for a file.
Hello On Mon, 2006-05-08 at 17:03 +0530, Suzuki wrote: Resending, since there were no responses to the earlier post. Hi, I was working on a reiserfs panic with 2.6.17-rc3, while running fs stress tests. The panic message looked like : REISERFS: panic (device Null superblock): reiserfs[4248]: assertion !(truncate (REISERFS_I(inode)-i_flags i_link_saved_truncate_mask) ) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re exists for truncated inode 13b5a -- Summary of the problem --- Reiserfs uses safe links ( directory entries with some special key value) to keep track of truncated or unlinked files to ensure integrity across crashes. Whenever there is a truncate/unlink on a file, Reiserfs creates a safe link for the same and deletes the same once the operation is complete. If the machine crashes before committing the operation, whenever the fs is mounted next time, the fs will look for the saved links ( easy to find out, since they have special key) and commit the operation that was unfinished. The problem here occurs as follows: Whenever there is an extending DIO write operation, the fs would create a safe link so as to ensure the file size consistent, if there is crash in between the DIO. This will be deleted once the write operation finishes. If the DIO write happens to go through a HOLE region in the file, it will fall into normal buffered write, which is done through the address space operations prepare_write() commit_write(). Now, the prepare_write() might allocate blocks for the file (if needed). So if there is some error at a later point (say ENOSPC) in prepare_write(), we need to discard the allocated blocks. This is done by calling vmtruncate() on the file. This call leads to reiserfs specific truncate, which would try to add a save link for the file. This addition causes a reiserfs_panic, since there is already a save link stored for the file. I have a simple testcase to reproduce the problem, which does the same as described above. I will attach it if required. Any thoughts on how to fix this ? Thanks for the report. We will discuss how that should be fixed when may holidays are over here. thanks, Suzuki K P Linux Technology Centre, IBM Software Labs.
Re: bad bread
On Sun, 07 May 2006 10:35:44 +0200, PFC said: In the event of physical HD failure, the procedure goes like this: Get mail saying a HDD is dead. Replace harddisk, resynchronize RAID. Use Linux software RAID. Harddrives are cheaper that the time you'll lose trying to recover your data. Remember to take backups *anyhow*. That way, if the RAID controller dumps cow manure on all the sectors, you won't be saying Oh, SH*T. Also, note that there exist buggy RAID controllers, where if you are doing mirroring to 2 disks, and they develop bad blocks at different locations, you can trash the mirror by resynchronizing (basically, you swap out one of the bad disks, re-sync, it progresses as far as the bad block on the source for the mirror, and dies). pgpncHXAUBEls.pgp Description: PGP signature
Re: Comparing LFS and reiserfs4
On 5/7/06, Kristian Koehntopp [EMAIL PROTECTED] wrote: I am looking for a paper that contrasts the write strategies and organisation of reiser4 vs. the old Sprite and BSD Log Based File System (LFS). Does such a thing exist? no, but it should, it would be very interesting. there are a couple out-of-tree LFS implementions for linux floating around, too, that could be compared in a set of benchmarks. Reiser4's write strategy could be said to be a combination of LFS and WAFL, since the original LFS used inodes and indirect blocks, where WAFL and Reiser4 use tree structures. Reiser4 also does some in-place overwrite updates, where the data is written to journal blocks for atomicity, but is then copied over the old data to optimize read performance. So although Reiser4 always flushes data to a continuous stream like LFS, it sometimes does extra work also. The comments at the top of the Reiser4 source files are probably the most detailed and up-to-date descriptions of the flush strategy. it's not much, i admit, but it's better than most other linux code... NATE
Re: bad bread
[EMAIL PROTECTED] wrote (ao): On Sun, 07 May 2006 10:35:44 +0200, PFC said: In the event of physical HD failure, the procedure goes like this: Get mail saying a HDD is dead. Replace harddisk, resynchronize RAID. Use Linux software RAID. Harddrives are cheaper that the time you'll lose trying to recover your data. Remember to take backups *anyhow*. That way, if the RAID controller dumps cow manure on all the sectors, you won't be saying Oh, SH*T. Or user error (rm -rf, fdisk, dd, mkswap) or bad memory or fire or broken new kernel or script kiddies or worms/viruses or .. With kind regards, Sander -- Humilis IT Services and Solutions http://www.humilis.net
Re: reiser4 bug [was Re: 2.6.17-rc3-mm1]
Nope, did not work... regards Alex Am Dienstag, 9. Mai 2006 01:21 schrieb Joe Feise: Try the patch from here: http://marc.theaimsgroup.com/?l=reiserfsm=114709188305181w=2 That helped me get past the bootup phase (currently 8 hours uptime). -Joe Alexander Gran writes: Hi all, 2.6.17-rc3-mm1 doesn't get up running here, it bugs around while init runs: I cannot login afterwards, and syslog did not get the bug too. So here are some poor screenshots from my Treo650 (digicam is broken, sorry..;) EIP is in clear_inode. Trace: reiser4_delete_inode+0x6c/0xd0 d_delete+0xf0/0x10f reiser4_delete_inode+0x0/0xd0 generic_delete_inode+0x6b/0xfb input+0x5c/0x68 do_unlikat+0xd7/0x12c sysenter_past_esp+0x54/0x75 __hidp_send_ctrl_message+0xb4/0xfa details: http://zodiac.dnsalias.org/images/1.jpg http://zodiac.dnsalias.org/images/2.jpg http://zodiac.dnsalias.org/images/3.jpg http://zodiac.dnsalias.org/images/4.jpg Kernel config: http://zodiac.dnsalias.org/images/config System is my T40p, as usual. running an up2date debian unstable. regards Alex -- Encrypted Mails welcome. PGP-Key at http://zodiac.dnsalias.org/misc/pgpkey.asc | Key-ID: 0x6D7DD291 pgp043lcmFacP.pgp Description: PGP signature