[BUG] Reiserfs panic while running fsstress due to multiple truncate safe links for a file.

2006-05-08 Thread Suzuki

Resending, since there were no responses to the earlier post.

Hi,


I was working on a reiserfs panic with 2.6.17-rc3, while running fs
stress tests.

The panic message looked like :

 REISERFS: panic (device Null superblock): reiserfs[4248]: assertion
!(truncate  (REISERFS_I(inode)-i_flags  i_link_saved_truncate_mask)
) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re
exists for truncated inode 13b5a 

-- Summary of the problem ---

Reiserfs uses safe links ( directory entries with some special key
value) to keep track of truncated or unlinked files to ensure
integrity across crashes.

Whenever there is a truncate/unlink on a file, Reiserfs creates a safe
link for the same and deletes the same once the operation is complete.
If the machine crashes before committing the operation, whenever the fs
is mounted next time, the fs will look for the saved links ( easy to
find out, since they have special key) and commit the operation that was
unfinished.


The problem here occurs as follows:

 Whenever there is an extending DIO write operation, the fs would
create a safe link so as to ensure the file size consistent, if there is
crash in between the DIO. This will be deleted once the write operation
finishes.

 If the DIO write happens to go through a HOLE region in the file, it
will fall into normal buffered write, which is done  through the
address space operations prepare_write()  commit_write(). Now, the
prepare_write() might allocate blocks for the file (if needed). So if
there is some error at a later point (say ENOSPC) in prepare_write(), we
need to discard the allocated blocks. This is done by calling
vmtruncate() on the file. This call leads to reiserfs specific
truncate, which would try to add a save link for the file.

This addition causes a reiserfs_panic, since there is already a save
link stored for the file.


I have a simple testcase to reproduce the problem, which does the same 
as described above. I will attach it if required.


Any thoughts on how to fix this ?

thanks,

Suzuki K P
Linux Technology Centre,
IBM Software Labs.





Re: Reiserfs bug in 2.6.17-rc3-mm1

2006-05-08 Thread Vladimir V. Saveliev
Hello

On Sun, 2006-05-07 at 12:35 -0700, Joe Feise wrote:
 rc3-mm1 plus the reiser4-radix-tree-direct-data-fix.patch
 

first version of this patch is insufficient. Please unapply it and try
new version.

 -Joe
 
 dmesg output:
 kernel BUG at fs/reiser4/flush.c:1038!
 invalid opcode:  [#1]
 PREEMPT
 last sysfs file: /class/net/eth2/ifindex
 Modules linked in: pl2303 usbserial softdog cisco_ipsec snd_pcm_oss
 snd_mixer_oss snd_cs46xx gameport snd_rawmidi snd_seq_device snd_ac97_codec
 snd_ac97_bus snd_pcm snd_timer snd soundcore snd_page_alloc zoran i2c_algo_bit
 videodev saa7111 i2c_corepegasus arc4 ppp_mppe ppp_deflate ppp_generic slhc 
 usblp
 CPU:0
 EIP:0060:[c01cb0d4]Tainted: P  VLI
 EFLAGS: 00010287   (2.6.17-rc3-mm1 #4)
 EIP is at flush_current_atom+0x1cf/0x247
 eax: e0efc080   ebx: f4b45e00   ecx: f4b45e00   edx: f5c2e000
 esi: f5dabe4c   edi: f5daa000   ebp: 0001   esp: f5dabe0c
 ds: 007b   es: 007b   ss: 0068
 Process ent:sdb2! (pid: 1832, threadinfo=f5daa000 task=f5d89ab0)
 Stack: 0f5dabe18 0001 f5dabe90  c5db8a40 e0efc080 f5daa000 
 f5c21cd8
 c01c84b8 f5dabe4c  f5dabeec f5dabea8 f7e57dcc f5dabe90
e0efc080 f7e57d80  f7e57dcc f58d7c00 c01d6b60 0001 d75ba3ac
 Call Trace:
  c01c84b8 flush_some_atom+0x245/0x367   c01d6b60 writeout+0xc8/0x1e7
  c0175f27 generic_sync_sb_inodes+0x211/0x2a8   c01d857d 
 entd_flush+0x9f/0xbc
  c01d8257 entd+0xd5/0x2a3   c0128bdc autoremove_wake_function+0x0/0x43
  c0128bdc autoremove_wake_function+0x0/0x43   c01d8182 entd+0x0/0x2a3
  c01286f4 kthread+0x9c/0xa1   c0128658 kthread+0x0/0xa1
  c0100f1d kernel_thread_helper+0x5/0xb
 Code: 87 c7 04 24 20 72 47 c0 e8 b8 b4 f4 ff e8 ae 84 f3 ff e9 5c ff ff ff e8 
 59
 c2 27 00 e9 28 ff ff ff e8 4f c2 27 00 e9 0f ff ff ff 0f 0b 0e 04 e5 bd 47 
 c0
 e9 f3 fe ff ff 8b 54 24 0c 85 d2 74 09
 EIP: [c01cb0d4] flush_current_atom+0x1cf/0x247 SS:ESP 0068:f5dabe0c
  6note: ent:sdb2![1832] exited with preempt_count 2
 
 


Reiser4 used to check radix tree emptiness by comparing tree height against 0.
With radix-tree-direct-data.patch not empty tree can have zero height.
This patch makes reiser4 to check tree emptiness using tree root.

Signed-off-by: Vladimir V. Saveliev [EMAIL PROTECTED]

diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c


 fs/reiser4/jnode.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c
--- linux-2.6.17-rc3-mm1/fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix	2006-05-08 12:54:12.0 +0400
+++ linux-2.6.17-rc3-mm1-vs/fs/reiser4/jnode.c	2006-05-08 12:54:57.0 +0400
@@ -432,7 +432,7 @@ static void inode_attach_jnode(jnode * n
 	inode = node-key.j.mapping-host;
 	info = reiser4_inode_data(inode);
 	rtree = jnode_tree_by_reiser4_inode(info);
-	if (rtree-height == 0) {
+	if (rtree-rnode == NULL) {
 		/* prevent inode from being pruned when it has jnodes attached
 		   to it */
 		write_lock_irq(inode-i_data.tree_lock);
@@ -464,7 +464,7 @@ static void inode_detach_jnode(jnode * n
 
 	/* delete jnode from inode's radix tree of jnodes */
 	check_me(zam-1046, radix_tree_delete(rtree, node-key.j.index));
-	if (rtree-height == 0) {
+	if (rtree-rnode == NULL) {
 		/* inode can be pruned now */
 		write_lock_irq(inode-i_data.tree_lock);
 		inode-i_data.nrpages--;

_


Re: Reiser4 2.6.16.2 / 2.6.17-rc3-mm1 WARNING: out of memory?

2006-05-08 Thread Vladimir V. Saveliev
Hello

On Sun, 2006-05-07 at 17:05 -0500, Yien Zheng wrote:
 I thought the patch might have fixed it for me, but it happened again.

Sorry, first version of patch is not correct. Please unapply it and try
the attached one.

  I think I'm getting the same error you were too but let me paste mine
 in case it adds any additional info:
 
 kernel BUG at fs/inode.c:251!
 invalid opcode:  [#1]
 PREEMPT
 last sysfs file: /block/sdj/size
 Modules linked in: smbfs usbcore dm_mod
 CPU:0
 EIP:0060:[c015c277]Not tainted VLI
 EFLAGS: 00010286   (2.6.17-rc3-mm1 #8)
 EIP is at clear_inode+0x16/0xa5
 eax: c08fb8ac   ebx: c08fb8ac   ecx: c08fb8ac   edx: c08fb8ac
 esi: c9cf9c80   edi: c08c941c   ebp: c08fb8ac   esp: c9862f1c
 ds: 007b   es: 007b   ss: 0068
 Process emerge (pid: 8062, threadinfo=c9862000 task=cba5c590)
 Stack: 0c08fb8ac c9cf9c80 c018dc4b c08fb8ac c08fb8ac c018dbba
 c015cf0d c08fb8ac
 c86a7000 c0154c22 c08fb8ac c19c53c0 cc2e3a20 390b39cc 000d
c86a7041 0010   0296 ca037d20 ca3ab2a0 0001
 Call Trace:
  c018dc4b reiser4_delete_inode+0x91/0x9d   c018dbba
 reiser4_delete_inode+0x0/0x9d
  c015cf0d generic_delete_inode+0x6c/0xea   c0154c22 do_unlinkat+0xb7/0xfc
  c0155510 sys_renameat+0x58/0x60   c0154ca2 sys_unlink+0xb/0xe
  c02be48f syscall_call+0x7/0xb
 Code: c7 42 04 a8 03 31 c0 89 15 a8 03 31 c0 ff 0d 48 d8 39 c0 5b c3
 56 53 8b 5c 24 0c 53 e8 ae cb fe ff 83 bb c4 00 00 00 00 58 74 08 0f
 0b fb 00 f5 2a 2d c0 8b 83 1c 01 00 00 a8 10 75 08 0f 0b fc
 EIP: [c015c277] clear_inode+0x16/0xa5 SS:ESP 0068:c9862f1c
  44reiser4[emerge(8062)]: release_unix_file
 (fs/reiser4/plugin/file/file.c:2670)[vs-44]:
 WARNING: out of memory?
 4reiser4[emerge(8062)]: release_unix_file
 (fs/reiser4/plugin/file/file.c:2670)[vs-44]:
 WARNING: out of memory?
 
 On 5/6/06, Joseph Landers [EMAIL PROTECTED] wrote:
  Thanks for the patch, I still get the same (first post) error message, as
  before, only when  booting into the reiser4 partition, although now the
  system stays up a bit longer before dying, I am able to execute commands and
  run programs for a few minutes before the system halting/becoming
  unresponsive
 
  booting on ext3 and mounting the reiser4 wseems to be fine, it's just
  peculiar, or maybe I am not using the reiser4 partition enough to make the
  memory problem significant?
 
  I have tried 2.6.16 to check if any bugs were introduced since then and that
  has the same problem, so I will just have to wait for a patch to fix this
  now
 
  I think the problem is in /fs/inode.c which is the kernel inode file, not
  reiser4s inode file, maybe reiser4 is sending the wrong delete inode command
  to it?
 
  It only seems to segfault once which is peculiar too?
 
  _
  FREE pop-up blocking with the new MSN Toolbar – get it now!
  http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
 
 
 


Reiser4 used to check radix tree emptiness by comparing tree height against 0.
With radix-tree-direct-data.patch not empty tree can have zero height.
This patch makes reiser4 to check tree emptiness using tree root.

Signed-off-by: Vladimir V. Saveliev [EMAIL PROTECTED]

diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c


 fs/reiser4/jnode.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff -puN fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix fs/reiser4/jnode.c
--- linux-2.6.17-rc3-mm1/fs/reiser4/jnode.c~reiser4-radix-tree-direct-data-fix	2006-05-08 12:54:12.0 +0400
+++ linux-2.6.17-rc3-mm1-vs/fs/reiser4/jnode.c	2006-05-08 12:54:57.0 +0400
@@ -432,7 +432,7 @@ static void inode_attach_jnode(jnode * n
 	inode = node-key.j.mapping-host;
 	info = reiser4_inode_data(inode);
 	rtree = jnode_tree_by_reiser4_inode(info);
-	if (rtree-height == 0) {
+	if (rtree-rnode == NULL) {
 		/* prevent inode from being pruned when it has jnodes attached
 		   to it */
 		write_lock_irq(inode-i_data.tree_lock);
@@ -464,7 +464,7 @@ static void inode_detach_jnode(jnode * n
 
 	/* delete jnode from inode's radix tree of jnodes */
 	check_me(zam-1046, radix_tree_delete(rtree, node-key.j.index));
-	if (rtree-height == 0) {
+	if (rtree-rnode == NULL) {
 		/* inode can be pruned now */
 		write_lock_irq(inode-i_data.tree_lock);
 		inode-i_data.nrpages--;

_


Re: [BUG] Reiserfs panic while running fsstress due to multiple truncate safe links for a file.

2006-05-08 Thread Vladimir V. Saveliev
Hello

On Mon, 2006-05-08 at 17:03 +0530, Suzuki wrote:
 Resending, since there were no responses to the earlier post.
 
 Hi,
 
 
 I was working on a reiserfs panic with 2.6.17-rc3, while running fs
 stress tests.
 
 The panic message looked like :
 
  REISERFS: panic (device Null superblock): reiserfs[4248]: assertion
 !(truncate  (REISERFS_I(inode)-i_flags  i_link_saved_truncate_mask)
 ) failed at fs/reiserfs/super.c:328:add_save_link: saved link already re
 exists for truncated inode 13b5a 
 
 -- Summary of the problem ---
 
 Reiserfs uses safe links ( directory entries with some special key
 value) to keep track of truncated or unlinked files to ensure
 integrity across crashes.
 
 Whenever there is a truncate/unlink on a file, Reiserfs creates a safe
 link for the same and deletes the same once the operation is complete.
 If the machine crashes before committing the operation, whenever the fs
 is mounted next time, the fs will look for the saved links ( easy to
 find out, since they have special key) and commit the operation that was
 unfinished.
 
 
 The problem here occurs as follows:
 
   Whenever there is an extending DIO write operation, the fs would
 create a safe link so as to ensure the file size consistent, if there is
 crash in between the DIO. This will be deleted once the write operation
 finishes.
 
   If the DIO write happens to go through a HOLE region in the file, it
 will fall into normal buffered write, which is done  through the
 address space operations prepare_write()  commit_write(). Now, the
 prepare_write() might allocate blocks for the file (if needed). So if
 there is some error at a later point (say ENOSPC) in prepare_write(), we
 need to discard the allocated blocks. This is done by calling
 vmtruncate() on the file. This call leads to reiserfs specific
 truncate, which would try to add a save link for the file.
 
 This addition causes a reiserfs_panic, since there is already a save
 link stored for the file.
 
 
 I have a simple testcase to reproduce the problem, which does the same 
 as described above. I will attach it if required.
 
 Any thoughts on how to fix this ?
 

Thanks for the report. We will discuss how that should be fixed when may
holidays are over here.

 thanks,
 
 Suzuki K P
 Linux Technology Centre,
 IBM Software Labs.
 
 
 
 



Re: bad bread

2006-05-08 Thread Valdis . Kletnieks
On Sun, 07 May 2006 10:35:44 +0200, PFC said:
 
  In the event of physical HD failure, the procedure goes like this:
 
   Get mail saying a HDD is dead. Replace harddisk, resynchronize RAID.
   Use Linux software RAID. Harddrives are cheaper that the time you'll 
 lose  
 trying to recover your data.

Remember to take backups *anyhow*.  That way, if the RAID controller dumps
cow manure on all the sectors, you won't be saying Oh, SH*T.

Also, note that there exist buggy RAID controllers, where if you are doing
mirroring to 2 disks, and they develop bad blocks at different locations,
you can trash the mirror by resynchronizing (basically, you swap out one of
the bad disks, re-sync, it progresses as far as the bad block on the source
for the mirror, and dies).



pgpncHXAUBEls.pgp
Description: PGP signature


Re: Comparing LFS and reiserfs4

2006-05-08 Thread Nate Diller

On 5/7/06, Kristian Koehntopp [EMAIL PROTECTED] wrote:


I am looking for a paper that contrasts the write strategies and organisation
of reiser4 vs. the old Sprite and BSD Log Based File System (LFS).

Does such a thing exist?


no, but it should, it would be very interesting.  there are a couple
out-of-tree LFS implementions for linux floating around, too, that
could be compared in a set of benchmarks.

Reiser4's write strategy could be said to be a combination of LFS and
WAFL, since the original LFS used inodes and indirect blocks, where
WAFL and Reiser4 use tree structures.  Reiser4 also does some in-place
overwrite updates, where the data is written to journal blocks for
atomicity, but is then copied over the old data to optimize read
performance.  So although Reiser4 always flushes data to a continuous
stream like LFS, it sometimes does extra work also.

The comments at the top of the Reiser4 source files are probably the
most detailed and up-to-date descriptions of the flush strategy.  it's
not much, i admit, but it's better than most other linux code...

NATE


Re: bad bread

2006-05-08 Thread Sander
[EMAIL PROTECTED] wrote (ao):
 On Sun, 07 May 2006 10:35:44 +0200, PFC said:
   In the event of physical HD failure, the procedure goes like this:
  
  Get mail saying a HDD is dead. Replace harddisk, resynchronize RAID.
  Use Linux software RAID. Harddrives are cheaper that the time you'll 
  lose
  trying to recover your data.
 
 Remember to take backups *anyhow*. That way, if the RAID controller dumps
 cow manure on all the sectors, you won't be saying Oh, SH*T.

Or user error (rm -rf, fdisk, dd, mkswap) or bad memory or fire or
broken new kernel or script kiddies or worms/viruses or ..

With kind regards, Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net


Re: reiser4 bug [was Re: 2.6.17-rc3-mm1]

2006-05-08 Thread Alexander Gran
Nope, did not work...
regards
Alex

Am Dienstag, 9. Mai 2006 01:21 schrieb Joe Feise:
 Try the patch from here:
 http://marc.theaimsgroup.com/?l=reiserfsm=114709188305181w=2
 That helped me get past the bootup phase (currently 8 hours uptime).

  -Joe

 Alexander Gran writes:
  Hi all,
 
  2.6.17-rc3-mm1 doesn't get up running  here, it bugs around while init
  runs: I cannot login afterwards, and syslog did not get the bug too. So
  here are some poor screenshots from my Treo650 (digicam is broken,
  sorry..;) EIP is in clear_inode.
  Trace:
  reiser4_delete_inode+0x6c/0xd0
  d_delete+0xf0/0x10f
  reiser4_delete_inode+0x0/0xd0
  generic_delete_inode+0x6b/0xfb
  input+0x5c/0x68
  do_unlikat+0xd7/0x12c
  sysenter_past_esp+0x54/0x75
  __hidp_send_ctrl_message+0xb4/0xfa
  details:
  http://zodiac.dnsalias.org/images/1.jpg
  http://zodiac.dnsalias.org/images/2.jpg
  http://zodiac.dnsalias.org/images/3.jpg
  http://zodiac.dnsalias.org/images/4.jpg
  Kernel config:
  http://zodiac.dnsalias.org/images/config
  System is my T40p, as usual. running an up2date debian unstable.
 
  regards
  Alex

-- 
Encrypted Mails welcome.
PGP-Key at http://zodiac.dnsalias.org/misc/pgpkey.asc | Key-ID: 0x6D7DD291


pgp043lcmFacP.pgp
Description: PGP signature