[2.6.19-rc2-mm1] error: too few arguments to function ‘crypto_alloc_hash’

2006-10-17 Thread Andrew James Wade
Hello,

The latest -mm introduced a new error:

  CC  fs/reiser4/plugin/crypto/digest.o
fs/reiser4/plugin/crypto/digest.c: In function ‘alloc_sha256’:
fs/reiser4/plugin/crypto/digest.c:17: error: too few arguments to function 
‘crypto_alloc_hash’
make[2]: *** [fs/reiser4/plugin/crypto/digest.o] Error 1
make[1]: *** [fs/reiser4] Error 2
make: *** [fs] Error 2

Andrew Wade


[patch] Fix use after free in jrelse_tail

2006-09-01 Thread Andrew James Wade
Hello Alexander,

[nikita-1936] assertion failed: reiser4_no_counters_are_held()
turned out to be a bug in the debugging code. I've applied the patch
below and haven't had a recurrence.

Cheers,
Andrew Wade

signed-off-by [EMAIL PROTECTED]

diff -rupN a/fs/reiser4/jnode.c b/fs/reiser4/jnode.c
--- a/fs/reiser4/jnode.c2006-09-01 16:44:51.0 -0400
+++ b/fs/reiser4/jnode.c2006-09-01 16:58:06.0 -0400
@@ -999,10 +999,10 @@ void jrelse_tail(jnode * node /* jnode t
 {
assert(nikita-489, atomic_read(node-d_count)  0);
atomic_dec(node-d_count);
-   /* release reference acquired in jload_gfp() or jinit_new() */
-   jput(node);
if (jnode_is_unformatted(node) || jnode_is_znode(node))
LOCK_CNT_DEC(d_refs);
+   /* release reference acquired in jload_gfp() or jinit_new() */
+   jput(node);
 }
 
 /* drop reference to node data. When last reference is dropped, data are


Re: assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)

2006-08-30 Thread Andrew James Wade
On Wednesday 30 August 2006 06:26, Alexander Zarochentsev wrote:
 On 30 August 2006 01:38, Andrew James Wade wrote:
  I now have a stack trace for this assertion:
 
 there  is a race between znode_make_dirty and flushing dirty node to 
 disk.  I guess (but not sure by 100%) it has no bad effect so the 
 assertion is wrong.
 
Okay, I'll change that to a WARN_ON in my tree and see what falls out.

Thanks,
Andrew Wade


[patch] Re: assertion failed: can_hit_entd(ctx, s)

2006-08-29 Thread Andrew James Wade
Hello Alexander,

In addition to your patch, I've also applied the patch below. With
these two patches the fs is much more stable for me.

However, something is holding a d_ref across the calls to
reiser4_writepage. It's not clear to me that this is allowed so my
patch may not be a full fix.

Andrew Wade

signed-off-by: [EMAIL PROTECTED]

diff -rupN a/fs/reiser4/plugin/item/extent_file_ops.c 
b/fs/reiser4/plugin/item/extent_file_ops.c
--- a/fs/reiser4/plugin/item/extent_file_ops.c  2006-08-28 11:30:33.0 
-0400
+++ b/fs/reiser4/plugin/item/extent_file_ops.c  2006-08-29 13:06:20.0 
-0400
@@ -1320,20 +1320,22 @@ static int extent_readpage_filler(void *
  TWIG_LEVEL, CBK_UNIQUE, NULL);
if (result != CBK_COORD_FOUND) {
reiser4_unset_hint(hint);
-   return result;
+   goto out;
}
ext_coord-valid = 0;
}
 
if (zload(ext_coord-coord.node)) {
reiser4_unset_hint(hint);
-   return RETERR(-EIO);
+   result = RETERR(-EIO);
+   goto out;
}
if (!item_is_extent(ext_coord-coord)) {
/* tail conversion is running in parallel */
zrelse(ext_coord-coord.node);
reiser4_unset_hint(hint);
-   return RETERR(-EIO);
+   result = RETERR(-EIO);
+   goto out;
}
 
if (ext_coord-valid == 0)
@@ -1358,6 +1360,10 @@ static int extent_readpage_filler(void *
} else
reiser4_unset_hint(hint);
zrelse(ext_coord-coord.node);
+
+out:
+   /* Calls to this function may be intermingled with VM writeback. */
+   reiser4_txn_restart_current();
return result;
 }
 


Re: assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)

2006-08-29 Thread Andrew James Wade
I now have a stack trace for this assertion:


reiser4 panicked cowardly: reiser4[tar(5412)]: reiser4_set_page_dirty_internal 
(fs/reiser4/page_cache.c:475)[]:
assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)
 [c0103870] dump_trace+0x64/0x1ad
 [c01039cb] show_trace_log_lvl+0x12/0x25
 [c0103cc1] show_trace+0xd/0x10
 [c0103cdb] dump_stack+0x17/0x19
 [c01a0caf] reiser4_do_panic+0x4e/0x84
 [c01e66e9] reiser4_set_page_dirty_internal+0xe5/0xee
 [c01cb2a4] znode_make_dirty+0x271/0x452
 [c022108f] cut_node40+0x191/0x1a6
 [c0221410] shift_node40+0x36c/0x91d
 [c01b11ba] carry_shift_data+0xaa/0x139
 [c01b2e1b] carry_insert_flow+0x1de/0x837
 [c01b008f] reiser4_carry+0x185/0x49a
 [c01b8d77] reiser4_insert_flow+0x16b/0x17e
 [c022deec] reiser4_write_tail+0x5cd/0x685
 [c020445f] batch_write_unix_file+0x26e/0x467
 [c0133347] generic_file_buffered_write+0xd2/0x1fb
 [c0135035] __generic_file_aio_write_nolock+0x3a8/0x3e5
 [c01350ca] generic_file_aio_write+0x58/0xab
 [c014eda6] do_sync_write+0xb4/0xf2
 [c014f38e] vfs_write+0x8a/0x136
 [c014faca] sys_write+0x3b/0x60
 [c01028cd] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d

Leftover inexact backtrace:

 [c01039cb] show_trace_log_lvl+0x12/0x25
 [c0103cc1] show_trace+0xd/0x10
 [c0103cdb] dump_stack+0x17/0x19
 [c01a0caf] reiser4_do_panic+0x4e/0x84
 [c01e66e9] reiser4_set_page_dirty_internal+0xe5/0xee
 [c01cb2a4] znode_make_dirty+0x271/0x452
 [c022108f] cut_node40+0x191/0x1a6
 [c0221410] shift_node40+0x36c/0x91d
 [c01b11ba] carry_shift_data+0xaa/0x139
 [c01b2e1b] carry_insert_flow+0x1de/0x837
 [c01b008f] reiser4_carry+0x185/0x49a
 [c01b8d77] reiser4_insert_flow+0x16b/0x17e
 [c022deec] reiser4_write_tail+0x5cd/0x685
 [c020445f] batch_write_unix_file+0x26e/0x467
 [c0133347] generic_file_buffered_write+0xd2/0x1fb
 [c0135035] __generic_file_aio_write_nolock+0x3a8/0x3e5
 [c01350ca] generic_file_aio_write+0x58/0xab
 [c014eda6] do_sync_write+0xb4/0xf2
 [c014f38e] vfs_write+0x8a/0x136
 [c014faca] sys_write+0x3b/0x60
 [c01028cd] sysenter_past_esp+0x56/0x8d
 ===
: jnode: 0, tree: 0 (r:0,w:0), dk: 0 (r:0,w:0)
jload: 0, txnh: 0, atom: 0, stack: 0, txnmgr: 0, ktxnmgrd: 0, fq: 0
inode: 0, cbk_cache: 0 (r:0,w0), eflush: 0, zlock: 0,
spin: 0, long: 3 inode_sem: (r:0,w:1)
d: 3, x: 6, t: 0
Kernel panic - not syncing: reiser4[tar(5412)]: reiser4_set_page_dirty_internal 
(fs/reiser4/page_cache.c:475)[]:
assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)

Andrew Wade


Re: assertion failed: can_hit_entd(ctx, s)

2006-08-25 Thread Andrew James Wade
 btw, is [EMAIL PROTECTED] 
 your mail address (it is from Reply-To:) ?
Reply-to fixed; thanks. The above address is an ephemeral address
I've subscribed to the mailing list and could go away at any time.

 can you please try the following patch:

Will do.

Andrew Wade



Re: Reiser4 stress test.

2006-08-22 Thread Andrew James Wade
On Tuesday 22 August 2006 01:23, Hans Reiser wrote:
 Thanks Andrew, please be patient and persistent with us at this time, as
 one programmer is on vacation, and the other is only able to work a few
 hours a day due to an illness.

No problem. I'll post what I find to the list; the posts will still be
there when you have the time to devote to solving bugs. The delay will
do me no harm whatsoever and I may even get to the bottom of one or
two bugs in the meantime. (I happen to have time to spare at the
moment).

Andrew Wade


Re: Reiser4 stress test.

2006-08-22 Thread Andrew James Wade
On Tuesday 22 August 2006 01:23, Hans Reiser wrote:
 Thanks Andrew, please be patient and persistent with us at this time, as
 one programmer is on vacation, and the other is only able to work a few
 hours a day due to an illness.

No problem. I'll post what I find to the list; the posts will still be
there when you have time to devote to chasing bugs. They're not urgent
problems for me; I just happen to have the time and interest to devote
myself to solving them right now, and it appears I'll be able to muddle
through the code okay.

Andrew


Reiser4 stress test.

2006-08-21 Thread Andrew James Wade
Hello,

 I've been having problems with Reiser 4 panicking for a few
months, and I've recently had time to investigate the matter. I've
created a program that can crash my system in a few minutes. It's
based on kmail's disk activity and consists of small, separated writes
to a file that is also mmapped.

=== scatteredwrites ===
#!/usr/bin/python

import os
import mmap
import optparse

parser = optparse.OptionParser(description=
Creates a file in $CWD and performs a pattern of reads and writes to it in an 
attempt to trigger fs bugs. The file is broken up into regions: for each 
region the entire region is read, then some portion of it is written to.
\nDistilled from kmail workload.)

parser.add_option(--region-size, dest=regionsize, default=65536,
type=int, help=Set region size to BYTES, metavar=BYTES)
parser.add_option(--region-count, dest=regioncount, default=2048,
type=int, help=Set number of regions to COUNT, metavar=COUNT)
parser.add_option(--write-offset, dest=writeoffset, default=0,
type=int, help=Offset write by BYTES in each region, metavar=BYTES)
parser.add_option(--write-size, dest=writesize, default=256,
type=int, help=Size of write in each region., metavar=BYTES)

options, args = parser.parse_args()


f = open(scatteredwrites.%d.tmp % (os.getpid()), w+b)

try:
writestr = A * options.regionsize
for i in xrange(options.regioncount):
f.write(writestr)
f.close()

f = open(scatteredwrites.%d.tmp % (os.getpid()), r+b)

writestr = B * options.writesize

dummy = mmap.mmap(f.fileno(), options.regionsize * options.regioncount,
  mmap.MAP_SHARED)

while True:
for i in xrange(options.regioncount):
f.seek(i * options.regionsize, 0)
f.read(options.regionsize)
f.seek(- options.regionsize + options.writeoffset,1)
f.write(writestr)

except KeyboardInterrupt:
os.unlink(scatteredwrites.%d.tmp % (os.getpid()))

==
Without fs load this stress test rarely causes problems. But with five
instances running in parallel with five instances of a large grep (or
patch, or tar), my computer crashes on a timescale of 10 minutes.


I've also added a few patches to my kernel to help me debug the
problems I've been having:

diff -rupN a/fs/reiser4/page_cache.c b/fs/reiser4/page_cache.c
--- a/fs/reiser4/page_cache.c   2006-08-19 19:45:57.0 -0400
+++ b/fs/reiser4/page_cache.c   2006-08-19 20:23:43.0 -0400
@@ -489,12 +489,9 @@ static int can_hit_entd(reiser4_context 
return 1;
if (ctx-super != s)
return 1;
-   if (get_super_private(s)-entd.tsk == current)
-   return 0;
-   if (!lock_stack_isclean(ctx-stack))
-   return 0;
-   if (ctx-trans-atom != NULL)
-   return 0;
+   assert(ajw-1, get_super_private(s)-entd.tsk != current);
+   assert(ajw-2, lock_stack_isclean(ctx-stack));
+   assert(ajw-3, ctx-trans-atom == NULL);
return 1;
 }
 
diff -rupN 2.6.18-rc4-mm1/fs/reiser4/debug.c linux/fs/reiser4/debug.c
--- 2.6.18-rc4-mm1/fs/reiser4/debug.c   2006-08-18 19:21:13.0 -0400
+++ linux/fs/reiser4/debug.c2006-08-18 19:24:35.0 -0400
@@ -56,6 +56,9 @@ static char panic_buf[REISER4_PANIC_MSG_
  */
 static DEFINE_SPINLOCK(panic_guard);
 
+static void print_lock_counters(const char *prefix,
+const reiser4_lock_counters_info * info);
+
 /* Your best friend. Call it on each occasion.  This is called by
 fs/reiser4/debug.h:reiser4_panic(). */
 void reiser4_do_panic(const char *format /* format string */ , ... /* rest */ )
@@ -74,6 +77,8 @@ void reiser4_do_panic(const char *format
vsnprintf(panic_buf, sizeof(panic_buf), format, args);
va_end(args);
printk(KERN_EMERG reiser4 panicked cowardly: %s, panic_buf);
+   dump_stack();
+   print_lock_counters(,reiser4_lock_counters());
spin_unlock(panic_guard);
 
/*

I've also added this bugfix by Alexander Zarochentsev [EMAIL PROTECTED]:

Index: linux-2.6-git/fs/reiser4/as_ops.c
===
--- linux-2.6-git.orig/fs/reiser4/as_ops.c
+++ linux-2.6-git/fs/reiser4/as_ops.c
@@ -350,6 +350,11 @@ int reiser4_releasepage(struct page *pag
if (PageDirty(page))
return 0;
 
+   /* extra page reference is used by reiser4 to protect
+* jnode-page link from this -releasepage(). */
+   if (page_count(page)  3)
+   return 0;
+
/* releasable() needs jnode lock, because it looks at the jnode fields
 * and we need jload_lock here to avoid races with jload(). */
spin_lock_jnode(node);


Andrew Wade


assertion failed: can_hit_entd(ctx, s)

2006-08-21 Thread Andrew James Wade
This is the most common panic I've been getting. It looks like this:
(2.6.18-rc4-mm1)

reiser4 panicked cowardly: reiser4[scatteredwrites(4506)]: reiser4_writepage 
(fs/reiser4/page_cache.c:522)[]:
assertion failed: can_hit_entd(ctx, s)
Kernel panic - not syncing: reiser4[scatteredwrites(4506)]: reiser4_writepage 
(fs/reiser4/page_cache.c:522)[]:
assertion failed: can_hit_entd(ctx, s)

With the extra patches it looks like this:
(2.6.18-rc4-mm2)

reiser4 panicked cowardly: reiser4[grep(4918)]: can_hit_entd 
(fs/reiser4/page_cache.c:494)[ajw-3]:
assertion failed: ctx-trans-atom == NULL
 [c0103870] dump_trace+0x64/0x1ad
 [c01039cb] show_trace_log_lvl+0x12/0x25
 [c0103cc1] show_trace+0xd/0x10
 [c0103cdb] dump_stack+0x17/0x19
 [c01a0ccf] reiser4_do_panic+0x4e/0x7b
 [c01e67b1] reiser4_writepage+0xab/0x1a8
 [c013b973] shrink_inactive_list+0x37d/0x6f0
 [c013bd94] shrink_zone+0xae/0xcc
 [c013c265] try_to_free_pages+0x139/0x20d
 [c0136f12] __alloc_pages+0x189/0x27d
 [c014c2ce] cache_alloc_refill+0x2d2/0x5a0
 [c014bfc7] kmem_cache_alloc+0x70/0xa5
 [c01eb68c] reiser4_alloc_inode+0x51/0xfa
 [c0163adc] alloc_inode+0x14/0x122
 [c0164ad5] iget5_locked+0x3f/0x132
 [c01f4091] reiser4_iget+0x8b/0x361
 [c01fadd8] reiser4_lookup_common+0xef/0x151
 [c015aef7] do_lookup+0xa0/0x13d
 [c015b72f] __link_path_walk+0x79b/0xbd4
 [c015bbb6] link_path_walk+0x4e/0xc6
 [c015c0e3] do_path_lookup+0x203/0x21d
 [c015c544] __path_lookup_intent_open+0x44/0x76
 [c015c5d2] path_lookup_open+0x10/0x12
 [c015c7c7] open_namei+0x61/0x570
 [c014e72d] do_filp_open+0x1f/0x35
 [c014e83e] do_sys_open+0x3f/0xba
 [c014e8e5] sys_open+0x16/0x18
 [c01028cd] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d

Leftover inexact backtrace:

 [c01039cb] show_trace_log_lvl+0x12/0x25
 [c0103cc1] show_trace+0xd/0x10
 [c0103cdb] dump_stack+0x17/0x19
 [c01a0ccf] reiser4_do_panic+0x4e/0x7b
 [c01e67b1] reiser4_writepage+0xab/0x1a8
 [c013b973] shrink_inactive_list+0x37d/0x6f0
 [c013bd94] shrink_zone+0xae/0xcc
 [c013c265] try_to_free_pages+0x139/0x20d
 [c0136f12] __alloc_pages+0x189/0x27d
 [c014c2ce] cache_alloc_refill+0x2d2/0x5a0
 [c014bfc7] kmem_cache_alloc+0x70/0xa5
 [c01eb68c] reiser4_alloc_inode+0x51/0xfa
 [c0163adc] alloc_inode+0x14/0x122
 [c0164ad5] iget5_locked+0x3f/0x132
 [c01f4091] reiser4_iget+0x8b/0x361
 [c01fadd8] reiser4_lookup_common+0xef/0x151
 [c015aef7] do_lookup+0xa0/0x13d
 [c015b72f] __link_path_walk+0x79b/0xbd4
 [c015bbb6] link_path_walk+0x4e/0xc6
 [c015c0e3] do_path_lookup+0x203/0x21d
 [c015c544] __path_lookup_intent_open+0x44/0x76
 [c015c5d2] path_lookup_open+0x10/0x12
 [c015c7c7] open_namei+0x61/0x570
 [c014e72d] do_filp_open+0x1f/0x35
 [c014e83e] do_sys_open+0x3f/0xba
 [c014e8e5] sys_open+0x16/0x18
 [c01028cd] sysenter_past_esp+0x56/0x8d
 ===
: jnode: 0, tree: 0 (r:0,w:0), dk: 0 (r:0,w:0)
jload: 0, txnh: 0, atom: 0, stack: 0, txnmgr: 0, ktxnmgrd: 0, fq: 0
inode: 0, cbk_cache: 0 (r:0,w0), eflush: 0, zlock: 0,
spin: 0, long: 0 inode_sem: (r:0,w:0)
d: 0, x: 0, t: 0
Kernel panic - not syncing: reiser4[grep(4918)]: can_hit_entd 
(fs/reiser4/page_cache.c:494)[ajw-3]:
assertion failed: ctx-trans-atom == NULL




reiser4 panicked cowardly: reiser4[scatteredwrites(4245)]: can_hit_entd 
(fs/reiser4/page_cache.c:494)[ajw-3]:
assertion failed: ctx-trans-atom == NULL
 [c0103870] dump_trace+0x64/0x1ad
 [c01039cb] show_trace_log_lvl+0x12/0x25
 [c0103cc1] show_trace+0xd/0x10
 [c0103cdb] dump_stack+0x17/0x19
 [c01a0ccf] reiser4_do_panic+0x4e/0x7b
 [c01e67b1] reiser4_writepage+0xab/0x1a8
 [c013b973] shrink_inactive_list+0x37d/0x6f0
 [c013bd94] shrink_zone+0xae/0xcc
 [c013c265] try_to_free_pages+0x139/0x20d
 [c0136f12] __alloc_pages+0x189/0x27d
 [c01388a7] __do_page_cache_readahead+0xcc/0x1d2
 [c0138f07] blockable_page_cache_readahead+0x51/0xd9
 [c0139010] make_ahead_window+0x81/0xa4
 [c013918a] page_cache_readahead+0x157/0x176
 [c023aa82] reiser4_read_extent+0x374/0x6ab
 [c020511f] read_unix_file+0x5c7/0x762
 [c014f1e2] vfs_read+0x88/0x134
 [c014fa4e] sys_read+0x3b/0x60
 [c01028cd] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d

Leftover inexact backtrace:

 [c01039cb] show_trace_log_lvl+0x12/0x25
 [c0103cc1] show_trace+0xd/0x10
 [c0103cdb] dump_stack+0x17/0x19
 [c01a0ccf] reiser4_do_panic+0x4e/0x7b
 [c01e67b1] reiser4_writepage+0xab/0x1a8
 [c013b973] shrink_inactive_list+0x37d/0x6f0
 [c013bd94] shrink_zone+0xae/0xcc
 [c013c265] try_to_free_pages+0x139/0x20d
 [c0136f12] __alloc_pages+0x189/0x27d
 [c01388a7] __do_page_cache_readahead+0xcc/0x1d2
 [c0138f07] blockable_page_cache_readahead+0x51/0xd9
 [c0139010] make_ahead_window+0x81/0xa4
 [c013918a] page_cache_readahead+0x157/0x176
 [c023aa82] reiser4_read_extent+0x374/0x6ab
 [c020511f] read_unix_file+0x5c7/0x762
 [c014f1e2] vfs_read+0x88/0x134
 [c014fa4e] sys_read+0x3b/0x60
 [c01028cd] sysenter_past_esp+0x56/0x8d
 ===
: jnode: 0, tree: 0 (r:0,w:0), dk: 0 (r:0,w:0)
jload: 0, txnh: 0, atom: 0, stack: 0, 

assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)

2006-08-21 Thread Andrew James Wade

This one hasn't recurred, so I don't have a stack trace. I haven't
looked into it.
(2.6.18-rc4-mm1)

reiser4 panicked cowardly: reiser4[patch(9302)]: reiser4_set_page_dirty_internal
 (fs/reiser4/page_cache.c:475)[]:
assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)
Kernel panic - not syncing: reiser4[patch(9302)]: reiser4_set_page_dirty_interna
l (fs/reiser4/page_cache.c:475)[]:
assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)

Andrew Wade


assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))

2006-08-21 Thread Andrew James Wade

I looked at this one for a bit; I couldn't make any headway. I don't
fully understand what the debugging code for the delimiting keys is
doing.

(2.6.18-rc4-mm1)

reiser4 panicked cowardly: reiser4[ent:hdb1!(2167)]: sibling_list_remove 
(fs/reiser4/tree_walk.c:814)[zam-32245]:
assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))
Kernel panic - not syncing: reiser4[ent:hdb1!(2167)]: sibling_list_remove 
(fs/reiser4/tree_walk.c:814)[zam-32245]:
assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))



(2.6.18-rc4-mm1)

reiser4 panicked cowardly: reiser4[ent:hdb1!(2175)]: sibling_list_remove 
(fs/reiser4/tree_walk.c:814)[zam-32245]:
assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))
 [c0103754] dump_trace+0x64/0x181
 [c0103883] show_trace_log_lvl+0x12/0x25
 [c0103b79] show_trace+0xd/0x10
 [c0103b93] dump_stack+0x17/0x19
 [c01a0663] reiser4_do_panic+0x4e/0x7b
 [c01ee6bd] sibling_list_remove+0x85/0x52e
 [c01ba97d] forget_znode+0x22b/0x33b
 [c01b76e0] longterm_unlock_znode+0x268/0x723
 [c01da260] handle_pos_on_formatted+0x35c/0x45f
 [c01da3fc] handle_pos_on_leaf+0x4d/0x61
 [c01d6a84] squalloc+0x16/0x52
 [c01d89f7] jnode_flush+0x80e/0x99d
 [c01d8fee] flush_current_atom+0x468/0x722
 [c01cf073] flush_some_atom+0x9c3/0xb13
 [c01f4216] reiser4_writeout+0x1a6/0x30c
 [c01f554b] entd+0x1e2/0x3d5
 [c0124545] kthread+0xaf/0xde
 [c03eb415] kernel_thread_helper+0x5/0xb
DWARF2 unwinder stuck at kernel_thread_helper+0x5/0xb
Leftover inexact backtrace:
 [c0103883] show_trace_log_lvl+0x12/0x25
 [c0103b79] show_trace+0xd/0x10
 [c0103b93] dump_stack+0x17/0x19
 [c01a0663] reiser4_do_panic+0x4e/0x7b
 [c01ee6bd] sibling_list_remove+0x85/0x52e
 [c01ba97d] forget_znode+0x22b/0x33b
 [c01b76e0] longterm_unlock_znode+0x268/0x723
 [c01da260] handle_pos_on_formatted+0x35c/0x45f
 [c01da3fc] handle_pos_on_leaf+0x4d/0x61
 [c01d6a84] squalloc+0x16/0x52
 [c01d89f7] jnode_flush+0x80e/0x99d
 [c01d8fee] flush_current_atom+0x468/0x722
 [c01cf073] flush_some_atom+0x9c3/0xb13
 [c01f4216] reiser4_writeout+0x1a6/0x30c
 [c01f554b] entd+0x1e2/0x3d5
 [c0124545] kthread+0xaf/0xde
 [c03eb415] kernel_thread_helper+0x5/0xb
 ===
: jnode: 0, tree: 1 (r:0,w:1), dk: 1 (r:0,w:1)
jload: 0, txnh: 0, atom: 0, stack: 0, txnmgr: 0, ktxnmgrd: 0, fq: 0
inode: 0, cbk_cache: 0 (r:0,w0), eflush: 0, zlock: 1,
spin: 3, long: 1 inode_sem: (r:0,w:0)
d: 1, x: 4, t: -1
Kernel panic - not syncing: reiser4[ent:hdb1!(2175)]: sibling_list_remove 
(fs/reiser4/tree_walk.c:814)[zam-32245]:
assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))

Andrew Wade


[nikita-1936] assertion failed: reiser4_no_counters_are_held()

2006-08-21 Thread Andrew James Wade

This one has only occurred once. I looked fairly carefully at the code for
partially converted files under the assumption that the rest was
unlikely to be buggy, but nothing stood out at me.

reiser4 panicked cowardly: reiser4[fixdep(19237)]: reiser4_done_context 
(fs/reiser4/context.c:181)[nikita-1936]:
assertion failed: reiser4_no_counters_are_held()
 [c0103754] dump_trace+0x64/0x181
 [c0103883] show_trace_log_lvl+0x12/0x25
 [c0103b79] show_trace+0xd/0x10
 [c0103b93] dump_stack+0x17/0x19
 [c01a0663] reiser4_do_panic+0x4e/0x7b
 [c01bdbc0] reiser4_exit_context+0xa1/0x575
 [c0202bc9] release_unix_file+0x1b7/0x1c2
 [c014f90b] __fput+0xbe/0x16c
 [c014f9e7] fput+0x2e/0x33
 [c014d3ec] filp_close+0x51/0x5b
 [c014ddd2] sys_close+0x70/0x93
 [c01028a5] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
 [c0103883] show_trace_log_lvl+0x12/0x25
 [c0103b79] show_trace+0xd/0x10
 [c0103b93] dump_stack+0x17/0x19
 [c01a0663] reiser4_do_panic+0x4e/0x7b
 [c01bdbc0] reiser4_exit_context+0xa1/0x575
 [c0202bc9] release_unix_file+0x1b7/0x1c2
 [c014f90b] __fput+0xbe/0x16c
 [c014f9e7] fput+0x2e/0x33
 [c014d3ec] filp_close+0x51/0x5b
 [c014ddd2] sys_close+0x70/0x93
 [c01028a5] sysenter_past_esp+0x56/0x8d
 ===
: jnode: 0, tree: 0 (r:0,w:0), dk: 0 (r:0,w:0)
jload: 0, txnh: 0, atom: 0, stack: 0, txnmgr: 0, ktxnmgrd: 0, fq: 0
inode: 0, cbk_cache: 0 (r:0,w0), eflush: 0, zlock: 0,
spin: 0, long: 0 inode_sem: (r:0,w:0)
d: 1, x: -2, t: -2
Kernel panic - not syncing: reiser4[fixdep(19237)]: reiser4_done_context 
(fs/reiser4/context.c:181)[nikita-1936]:
assertion failed: reiser4_no_counters_are_held()

I should be looking for an un-zrelse'd znode for this bug, correct?

Andrew Wade


Re: [nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-17 Thread Andrew James Wade
On Wednesday 16 August 2006 09:32, Benjamin Vander Jagt wrote:

 I am having the exact same problems but with one difference.  After
 a while, the drive starts thrashing, and the system becomes totally
 unresponsive. 
 I've been getting occasional short freezes of a couple of minutes.
But that's probably unrelated: as I have debugging turned on and am
deliberately stressing the fs poor performance is not unexpected.

...
 Andrew, may I ask for the contents of your /proc/meminfo file? 

Sure:

MemTotal:   512648 kB
MemFree: 70612 kB
Buffers:  2800 kB
Cached: 105236 kB
SwapCached:  33812 kB
Active: 335988 kB
Inactive:63028 kB
SwapTotal: 9791608 kB
SwapFree:  9757768 kB
Dirty:  84 kB
Writeback:   0 kB
AnonPages:  267960 kB
Mapped:  52792 kB
Slab:22760 kB
PageTables:   3148 kB
NFS Unstable:0 kB
Bounce:  0 kB
CommitLimit:  10047932 kB
Committed_AS:   686084 kB
VmallocTotal:   515796 kB
VmallocUsed: 25572 kB
VmallocChunk:   489680 kB
HugePages_Total: 0
HugePages_Free:  0
HugePages_Rsvd:  0
Hugepagesize: 4096 kB

I am currently trying to distill a test-case for crashing the fs.
It is going slowly, but I have managed to provoke a few panics,
including some new ones:

reiser4 panicked cowardly: reiser4[scatteredwrites(4506)]: reiser4_writepage 
(fs/reiser4/page_cache.c:522)[]:
assertion failed: can_hit_entd(ctx, s)
Kernel panic - not syncing: reiser4[scatteredwrites(4506)]: reiser4_writepage 
(fs/reiser4/page_cache.c:522)[]:
assertion failed: can_hit_entd(ctx, s)

reiser4 panicked cowardly: reiser4[tar(4238)]: reiser4_update_extent 
(fs/reiser4/plugin/item/extent_file_ops.c:807)[]:
assertion failed: reiser4_lock_counters()-d_refs == 0
Kernel panic - not syncing: reiser4[tar(4238)]: reiser4_update_extent 
(fs/reiser4/plugin/item/extent_file_ops.c:807)[]:
assertion failed: reiser4_lock_counters()-d_refs == 0

reiser4 panicked cowardly: reiser4[patch(9302)]: 
reiser4_set_page_dirty_internal (fs/reiser4/page_cache.c:475)[]:
assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)
Kernel panic - not syncing: reiser4[patch(9302)]: 
reiser4_set_page_dirty_internal (fs/reiser4/page_cache.c:475)[]:
assertion failed: JF_ISSET(jprivate(page), JNODE_DIRTY)

These are all for 2.6.18-rc4-mm1 + the small patch upthread.

Andrew Wade


Re: [nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-11 Thread Andrew James Wade
On Friday 11 August 2006 05:15, Vladimir V. Saveliev wrote:
 Hello
 
 On Thursday 10 August 2006 21:55, Andrew James Wade wrote:
  Hello,
 
  I've had another panic on a fscked filesystem:
 
  reiser4 panicked cowardly: reiser4[updatedb(3302)]: reiser4_writepage
  (fs/reiser4/page_cache.c:521)[]: assertion failed: can_hit_entd(ctx, s)
  Kernel panic - not syncing: reiser4[updatedb(3302)]: reiser4_writepage
  (fs/reiser4/page_cache.c:521)[]: assertion failed: can_hit_entd(ctx, s)
 
 
 What kernel do you use? Recently we had few fixes of such problem.

2.6.18-rc3-mm2 + the patch below.

I've been unable to observe any corruption in over 300 GB of file data
written to the hd, so I don't think I have a hardware issue.

I will continue poking away at the problem.

Andrew Wade

--

re-add to reiser4_releasepage mistakenly removed page_count check.
extra page reference is used to protect page from detaching from the 
jnode.

Signed-off-by: Alexander Zarochentsev [EMAIL PROTECTED]
---
 fs/reiser4/as_ops.c |5 +
 1 file changed, 5 insertions(+)

Index: linux-2.6-git/fs/reiser4/as_ops.c
===
--- linux-2.6-git.orig/fs/reiser4/as_ops.c
+++ linux-2.6-git/fs/reiser4/as_ops.c
@@ -350,6 +350,11 @@ int reiser4_releasepage(struct page *pag
if (PageDirty(page))
return 0;
 
+   /* extra page reference is used by reiser4 to protect
+* jnode-page link from this -releasepage(). */
+   if (page_count(page)  3)
+   return 0;
+
/* releasable() needs jnode lock, because it looks at the jnode fields
 * and we need jload_lock here to avoid races with jload(). */
spin_lock_jnode(node);


Re: [nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-10 Thread Andrew James Wade
Hello,

I've had another panic on a fscked filesystem:

reiser4 panicked cowardly: reiser4[updatedb(3302)]: reiser4_writepage 
(fs/reiser4/page_cache.c:521)[]:
assertion failed: can_hit_entd(ctx, s)
Kernel panic - not syncing: reiser4[updatedb(3302)]: reiser4_writepage 
(fs/reiser4/page_cache.c:521)[]:
assertion failed: can_hit_entd(ctx, s)

It's getting pretty obvious that there must be something unusual/unique
in my setup that's giving me grief. My guess would be that data is
getting corrupted going between the drive and memory. I do have my
pci bus underclocked to 30 MHz so maybe that's a factor. I have had
problems with memory corruption in the past (hence the underclocking),
but I haven't had any of the symptoms of memory corruption
re-appearing. (Note that /dev/hdb is my /home filesystem only, so
it's plausible that problems there would mostly tickle reiser4 code).

If that's what is going on, I would expect file contents to also
corrupt. I'm going to whip up some scripts to exercise the reading
and writing large amounts of data to the disk and and see if I can
find corruption of the data. (I hope to be able to use O_DIRECT to
avoid thrashing). 

I suppose another possibility is that there is something strange in
my filesystem that survives fsck, but causes problems. Given the
variety of symptoms (and the lack of other reports) I would tend to
discount that though. For the record this is what fsck keeps telling
me:

FSCK: Node (33160105), item (0), [29:1(SD):0:2a:0]: the slot (9) contains the 
invalid opset member (compress mode), id (2).
FSCK: Node (33160105), item (0), [29:1(SD):0:2a:0]: removing broken slots.
FSCK: Node (33160105), item (0), [29:1(SD):0:2a:0]: item has the wrong length 
(94). Should be (90). Fixed.

I'm going to run fsck twice in a row to verify that fsck fixes the
problems, but I'm working under the assumption that what fsck is
finding is unrelated.

I think the ball is in my court: fortunately I now have time to devote
to investigation. I'll let you know what I find.

Comments?

Andrew Wade


Re: [nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-06 Thread Andrew James Wade
Hello,

I have had another assertion fail. This one is with 2.6.18-rc2-mm1 +
the fix in reiser4_releasepage. This was on a filesystem that had not
been unmounted cleanly. (2.6.18-rc3-mm1 crashed on me). 

reiser4 panicked cowardly: reiser4[ktxnmgrd:hdb1:r(1977)]: sibling_list_remove 
(fs/reiser4/tree_walk.c:813)[zam-32245]:
assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))
Kernel panic - not syncing: reiser4[ktxnmgrd:hdb1:r(1977)]: sibling_list_remove 
(fs/reiser4/tree_walk.c:813)[zam-32245]:
assertion failed: keyeq(znode_get_rd_key(node), znode_get_ld_key(node-right))

The next boot had this diagnostic:
reiser4[kde-config(3707)]: present_lw_sd 
(fs/reiser4/plugin/item/static_stat.c:276)[]:
WARNING: partially converted file is encountered
and is continuing to work fine. I have not yet fscked the filesystem.

I hope this helps,
Andrew Wade


Re: [nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-03 Thread Andrew James Wade
I've just had some warnings show up in my kernel log. I don't know if
they're related to the troubles I've been having (I fscked after the
last panic).

reiser4[updatedb(32445)]: key_warning 
(fs/reiser4/plugin/file_plugin_common.c:513)[nikita-717]:
WARNING: Error for inode 401698 (-2)
for key: (6211c:1:656e646f727365:0:62122:0)[stat data]
reiser4[updatedb(32445)]: key_warning 
(fs/reiser4/plugin/file_plugin_common.c:513)[nikita-717]:
WARNING: Error for inode 401697 (-2)
for key: (6211c:1:6576656e74732e:0:62121:0)[stat data]
reiser4[updatedb(32445)]: key_warning 
(fs/reiser4/plugin/file_plugin_common.c:513)[nikita-717]:
WARNING: Error for inode 401694 (-2)
for key: (6211c:1:736e617073686f:0:6211e:0)[stat data]
reiser4[updatedb(32445)]: key_warning 
(fs/reiser4/plugin/file_plugin_common.c:513)[nikita-717]:
WARNING: Error for inode 401699 (-2)
for key: (6211c:1:776f6d656e5f6c:0:62123:0)[stat data]

Hope this helps,
Andrew Wade


Re: [nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-03 Thread Andrew James Wade
Thanks.

I've applied the patch, and I'll let you know if any errors reccur.

Andrew Wade


[nikita-3002]: assertion failed: carry_level_invariant(doing, CARRY_DOING)

2006-08-02 Thread Andrew James Wade
Hello,

 Every few weeks reiser4 panics on me, generally while kmail is
receiving emails. Until recently, the panic was invalid opcode: 
[#1] (previously reported), but I have some new errors:

The first is:
reiser4 panicked cowardly: reiser4[less(7234)]: set_file_state 
(fs/reiser4/plugin/file/file.c:200)[vs-1162]:
assertion failed: ergo(level == LEAF_LEVEL  cbk_result == CBK_COORD_FOUND, 
uf_info-container == UF_CONTAINER_TAILS)
Kernel panic - not syncing: reiser4[less(7234)]: set_file_state 
(fs/reiser4/plugin/file/file.c:200)[vs-1162]:
assertion failed: ergo(level == LEAF_LEVEL  cbk_result == CBK_COORD_FOUND, 
uf_info-container == UF_CONTAINER_TAILS)

and the second is:
reiser4[patch(25956)]: carry_level_invariant (fs/reiser4/carry.c:1250)[]:
WARNING: wrong key order
reiser4 panicked cowardly: reiser4[patch(25956)]: carry_on_level 
(fs/reiser4/carry.c:356)[nikita-3002]:
assertion failed: carry_level_invariant(doing, CARRY_DOING)
Kernel panic - not syncing: reiser4[patch(25956)]: carry_on_level 
(fs/reiser4/carry.c:356)[nikita-3002]:
assertion failed: carry_level_invariant(doing, CARRY_DOING)

Both were for 2.6.18-rc2-mm1 [1]. The second error occurred on a recently 
fscked filesystem.

[1] with one patch reverted for unrelated reasons.

I hope this helps get to the root of the problem. Unfortunately, I do
not yet have a reproduceable test case.

Andrew Wade


Possible circular locking dependency detected in Reiser4

2006-06-29 Thread Andrew James Wade
Hello,

I got the following warning when I ran klive:

Andrew Wade
  
 ===
 [ INFO: possible circular locking dependency detected ]
 ---
 twistd/3816 is trying to acquire lock:
  (txnh-hlock){--..}, at: [txn_end+1011/1139] txn_end+0x3f3/0x473
 
 but task is already holding lock:
  (atom-alock){--..}, at: [txnh_get_atom+28/120] txnh_get_atom+0x1c/0x78
 
 which lock already depends on the new lock.
 
 
 the existing dependency chain (in reverse order) is:
 
 - #1 (atom-alock){--..}:
[lock_acquire+94/129] lock_acquire+0x5e/0x81
[_spin_lock+35/50] _spin_lock+0x23/0x32
[try_capture+733/2499] try_capture+0x2dd/0x9c3
[longterm_lock_znode+755/1026] longterm_lock_znode+0x2f3/0x402
[seal_validate+82/288] seal_validate+0x52/0x120
[write_sd_by_inode_common+659/1328] write_sd_by_inode_common+0x293/0x530
[reiser4_update_sd+37/44] reiser4_update_sd+0x25/0x2c
[reiser4_dirty_inode+23/112] reiser4_dirty_inode+0x17/0x70
[__mark_inode_dirty+41/353] __mark_inode_dirty+0x29/0x161
[inode_setattr+345/355] inode_setattr+0x159/0x163
[setattr_common+86/131] setattr_common+0x56/0x83
[setattr_unix_file+493/507] setattr_unix_file+0x1ed/0x1fb
[notify_change+260/533] notify_change+0x104/0x215
[sys_fchmodat+151/190] sys_fchmodat+0x97/0xbe
[sys_chmod+18/20] sys_chmod+0x12/0x14
[sysenter_past_esp+86/141] sysenter_past_esp+0x56/0x8d
 
 - #0 (txnh-hlock){--..}:
[lock_acquire+94/129] lock_acquire+0x5e/0x81
[_spin_lock+35/50] _spin_lock+0x23/0x32
[txn_end+1011/1139] txn_end+0x3f3/0x473
[reiser4_exit_context+172/287] reiser4_exit_context+0xac/0x11f
[setattr_common+123/131] setattr_common+0x7b/0x83
[setattr_unix_file+493/507] setattr_unix_file+0x1ed/0x1fb
[notify_change+260/533] notify_change+0x104/0x215
[sys_fchmodat+151/190] sys_fchmodat+0x97/0xbe
[sys_chmod+18/20] sys_chmod+0x12/0x14
[sysenter_past_esp+86/141] sysenter_past_esp+0x56/0x8d
 
 other info that might help us debug this:
 
 2 locks held by twistd/3816:
  #0:  (inode-i_mutex){--..}, at: [mutex_lock+8/10] mutex_lock+0x8/0xa
  #1:  (atom-alock){--..}, at: [txnh_get_atom+28/120] txnh_get_atom+0x1c/0x78
 
 stack backtrace:
  [show_trace_log_lvl+84/253] show_trace_log_lvl+0x54/0xfd
  [show_trace+13/16] show_trace+0xd/0x10
  [dump_stack+23/25] dump_stack+0x17/0x19
  [print_circular_bug_tail+89/100] print_circular_bug_tail+0x59/0x64
  [__lock_acquire+2084/2524] __lock_acquire+0x824/0x9dc
  [lock_acquire+94/129] lock_acquire+0x5e/0x81
  [_spin_lock+35/50] _spin_lock+0x23/0x32
  [txn_end+1011/1139] txn_end+0x3f3/0x473
  [reiser4_exit_context+172/287] reiser4_exit_context+0xac/0x11f
  [setattr_common+123/131] setattr_common+0x7b/0x83
  [setattr_unix_file+493/507] setattr_unix_file+0x1ed/0x1fb
  [notify_change+260/533] notify_change+0x104/0x215
  [sys_fchmodat+151/190] sys_fchmodat+0x97/0xbe
  [sys_chmod+18/20] sys_chmod+0x12/0x14
  [sysenter_past_esp+86/141] sysenter_past_esp+0x56/0x8d


Re: reiser4 bug in 2.6.16-rc2-mm1

2006-02-16 Thread Andrew James Wade
On Friday 10 February 2006 09:22, Maarten Deprez wrote:
 Hello,
 
 reiser4 on linux 2.6.16-rc2-mm1 bugs for me in
 plugins/file/tail_conversion.c line 29, locking up a process
 sometimes, when it is reading a file.
 
 Greetings,
 Maarten Deprez
 
 

Still present in 2.6.16-rc3-mm1:

[ cut here ]
kernel BUG at fs/reiser4/plugin/file/tail_conversion.c:81!
invalid opcode:  [#1]
PREEMPT 
last sysfs file: /devices/pci:00/:00:01.0/:01:00.0/i2c-0/name
CPU:0
EIP:0060:[get_nonexclusive_access+30/49]Not tainted VLI
EFLAGS: 00010286   (2.6.16-rc3-mm1 #2) 
EIP is at get_nonexclusive_access+0x1e/0x31
eax: cc2ec288   ebx:    ecx: cb87d4e8   edx: 
esi: cb87d4e8   edi:    ebp: d30ba574   esp: d3934e00
ds: 007b   es: 007b   ss: 0068
Process kmail (pid: 21299, threadinfo=d3934000 task=d8577570)
Stack: 0c01ca9ae d94d157c d3934ed8 cb87d540 000f  00320af1 
 
   df136ef4 f000  d20b2494 1000 d94d158c   
     0127  43f543c8 2ac0f373  d94d158c 
Call Trace:
 c01ca9ae write_extent+0x68d/0xbc3   c01cd0e2 item_length_by_coord+0xb/0xf
 c01c8125 nr_units_extent+0x5/0xd   c01c94ef 
init_coord_extension_extent+0x60/0xdf
 c01b5031 set_file_state+0x26/0x5b   c01b5128 find_file_item+0xc2/0xd4
 c01ca321 write_extent+0x0/0xbc3   c01b6c46 write_flow+0x248/0x2df
 c01b74bc write_unix_file+0x343/0x4cc   c01345f6 
lru_cache_add_active+0x47/0x5d
 c01b7179 write_unix_file+0x0/0x4cc   c01488f5 vfs_write+0x83/0x122
 c014910e sys_write+0x3c/0x63   c0102ac7 sysenter_past_esp+0x54/0x75
Code: 81 c4 b0 00 00 00 89 e8 5b 5e 5f 5d c3 85 d2 89 c1 75 20 b8 00 f0 ff ff 
21 e0 8b 00 8b 80 c4 04 00 00 8b 40 40 83 78 08 00 74 08 0f 0b 51 00 b8 b8 38 
c0 89 c8 ff 00 0f 88 e1 06 00 00 c3 55 ba 
 44reiser4[kmail(21299)]: release_unix_file 
(fs/reiser4/plugin/file/file.c:2674)[vs-44]:
WARNING: out of memory?
4reiser4[kmail(21299)]: release_unix_file 
(fs/reiser4/plugin/file/file.c:2674)[vs-44]:
WARNING: out of memory?
...



Re: Unexpected reset corrupted Reiser4 filesystem

2005-05-25 Thread Andrew James Wade
John Dong wrote:
 If thse were IDE drives, the IDE writeback cache is probably the bad boy -- 
 on FreeBSD 5.x, Soft Updates is virtually broken on IDE drives because they 
 simply haven't written all the data they promised the kernel that they had.
I do indeed have an IDE drive (Seagate Barracuda) with a writeback cache.
But I thought that write barriers were now working (by flushing the writeback
cache if the drive doesn't support anything fancier). However, I couldn't find
any updates on the write barrier work since March of last year.
(http://lwn.net/Articles/77074/). So the writeback cache may indeed be the
bad boy.

On May 25, 2005 12:49 am, David Masover wrote:
 That's what it all comes down to -- make backups.  The fact that you
 have journalling/transactions/fsck/batteries/RAID is all just to make it
 a little less catostrophic when stuff does fail.
Yup, time for me to make backups. Thanks for the nudge. Especially as I'm
running a bleeding-edge kernel. (I did have one eat some of my data).

Andrew

P.S. My internet connection's been flaky lately, so apologies for any
bounces. I check the mailing list archives for missed messages.


Unexpected reset corrupted Reiser4 filesystem

2005-05-24 Thread Andrew James Wade
Hello,

One of my Reiser4 filesystems was corrupted by a power glitch.
fsck fixed the corruption, but my understanding is that an
unexpected reset should not have corrupted the filesystem. I
have an image of the corrupted filesystem, is it of any use to
anyone?

Details:
kernel: 2.6.12-rc4-mm2
fsck.reiser4: 1.0.4
I was installing oracle, when the power flickered. I was
unable to delete oracle's directory due to what was reported
as I/O errors. fsck revealed a corrupted filesystem (FSCK:
Node (13142228), item (77), unit (0): Points to the block
(12981542) which is in the tree already. The whole subtree
is skipped.) I have an image of the partition at this point,
and dd reported no errors while copying. The image is
unfortunately a bit large to upload (70 GB), but I am happy
to run diagnostic tools against it.

Andrew Wade