Re: kernel 2.4.27 reiserfs quotas patch

2004-08-23 Thread Jan Kara
  Hello,

> I have tried using these patches and I cannot build my kernel with them.  I
> get the following error when compile the kernel:
> 
> In file included from journal.c:42:
> /usr/src/linux-2.4.27/include/linux/module.h:22:34: linux/modversions.h: No
> such file or directory
  That seems quite strange and probably not related to the patch - can
you try following?
backup .config
make mrproper
copy back .config
make oldconfig
make bzImage

Honza


[PATCH] Compile fix for reiserfs quota debug

2004-11-18 Thread Jan Kara
  Hello!

  Attached patch fixes debugging messages of the quota code in the
reiserfs so that they compile. Could some of the reiserfs developers
have a look at it please so that it can be merged in the mainline?

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Fix debugging quota messages in reiserfs code so that they compile.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/bitmap.c 
linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/bitmap.c
--- linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/bitmap.c  2004-11-16 
16:39:09.0 +0100
+++ linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/bitmap.c 2004-11-16 
16:57:33.0 +0100
@@ -956,14 +956,14 @@ static inline int blocknrs_and_prealloc_
 if (!hint->formatted_node) {
 int quota_ret;
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: allocating %d blocks id=%u", 
amount_needed, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: allocating %d 
blocks id=%u", amount_needed, hint->inode->i_uid);
 #endif
quota_ret = DQUOT_ALLOC_BLOCK_NODIRTY(hint->inode, amount_needed);
if (quota_ret)/* Quota exceeded? */
return QUOTA_EXCEEDED;
if (hint->preallocate && hint->prealloc_size ) {
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: allocating (prealloc) %d blocks 
id=%u", hint->prealloc_size, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: allocating 
(prealloc) %d blocks id=%u", hint->prealloc_size, hint->inode->i_uid);
 #endif
quota_ret = DQUOT_PREALLOC_BLOCK_NODIRTY(hint->inode, 
hint->prealloc_size);
if (quota_ret)
@@ -1009,7 +1009,7 @@ static inline int blocknrs_and_prealloc_
/* Free the blocks */
if (!hint->formatted_node) {
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: freeing (nospace) %d blocks 
id=%u", amount_needed + hint->prealloc_size - nr_allocated, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: freeing 
(nospace) %d blocks id=%u", amount_needed + hint->prealloc_size - nr_allocated, 
hint->inode->i_uid);
 #endif
DQUOT_FREE_BLOCK_NODIRTY(hint->inode, amount_needed + 
hint->prealloc_size - nr_allocated); /* Free not allocated blocks */
}
@@ -1029,7 +1029,7 @@ static inline int blocknrs_and_prealloc_
 nr_allocated + REISERFS_I(hint->inode)->i_prealloc_count) {
 /* Some of preallocation blocks were not allocated */
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: freeing (failed prealloc) %d blocks 
id=%u", amount_needed + hint->prealloc_size - nr_allocated - 
INODE_INFO(hint->inode)->i_prealloc_count, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: freeing (failed 
prealloc) %d blocks id=%u", amount_needed + hint->prealloc_size - nr_allocated 
- REISERFS_I(hint->inode)->i_prealloc_count, hint->inode->i_uid);
 #endif
DQUOT_FREE_BLOCK_NODIRTY(hint->inode, amount_needed +
 hint->prealloc_size - nr_allocated -
diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/stree.c 
linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/stree.c
--- linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/stree.c   2004-11-16 
16:39:09.0 +0100
+++ linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/stree.c  2004-11-16 
16:57:33.0 +0100
@@ -1388,7 +1388,7 @@ int reiserfs_delete_item (struct reiserf
 do_balance(&s_del_balance, NULL, NULL, M_DELETE);
 
 #ifdef REISERQUOTA_DEBUG
-reiserfs_debug (p_s_sb, "reiserquota delete_item(): freeing %u, id=%u 
type=%c", quota_cut_bytes, p_s_inode->i_uid, head2type(&s_ih));
+reiserfs_debug (p_s_sb, REISERFS_DEBUG_CODE, "reiserquota delete_item(): 
freeing %u, id=%u type=%c", quota_cut_bytes, p_s_inode->i_uid, 
head2type(&s_ih));
 #endif
 DQUOT_FREE_SPACE_NODIRTY(p_s_inode, quota_cut_bytes);
 
@@ -1465,7 +1465,7 @@ void reiserfs_delete_solid_item (struct 
do_balance (&tb, NULL, NULL, M_DELETE);
if (inode) {/* Should we count quota for item? (we don't 
count quotas for save-links) */
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (th->t_super, "reiserquota delete_solid_item(): 
freeing %u id=%u type=%c", quota_cut_bytes, inode->i_uid, key2type(key));
+   reiserfs_debug (th->t_super, REISERFS_DEBUG_CODE, "reiserquota 
delete_solid_item(): freeing %u id=%u type=%c", quota_cut_bytes, inode->i_uid, 
key2ty

[PATCH] Fix reiserfs oops on small fs

2004-11-18 Thread Jan Kara
  Hello!

  Attached patch fixes oops of reiserfs on a filesystem with just one
bitmap block - current code always tries to return second bitmap even if
there's not any. Could someone review it please so that it can be merged
in mainline?

Honza
Fix block allocation code of reiserfs to give correct bitmap numbers
for small filesystems (with just one bitmap block).

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/bitmap.c 
linux-2.6.10-rc2-mm1-4-reisersmall/fs/reiserfs/bitmap.c
--- linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/bitmap.c 2004-11-16 
16:57:33.0 +0100
+++ linux-2.6.10-rc2-mm1-4-reisersmall/fs/reiserfs/bitmap.c 2004-11-16 
17:01:47.0 +0100
@@ -229,6 +229,9 @@ static int bmap_hash_id(struct super_blo
 unsigned long hash;
 unsigned bm;
 
+/* If there is only one bitmap, we have no choice... */
+if (SB_BMAP_NR(s) == 1)
+   return 0;
 if (id <= 2) {
bm = 1;
 } else {
@@ -613,7 +616,7 @@ dirid_groups (reiserfs_blocknr_hint_t *h
/* give a portion of the block group to metadata */
if (hint->inode)
hash += sb->s_blocksize/2;
-   hint->search_start = hash;
+   hint->search_start = hash >= SB_BLOCK_COUNT(sb) ? SB_BLOCK_COUNT(sb)-1 
: hash;
 }
 }
 
@@ -642,7 +645,7 @@ oid_groups (reiserfs_blocknr_hint_t *hin
bm = bmap_hash_id(hint->inode->i_sb, oid);
hash = bm * (hint->inode->i_sb->s_blocksize << 3);
}
-   hint->search_start = hash;
+   hint->search_start = hash >= SB_BLOCK_COUNT(hint->inode->i_sb) ? 
SB_BLOCK_COUNT(hint->inode->i_sb)-1 : hash;
 }
 }
 


[PATCH] Expose sync_fs()

2004-11-18 Thread Jan Kara
  Hello!

  Attached patch makes reiserfs provide sync_fs() function. It is
necessary for a new quota code to work correctly and expose quota data
to the user space after quotaoff. Currently the functionality is hidden
behind the write_super() call which also seems a bit non-intuitive to me.
Do you think the patch is acceptable?

Thanks for any comments
Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Make reiserfs provide the sync_fs() function so that the quota code
has a way to reliably force a transaction to disk.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-4-reisersmall/fs/reiserfs/super.c 
linux-2.6.10-rc2-mm1-5-reisersync/fs/reiserfs/super.c
--- linux-2.6.10-rc2-mm1-4-reisersmall/fs/reiserfs/super.c  2004-11-16 
16:40:53.0 +0100
+++ linux-2.6.10-rc2-mm1-5-reisersync/fs/reiserfs/super.c   2004-11-16 
17:07:32.0 +0100
@@ -62,7 +62,7 @@ static int is_any_reiserfs_magic_string 
 static int reiserfs_remount (struct super_block * s, int * flags, char * data);
 static int reiserfs_statfs (struct super_block * s, struct kstatfs * buf);
 
-static void reiserfs_sync_fs (struct super_block * s)
+static int reiserfs_sync_fs (struct super_block * s, int wait)
 {
 if (!(s->s_flags & MS_RDONLY)) {
 struct reiserfs_transaction_handle th;
@@ -76,11 +76,12 @@ static void reiserfs_sync_fs (struct sup
 } else {
 s->s_dirt = 0;
 }
+return 0;
 }
 
 static void reiserfs_write_super(struct super_block *s)
 {
-reiserfs_sync_fs(s);
+reiserfs_sync_fs(s, 1);
 }
 
 static void reiserfs_write_super_lockfs (struct super_block * s)
@@ -526,6 +527,7 @@ struct super_operations reiserfs_sops = 
   .clear_inode  = reiserfs_clear_inode,
   .put_super = reiserfs_put_super,
   .write_super = reiserfs_write_super,
+  .sync_fs = reiserfs_sync_fs,
   .write_super_lockfs = reiserfs_write_super_lockfs,
   .unlockfs = reiserfs_unlockfs,
   .statfs = reiserfs_statfs,


Re: [PATCH] Fix reiserfs oops on small fs

2004-11-19 Thread Jan Kara
> On Thu, 2004-11-18 at 12:49 +0100, Jan Kara wrote:
> >   Hello!
> > 
> >   Attached patch fixes oops of reiserfs on a filesystem with just one
> > bitmap block - current code always tries to return second bitmap even if
> > there's not any. Could someone review it please so that it can be merged
> > in mainline?
> 
> A slightly different form of this patch is in already.  Look for the
> checks in bmap_hash_id.  Are you still able to reproduce this bug on
> kernels newer than October 18?
  Sorry. I've missed that the bug was already fixed in the other place.
It looks OK now.
        Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


[PATCH] Expose reiserfs_sync_fs()

2004-11-19 Thread Jan Kara
  Hello!

  Attached patch exposes reiserfs_sync_fs(). This call is needed by the
new quota code to write data to disk on quotaoff so that userspace can
see them afterwards. Chris Mason agrees with the patch so I hope you
can merge it.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Make reiserfs provide the sync_fs() function so that the quota code
has a way to reliably force a transaction to disk.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/super.c 
linux-2.6.10-rc2-mm1-4-reisersync/fs/reiserfs/super.c
--- linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/super.c  2004-11-16 
16:40:53.0 +0100
+++ linux-2.6.10-rc2-mm1-4-reisersync/fs/reiserfs/super.c   2004-11-18 
20:27:03.0 +0100
@@ -62,7 +62,7 @@ static int is_any_reiserfs_magic_string 
 static int reiserfs_remount (struct super_block * s, int * flags, char * data);
 static int reiserfs_statfs (struct super_block * s, struct kstatfs * buf);
 
-static void reiserfs_sync_fs (struct super_block * s)
+static int reiserfs_sync_fs (struct super_block * s, int wait)
 {
 if (!(s->s_flags & MS_RDONLY)) {
 struct reiserfs_transaction_handle th;
@@ -76,11 +76,12 @@ static void reiserfs_sync_fs (struct sup
 } else {
 s->s_dirt = 0;
 }
+return 0;
 }
 
 static void reiserfs_write_super(struct super_block *s)
 {
-reiserfs_sync_fs(s);
+reiserfs_sync_fs(s, 1);
 }
 
 static void reiserfs_write_super_lockfs (struct super_block * s)
@@ -526,6 +527,7 @@ struct super_operations reiserfs_sops = 
   .clear_inode  = reiserfs_clear_inode,
   .put_super = reiserfs_put_super,
   .write_super = reiserfs_write_super,
+  .sync_fs = reiserfs_sync_fs,
   .write_super_lockfs = reiserfs_write_super_lockfs,
   .unlockfs = reiserfs_unlockfs,
   .statfs = reiserfs_statfs,


[PATCH] Fix reiserfs quota debug messages

2004-11-19 Thread Jan Kara
  Hello!

  Attached patch fixes debug messages of quota code in reiserfs so that
they compile. Chris Mason agreed the patch so I hope you can merge it.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Fix debugging quota messages in reiserfs code so that they compile.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/bitmap.c 
linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/bitmap.c
--- linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/bitmap.c  2004-11-16 
16:39:09.0 +0100
+++ linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/bitmap.c 2004-11-18 
20:26:07.0 +0100
@@ -956,14 +956,14 @@ static inline int blocknrs_and_prealloc_
 if (!hint->formatted_node) {
 int quota_ret;
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: allocating %d blocks id=%u", 
amount_needed, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: allocating %d 
blocks id=%u", amount_needed, hint->inode->i_uid);
 #endif
quota_ret = DQUOT_ALLOC_BLOCK_NODIRTY(hint->inode, amount_needed);
if (quota_ret)/* Quota exceeded? */
return QUOTA_EXCEEDED;
if (hint->preallocate && hint->prealloc_size ) {
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: allocating (prealloc) %d blocks 
id=%u", hint->prealloc_size, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: allocating 
(prealloc) %d blocks id=%u", hint->prealloc_size, hint->inode->i_uid);
 #endif
quota_ret = DQUOT_PREALLOC_BLOCK_NODIRTY(hint->inode, 
hint->prealloc_size);
if (quota_ret)
@@ -1009,7 +1009,7 @@ static inline int blocknrs_and_prealloc_
/* Free the blocks */
if (!hint->formatted_node) {
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: freeing (nospace) %d blocks 
id=%u", amount_needed + hint->prealloc_size - nr_allocated, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: freeing 
(nospace) %d blocks id=%u", amount_needed + hint->prealloc_size - nr_allocated, 
hint->inode->i_uid);
 #endif
DQUOT_FREE_BLOCK_NODIRTY(hint->inode, amount_needed + 
hint->prealloc_size - nr_allocated); /* Free not allocated blocks */
}
@@ -1029,7 +1029,7 @@ static inline int blocknrs_and_prealloc_
 nr_allocated + REISERFS_I(hint->inode)->i_prealloc_count) {
 /* Some of preallocation blocks were not allocated */
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (s, "reiserquota: freeing (failed prealloc) %d blocks 
id=%u", amount_needed + hint->prealloc_size - nr_allocated - 
INODE_INFO(hint->inode)->i_prealloc_count, hint->inode->i_uid);
+   reiserfs_debug (s, REISERFS_DEBUG_CODE, "reiserquota: freeing (failed 
prealloc) %d blocks id=%u", amount_needed + hint->prealloc_size - nr_allocated 
- REISERFS_I(hint->inode)->i_prealloc_count, hint->inode->i_uid);
 #endif
DQUOT_FREE_BLOCK_NODIRTY(hint->inode, amount_needed +
 hint->prealloc_size - nr_allocated -
diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/stree.c 
linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/stree.c
--- linux-2.6.10-rc2-mm1-2-offadd/fs/reiserfs/stree.c   2004-11-16 
16:39:09.0 +0100
+++ linux-2.6.10-rc2-mm1-3-reiserdebug/fs/reiserfs/stree.c  2004-11-18 
20:26:07.0 +0100
@@ -1388,7 +1388,7 @@ int reiserfs_delete_item (struct reiserf
 do_balance(&s_del_balance, NULL, NULL, M_DELETE);
 
 #ifdef REISERQUOTA_DEBUG
-reiserfs_debug (p_s_sb, "reiserquota delete_item(): freeing %u, id=%u 
type=%c", quota_cut_bytes, p_s_inode->i_uid, head2type(&s_ih));
+reiserfs_debug (p_s_sb, REISERFS_DEBUG_CODE, "reiserquota delete_item(): 
freeing %u, id=%u type=%c", quota_cut_bytes, p_s_inode->i_uid, 
head2type(&s_ih));
 #endif
 DQUOT_FREE_SPACE_NODIRTY(p_s_inode, quota_cut_bytes);
 
@@ -1465,7 +1465,7 @@ void reiserfs_delete_solid_item (struct 
do_balance (&tb, NULL, NULL, M_DELETE);
if (inode) {/* Should we count quota for item? (we don't 
count quotas for save-links) */
 #ifdef REISERQUOTA_DEBUG
-   reiserfs_debug (th->t_super, "reiserquota delete_solid_item(): 
freeing %u id=%u type=%c", quota_cut_bytes, inode->i_uid, key2type(key));
+   reiserfs_debug (th->t_super, REISERFS_DEBUG_CODE, "reiserquota 
delete_solid_item(): freeing %u id=%u type=%c", quota_cut_bytes, inode->i_uid, 
key2type(key));
 #endif
DQUOT_FREE_SPACE_NODIRT

[PATCH] Fix of quota deadlock on pagelock

2004-11-19 Thread Jan Kara
  Hello!

  The four patches in the next mails fix deadlocks with quotas of
pagelock (the problem was lock inversion on PageLock and transaction
start - quota code needed to first start a transaction and then write
the data which subsequently needed acquisition of PageLock while the
standard ordering - PageLock first and transaction start later - was
used e.g. by pdflush). They implement a new way of quota access to disk:
Every filesystem that would like to implement quotas now has to provide
quota_read() and quota_write() functions. These functions must obey
quota lock ordering (in particular they should not take PageLock inside
a transaction).
  The first patch implements the changes in the quota core, the other
three patches implement needed functions in ext2, ext3 and reiserfs.
The patch for reiserfs also fixes several other lock inversion problems
(similar as ext3 had) and implements the journaled quota functionality
(which comes almost for free after the locking fixes...).
  The quota core patch makes quota support in other filesystems (except
XFS which implements everything on its own ;)) unfunctional (quotaon()
will refuse to turn on quotas on them). When the patches get reasonable
wide testing and it will seem that no major changes will be needed I
can make fixes also for the other filesystems (JFS, UDF, UFS).
  Could you Andrew merge the patches in the -mm tree for testing please?
The patch for reiserfs was not yet checked by any reiserfs maintainer so
I'm not sure about the policy here...
  Any comments are welcome.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH] Fix of quota deadlock on pagelock

2004-11-19 Thread Jan Kara
  Hello,

  the attached patch implements the new way of quota io in the quota
core. Every filesystem wanting to support quotas has to provide functions
quota_read() and quota_write() obeying quota locking rules. As the
writes and reads bypass the pagecache there is some ugly stuff ensuring
that userspace can see all the data after quotaoff() (or Q_SYNC quotactl).
In future I plan to make quota files inaccessible from userspace (with
the exception of quotacheck(8) which will take care about the cache
flushing and such stuff itself) so that this synchronization stuff can
be removed... Please apply.

Honza

The rewrite of the quota core. Quota uses the filesystem read() and write()
functions no more to avoid possible deadlocks on PageLock. From now on every
filesystem supporting quotas must provide functions quota_read() and
quota_write() which obey the quota locking rules (e.g. they cannot acquire the
PageLock).

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-4-reisersync/fs/dquot.c 
linux-2.6.10-rc2-mm1-5-quotacore/fs/dquot.c
--- linux-2.6.10-rc2-mm1-4-reisersync/fs/dquot.c2004-11-18 
20:20:41.0 +0100
+++ linux-2.6.10-rc2-mm1-5-quotacore/fs/dquot.c 2004-11-18 20:28:03.0 
+0100
@@ -49,7 +49,7 @@
  * New SMP locking.
  *         Jan Kara, <[EMAIL PROTECTED]>, 10/2002
  *
- * Added journalled quota support
+ * Added journalled quota support, fix lock inversion problems
  *     Jan Kara, <[EMAIL PROTECTED]>, 2003,2004
  *
  * (C) Copyright 1994 - 1997 Marco van Wieringen 
@@ -75,7 +75,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include 
 
@@ -114,7 +115,7 @@
  * operations on dquots don't hold dq_lock as they copy data under dq_data_lock
  * spinlock to internal buffers before writing.
  *
- * Lock ordering (including related VFS locks) is following:
+ * Lock ordering (including related VFS locks) is the following:
  *   i_sem > dqonoff_sem > iprune_sem > journal_lock > dqptr_sem >
  *   > dquot->dq_lock > dqio_sem
  * i_sem on quota files is special (it's below dqio_sem)
@@ -183,8 +184,7 @@ static void put_quota_format(struct quot
  * on all three lists, depending on its current state.
  *
  * All dquots are placed to the end of inuse_list when first created, and this
- * list is used for the sync and invalidate operations, which must look
- * at every dquot.
+ * list is used for invalidate operation, which must look at every dquot.
  *
  * Unused dquots (dq_count == 0) are added to the free_dquots list when freed,
  * and this list is searched whenever we need an available dquot.  Dquots are
@@ -1314,10 +1314,12 @@ int vfs_quota_off(struct super_block *sb
 {
int cnt;
struct quota_info *dqopt = sb_dqopt(sb);
+   struct inode *toput[MAXQUOTAS];
 
/* We need to serialize quota_off() for device */
down(&dqopt->dqonoff_sem);
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+   toput[cnt] = NULL;
if (type != -1 && cnt != type)
continue;
if (!sb_has_quota_enabled(sb, cnt))
@@ -1337,7 +1339,7 @@ int vfs_quota_off(struct super_block *sb
dqopt->ops[cnt]->free_file_info(sb, cnt);
put_quota_format(dqopt->info[cnt].dqi_format);
 
-   fput(dqopt->files[cnt]);
+   toput[cnt] = dqopt->files[cnt];
dqopt->files[cnt] = NULL;
dqopt->info[cnt].dqi_flags = 0;
dqopt->info[cnt].dqi_igrace = 0;
@@ -1345,6 +1347,26 @@ int vfs_quota_off(struct super_block *sb
dqopt->ops[cnt] = NULL;
}
up(&dqopt->dqonoff_sem);
+   /* Sync the superblock so that buffers with quota data are written to
+ * disk (and so userspace sees correct data afterwards) */
+   if (sb->s_op->sync_fs)
+   sb->s_op->sync_fs(sb, 1);
+   sync_blockdev(sb->s_bdev);
+   /* Now the quota files are just ordinary files and we can set the
+* inode flags back. Moreover we discard the pagecache so that
+* userspace sees the writes we did bypassing the pagecache. We
+* must also discard the blockdev buffers so that we see the
+* changes done by userspace on the next quotaon() */
+   for (cnt = 0; cnt < MAXQUOTAS; cnt++)
+   if (toput[cnt]) {
+   down(&toput[cnt]->i_sem);
+   toput[cnt]->i_flags &= ~(S_IMMUTABLE | S_NOATIME | 
S_NOQUOTA);
+   truncate_inode_pages(&toput[cnt]->i_data, 0);
+   up(&toput[cnt]->i_sem);
+   mark_inode_dirty(toput[cnt]);
+   iput(toput[cn

Re: [PATCH] Fix of quota deadlock on pagelock

2004-11-19 Thread Jan Kara
  Hello!

  Attached patch implements quota_read() and quota_write() functions for
ext2. Please apply.
Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Implementation of quota reading and writing functions for ext2.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-5-quotacore/fs/ext2/ext2.h 
linux-2.6.10-rc2-mm1-6-ext2/fs/ext2/ext2.h
--- linux-2.6.10-rc2-mm1-5-quotacore/fs/ext2/ext2.h 2004-11-16 
16:39:07.0 +0100
+++ linux-2.6.10-rc2-mm1-6-ext2/fs/ext2/ext2.h  2004-11-18 20:28:58.0 
+0100
@@ -119,6 +119,7 @@ extern int ext2_write_inode (struct inod
 extern void ext2_delete_inode (struct inode *);
 extern int ext2_sync_inode (struct inode *);
 extern void ext2_discard_prealloc (struct inode *);
+extern int ext2_get_block(struct inode *, sector_t, struct buffer_head *, int);
 extern void ext2_truncate (struct inode *);
 extern int ext2_setattr (struct dentry *, struct iattr *);
 extern void ext2_set_inode_flags(struct inode *inode);
diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-5-quotacore/fs/ext2/inode.c 
linux-2.6.10-rc2-mm1-6-ext2/fs/ext2/inode.c
--- linux-2.6.10-rc2-mm1-5-quotacore/fs/ext2/inode.c2004-11-16 
16:39:07.0 +0100
+++ linux-2.6.10-rc2-mm1-6-ext2/fs/ext2/inode.c 2004-11-18 20:28:58.0 
+0100
@@ -524,7 +524,7 @@ changed:
  * reachable from inode.
  */
 
-static int ext2_get_block(struct inode *inode, sector_t iblock, struct 
buffer_head *bh_result, int create)
+int ext2_get_block(struct inode *inode, sector_t iblock, struct buffer_head 
*bh_result, int create)
 {
int err = -EIO;
int offsets[4];
diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-5-quotacore/fs/ext2/super.c 
linux-2.6.10-rc2-mm1-6-ext2/fs/ext2/super.c
--- linux-2.6.10-rc2-mm1-5-quotacore/fs/ext2/super.c2004-11-16 
16:39:07.0 +0100
+++ linux-2.6.10-rc2-mm1-6-ext2/fs/ext2/super.c 2004-11-18 20:28:58.0 
+0100
@@ -200,6 +200,11 @@ static void ext2_clear_inode(struct inod
 }
 
 
+#ifdef CONFIG_QUOTA
+static ssize_t ext2_quota_read(struct super_block *sb, int type, char *data, 
size_t len, loff_t off);
+static ssize_t ext2_quota_write(struct super_block *sb, int type, const char 
*data, size_t len, loff_t off);
+#endif
+
 static struct super_operations ext2_sops = {
.alloc_inode= ext2_alloc_inode,
.destroy_inode  = ext2_destroy_inode,
@@ -211,6 +216,10 @@ static struct super_operations ext2_sops
.statfs = ext2_statfs,
.remount_fs = ext2_remount,
.clear_inode= ext2_clear_inode,
+#ifdef CONFIG_QUOTA
+   .quota_read = ext2_quota_read,
+   .quota_write= ext2_quota_write,
+#endif
 };
 
 /* Yes, most of these are left as NULL!!
@@ -1001,6 +1010,102 @@ static struct super_block *ext2_get_sb(s
return get_sb_bdev(fs_type, flags, dev_name, data, ext2_fill_super);
 }
 
+#ifdef CONFIG_QUOTA
+
+/* Read data from quotafile - avoid pagecache and such because we cannot afford
+ * acquiring the locks... As quota files are never truncated and quota code
+ * itself serializes the operations (and noone else should touch the files)
+ * we don't have to be afraid of races */
+static ssize_t ext2_quota_read(struct super_block *sb, int type, char *data,
+  size_t len, loff_t off)
+{
+   struct inode *inode = sb_dqopt(sb)->files[type];
+   unsigned long blk = off >> EXT2_BLOCK_SIZE_BITS(sb);
+   int err = 0, offset = off & (sb->s_blocksize - 1), tocopy;
+   size_t toread;
+   struct buffer_head tmp_bh, *bh;
+   loff_t i_size = i_size_read(inode);
+
+   if (off > i_size)
+   return 0;
+   if (off+len > i_size)
+   len = i_size-off;
+   toread = len;
+   while (toread > 0) {
+   tocopy = sb->s_blocksize - offset < toread ?
+   sb->s_blocksize - offset : toread;
+
+   tmp_bh.b_state = 0;
+   err = ext2_get_block(inode, blk, &tmp_bh, 0);
+   if (err)
+   return err;
+   if (!buffer_mapped(&tmp_bh))/* A hole? */
+   memset(data, 0, tocopy);
+   else {
+   bh = sb_bread(sb, tmp_bh.b_blocknr);
+   if (!bh)
+   return -EIO;
+   memcpy(data, bh->b_data+offset, tocopy);
+   brelse(bh);
+   }
+   offset = 0;
+   toread -= tocopy;
+   data += tocopy;
+   blk++;
+   }
+   return len;
+}
+
+/* Write to quotafile */
+static ssize_t ext2_quota_write(struct super_block *sb, int type,
+   const char *data, size_t len, loff_t off)
+{
+   struct inode *inod

Re: [PATCH] Fix of quota deadlock on pagelock

2004-11-19 Thread Jan Kara
  Hi!

  The last patch in the row ;). It implements quota_read(),
quota_write() for reiserfs, fixes several lock inversion problems on
various locks and journal_begin and implements journaled quota. Hopefuly
some of the reiserfs maintainers will have a look at it and either
accept it or tells me what's wrong. In the meantime accept it if you
like it (I'm not sure how much Hans insists on seeing the patches that
are going in...).

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Implement quota journaling and quota reading and writing functions for
reiserfs. Solves also several other deadlocks possible for reiserfs
due to the lock inversion on journal_begin and quota locks.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-7-ext3/fs/reiserfs/file.c 
linux-2.6.10-rc2-mm1-8-reiser/fs/reiserfs/file.c
--- linux-2.6.10-rc2-mm1-7-ext3/fs/reiserfs/file.c  2004-11-16 
16:39:09.0 +0100
+++ linux-2.6.10-rc2-mm1-8-reiser/fs/reiserfs/file.c2004-11-18 
20:30:51.0 +0100
@@ -54,7 +54,7 @@ static int reiserfs_file_release (struct
 /* freeing preallocation only involves relogging blocks that
  * are already in the current transaction.  preallocation gets
  * freed at the end of each transaction, so it is impossible for
- * us to log any additional blocks
+ * us to log any additional blocks (including quota blocks)
  */
 err = journal_begin(&th, inode->i_sb, 1);
 if (err) {
@@ -201,7 +201,7 @@ int reiserfs_allocate_blocks_for_region(
 /* If we came here, it means we absolutely need to open a transaction,
since we need to allocate some blocks */
 reiserfs_write_lock(inode->i_sb); // Journaling stuff and we need that.
-res = journal_begin(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1); // 
Wish I know if this number enough
+res = journal_begin(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 * 
REISERFS_QUOTA_TRANS_BLOCKS); // Wish I know if this number enough
 if (res)
 goto error_exit;
 reiserfs_update_inode_transaction(inode) ;
@@ -576,7 +576,7 @@ error_exit:
 int err;
 // update any changes we made to blk count
 reiserfs_update_sd(th, inode);
-err = journal_end(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1);
+err = journal_end(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 
* REISERFS_QUOTA_TRANS_BLOCKS);
 if (err)
 res = err;
 }
diff -rupNX /home/jack/.kerndiffexclude 
linux-2.6.10-rc2-mm1-7-ext3/fs/reiserfs/inode.c 
linux-2.6.10-rc2-mm1-8-reiser/fs/reiserfs/inode.c
--- linux-2.6.10-rc2-mm1-7-ext3/fs/reiserfs/inode.c 2004-11-16 
16:39:09.0 +0100
+++ linux-2.6.10-rc2-mm1-8-reiser/fs/reiserfs/inode.c   2004-11-18 
20:30:51.0 +0100
@@ -20,27 +20,17 @@
 
 extern int reiserfs_default_io_size; /* default io size devuned in super.c */
 
-/* args for the create parameter of reiserfs_get_block */
-#define GET_BLOCK_NO_CREATE 0 /* don't create new blocks or convert tails */
-#define GET_BLOCK_CREATE 1/* add anything you need to find block */
-#define GET_BLOCK_NO_HOLE 2   /* return -ENOENT for file holes */
-#define GET_BLOCK_READ_DIRECT 4  /* read the tail if indirect item not found */
-#define GET_BLOCK_NO_ISEM 8 /* i_sem is not held, don't preallocate */
-#define GET_BLOCK_NO_DANGLE   16 /* don't leave any transactions running */
-
-static int reiserfs_get_block (struct inode * inode, sector_t block,
-  struct buffer_head * bh_result, int create);
 static int reiserfs_commit_write(struct file *f, struct page *page,
  unsigned from, unsigned to);
 
 void reiserfs_delete_inode (struct inode * inode)
 {
-int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2; 
+/* We need blocks for transaction + (user+group) quota update (possibly 
delete) */
+int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2 + 2 * 
REISERFS_QUOTA_INIT_BLOCKS; 
 struct reiserfs_transaction_handle th ;
   
 reiserfs_write_lock(inode->i_sb);
 
-DQUOT_FREE_INODE(inode);
 /* The = 0 happens when we abort creating a new inode for some reason like 
lack of space.. */
 if (!(inode->i_state & I_NEW) && INODE_PKEY(inode)->k_objectid != 0) { /* 
also handles bad_inode case */
down (&inode->i_sem); 
@@ -58,6 +48,11 @@ void reiserfs_delete_inode (struct inode
goto out;
}
 
+   /* Do quota update inside a transaction for journaled quotas. We must 
do that
+* after delete_object so that quota updates go into the same 
transaction as
+* stat data deletion */
+   DQUOT_FREE_INODE(inode);
+
if (journal_end(&th, inode->i_sb, jbegin_count)) {
up (&inode->i_sem);
goto out;
@@ -592,8 +587

Re: [PATCH] Fix of quota deadlock on pagelock

2004-11-22 Thread Jan Kara
> 
> What prevents vfs_quota_off() from racing with unmount?
> 
> If you look at, say, sync_filesystems() you'll see that we take ->s_umount
> and then test ->s_root to check that we didn't race with an unmount
> attempt.
  I thought that the sync_fs and stuff is guarded by the fact that quota
code is holding references to inodes on the filesystem. And invalidate_bdev()
should not care about the filesystem. Or do I miss something? Anyway
adding a comment about this is really a good idea...
  I've found a subtle bug though - there can be vfs_quota_off() racing
against vfs_quota_on() resulting in the flags on the inode being set
wrong... I'll fix that one.

Thanks for checking
        Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH] Fix of quota deadlock on pagelock

2004-11-22 Thread Jan Kara
> Jan Kara <[EMAIL PROTECTED]> wrote:
> >
> > +static ssize_t ext2_quota_write(struct super_block *sb, int type,
> > +   const char *data, size_t len, loff_t off)
> > +{
> > +   struct inode *inode = sb_dqopt(sb)->files[type];
> > +   unsigned long blk = off >> EXT2_BLOCK_SIZE_BITS(sb);
> > +   int err = 0, offset = off & (sb->s_blocksize - 1), tocopy;
> > +   size_t towrite = len;
> > +   struct buffer_head tmp_bh, *bh;
> > +
> > +   down(&inode->i_sem);
> > +   while (towrite > 0) {
> > +   tocopy = sb->s_blocksize - offset < towrite ?
> > +   sb->s_blocksize - offset : towrite;
> > +
> > +   tmp_bh.b_state = 0;
> > +   err = ext2_get_block(inode, blk, &tmp_bh, 1);
> > +   if (err)
> > +   goto out;
> > +   if (offset || tocopy != EXT2_BLOCK_SIZE(sb))
> > +   bh = sb_bread(sb, tmp_bh.b_blocknr);
> > +   else
> > +   bh = sb_getblk(sb, tmp_bh.b_blocknr);
> > +   if (!bh) {
> > +   err = -EIO;
> > +   goto out;
> > +   }
> > +   memcpy(bh->b_data+offset, data, tocopy);
> 
> It is possible to mmap block devices, so we should do a flush_dcache_page()
> after the memset here.
  Thanks I didn't know about this.

> Also, we should lock the buffer.  Because a concurrent read of the blockdev
> would cause the disk DMA transfer to stomp all over the data whihc you just
> wrote.  Say, someone is doing `cp /dev/hda1 /dev/null' at the same time.
  Yes, obviously the buffer should be locked. Thanks for spotting.

> So my accumulated patch against this one is:
> 
> diff -puN fs/ext2/super.c~fix-of-quota-deadlock-on-pagelock-ext2-tweaks 
> fs/ext2/super.c
> --- 25/fs/ext2/super.c~fix-of-quota-deadlock-on-pagelock-ext2-tweaks  Fri Nov 
> 19 14:54:47 2004
> +++ 25-akpm/fs/ext2/super.c   Fri Nov 19 15:00:59 2004
> @@ -1020,10 +1020,13 @@ static ssize_t ext2_quota_read(struct su
>  size_t len, loff_t off)
>  {
>   struct inode *inode = sb_dqopt(sb)->files[type];
> - unsigned long blk = off >> EXT2_BLOCK_SIZE_BITS(sb);
> - int err = 0, offset = off & (sb->s_blocksize - 1), tocopy;
> + sector_t blk = off >> EXT2_BLOCK_SIZE_BITS(sb);
> + int err = 0;
> + int offset = off & (sb->s_blocksize - 1);
> + int tocopy;
>   size_t toread;
> - struct buffer_head tmp_bh, *bh;
> + struct buffer_head tmp_bh;
> + struct buffer_head *bh;
>   loff_t i_size = i_size_read(inode);
>  
>   if (off > i_size)
> @@ -1061,10 +1064,13 @@ static ssize_t ext2_quota_write(struct s
>   const char *data, size_t len, loff_t off)
>  {
>   struct inode *inode = sb_dqopt(sb)->files[type];
> - unsigned long blk = off >> EXT2_BLOCK_SIZE_BITS(sb);
> - int err = 0, offset = off & (sb->s_blocksize - 1), tocopy;
> + sector_t blk = off >> EXT2_BLOCK_SIZE_BITS(sb);
> + int err = 0;
> + int offset = off & (sb->s_blocksize - 1);
> + int tocopy;
>   size_t towrite = len;
> - struct buffer_head tmp_bh, *bh;
> + struct buffer_head tmp_bh;
> + struct buffer_head *bh;
>  
>   down(&inode->i_sem);
>   while (towrite > 0) {
> @@ -1083,9 +1089,12 @@ static ssize_t ext2_quota_write(struct s
>   err = -EIO;
>   goto out;
>   }
> + lock_buffer(bh);
>   memcpy(bh->b_data+offset, data, tocopy);
> + flush_dcache_page(&bh->b_page);
>   set_buffer_uptodate(bh);
>   mark_buffer_dirty(bh);
> + unlock_buffer(bh);
>   brelse(bh);
>   offset = 0;
>   towrite -= tocopy;
> _
> 
  Thanks, the patch looks OK (only Jens spotted a minor bug there).

> I suspect things in there are still a bit racy against a concurrent
> `blockdev --flushbufs' though.
  Thanks for notifying. I'll check that.

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH] Fix of quota deadlock on pagelock

2004-11-22 Thread Jan Kara
> Jan Kara <[EMAIL PROTECTED]> wrote:
> >
> > > 
> > > What prevents vfs_quota_off() from racing with unmount?
> > > 
> > > If you look at, say, sync_filesystems() you'll see that we take ->s_umount
> > > and then test ->s_root to check that we didn't race with an unmount
> > > attempt.
> >   I thought that the sync_fs and stuff is guarded by the fact that quota
> > code is holding references to inodes on the filesystem. And 
> > invalidate_bdev()
> > should not care about the filesystem. Or do I miss something? Anyway
> > adding a comment about this is really a good idea...
> 
> Holding a ref on an inode will not cause umount to block.  It'll just cause
> nasty "Busy inodes after unmount" warnings.
  I see, thanks. I'll fix that also.

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: Status of quota support for reiserfs on 2.6.x kernels

2004-11-23 Thread Jan Kara
  Hello,

> On Tue, 23 Nov 2004 14:06:43 +0100
> Christian Mayrhuber <[EMAIL PROTECTED]> wrote:
> 
>   | 
>   | Jan Kara from SuSE labs recently discoverd a deadlock in the linux
>   | quota system, see:
>   | http://marc.theaimsgroup.com/?t=11008581692&r=1&w=2
> 
> yes saw that.
  Actually it was discovered quite some time ago but it took me quite a
while to fix it as it needed bigger changes...

>   | I guess it would be wise to wait with the migration till these patches
>   | make it into some official 2.6 kernel release.
> 
> True, as mentioned in latest -mm release (2.6.10-rc2-mm3) : 
> "There's a pretty big revamp of the filesystem quota code in here.  If you
>   use quotas, please test"
> 
> I 'll test in the meantime and wait for 2.6.10.
  Actually the similar bugs are in 2.4 kernels so if you don't happen to
hit them there you are probably safe to try some 2.6 kernel (my quota
fixes will take some time to be tested enough to get into Linus's
tree..).

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: Status of quota support for reiserfs on 2.6.x kernels

2004-12-15 Thread Jan Kara
  Hello!

> Christian Mayrhuber  gmx.net> writes:
> 
> > Quotas are in kernel 2.6 and supported by ext2,ext3,reiserfs,xfs.
> > Reiserfs doesn't need an extra patch for quotas, data journalling
> > or extended attributes. Everything is right there for
> > kernels > 2.6.5 if I remember right.
> 
> I tried to use quota with 2.6.7 and got two serious problems:
  I suggest trying some newer kernel - either 2.6.10-rc3-mm1 or
just plain 2.6.10-rc3 with attached patch applied (the patch fixes the
deadlock you seem to hit).

> 1. soft quotas were treated like hard quotas
> (grace time 8 days but behaved like 0 days)
  That is rather strange - I never heard about such problem. If this
problem remains after changing the kernel then mail me please and we try
to sort that out.

> 2. after a few days no access to the file system was possible,
> fs-options usrquota and grpquota were doubled
> when listed by the mount command.
> (all 14 clients, samba and nfs paralyzed,
> a whole radio station silent, yes I had better days)
> After reboot the problem was gone but I turned quota off.
  That looks like a deadlock which should be fixed by now.

Honza
diff -rupX /home/jack/.kerndiffexclude linux-2.6.10-rc2/fs/dquot.c 
linux-2.6.10-rc2-quotafix-all/fs/dquot.c
--- linux-2.6.10-rc2/fs/dquot.c 2004-11-16 16:39:07.0 +0100
+++ linux-2.6.10-rc2-quotafix-all/fs/dquot.c2004-11-23 18:10:29.470754136 
+0100
@@ -49,7 +49,7 @@
  * New SMP locking.
  * Jan Kara, <[EMAIL PROTECTED]>, 10/2002
  *
- * Added journalled quota support
+ * Added journalled quota support, fix lock inversion problems
  * Jan Kara, <[EMAIL PROTECTED]>, 2003,2004
  *
  * (C) Copyright 1994 - 1997 Marco van Wieringen 
@@ -75,7 +75,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include 
 
@@ -114,7 +115,7 @@
  * operations on dquots don't hold dq_lock as they copy data under dq_data_lock
  * spinlock to internal buffers before writing.
  *
- * Lock ordering (including related VFS locks) is following:
+ * Lock ordering (including related VFS locks) is the following:
  *   i_sem > dqonoff_sem > iprune_sem > journal_lock > dqptr_sem >
  *   > dquot->dq_lock > dqio_sem
  * i_sem on quota files is special (it's below dqio_sem)
@@ -183,8 +184,7 @@ static void put_quota_format(struct quot
  * on all three lists, depending on its current state.
  *
  * All dquots are placed to the end of inuse_list when first created, and this
- * list is used for the sync and invalidate operations, which must look
- * at every dquot.
+ * list is used for invalidate operation, which must look at every dquot.
  *
  * Unused dquots (dq_count == 0) are added to the free_dquots list when freed,
  * and this list is searched whenever we need an available dquot.  Dquots are
@@ -1314,10 +1314,14 @@ int vfs_quota_off(struct super_block *sb
 {
int cnt;
struct quota_info *dqopt = sb_dqopt(sb);
+   struct inode *toputinode[MAXQUOTAS];
+   struct vfsmount *toputmnt[MAXQUOTAS];
 
/* We need to serialize quota_off() for device */
down(&dqopt->dqonoff_sem);
for (cnt = 0; cnt < MAXQUOTAS; cnt++) {
+   toputinode[cnt] = NULL;
+   toputmnt[cnt] = NULL;
if (type != -1 && cnt != type)
continue;
if (!sb_has_quota_enabled(sb, cnt))
@@ -1337,14 +1341,50 @@ int vfs_quota_off(struct super_block *sb
dqopt->ops[cnt]->free_file_info(sb, cnt);
put_quota_format(dqopt->info[cnt].dqi_format);
 
-   fput(dqopt->files[cnt]);
+   toputinode[cnt] = dqopt->files[cnt];
+   toputmnt[cnt] = dqopt->mnt[cnt];
dqopt->files[cnt] = NULL;
+   dqopt->mnt[cnt] = NULL;
dqopt->info[cnt].dqi_flags = 0;
dqopt->info[cnt].dqi_igrace = 0;
dqopt->info[cnt].dqi_bgrace = 0;
dqopt->ops[cnt] = NULL;
}
up(&dqopt->dqonoff_sem);
+   /* Sync the superblock so that buffers with quota data are written to
+* disk (and so userspace sees correct data afterwards).
+* The reference to vfsmnt we are still holding protects us from
+* umount (we don't have it only when quotas are turned on/off for
+* journal replay but in that case we are guarded by the fs anyway). */
+   if (sb->s_op->sync_fs)
+   sb->s_op->sync_fs(sb, 1);
+   sync_blockdev(sb->s_bdev);
+   /* Now the quota files are just ordinary files and we can set the
+* inode flags back. Moreover we discard the pagecache so that
+* userspace sees the writes we did byp

Re: Status of quota support for reiserfs on 2.6.x kernels

2005-01-06 Thread Jan Kara
  Sorry for a bit late reply but I was on a holidays.

> On Wed, Dec 15, 2004 at 02:38:06PM +0100, Jan Kara wrote:
> > > I tried to use quota with 2.6.7 and got two serious problems
> >   I suggest trying some newer kernel - either 2.6.10-rc3-mm1 or
> > just plain 2.6.10-rc3 with attached patch applied (the patch fixes the
> > deadlock you seem to hit).
> 
> It tried the latter on another server and the filesystem was accessible
> at all time but other problems occured (maybe not quota-related) that made
> my co-admins downgrade to 2.6.9:
> 
> 1. suddenly, after running the patched kernel for days,
> su, sudo and authentication through courier webmailer was not possible,
> ssh-root-login went fine. After reboot problem was gone but pam_rootok.so
> was missing (file system curruption?)
  Are there any error messages in the log? If that was filesystem
corruption there should be some Otherwise it's hard to guess what
was going on.

> 2. after a few hours stress-testing the quota-partition:
> kernel crash: code ff 21 e2 8b 0a ...
> ... 76 60 e0 ff 6a 00
> kernel BUG at lib/kernel_lock:120!
> invalid operand: 
  This does not look nice. Do you have all the error message? From this
part it's hard to guess what has happened.

> Do you think they are quota-related or related to you quota-fixall patch?
  They might be - you can try to stress-test the filesystem without
quotas enabled and see if some error occurs.

> Is your fix appliable for vanilla 2.6.10 or is it already integrated
> or is there a newer patch?
  The patch I sent you should be applicable to 2.6.10 kernel. There's also one
other bugfix to my quota-fixall patch which I attached (but you don't seem to
actually hit that problem).

Honza

When CONFIG_QUOTA is defined reiserfs's finish_unfinished sets and clears
MS_ACTIVE bit in s_flags field of super block. If that bit was set already
it should not be set.


 fs/reiserfs/super.c |   13 ++---
 1 files changed, 10 insertions(+), 3 deletions(-)

diff -puN fs/reiserfs/super.c~reiserfs-do-not-clear-MS_ACTIVE 
fs/reiserfs/super.c
--- linux-2.6.10-rc3-mm1/fs/reiserfs/super.c~reiserfs-do-not-clear-MS_ACTIVE
2004-12-23 18:22:06.568755520 +0300
+++ linux-2.6.10-rc3-mm1-vs/fs/reiserfs/super.c 2004-12-23 18:22:06.576756006 
+0300
@@ -158,6 +158,7 @@ static int finish_unfinished (struct sup
 int truncate;
 #ifdef CONFIG_QUOTA
 int i;
+int ms_active_set;
 #endif
  
  
@@ -168,7 +169,12 @@ static int finish_unfinished (struct sup
 
 #ifdef CONFIG_QUOTA
 /* Needed for iput() to work correctly and not trash data */
-s->s_flags |= MS_ACTIVE;
+if (s->s_flags & MS_ACTIVE) {
+   ms_active_set = 0;
+} else {
+   ms_active_set = 1;
+   s->s_flags |= MS_ACTIVE;
+}
 /* Turn on quotas so that they are updated correctly */
 for (i = 0; i < MAXQUOTAS; i++) {
if (REISERFS_SB(s)->s_qf_names[i]) {
@@ -276,8 +282,9 @@ static int finish_unfinished (struct sup
 if (sb_dqopt(s)->files[i])
 vfs_quota_off_mount(s, i);
 }
-/* Restore the flag back */
-s->s_flags &= ~MS_ACTIVE;
+if (ms_active_set)
+   /* Restore the flag back */
+   s->s_flags &= ~MS_ACTIVE;
 #endif
 pathrelse (&path);
 if (done)

_


Re: [2.6.11-rc2] kernel BUG at fs/reiserfs/prints.c:362

2005-01-27 Thread Jan Kara
  Hello,

> On Thu, 2005-01-27 at 10:24, Sergey S. Kostyliov wrote:
> > Hello all,
> > 
> > Here is a BUG() I've just hited on quota enabled reiserfs disk.
> > 
> > [EMAIL PROTECTED] rathamahata $ mount | grep /dev/sdb2
> > /dev/sdb2 on /var/www type reiserfs 
> > (rw,noatime,nodiratime,data=writeback,grpquota,usrquota)
> > [EMAIL PROTECTED] rathamahata $
> > 
> > REISERFS: panic (device sdb2): journal_begin called without kernel lock held
> 
> Would you check whether this patch helps, please?
  BTW: What are the exact rules where lock_kernel() should be held for
reiserfs? Is there a doc somewhere? I suspect we might need the lock
also for reiserfs_quota_read() (reiserfs_quota_write() should be
already protected by your fix).
  Hmm, I should also check ext2/ext3 whether it needs the lock...

        Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: ReiserFS quota and 2.6.10

2005-02-23 Thread Jan Kara
> Because I like to keep people happy, especially Valdis
> Kletnieks, I'll repost my query from another email
> account without a disclaimer..here we go.
  :) The disclaimer was quite longer than the message...

> ==
> >From the subject, you have probably already guessed
> what I am about to ask.. 
> Is quota working ok with vanilla 2.6.10 with ReiserFS?
  Yes, the quota should be working fine - I suggest to wait a bit for
2.6.11 as it has fixed some deadlocks in the quota code appearing under
high load.

Honza


Re: ReiserFS quota and 2.6.10

2005-02-23 Thread Jan Kara
> Am Mittwoch, 23. Februar 2005 11:28 schrieb Jan Kara:
> > > Because I like to keep people happy, especially Valdis
> > > Kletnieks, I'll repost my query from another email
> > > account without a disclaimer..here we go.
> > >
> >   :) The disclaimer was quite longer than the message...
> > >
> > > ==
> > >
> > > >From the subject, you have probably already guessed
> > >
> > > what I am about to ask..
> > > Is quota working ok with vanilla 2.6.10 with ReiserFS?
> >
> >   Yes, the quota should be working fine - I suggest to wait a bit for
> > 2.6.11 as it has fixed some deadlocks in the quota code appearing under
> > high load.
> 
> Is this "fixed" in SLES kernels already?
  These problems are fixed in the current SUSE kernel and hence will be
fixed in 9.3 and newer "products". But it's not in anything older...

Honza



Re: reiser4 not supporting quota

2005-04-05 Thread Jan Kara
> Sorry for this ``not so smart'' question, is there quota support in
> reiser4? Or quota is not in the main idea of reiser4?
  There is not quota support for Reiser4. I'm looking into it but first
I have to come up with a reasonable semantics + idea of efficient
implementation :) The framework of plugins and the fact that Reiser4 first
overestimates the needed space and then releases it (and eventually
uses also some of the "reserved" space) makes the quota implementation harder
than in other filesystems...

Honza


Re: reiser4 not supporting quota

2005-04-05 Thread Jan Kara
  Hello,

> On Tue, 2005-04-05 at 14:37, Jan Kara wrote:
> > > Sorry for this ``not so smart'' question, is there quota support in
> > > reiser4? Or quota is not in the main idea of reiser4?
> >   There is not quota support for Reiser4. I'm looking into it but first
> > I have to come up with a reasonable semantics + idea of efficient
> > implementation :) The framework of plugins and the fact that Reiser4 first
> > overestimates the needed space and then releases it (and eventually
> > uses also some of the "reserved" space) makes the quota implementation 
> > harder
> > than in other filesystems...
> 
> Wouldn't be simpler if quota limited a user in terms of number of
> files/total size of files and not in term of space used by filesystem
> internally to store those files?
> 
> I think that anyway it is not possible to calculate exactly number of
> bytes of disk space reiserX filesystem allocates to store a file in
> filesystem because of rebalancing.
  Yes, I agree that accounting metadata is almost impossible. But still
I think it would be nice if especially compressed file would be
accounted only for its compressed size and not for the uncompressed
one. This is basically the only reason why I didn't yet decide to use
just the plain amount of file's data.

> For quota in reiser4 : we probably need to make it "transactional".
> Do you think we can make quota counter update in the same transaction
> with the quoted operation?
  Yes, that should be possible and not very complicated - ext3 and
reiser3 already support this. If the quota callbacks are set properly
then each call of DQUOT_ALLOC/FREE_... will end up as a call to
foofs_quota_write() to update data in the quota file and filesystem can
then attach this update to the running transaction - at least this way
it works for ext3 and reiser3.

Honza


[PATCH] Fix rewriting on a full reiserfs filesystem

2005-04-14 Thread Jan Kara
  Hello,

  attached patch fixes rewriting of a file on a full reiserfs
filesystem. Previously it was impossible to write to a file even if we
needed no empty block. The patch allows rewriting of a file and also
extension of a file upto the end of the last allocated block. The patch
applies fine also against 2.6.12-rc2 kernel.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Allow rewriting of a file and extending a file upto the end of the allocated
block on a full filesystem.

From: Chris Mason <[EMAIL PROTECTED]>
Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.5-SLES9_SP2/fs/reiserfs/file.c 
linux-2.6.5-SLES9_SP2-reiserrewrite/fs/reiserfs/file.c
--- linux-2.6.5-SLES9_SP2/fs/reiserfs/file.c2005-04-13 13:56:08.0 
+0200
+++ linux-2.6.5-SLES9_SP2-reiserrewrite/fs/reiserfs/file.c  2005-04-14 
12:50:33.0 +0200
@@ -1303,10 +1303,11 @@ ssize_t reiserfs_file_write( struct file
reiserfs_claim_blocks_to_be_allocated(inode->i_sb, num_pages << 
(PAGE_CACHE_SHIFT - inode->i_blkbits));
reiserfs_write_unlock(inode->i_sb);
 
-   if ( !num_pages ) { /* If we do not have enough space even for */
-   res = -ENOSPC;  /* single page, return -ENOSPC */
-   if ( pos > (inode->i_size & (inode->i_sb->s_blocksize-1)))
-   break; // In case we are writing past the file end, break.
+   if ( !num_pages ) { /* If we do not have enough space even for a single 
page... */
+   if ( pos > inode->i_size+inode->i_sb->s_blocksize-(pos & 
(inode->i_sb->s_blocksize-1))) {
+   res = -ENOSPC;
+   break; // In case we are writing past the end of the last file 
block, break.
+   }
// Otherwise we are possibly overwriting the file, so
// let's set write size to be equal or less than blocksize.
// This way we get it correctly for file holes.


Re: quota, reiserfs v3, 2.6.11.7

2005-05-03 Thread Jan Kara
  Hello,

> To add something to previous quota-related discussion:
> I have a server with 180GB reiserfs v3 filesystem on a software RAID-1, 
> 30GB of which is used, exported via nfs.
> Yesterday I turned quotas on (in order to calm down a new user eating
> approx. 1GB/day). Today in the morning the server was hung, I rebooted,
> worked for a few minutes before it hung again (with no message on the
> console). I rebooted again, turned quotas off, run reiserfsck 
> and the server is happily running for about 3 hours now, with heavy 
> nfs usage.
> 
> Kernel is vanilla 2.6.11.7 + grsecurity patches (could this be
> related?). Usually, the usage is rather heavy (/home nfs export to >15
> workstations).
  Hmm, grsecurity does not seem to modify anything that could influence
your problem. What I suspect more is that quota code trigges some rare
error path with a bug in it.

> I found these mesasges in syslog:
> May  2 07:36:30 palo kernel: ReiserFS: md0: warning: PAP-5660: 
> reiserfs_do_truncate: wrong result -1 of search for [186288 279757 
> 0xfff DIRECT]
  It would be useful if you can apply the attached patch to the kernel
so that when the above happens (which seems to be the first bug in a chain)
more debug info was dumped - at least we will know from where
reiserfs_do_truncate was called.

> May  2 07:36:30 palo kernel: ReiserFS: md0: warning: clm-2100: nesting info a 
> different FS
> May  2 07:36:42 palo last message repeated 5 times
> 
> May  2 08:59:40 palo kernel: ReiserFS: md0: warning: PAP-5660: 
> reiserfs_do_truncate: wrong result -1 of search for [221141 125166 
> 0xfff DIRECT]
> May  2 08:59:40 palo kernel: ReiserFS: md0: warning: clm-2100: nesting info a 
> different FS
> May  2 08:59:47 palo last message repeated 274 times
> 
> these two mesasges seem to correspond to the two crashes 
> 
> Unfortunately, since this is a production server, I cannot do many
> experiments, but I might try to set up reiserfs+quotas on some test
> computers if necessary.

Thanks for report
Honza
--- linux/fs/reiserfs/stree.c   2005-03-03 18:58:30.0 +0100
+++ linux/fs/reiserfs/stree.c   2005-05-03 14:47:12.0 +0200
@@ -1790,7 +1790,7 @@ int reiserfs_do_truncate (struct reiserf
 if (retval == POSITION_FOUND || retval == FILE_NOT_FOUND) {
reiserfs_warning (p_s_inode->i_sb, "PAP-5660: reiserfs_do_truncate: "
  "wrong result %d of search for %K", retval, 
&s_item_key);
-
+   dump_stack();
 err = -EIO;
 goto out;
 }


[PATCH] Make reiserfs BUG on too big transaction

2005-05-19 Thread Jan Kara
  Hello!

  Attached patch makes reiserfs BUG() when somebody tries to start a
larger transaction than it's allowed (currently the code just silently
deadlocks). I think this is a better behaviour. Can you please apply the
patch?

Honza
Make kernel BUG when someone tries to start a transaction which is too
large.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rup linux-2.6.12-rc4/fs/reiserfs/journal.c 
linux-2.6.12-rc4-reiserassert/fs/reiserfs/journal.c
--- linux-2.6.12-rc4/fs/reiserfs/journal.c  Sat May 14 13:02:01 2005
+++ linux-2.6.12-rc4-reiserassert/fs/reiserfs/journal.c Sat May 14 13:21:43 2005
@@ -2631,6 +2631,8 @@ static int do_journal_begin_r(struct rei
   int retval;
 
   reiserfs_check_lock_depth(p_s_sb, "journal_begin") ;
+  if (nblocks > journal->j_trans_max)
+   BUG();
 
   PROC_INFO_INC( p_s_sb, journal.journal_being );
   /* set here for journal_join */


[PATCH] Fix quota transaction size

2005-05-19 Thread Jan Kara
  Hello,

  attached patch improves the estimates on the number of credits needed
for a quota operation. This is needed as currently quota overflows the
maximum size of a transaction if 1KB blocksize is used. Please apply.

Honza
Improve the estimates on the number of needed credits for quota transaction.
We now distinguish blocks which might need to be allocated and blocks that
only need to be rewritten. Also we distinguish deleting of a quota structure
and creating of a new one.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rup linux-2.6.12-rc4-reiserassert/fs/ext3/inode.c 
linux-2.6.12-rc4-quotareserve/fs/ext3/inode.c
--- linux-2.6.12-rc4-reiserassert/fs/ext3/inode.c   Sat May 14 13:02:00 2005
+++ linux-2.6.12-rc4-quotareserve/fs/ext3/inode.c   Sat May 14 13:50:23 2005
@@ -2763,7 +2763,8 @@ int ext3_setattr(struct dentry *dentry, 
 
/* (user+group)*(old+new) structure, inode write (sb,
 * inode block, ? - but truncate inode update has it) */
-   handle = ext3_journal_start(inode, 4*EXT3_QUOTA_INIT_BLOCKS+3);
+   handle = ext3_journal_start(inode, 2*(EXT3_QUOTA_INIT_BLOCKS+
+   EXT3_QUOTA_DEL_BLOCKS)+3);
if (IS_ERR(handle)) {
error = PTR_ERR(handle);
goto err_out;
diff -rup linux-2.6.12-rc4-reiserassert/fs/ext3/super.c 
linux-2.6.12-rc4-quotareserve/fs/ext3/super.c
--- linux-2.6.12-rc4-reiserassert/fs/ext3/super.c   Sat May 14 13:02:00 2005
+++ linux-2.6.12-rc4-quotareserve/fs/ext3/super.c   Sat May 14 13:55:23 2005
@@ -2246,7 +2246,7 @@ static int ext3_dquot_drop(struct inode 
int ret, err;
 
/* We may delete quota structure so we need to reserve enough blocks */
-   handle = ext3_journal_start(inode, 2*EXT3_QUOTA_INIT_BLOCKS);
+   handle = ext3_journal_start(inode, 2*EXT3_QUOTA_DEL_BLOCKS);
if (IS_ERR(handle))
return PTR_ERR(handle);
ret = dquot_drop(inode);
@@ -2296,7 +2296,7 @@ static int ext3_release_dquot(struct dqu
handle_t *handle;
 
handle = ext3_journal_start(dquot_to_inode(dquot),
-   EXT3_QUOTA_INIT_BLOCKS);
+   EXT3_QUOTA_DEL_BLOCKS);
if (IS_ERR(handle))
return PTR_ERR(handle);
ret = dquot_release(dquot);
diff -rup linux-2.6.12-rc4-reiserassert/fs/reiserfs/inode.c 
linux-2.6.12-rc4-quotareserve/fs/reiserfs/inode.c
--- linux-2.6.12-rc4-reiserassert/fs/reiserfs/inode.c   Sat May 14 16:53:47 2005
+++ linux-2.6.12-rc4-quotareserve/fs/reiserfs/inode.c   Sat May 14 16:53:01 2005
@@ -2798,10 +2798,10 @@ int reiserfs_setattr(struct dentry *dent
struct reiserfs_transaction_handle th;
 
/* (user+group)*(old+new) structure - we count quota info 
and , inode write (sb, inode) */
-   journal_begin(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_begin(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
 error = DQUOT_TRANSFER(inode, attr) ? -EDQUOT : 0;
if (error) {
-   journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
goto out;
}
/* Update corresponding info in inode so that everything is 
in
@@ -2811,7 +2811,7 @@ int reiserfs_setattr(struct dentry *dent
if (attr->ia_valid & ATTR_GID)
inode->i_gid = attr->ia_gid;
mark_inode_dirty(inode);
-   journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
}
 }
 if (!error)
diff -rup linux-2.6.12-rc4-reiserassert/fs/reiserfs/namei.c 
linux-2.6.12-rc4-quotareserve/fs/reiserfs/namei.c
--- linux-2.6.12-rc4-reiserassert/fs/reiserfs/namei.c   Sat May 14 16:52:17 2005
+++ linux-2.6.12-rc4-quotareserve/fs/reiserfs/namei.c   Sat May 14 17:00:58 2005
@@ -829,8 +829,10 @@ static int reiserfs_rmdir (struct inode 
 
 
 /* we will be doing 2 balancings and update 2 stat data, we change quotas
- * of the owner of the directory and of the owner of the parent directory 
*/
-jbegin_count = JOURNAL_PER_BALANCE_CNT * 2 + 2 + 2 * 
(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_TRANS_BLOCKS);
+ * of the owner of the directory and of the owner of the parent directory.
+ * The quota structure is possibly deleted only on last iput => outside
+ * of this transaction */
+jbegin_count = JO

[PATCH] Check journal_begin() return codes

2005-05-19 Thread Jan Kara
  Hello,

  attached patch makes quota code in reiserfs properly check the return
code of journal_begin() and journal_end() functions. The patch is to be
applied after the previous patch fixing the quota transaction size.
Please apply.

Honza
Check return values of journal_begin() and journal_end() in the quota code
for reiserfs.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rup linux-2.6.12-rc4-quotareserve/fs/reiserfs/inode.c 
linux-2.6.12-rc4-transstart/fs/reiserfs/inode.c
--- linux-2.6.12-rc4-quotareserve/fs/reiserfs/inode.c   Sat May 14 16:53:01 2005
+++ linux-2.6.12-rc4-transstart/fs/reiserfs/inode.c Sat May 14 17:08:02 2005
@@ -2798,7 +2798,9 @@ int reiserfs_setattr(struct dentry *dent
struct reiserfs_transaction_handle th;
 
/* (user+group)*(old+new) structure - we count quota info 
and , inode write (sb, inode) */
-   journal_begin(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
+   error = journal_begin(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
+   if (error)
+   goto out;
 error = DQUOT_TRANSFER(inode, attr) ? -EDQUOT : 0;
if (error) {
journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
@@ -2811,7 +2813,7 @@ int reiserfs_setattr(struct dentry *dent
if (attr->ia_valid & ATTR_GID)
inode->i_gid = attr->ia_gid;
mark_inode_dirty(inode);
-   journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
+   error = journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
}
 }
 if (!error)
diff -rup linux-2.6.12-rc4-quotareserve/fs/reiserfs/super.c 
linux-2.6.12-rc4-transstart/fs/reiserfs/super.c
--- linux-2.6.12-rc4-quotareserve/fs/reiserfs/super.c   Sat May 14 17:01:56 2005
+++ linux-2.6.12-rc4-transstart/fs/reiserfs/super.c Sat May 14 17:16:15 2005
@@ -1839,13 +1839,18 @@ static int reiserfs_statfs (struct super
 static int reiserfs_dquot_initialize(struct inode *inode, int type)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 /* We may create quota structure so we need to reserve enough blocks */
 reiserfs_write_lock(inode->i_sb);
-journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+ret = journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+if (ret)
+   goto out;
 ret = dquot_initialize(inode, type);
-journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+err = journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+if (!ret && err)
+   ret = err;
+out:
 reiserfs_write_unlock(inode->i_sb);
 return ret;
 }
@@ -1853,13 +1858,18 @@ static int reiserfs_dquot_initialize(str
 static int reiserfs_dquot_drop(struct inode *inode)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 /* We may delete quota structure so we need to reserve enough blocks */
 reiserfs_write_lock(inode->i_sb);
-journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_DEL_BLOCKS);
+ret = journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_DEL_BLOCKS);
+if (ret)
+   goto out;
 ret = dquot_drop(inode);
-journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_DEL_BLOCKS);
+err = journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_DEL_BLOCKS);
+if (!ret && err)
+   ret = err;
+out:
 reiserfs_write_unlock(inode->i_sb);
 return ret;
 }
@@ -1867,12 +1877,17 @@ static int reiserfs_dquot_drop(struct in
 static int reiserfs_write_dquot(struct dquot *dquot)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 reiserfs_write_lock(dquot->dq_sb);
-journal_begin(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+ret = journal_begin(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+if (ret)
+   goto out;
 ret = dquot_commit(dquot);
-journal_end(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+err = journal_end(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+if (!ret && err)
+   ret = err;
+out:
 reiserfs_write_unlock(dquot->dq_sb);
 return ret;
 }
@@ -1880,12 +1895,17 @@ static int reiserfs_write_dquot(struct d
 static int reiserfs_acquire_dquot(struct dquot *dquot)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 reiserfs_write_lock(dquot->dq_sb);
-journal_begin(&th, dquot->dq_sb, REISERFS_QUOTA_INIT_BLOCKS);
+ret =

Re: [PATCH] Fix quota transaction size

2005-05-20 Thread Jan Kara
  Hello,

> On Thursday 19 May 2005 05:40, Jan Kara wrote:
> >   Hello,
> >
> >   attached patch improves the estimates on the number of credits needed
> > for a quota operation. This is needed as currently quota overflows the
> > maximum size of a transaction if 1KB blocksize is used. Please apply.
> 
> Thanks Jan,
> 
> It would make more sense to only allocate for the quota if quotas are
> in use.  When you have 10 or more concurrent procs unlinking things,
> they end up waiting for each other because they are trying to reserve
> so many blocks in the transaction.  So, a smaller reservation allows
> for better concurrency when quotas are off.
  That's a good point. Checking whether the quota is enabled on a
transaction start is probably not a good option as quotas can be turned
on while some transaction is in progress. But we may check whether the
filesystem was mounted with some quota option (better set some
superblock flag when it is so) and reserve a space in a transaction for
quotas in that case. Mount options can be changed only on remounting
which nicely synchronizes everything anyway. And running a filesystem
with quota options but without quota turned on is probably a case
which can suffer some penalty. I'll write the patch for this.

        Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH] Fix quota transaction size

2005-05-20 Thread Jan Kara
>   Hello,
> 
>   attached patch improves the estimates on the number of credits needed
> for a quota operation. This is needed as currently quota overflows the
> maximum size of a transaction if 1KB blocksize is used. Please apply.
> 
  Sorry for replying to myself but I just found out that this patch did
not compile without CONFIG_QUOTA. Attached patch was tested to compile
also without it.

Honza

Improve estimates on the number of needed credits for quota transaction.
Now we distinguish blocks that might need to be allocated and blocks that
only need to be rewritten. Also we distinguish deleting of a quota structure
and creating of a new one.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-1-reiserbigtrans/fs/ext3/inode.c 
linux-2.6.12-rc4-2-credits/fs/ext3/inode.c
--- linux-2.6.12-rc4-1-reiserbigtrans/fs/ext3/inode.c   2005-05-18 
15:10:55.0 +0200
+++ linux-2.6.12-rc4-2-credits/fs/ext3/inode.c  2005-05-18 15:18:12.0 
+0200
@@ -2763,7 +2763,8 @@ int ext3_setattr(struct dentry *dentry, 
 
/* (user+group)*(old+new) structure, inode write (sb,
 * inode block, ? - but truncate inode update has it) */
-   handle = ext3_journal_start(inode, 4*EXT3_QUOTA_INIT_BLOCKS+3);
+   handle = ext3_journal_start(inode, 2*(EXT3_QUOTA_INIT_BLOCKS+
+   EXT3_QUOTA_DEL_BLOCKS)+3);
if (IS_ERR(handle)) {
error = PTR_ERR(handle);
goto err_out;
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-1-reiserbigtrans/fs/ext3/super.c 
linux-2.6.12-rc4-2-credits/fs/ext3/super.c
--- linux-2.6.12-rc4-1-reiserbigtrans/fs/ext3/super.c   2005-05-18 
15:10:55.0 +0200
+++ linux-2.6.12-rc4-2-credits/fs/ext3/super.c  2005-05-18 15:18:12.0 
+0200
@@ -2246,7 +2246,7 @@ static int ext3_dquot_drop(struct inode 
int ret, err;
 
/* We may delete quota structure so we need to reserve enough blocks */
-   handle = ext3_journal_start(inode, 2*EXT3_QUOTA_INIT_BLOCKS);
+   handle = ext3_journal_start(inode, 2*EXT3_QUOTA_DEL_BLOCKS);
if (IS_ERR(handle))
return PTR_ERR(handle);
ret = dquot_drop(inode);
@@ -2296,7 +2296,7 @@ static int ext3_release_dquot(struct dqu
handle_t *handle;
 
handle = ext3_journal_start(dquot_to_inode(dquot),
-   EXT3_QUOTA_INIT_BLOCKS);
+   EXT3_QUOTA_DEL_BLOCKS);
if (IS_ERR(handle))
return PTR_ERR(handle);
ret = dquot_release(dquot);
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/inode.c 
linux-2.6.12-rc4-2-credits/fs/reiserfs/inode.c
--- linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/inode.c   2005-05-18 
15:10:59.0 +0200
+++ linux-2.6.12-rc4-2-credits/fs/reiserfs/inode.c  2005-05-18 
15:18:12.0 +0200
@@ -2798,10 +2798,10 @@ int reiserfs_setattr(struct dentry *dent
struct reiserfs_transaction_handle th;
 
/* (user+group)*(old+new) structure - we count quota info 
and , inode write (sb, inode) */
-   journal_begin(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_begin(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
 error = DQUOT_TRANSFER(inode, attr) ? -EDQUOT : 0;
if (error) {
-   journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
goto out;
}
/* Update corresponding info in inode so that everything is 
in
@@ -2811,7 +2811,7 @@ int reiserfs_setattr(struct dentry *dent
if (attr->ia_valid & ATTR_GID)
inode->i_gid = attr->ia_gid;
mark_inode_dirty(inode);
-   journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_end(&th, inode->i_sb, 
2*(REISERFS_QUOTA_INIT_BLOCKS+REISERFS_QUOTA_DEL_BLOCKS)+2);
}
 }
 if (!error)
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/namei.c 
linux-2.6.12-rc4-2-credits/fs/reiserfs/namei.c
--- linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/namei.c   2005-05-18 
15:10:59.0 +0200
+++ linux-2.6.12-rc4-2-credits/fs/reiserfs/namei.c  2005-05-18 
15:18:12.0 +0200
@@ -829,8 +829,10 @@ static int reiserfs_rmdir (struct inode 
 
 
 /* we will be doing 2 balancings and update 2 stat data, we change quotas

[PATCH] Add checking of journal_begin() return value

2005-05-24 Thread Jan Kara
  Hello,

  attached patch adds proper checking of the return values of
journal_begin() and journal_end() to the quota code in reiserfs.
I already sent a similar patch but it got rejected due to the dependency
on another rejected patch ;) Now the patch is rediffed to be
independent. Please apply.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Check return values of journal_begin() and journal_end() in the quota code
for reiserfs.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/inode.c 
linux-2.6.12-rc4-2-transstart/fs/reiserfs/inode.c
--- linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/inode.c   2005-05-18 
15:10:59.0 +0200
+++ linux-2.6.12-rc4-2-transstart/fs/reiserfs/inode.c   2005-05-23 
17:05:01.0 +0200
@@ -2798,7 +2798,9 @@ int reiserfs_setattr(struct dentry *dent
struct reiserfs_transaction_handle th;
 
/* (user+group)*(old+new) structure - we count quota info 
and , inode write (sb, inode) */
-   journal_begin(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   error = journal_begin(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   if (error)
+   goto out;
 error = DQUOT_TRANSFER(inode, attr) ? -EDQUOT : 0;
if (error) {
journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
@@ -2811,7 +2813,7 @@ int reiserfs_setattr(struct dentry *dent
if (attr->ia_valid & ATTR_GID)
inode->i_gid = attr->ia_gid;
mark_inode_dirty(inode);
-   journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   error = journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
}
 }
 if (!error)
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/super.c 
linux-2.6.12-rc4-2-transstart/fs/reiserfs/super.c
--- linux-2.6.12-rc4-1-reiserbigtrans/fs/reiserfs/super.c   2005-05-18 
15:10:59.0 +0200
+++ linux-2.6.12-rc4-2-transstart/fs/reiserfs/super.c   2005-05-23 
17:08:32.0 +0200
@@ -1839,13 +1839,18 @@ static int reiserfs_statfs (struct super
 static int reiserfs_dquot_initialize(struct inode *inode, int type)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 /* We may create quota structure so we need to reserve enough blocks */
 reiserfs_write_lock(inode->i_sb);
-journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+ret = journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+if (ret)
+   goto out;
 ret = dquot_initialize(inode, type);
-journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+err = journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+if (!ret && err)
+   ret = err;
+out:
 reiserfs_write_unlock(inode->i_sb);
 return ret;
 }
@@ -1853,13 +1858,18 @@ static int reiserfs_dquot_initialize(str
 static int reiserfs_dquot_drop(struct inode *inode)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 /* We may delete quota structure so we need to reserve enough blocks */
 reiserfs_write_lock(inode->i_sb);
-journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+ret = journal_begin(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+if (ret)
+   goto out;
 ret = dquot_drop(inode);
-journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+err = journal_end(&th, inode->i_sb, 2*REISERFS_QUOTA_INIT_BLOCKS);
+if (!ret && err)
+   ret = err;
+out:
 reiserfs_write_unlock(inode->i_sb);
 return ret;
 }
@@ -1867,12 +1877,17 @@ static int reiserfs_dquot_drop(struct in
 static int reiserfs_write_dquot(struct dquot *dquot)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 reiserfs_write_lock(dquot->dq_sb);
-journal_begin(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+ret = journal_begin(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+if (ret)
+   goto out;
 ret = dquot_commit(dquot);
-journal_end(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+err = journal_end(&th, dquot->dq_sb, REISERFS_QUOTA_TRANS_BLOCKS);
+if (!ret && err)
+   ret = err;
+out:
 reiserfs_write_unlock(dquot->dq_sb);
 return ret;
 }
@@ -1880,12 +1895,17 @@ static int reiserfs_write_dquot(struct d
 static int reiserfs_acquire_dquot(struct dquot *dquot)
 {
 struct reiserfs_transaction_handle th;
-int ret;
+int ret, err;
 
 r

[PATCH] Improve credits estimates for quota

2005-05-24 Thread Jan Kara
  Hello,

  attached patch is the first of the three patches improving the estimates
on the number of credits needed to perform a quota operation. We now
distinguish whether block needs to be allocated or just rewritten. This
change also fixes a problem with reiserfs and 1KB blocksize where we
exceeded the maximum transaction size.
  This patch implements the changes in the generic quota code. Please apply.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Improve estimates on the number of needed credits for quota transaction.
Now we distinguish blocks that might need to be allocated and blocks that
only need to be rewritten. Also we distinguish deleting of a quota structure
and creating of a new one.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-2-transstart/include/linux/dqblk_v1.h 
linux-2.6.12-rc4-3-credits/include/linux/dqblk_v1.h
--- linux-2.6.12-rc4-2-transstart/include/linux/dqblk_v1.h  2004-10-18 
23:54:07.0 +0200
+++ linux-2.6.12-rc4-3-credits/include/linux/dqblk_v1.h 2005-05-24 
11:13:45.0 +0200
@@ -11,6 +11,12 @@
 /* Root squash turned on */
 #define V1_DQF_RSQUASH 1
 
+/* Numbers of blocks needed for updates */
+#define V1_INIT_ALLOC 1
+#define V1_INIT_REWRITE 1
+#define V1_DEL_ALLOC 0
+#define V1_DEL_REWRITE 2
+
 /* Special information about quotafile */
 struct v1_mem_dqinfo {
 };
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-2-transstart/include/linux/dqblk_v2.h 
linux-2.6.12-rc4-3-credits/include/linux/dqblk_v2.h
--- linux-2.6.12-rc4-2-transstart/include/linux/dqblk_v2.h  2004-10-18 
23:53:46.0 +0200
+++ linux-2.6.12-rc4-3-credits/include/linux/dqblk_v2.h 2005-05-24 
11:13:45.0 +0200
@@ -10,6 +10,12 @@
 /* id numbers of quota format */
 #define QFMT_VFS_V0 2
 
+/* Numbers of blocks needed for updates */
+#define V2_INIT_ALLOC 4
+#define V2_INIT_REWRITE 2
+#define V2_DEL_ALLOC 0
+#define V2_DEL_REWRITE 6
+
 /* Inmemory copy of version specific information */
 struct v2_mem_dqinfo {
unsigned int dqi_blocks;
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-2-transstart/include/linux/quota.h 
linux-2.6.12-rc4-3-credits/include/linux/quota.h
--- linux-2.6.12-rc4-2-transstart/include/linux/quota.h 2005-03-03 
18:58:41.0 +0100
+++ linux-2.6.12-rc4-3-credits/include/linux/quota.h2005-05-24 
11:13:45.0 +0200
@@ -138,8 +138,11 @@ struct if_dqinfo {
 #include 
 
 /* Maximal numbers of writes for quota operation (insert/delete/update)
- * (over all formats) - info block, 4 pointer blocks, data block */
-#define DQUOT_MAX_WRITES   6
+ * (over VFS all formats) */
+#define DQUOT_INIT_ALLOC max(V1_INIT_ALLOC, V2_INIT_ALLOC)
+#define DQUOT_INIT_REWRITE max(V1_INIT_REWRITE, V2_INIT_REWRITE)
+#define DQUOT_DEL_ALLOC max(V1_DEL_ALLOC, V2_DEL_ALLOC)
+#define DQUOT_DEL_REWRITE max(V1_DEL_REWRITE, V2_DEL_REWRITE)
 
 /*
  * Data for one user/group kept in memory


[PATCH] Improve quota credit estimates for reiserfs

2005-05-24 Thread Jan Kara
  Hello,

  attached patch makes reiserfs use improved quota credits estimates. It
also teaches reiserfs to reserve a space in a transaction only if some
quota mount options were specified. Please apply.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Use improved credits estimates for quota operations. Also reserve space
for a quota operation in a transaction only if filesystem was mounted with
some quota option.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-4-credits-ext3/fs/reiserfs/file.c 
linux-2.6.12-rc4-5-credits-reiser/fs/reiserfs/file.c
--- linux-2.6.12-rc4-4-credits-ext3/fs/reiserfs/file.c  2005-05-18 
15:10:59.0 +0200
+++ linux-2.6.12-rc4-5-credits-reiser/fs/reiserfs/file.c2005-05-24 
11:25:59.0 +0200
@@ -201,7 +201,7 @@ static int reiserfs_allocate_blocks_for_
 /* If we came here, it means we absolutely need to open a transaction,
since we need to allocate some blocks */
 reiserfs_write_lock(inode->i_sb); // Journaling stuff and we need that.
-res = journal_begin(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 * 
REISERFS_QUOTA_TRANS_BLOCKS); // Wish I know if this number enough
+res = journal_begin(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 * 
REISERFS_QUOTA_TRANS_BLOCKS(inode->i_sb)); // Wish I know if this number enough
 if (res)
 goto error_exit;
 reiserfs_update_inode_transaction(inode) ;
@@ -576,7 +576,7 @@ error_exit:
 int err;
 // update any changes we made to blk count
 reiserfs_update_sd(th, inode);
-err = journal_end(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 
* REISERFS_QUOTA_TRANS_BLOCKS);
+err = journal_end(th, inode->i_sb, JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 
* REISERFS_QUOTA_TRANS_BLOCKS(inode->i_sb));
 if (err)
 res = err;
 }
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-4-credits-ext3/fs/reiserfs/inode.c 
linux-2.6.12-rc4-5-credits-reiser/fs/reiserfs/inode.c
--- linux-2.6.12-rc4-4-credits-ext3/fs/reiserfs/inode.c 2005-05-23 
17:05:01.0 +0200
+++ linux-2.6.12-rc4-5-credits-reiser/fs/reiserfs/inode.c   2005-05-24 
11:27:23.0 +0200
@@ -28,7 +28,7 @@ static int reiserfs_prepare_write(struct
 void reiserfs_delete_inode (struct inode * inode)
 {
 /* We need blocks for transaction + (user+group) quota update (possibly 
delete) */
-int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2 + 2 * 
REISERFS_QUOTA_INIT_BLOCKS;
+int jbegin_count = JOURNAL_PER_BALANCE_CNT * 2 + 2 * 
REISERFS_QUOTA_INIT_BLOCKS(inode->i_sb);
 struct reiserfs_transaction_handle th ;
   
 reiserfs_write_lock(inode->i_sb);
@@ -591,7 +591,7 @@ int reiserfs_get_block (struct inode * i
XXX in practically impossible worst case direct2indirect()
can incur (much) more than 3 balancings.
quota update for user, group */
-int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 * 
REISERFS_QUOTA_TRANS_BLOCKS;
+int jbegin_count = JOURNAL_PER_BALANCE_CNT * 3 + 1 + 2 * 
REISERFS_QUOTA_TRANS_BLOCKS(inode->i_sb);
 int version;
 int dangle = 1;
 loff_t new_offset = (((loff_t)block) << inode->i_sb->s_blocksize_bits) + 1 
;
@@ -2796,14 +2796,15 @@ int reiserfs_setattr(struct dentry *dent
 
 if (!error) {
struct reiserfs_transaction_handle th;
+   int jbegin_count = 
2*(REISERFS_QUOTA_INIT_BLOCKS(inode->i_sb)+REISERFS_QUOTA_DEL_BLOCKS(inode->i_sb))+2;
 
/* (user+group)*(old+new) structure - we count quota info 
and , inode write (sb, inode) */
-   error = journal_begin(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   error = journal_begin(&th, inode->i_sb, jbegin_count);
if (error)
goto out;
 error = DQUOT_TRANSFER(inode, attr) ? -EDQUOT : 0;
if (error) {
-   journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   journal_end(&th, inode->i_sb, jbegin_count);
goto out;
}
/* Update corresponding info in inode so that everything is 
in
@@ -2813,7 +2814,7 @@ int reiserfs_setattr(struct dentry *dent
if (attr->ia_valid & ATTR_GID)
inode->i_gid = attr->ia_gid;
mark_inode_dirty(inode);
-   error = journal_end(&th, inode->i_sb, 
4*REISERFS_QUOTA_INIT_BLOCKS+2);
+   error = journal_end(&th, inode->i_sb, jbegin_count);
}
 }
 if (!error)
diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.12-rc4-4-credits-ext3/fs/reiserfs/namei.c 
linux-2.6.12

Re: BUG: reiserfs+acl+quota deadlock

2005-08-10 Thread Jan Kara
  Hello,

> I've already reported a similiar bug to the one I found now
> and that was fixed by:
> "[PATCH] reiserfs: fix deadlock in inode creation failure path w/
> default ACL"
> 
> This bug is similiar in effect but has some differences in how
> to trigger it. The end effect will be just like with the other
> bug that the affected directory will be unaccessible to any user
> or process.
> 
> So here's the way to reproduce it, as minimal as I could get it:
> 
> You need reiserfs, quota and acl support in kernel.
> you also need quota tools (edquota, quotaon, quotacheck), I used
> linuxquota 3.12.
> 
> # cd /mnt
> # dd if=/dev/zero of=test bs=1M count=50
> 50+0 records in
> 50+0 records out
> # mkreiserfs -f test >/dev/null
> mkreiserfs 3.6.19 (2003 www.namesys.com)
> 
> test is not a block special device
> Continue (y/n):y
> # mkdir mpoint
> # mount test mpoint -o loop,acl,usrquota
> # mkdir mpoint/user1
> # useradd -d /mnt/mpoint/user1 user1 # may also use existing user
> # chown user1 mpoint/user1
> # quotacheck -v mpoint   # initializes quota file
> # edquota user1
>  set soft block limit to 1000, hard limit to 4000 
> # edquota -t
>  set the grace periods to something small: 1minutes ---
> # quotaon mpoint
> # ## at this point "repquota -a" should show the quota for user1
> # su user1
> # cd
> # ## now we are in user1 home dir as user1
> # cat /dev/zero > file1
> loop2: warning, user block quota exceeded.
> loop2: write failed, user block limit reached.
> cat: write error: No space left on device
> --- now we wait till the grace period expires (repquota -a) 
> # cat "" > otherfile
> loop2: write failed, user block quota exceeded too long.
>  and it will hang forever 
> # ## /mnt/mpoint can still be accessed, but /mnt/mpoint/user1 can't
> 
> 
> I tested this on an -mm patchset kernel (2.6.13-rc5-mm1), but I
> discovered the bug in my server which runs plain 2.6.12 with the
> patch from Jeff Mahoney for the first reiserfs+acl bug.
> 
> The main difference between the two bugs is that the first one requires
> the existance of a default acl, this one does not, but it does require
> acl to be enabled.
  This seems to be the same problem as bug #4771 that I've just fix. Can
you try attached patch please?
  Andrew, can you include the patch into -mm if ReiserFS guys won't object?

        Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
Initialize key object ID in inode so that we don't try to remove the inode
when we fail on some checks even before we manage to allocate something.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.13-rc6/fs/reiserfs/namei.c 
linux-2.6.13-rc6-reiser_create_fix/fs/reiserfs/namei.c
--- linux-2.6.13-rc6/fs/reiserfs/namei.c2005-08-12 10:39:25.0 
+0200
+++ linux-2.6.13-rc6-reiser_create_fix/fs/reiserfs/namei.c  2005-08-12 
10:39:07.0 +0200
@@ -593,6 +593,9 @@ static int new_inode_init(struct inode *
 */
inode->i_uid = current->fsuid;
inode->i_mode = mode;
+   /* Make inode invalid - just in case we are going to drop it before
+* the initialization happens */
+   INODE_PKEY(inode)->k_objectid = 0;
 
if (dir->i_mode & S_ISGID) {
inode->i_gid = dir->i_gid;


Re: BUG: reiserfs+acl+quota deadlock

2005-08-10 Thread Jan Kara
> Tried the attached patch but it changed nothing, I trying to create
> a new file as a user whose quota grace time has ran out will still
> cause everything accessing the users homedir (the one with the quota)
> to hang in D state.
> 
> Also note that the bug I reported only exists when acl is also
> enabled (does not have to be used). And although my kernel is not
> built with debug (or reiserfs debug) support, I don't get any
> oopses or reiserfs errors.. it just hangs.
  Oops, sorry. I forgot to mount the fs with ACL mount option and so I
was not able to reproduce the hang. My fault, your bug is a different
problem. Now it hangs also for me so I can debug it :)

Honza

> On K, 2005-08-10 at 15:00 +0200, Jan Kara wrote:
> >   Hello,
> > 
> > > I've already reported a similiar bug to the one I found now
> > > and that was fixed by:
> > > "[PATCH] reiserfs: fix deadlock in inode creation failure path w/
> > > default ACL"
> > > 
> > > This bug is similiar in effect but has some differences in how
> > > to trigger it. The end effect will be just like with the other
> > > bug that the affected directory will be unaccessible to any user
> > > or process.
> > > 
> > > So here's the way to reproduce it, as minimal as I could get it:
> > > 
> > > You need reiserfs, quota and acl support in kernel.
> > > you also need quota tools (edquota, quotaon, quotacheck), I used
> > > linuxquota 3.12.
> > > 
> > > # cd /mnt
> > > # dd if=/dev/zero of=test bs=1M count=50
> > > 50+0 records in
> > > 50+0 records out
> > > # mkreiserfs -f test >/dev/null
> > > mkreiserfs 3.6.19 (2003 www.namesys.com)
> > > 
> > > test is not a block special device
> > > Continue (y/n):y
> > > # mkdir mpoint
> > > # mount test mpoint -o loop,acl,usrquota
> > > # mkdir mpoint/user1
> > > # useradd -d /mnt/mpoint/user1 user1 # may also use existing user
> > > # chown user1 mpoint/user1
> > > # quotacheck -v mpoint   # initializes quota file
> > > # edquota user1
> > >  set soft block limit to 1000, hard limit to 4000 
> > > # edquota -t
> > >  set the grace periods to something small: 1minutes ---
> > > # quotaon mpoint
> > > # ## at this point "repquota -a" should show the quota for user1
> > > # su user1
> > > # cd
> > > # ## now we are in user1 home dir as user1
> > > # cat /dev/zero > file1
> > > loop2: warning, user block quota exceeded.
> > > loop2: write failed, user block limit reached.
> > > cat: write error: No space left on device
> > > --- now we wait till the grace period expires (repquota -a) 
> > > # cat "" > otherfile
> > > loop2: write failed, user block quota exceeded too long.
> > >  and it will hang forever 
> > > # ## /mnt/mpoint can still be accessed, but /mnt/mpoint/user1 can't
> > > 
> > > 
> > > I tested this on an -mm patchset kernel (2.6.13-rc5-mm1), but I
> > > discovered the bug in my server which runs plain 2.6.12 with the
> > > patch from Jeff Mahoney for the first reiserfs+acl bug.
> > > 
> > > The main difference between the two bugs is that the first one requires
> > > the existance of a default acl, this one does not, but it does require
> > > acl to be enabled.
> >   This seems to be the same problem as bug #4771 that I've just fix. Can
> > you try attached patch please?
> >   Andrew, can you include the patch into -mm if ReiserFS guys won't object?
> 
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


[PATCH] Fix error handling in reiserfs

2005-08-13 Thread Jan Kara
  Hi,

  the patch below fixes oops triggered when user exceeded his inode
quota on reiserfs (reiserfs incorrectly thought the inode has been
already allocated and tried to free it). Please apply.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Initialize key object ID in inode so that we don't try to remove the inode
when we fail on some checks even before we manage to allocate something.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.13-rc6/fs/reiserfs/namei.c 
linux-2.6.13-rc6-reiser_create_fix/fs/reiserfs/namei.c
--- linux-2.6.13-rc6/fs/reiserfs/namei.c2005-08-12 10:39:25.0 
+0200
+++ linux-2.6.13-rc6-reiser_create_fix/fs/reiserfs/namei.c  2005-08-12 
10:39:07.0 +0200
@@ -593,6 +593,9 @@ static int new_inode_init(struct inode *
 */
inode->i_uid = current->fsuid;
inode->i_mode = mode;
+   /* Make inode invalid - just in case we are going to drop it before
+* the initialization happens */
+   INODE_PKEY(inode)->k_objectid = 0;
 
if (dir->i_mode & S_ISGID) {
inode->i_gid = dir->i_gid;


Re: BUG: reiserfs+acl+quota deadlock

2005-08-13 Thread Jan Kara
> Tried the attached patch but it changed nothing, I trying to create
> a new file as a user whose quota grace time has ran out will still
> cause everything accessing the users homedir (the one with the quota)
> to hang in D state.
> 
> Also note that the bug I reported only exists when acl is also
> enabled (does not have to be used). And although my kernel is not
> built with debug (or reiserfs debug) support, I don't get any
> oopses or reiserfs errors.. it just hangs.
  OK, I've debugged the hang (I think the bug was actually introduced by
Jeff's fix). Attached patch should fix it.

Honza

> On K, 2005-08-10 at 15:00 +0200, Jan Kara wrote:
> >   Hello,
> > 
> > > I've already reported a similiar bug to the one I found now
> > > and that was fixed by:
> > > "[PATCH] reiserfs: fix deadlock in inode creation failure path w/
> > > default ACL"
> > > 
> > > This bug is similiar in effect but has some differences in how
> > > to trigger it. The end effect will be just like with the other
> > > bug that the affected directory will be unaccessible to any user
> > > or process.
> > > 
> > > So here's the way to reproduce it, as minimal as I could get it:
> > > 
> > > You need reiserfs, quota and acl support in kernel.
> > > you also need quota tools (edquota, quotaon, quotacheck), I used
> > > linuxquota 3.12.
> > > 
> > > # cd /mnt
> > > # dd if=/dev/zero of=test bs=1M count=50
> > > 50+0 records in
> > > 50+0 records out
> > > # mkreiserfs -f test >/dev/null
> > > mkreiserfs 3.6.19 (2003 www.namesys.com)
> > > 
> > > test is not a block special device
> > > Continue (y/n):y
> > > # mkdir mpoint
> > > # mount test mpoint -o loop,acl,usrquota
> > > # mkdir mpoint/user1
> > > # useradd -d /mnt/mpoint/user1 user1 # may also use existing user
> > > # chown user1 mpoint/user1
> > > # quotacheck -v mpoint   # initializes quota file
> > > # edquota user1
> > >  set soft block limit to 1000, hard limit to 4000 
> > > # edquota -t
> > >  set the grace periods to something small: 1minutes ---
> > > # quotaon mpoint
> > > # ## at this point "repquota -a" should show the quota for user1
> > > # su user1
> > > # cd
> > > # ## now we are in user1 home dir as user1
> > > # cat /dev/zero > file1
> > > loop2: warning, user block quota exceeded.
> > > loop2: write failed, user block limit reached.
> > > cat: write error: No space left on device
> > > --- now we wait till the grace period expires (repquota -a) 
> > > # cat "" > otherfile
> > > loop2: write failed, user block quota exceeded too long.
> > >  and it will hang forever 
> > > # ## /mnt/mpoint can still be accessed, but /mnt/mpoint/user1 can't
> > > 
> > > 
> > > I tested this on an -mm patchset kernel (2.6.13-rc5-mm1), but I
> > > discovered the bug in my server which runs plain 2.6.12 with the
> > > patch from Jeff Mahoney for the first reiserfs+acl bug.
> > > 
> > > The main difference between the two bugs is that the first one requires
> > > the existance of a default acl, this one does not, but it does require
> > > acl to be enabled.
> >   This seems to be the same problem as bug #4771 that I've just fix. Can
> > you try attached patch please?
> >   Andrew, can you include the patch into -mm if ReiserFS guys won't object?
> 
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
When i_acl_default is set to some error we do not hold the lock (hence we are
not allowed to drop it and reacquire later).

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.13-rc6-1-reiser_create_fix/fs/reiserfs/inode.c 
linux-2.6.13-rc6-2-reiser_xattr_fix/fs/reiserfs/inode.c
--- linux-2.6.13-rc6-1-reiser_create_fix/fs/reiserfs/inode.c2005-08-14 
17:10:21.0 +0200
+++ linux-2.6.13-rc6-2-reiser_xattr_fix/fs/reiserfs/inode.c 2005-08-14 
17:11:35.0 +0200
@@ -1985,7 +1985,7 @@ int reiserfs_new_inode(struct reiserfs_t
 * iput doesn't deadlock in reiserfs_delete_xattrs. The locking
 * code really needs to be reworked, but this will take care of it
 * for now. -jeffm */
-   if (REISERFS_I(dir)->i_acl_default) {
+   if (REISERFS_I(dir)->i_acl_default && 
!IS_ERR(REISERFS_I(dir)->i_acl_default)) {
reiserfs_write_unlock_xattrs(dir->i_sb);
iput(inode);
reiserfs_write_lock_xattrs(dir->i_sb);


Re: BUG: reiserfs+acl+quota deadlock

2005-08-18 Thread Jan Kara
  Hello,

> Jan Kara wrote:
> >>Tried the attached patch but it changed nothing, I trying to create
> >>a new file as a user whose quota grace time has ran out will still
> >>cause everything accessing the users homedir (the one with the quota)
> >>to hang in D state.
> >>
> >>Also note that the bug I reported only exists when acl is also
> >>enabled (does not have to be used). And although my kernel is not
> >>built with debug (or reiserfs debug) support, I don't get any
> >>oopses or reiserfs errors.. it just hangs.
> >
> 
> It looks like the problem is that reiserfs_new_inode can be called either 
> having xattrs locked or not.
> It does unlocking/locking xattrs on error handling path, but has no idea 
> about whether
> xattrs are locked of not.
> The attached patch seems to fix the problem.
> I am not sure whether it is correct way to fix this problem, though.
  I've already fixed this problem and Andrew accepted the patch into
-mm. I took a bit different approach but yours might be better in a long
run (mine is just a one liner). The patch is attached if you're
interested.

Honza
--
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs
When i_acl_default is set to some error we do not hold the lock (hence we are
not allowed to drop it and reacquire later).

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.13-rc6-1-reiser_create_fix/fs/reiserfs/inode.c 
linux-2.6.13-rc6-2-reiser_xattr_fix/fs/reiserfs/inode.c
--- linux-2.6.13-rc6-1-reiser_create_fix/fs/reiserfs/inode.c2005-08-14 
17:10:21.0 +0200
+++ linux-2.6.13-rc6-2-reiser_xattr_fix/fs/reiserfs/inode.c 2005-08-14 
17:11:35.0 +0200
@@ -1985,7 +1985,7 @@ int reiserfs_new_inode(struct reiserfs_t
 * iput doesn't deadlock in reiserfs_delete_xattrs. The locking
 * code really needs to be reworked, but this will take care of it
 * for now. -jeffm */
-   if (REISERFS_I(dir)->i_acl_default) {
+   if (REISERFS_I(dir)->i_acl_default && 
!IS_ERR(REISERFS_I(dir)->i_acl_default)) {
reiserfs_write_unlock_xattrs(dir->i_sb);
iput(inode);
reiserfs_write_lock_xattrs(dir->i_sb);


Re: is quota functional in 2.6.12.3 ?

2005-08-23 Thread Jan Kara
  Hello,

> I tried to use quota with ReiserFS and a stock kernel 2.6.12.3.
> 
> Everything seems to work fine, except that it doesn't work : the user can use 
> as much space he wants ! However the repquota seems OK.
> 
> Any idea ?
  You probably forgot to turn quotas on using quotaon(8) command.

> [EMAIL PROTECTED] linux-2.6.12.3]# repquota /home
> *** Report for user quotas on device /dev/sda7
> Block grace time: 00:00; Inode grace time: 7days
> Block limitsFile limits
> Userusedsofthard  graceused  soft  hard  grace
> --
> root  --  127242   0   0   4155 0 0
> chris +-   83201   10240   20480   none 320 0 0



Honza


[PATCH] Fix return value in reiserfs allocator

2005-11-07 Thread Jan Kara
  Hello,

  the attached patch makes reiserfs correctly return EDQUOT when the
allocation failed due to quotas (so far we just returned ENOSPC).
Please apply.

Honza
Return correct error code if allocation failed.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.14-rc5/fs/reiserfs/file.c 
linux-2.6.14-rc5-1-reiser_error_return/fs/reiserfs/file.c
--- linux-2.6.14-rc5/fs/reiserfs/file.c 2005-10-25 08:53:25.0 +0200
+++ linux-2.6.14-rc5-1-reiser_error_return/fs/reiserfs/file.c   2005-11-07 
08:29:38.0 +0100
@@ -251,12 +251,12 @@ static int reiserfs_allocate_blocks_for_
   blocks_to_allocate,
   blocks_to_allocate);
if (res != CARRY_ON) {
-   res = -ENOSPC;
+   res = res == QUOTA_EXCEEDED ? -EDQUOT : -ENOSPC;
pathrelse(&path);
goto error_exit;
}
} else {
-   res = -ENOSPC;
+   res = res == QUOTA_EXCEEDED ? -EDQUOT : -ENOSPC;
pathrelse(&path);
goto error_exit;
}


another BUG with journaled quotas (fwd)

2005-12-20 Thread Jan Kara
  Hello,

  recently I got the following backtrace from one user:

BUG at fs/reiserfs/journal.c: 3097!
invalied operand:  [#1]
EIP is at journal_begin
Process nfsd...

Call trace:
reiserfs_write_dquot
_reiserfs_free_block
dquot_free_space
reiserfs_free_prealloc_block
__discard_prealloc
reiserfs_discard_all_prealloc
do_journal_end
reiserfs_dirty_inode
__mark_inode_dirty
journal_end
reiserfs_submit_file_region_for_write
reiserfs_file_write


  What happened there is that we were in do_journal_end() and we wanted
to discard some preallocation. We proceeded up to freeing the block but
then reiserfs_write_dquot() wanted to start a transaction for quota
operation and we got BUG() as the transaction handle that is passed to
functions for discarding preallocation has already refcount == 0.
  If I understood the preallocation code properly, the preallocation
happens only "inside one transaction" - i.e., we allocate more space at
the beginning of the transaction and when we are dropping the last
reference to the handle we discard all the space left. Am I right?
  If so, the problem in the Ooops is mostly cosmetic as all the blocks
needed for quota operation are already attached to the transaction from
the allocation phase. So just raising the refcount of the handle for the
time of discarding the preallocation to make the checks happy should be
enough. But I'm not sure this is the Right Fix... Does somebody have any
better ideas/solutions to the problem?

Bye
Honza


[PATCH] Fix assertion failure in reiserfs+journaled quotas

2006-01-09 Thread Jan Kara
  Hello,

  attached patch fixes assertion failure caused by journaled quotas run
on reiserfs. The problem is when there are some files with preallocated
blocks and we decide to close the transaction. At that point all
preallocation is discarded. Quota code needs to record the amount of
freed blocks into quota file but the handle of the transaction in
current->journal_info has already refcount 0 and so reiserfs check fails
because quota code tries to nest into a transaction with refcount < 1.
  Attached patch just temporarily raises the refcount for the time of
discarding the preallocation - that should be pretty safe as we are
still holding the reference in fact. As the needed blocks for quota
operation are already attached to the transaction from the previous
allocation (that must have taken place in the same transaction) we don't
have to be afraid about growing the transaction...
  The patch is against 2.6.14 but should apply fine against newer
kernels too. Andrew, could you please put the fix into -mm if there are
no objections? Thanks.

Honza

Sometimes we call do_journal_end() with t_refcount == 0. If quota is turned on
and we happen to have some inode with preallocation bad things happen as we try
to use the current handle for quota operations. Checks for t_refcount in
journal_begin() fail and we Oops. We raise t_refcount to make those checks
happy. We should not cause any bad as all the needed quota blocks should be
already attached to the transaction (they were attached to the transaction
when we allocated those preallocation blocks).

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.14/fs/reiserfs/journal.c 
linux-2.6.14-1-reiser_prealloc_oops/fs/reiserfs/journal.c
--- linux-2.6.14/fs/reiserfs/journal.c  2005-11-24 16:27:30.0 +0100
+++ linux-2.6.14-1-reiser_prealloc_oops/fs/reiserfs/journal.c   2006-01-08 
01:48:28.0 +0100
@@ -3906,10 +3906,13 @@ static int do_journal_end(struct reiserf
flush = 1;
}
 #ifdef REISERFS_PREALLOCATE
-   /* quota ops might need to nest, setup the journal_info pointer for 
them */
+   /* quota ops might need to nest, setup the journal_info pointer for them
+* and raise the refcount so that it is > 0. */
current->journal_info = th;
+   th->t_refcount++;
reiserfs_discard_all_prealloc(th);  /* it should not involve new 
blocks into
 * the transaction */
+   th->t_refcount--;
current->journal_info = th->t_handle_save;
 #endif
 


Re: reiserfsck --rebuild-tree aborts at same block

2006-01-15 Thread Jan Kara
> I have a situation where if I run "reiserfsck --rebuild-tree" multiple 
> times, it always aborts at the same block.  The output includes
> "Send us the bug report only if the second run dies at the same place 
> with the same block number."
> 
> Before sending a bunch of info to the wrong place though, could someone 
> please confirm if I should submit details here as a bug report, or would 
> this be something to go through the support channel with?
  First check that you have the latest version of reiserfsck. If so, then
this is the appropriate list for the report.

Honza


Re: Data being corrupted on reiserfs 3.6

2006-01-15 Thread Jan Kara
  Hello,

> I'm experiencing data corruption when creating or copy data to my 
> reiserfs 3.6 partition mounted under /home. The following extract gives 
> a pretty clear indication that it's getting corrupted somewhere.
> 
> [EMAIL PROTECTED]:/tmp$ mount
> /dev/md0 on / type ext3 (rw,errors=remount-ro)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> tmpfs on /dev/shm type tmpfs (rw)
> usbfs on /proc/bus/usb type usbfs (rw)
> tmpfs on /dev type tmpfs (rw,size=10M,mode=0755)
> /dev/md2 on /home type reiserfs (rw)
> 
> [EMAIL PROTECTED]:/tmp$ dd bs=1024 count=1000k if=/dev/urandom of=./1GB.tst
> 1024000+0 records in
> 1024000+0 records out
> 1048576000 bytes transferred in 231.749782 seconds (4524604 bytes/sec)
> 
> [EMAIL PROTECTED]:/tmp$ md5sum 1GB.tst
> 48f46744c7e50c42c061a00d11541a85  1GB.tst
> 
> [EMAIL PROTECTED]:/tmp$ cp 1GB.tst /home/michael/
> 
> [EMAIL PROTECTED]:/tmp$ md5sum /home/michael/1GB.tst
> 042d8c462882f848412679e3cea03fe2  /home/michael/1GB.tst
  Hmm, that is really strange. Do the files have the same size? Do you
get an error also if you just create file full of zeros? If so, how do
the differences look like (e.g. any signs of flipped bits or so?).

> I'm running Debian Sarge on an Athlon XP 2200+, /dev/md2 is made up of 
> four 400GB SATA hard disks on a Silicon Image 3114 controller in RAID 5. 
> Dmesg is showing no errors what so ever, the RAID array has been stable 
> since I installed it a couple of weeks ago and the drive was formatted 
> with mkfs.reiserfs with no special options.
> 
> [EMAIL PROTECTED]:/tmp$ uname -a
> Linux biggs 2.6.8-2-k7 #1 Tue Aug 16 14:00:15 UTC 2005 i686 GNU/Linux
  Any chance of trying some newer kernel? 2.6.8 is really old...

Honza


Re: Data being corrupted on reiserfs 3.6

2006-01-16 Thread Jan Kara
  Hello,

> Jan Kara wrote:
> 
> 
> >  Hmm, that is really strange. Do the files have the same size? Do you
> >get an error also if you just create file full of zeros? If so, how do
> >the differences look like (e.g. any signs of flipped bits or so?).
> >
> 
> [EMAIL PROTECTED]:/tmp$ dd bs=1024 count=1000k if=/dev/zero of=./1GB.tst
> 1024000+0 records in
> 1024000+0 records out
> 1048576000 bytes transferred in 61.578769 seconds (17028207 bytes/sec)
> [EMAIL PROTECTED]:/tmp$ ls -l 1GB.tst
> -rw-r--r--  1 michael michael 1048576000 2006-01-15 20:51 1GB.tst
> [EMAIL PROTECTED]:/tmp$ md5sum 1GB.tst
> e5c834fbdaa6bfd8eac5eb9404eefdd4  1GB.tst
> [EMAIL PROTECTED]:/tmp$ ls -l /home/michael/1GB.tst
> -rw-r--r--  1 michael michael 1048576000 2006-01-15 20:54 
> /home/michael/1GB.tst
> [EMAIL PROTECTED]:/tmp$ md5sum /home/michael/1GB.tst
> 92c51557041ebd6424b4467a878c9f44  /home/michael/1GB.tst
> 
> I looked at the file in /home/michael/1GB.tst with xdd for about 5 
> minutes but couldn't see anything but zeros - I'm not sure how to search 
> through a binary file for non-zero bytes.
  You can use 'od -t x1 ' - it should squeeze repeating characters
so you should see the non-zero ones easily...
  As Hans said usually such problems are hardware problems (memory,
overheating processor, flaky disk controler etc.).
  BTW: I generated the same file as you and md5sum of the one on
reiserfs is same as mine. So the file is stored correctly and something
wrong really happens during the copy from /tmp to /home/michael. I
looked at the differences and they don't seem to be random. It's always
a chunk of 3-16 bytes that gets corrupted. Then numbers written there
also do not seem to be random (lots of characters with code 16, 54,
128,...). I'll investigate more later...
  So this could be some memory corruption - for checking out this it
would be useful if you could try to reproduce the problem with 2.6.15
kernel. The problem might well be fixed there.

Honza


Re: Data being corrupted on reiserfs 3.6

2006-01-24 Thread Jan Kara
  Hello,

> Jan Kara wrote:
> 
> >>
> >  You can use 'od -t x1 ' - it should squeeze repeating characters
> >so you should see the non-zero ones easily...
> >  As Hans said usually such problems are hardware problems (memory,
> >overheating processor, flaky disk controler etc.).
> >  BTW: I generated the same file as you and md5sum of the one on
> >reiserfs is same as mine. So the file is stored correctly and something
> >wrong really happens during the copy from /tmp to /home/michael. I
> >looked at the differences and they don't seem to be random. It's always
> >a chunk of 3-16 bytes that gets corrupted. Then numbers written there
> >also do not seem to be random (lots of characters with code 16, 54,
> >128,...). I'll investigate more later...
> >  So this could be some memory corruption - for checking out this it
> >would be useful if you could try to reproduce the problem with 2.6.15
> >kernel. The problem might well be fixed there.
> 
> I finally upgraded to 2.6.15-1 and I'm still seeing the same problem 
> there - It's possibly its a memory issue or flaky disk controller, it's 
> a Silicon Image 3114 PCI card that I've not used before these hard 
> disks, it's more likely then memory which has been going fine for a 
> couple of years without any problems but I will run memtest86 when I get 
> the chance.
  Ok. Have you tried to reproduce the problem on some other hardware
(with some other controler)?

> Oh and I don't know if I mentioned this before but the corruption only 
> ever occurs on writing not reading.
> 
> Can anyone suggest a test to tell if it is the disk controller?
  I have no other idea than try a different hardware...

Honza


[reiserfs-list] mark_inode_dirty()

2001-06-23 Thread Jan Kara

  Hello,

  I came across a bug in my quota patches for reiserfs (the problem was,
that mark_inode_dirty() was called on inode which didn't have stat item
written yet => reiserfs_update_sd() called from dirty_inode() complained)
and following question popped in my head: On which places in reiserfs
is it legal/illegal to call mark_inode_dirty()? I see that it probably
wouldn't be wise idea during do_balance() and it's not wise when stat item
isn't written yet but are there some other cases?

Thanks for answer

Honza



Re: [reiserfs-list] quota-patches

2001-08-22 Thread Jan Kara

  Hello,

> I would like to know if anybody has portet the new quota-system and the
> reiserfs-quota-patch to kernel 2.4.9?
> 
> The only ftp-site I know where I can find patches is:
> 
> ftp://atrey.karlin.mff.cuni.cz/pub/local/jack/quota/
> 
> but, there are only patches against 2.4.8 which don't apply so cleanly.
  I suppose not but I'm going to port it soon... :)

Honza



Re: [reiserfs-list] quota utilities 1.7 with kernel 2.4.4 and reiserfs 3.6.25 (was large quotas)

2001-09-04 Thread Jan Kara

  Hello,

> some days ago I posted a question concerning quotas > 4GB on reiserfs. I
> found that almost everyone uses the newer releases of the quota utilities (those
> from sourceforge), while I still have an old 1.70 (patched for reiserfs)
> doing its work here.
> 
> Might the problem (quotas limited to 4 GB) be caused by these old quota
> utilities? So should I "upgrade" to the newer ones from sourceforge? Once again,
> my problem is that if I set the quota to 500 (in edquota), I get abou
> 80, that's something like
> -quotasize = 4GB - 500
> 
> Did anyone else experience this? I just don't want to do the update on the
> server (the 1.70 is working in most cases) if I am not sure that THIS is the
> problem. If there is anybody using the 1.70 quota-package on reiserfs, too -
> could you please try to set a high quota (>4GB) and report if it works?
  edquota in 1.70 version of utils has the bug that it doesn't allow admin
to set limit larger that 4GB (at one place the number was converted
to bytes and back to quota blocks with obvious results). BTW: you can try
to use setquota tu set the limit. I'm not sure whether it will work better
but it might be the case...
  I think it should be save to upgrade to newer utils for you. Some people
are already running them on their servers... And anyway if you don't like
the new utils you can always go back to 1.70.

Honza



Re: [reiserfs-list] Quotas working damn well :o)

2001-10-25 Thread Jan Kara

> On Thursday, October 25, 2001 05:44:48 PM +0400 Hans Reiser
> <[EMAIL PROTECTED]> wrote:
> 
> > Forgive me for losing track of patches Chris, but are you saying here
> > that a complete quota solution got merged into 2.4.13, and we no longer
> > have to maintain anything separate from the kernel?
> 
> That would be nice ;-)  No, Jan had two patches on his ftp site, named
> quota-patch-2.4.x and quota-fix-2.4.x.  The quota-fix patch was included,
> but quota-patch-2.4.x was not.
> 
> Of course, the reiserfs quota patches were not included either, and can't
> be until the quota-patch goes in.  The interesting part is that alan's
> kernels did include the v2 quota patch, I'm not sure what will happen when
> he takes over 2.4.
  I don't think that quota-patch will ever go in 2.4. It changes userland
interfaces so I think it's not 2.4 thing... I think in 2.5 there
will be no problem.

    Honza

--
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs



Re: [reiserfs-list] Quota on 2.4.12

2001-10-26 Thread Jan Kara

  Hello,

> Vladimir V. Saveliev [[EMAIL PROTECTED]] wrote:
> > Hi
> > 
> > Markus Hof wrote:
> > 
> > > Vladimir V. Saveliev [[EMAIL PROTECTED]] wrote:
> > > > Hi
> > > >
> > > > Markus Hof wrote:
> > > >
> > > > > Hi!
> > > > >
> > > > > I used a 2.4.9 but read that it?s bad to use it with reiser. So I changed to
> > > > > +2.4.12. I also want to have quota support. It?s a commercial webserver so 
>I?m
> > > > > +not able to try so much ;-)
> > > > >
> > > > > Does anyoune have experience with Quota on 2.4.12
> > > > >
> > > > > Without patch Quota works but you can overshoot the hardlimit ... ;-)
> > > > >
> > > > > Wich patch should I apply on a vanilla 2.4.12?
> > > > >
> > > >
> > > > Would you try this:
> > > > ftp.suse.com/pub/people/mason/patches/reiserfs/quota-2.4
> > >
> > > Thanks, but is it possible to apply this to a slackware box? SuSE was my old 
>server...
> > > I think so (kernel should be distribution indepenent)!
> > >
> > > But I want to be sure before doing something that cause system crash?
> > >
> 
> I ´m not able to apply the SuSE quota patch... seems to be against 2.4.13pre??
> Seems that there are not much patches availible for 2.4.12 ;-)
> More for .13pre?? 
  Now 2.4.13 is out :).

> > 
> > Yes, you should be able to vary kernel in any distribution.
> > You might also want to make sure that quota user land tools are fresh enough. 
>Something
> > like ones you can find at http://www.sf.net/projects/linuxquota.
> 
> Thx, I download & compiled it now.. quotastats gives:
> 
> root@k7:~# quotastats
> quotastats: Error while getting quota statistics from kernel: Bad address
  Know bug. It's fixed in CVS and will be in next release.

Thanks for report

Honza



Re: [reiserfs-list] quota problems on 2.4.8 + quota patches

2001-10-26 Thread Jan Kara

  Hello,

> Vladimir V. Saveliev [[EMAIL PROTECTED]] wrote:
> > Hi
> > 
> > Markus Hof wrote:
> > 
> > > Olivier Sessink [[EMAIL PROTECTED]] wrote:
> > > > Hi,
> > > >
> > > > on one box I run linux 2.4.8 with lvm + reiserfs + quota, and recently I
> > > > discovered the quota system doesn't work properly, when running quotacheck
> > > > it finds hige differences between the filesystem quota info and reality.
> > 
> > Currently in reiserfs quotacheck has to be run after unclean shutdown. Could it
> > be that you had unclean shutdowns and did not run quotacheck after that?
> > 
> > > One
> > > > user even passed the hard limit!!!
> > > >
> > 
> > > > this is a part from repquota:
> > > >
> > > > *** Report for user quotas on device /dev/lvm0/home
> > > > Block grace time: 14days; Inode grace time: 14days
> > > > Block limitsFile limits
> > > > Userusedsofthard  graceused  soft  hard  grace
> > > > --
> > > > thisuser  +-  601824  50  60 13days   10513 15000 25000
> > > >
> > 
> > Could it be that filesystem was used without quota enabled?
> 
> i dont think so, otherwise grace would be none!!!
> 
> I think its the same problem... quotastats is unable to get stats. from kernel.
> Maybe kernel has NO Quota Support!
> 
> btw. it was NOT cleanly unmounted... I run quotacheck also manually!!!
  Can you please try 2.4.13+patches from
ftp://ftp.suse.com/pub/people/mason/patches/reiserfs/quota-2.4/
 50_quota-patch-2.4.13-pre2.gz, nesting-8.diff.gz, reiserfs-quota-12.diff.gz

  These patches have a few bugs fixed...

Honza



Re: [reiserfs-list] quota does not do anything on 2.4.13

2001-10-30 Thread Jan Kara

  Hello,

> This is a Server and I alwas use ssh .. now I was down and had a look at the
> Monitor (tty1):
> After reboot called by init script /etc/rc.d/rc.M
> 
> root@k7:/etc/rc.d# /sbin/quotacheck -avug
> quotacheck: Cannot remount filesystem mounted on / read-only so counted
> values might not be right.
> Please stop all programs writing to filesystem or use -m flag to force
> checking.
> 
> root@k7:/etc/rc.d# /sbin/quotaon -avug
> quotaon: using //aquota.user on /dev/md0 [/]: No such file or directory
  Umm.. And do you have aquota.user file? If not you need to create
it either by convertquota or by quotacheck -c...

> I don´t know why... in 2.4.9 and 2.4.12 this made no problem, maybe this is
> since I installed the new Quota Utils (this is what I think)
  So you say quota worked for you with new quota format in 2.4.9 and 2.4.12?


> Isn´t it possible to use Quota on the / partition ???
  It's of course possible to have quotas on /.

Honza

> > > root@k7:/usr/src/linux# quota -V
> > > Quota utilities version 3.01.
> > > Compiled with RPC and EXT2_DIRECT
> > > Bugs to [EMAIL PROTECTED], [EMAIL PROTECTED]
> > >
> > >
> > > Do I have to compile the quota utils with any --with switch???
> > >
> > > Useres are still able to exeed even their hard limit!!!
> > >
> > > thanks
> > > markus
> >
> >
> > That seems to be different from what I experience. Using the ancient 1.7
> > tools it works for me, however with the 3.01-final quotas just do not want
> > to be turned on (for some strange reason), i.e.
> >
> > ./quota-tools/quotacheck -V
> > Quota utilities version 3.01.
> > Compiled with RPC
> >
> > quotacheck -auvg
> > quotacheck: Can't find filesystem to check or filesystem not mounted with
> > quota option.
> >
> > (the same works with the 1.7 tools)
> >
> > Hmmhh ?!
> >
> > Soeren.
> >



Re: [reiserfs-list] quota does not do anything on 2.4.13

2001-10-31 Thread Jan Kara

  Hello,

> > > This is a Server and I alwas use ssh .. now I was down and had a look at the
> > > Monitor (tty1):
> > > After reboot called by init script /etc/rc.d/rc.M
> > > 
> > > root@k7:/etc/rc.d# /sbin/quotacheck -avug
> > > quotacheck: Cannot remount filesystem mounted on / read-only so counted
> > > values might not be right.
> > > Please stop all programs writing to filesystem or use -m flag to force
> > > checking.
> > > 
> > > root@k7:/etc/rc.d# /sbin/quotaon -avug
> > > quotaon: using //aquota.user on /dev/md0 [/]: No such file or directory
> >   Umm.. And do you have aquota.user file? If not you need to create
> > it either by convertquota or by quotacheck -c...
> 
> Thanks a lot... now Quota work on 2.4.13 with newest Quto Utils ;-)
> Why convertyuota?? caused by new quotautils?
> What does this convert?
> Do I still need th /quota.user file?
  convertquota is needed because quotafile format has changed and so you need to 
convert
quotafiles from old format to new one. quota.user is needed no more by quota code (but 
maybe
it's not bad idea to keep it for a while if you decided to go back to standard Linus's 
kernel
which doesn't support new format...).

Honza




Re: [reiserfs-list] Re: CRC error with gzip

2001-10-09 Thread Jan Kara

> --On Tuesday, October 09, 2001 03:47:43 PM +0400 Nikita Danilov 
> <[EMAIL PROTECTED]> wrote:
> 
> 
> >I've just added link to Jan Kara's quota patches to our download
> >page. You can try them.
> 
> I can't find them anywhere, could you please post a link to these patches?
  Just a note: Currently I have quota patches only for -ac versions of
kernel due to transaction nesting stuff (latest I have is 2.4.10-ac3) -
you need to apply 'nesting' and 'reiserquota' patch for 2.4.10-ac3.

Honza



Re: [reiserfs-list] Quota patch for 2.4.10

2001-10-09 Thread Jan Kara

  Hello,

> Where can I find a Quota patch for linux kernel version 2.4.10 ? I couldn't
> find it at the NAMESYS site. All I found there was the patch for 2.4.7.
  I have patches for -ac versions of kernel. My lastest patch is for 2.4.10-ac3
at:
ftp://atrey.karlin.mff.cuni.cz/pub/local/jack/quota/reiserfs/
You need 'nesting' and 'reiserquota' patch from there.

Honza



[reiserfs-list] Re: [Fwd: Quick Question]

2001-12-27 Thread Jan Kara

  Hello,

  Question when Reiserfs started supporting quotas is a bit difficult :).
I'm the original author of current Reiserfs quota support (but also Chris
Mason and Vladimir V. Saveliev did a lot of work when resolving various
problems and bugs). This quota support is based on new quota format (allows
32-uids, quota allocation with byte precision). Looking into my ftp
directory oldest patches are for 2.4.5 kernel. Sometimes around 2.4.9 kernel
the patch became usable for normal users (ie. the most serious bugs were
resolved).
  To make it more complicated there was another patch for quota support for
Reiserfs. It was based on old quota format and the author is Vladimir V.
Saveliev <[EMAIL PROTECTED]> (at least I think so) so he's probably better
than me in answering details about his patches.
  Reiserfs uses default Linux Quota implementation - or better said quota format
which is going to be default quota format. In current 2.4 kernels there's still
only old quota format supported due to compatibility reasons (and so patches
for Reiserfs quota also include implementation of new quota format).

Bye
Honza

> I'm in the process of writing an article regarding quota systems.  Can you
> let me know starting at which version ReiserFS began supporting quotas?
> Is it the default Linux Quota implementation or a(n) variation/adaptation?
> 
> Thanks for the help.
> 
> Michael C. Montero
> Chief Technology Officer
> Community Connect Inc. Co-founder
> [EMAIL PROTECTED]
> 
> -=-=-=-=-=  Community Connect Inc.  -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> 
> The Premier Source of Interactive Online Communities149 Fifth Avenue
> http://www.CommunityConnectInc.com/ New York, NY 10010
> 
> http://www.AsianAvenue.com/ http://www.BlackPlanet.com/
> Click into Asian AmericaThe World Is Yours
> 
> http://www.MiGente.com/
>   The Power of Latinos
> 
> -  Your Message May Appear Below This Line
> 
> 
> 
> 

--
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs



Re: [reiserfs-list] about i-nodes and blocks

2002-03-21 Thread Jan Kara

  Hello,

> And another thing. If Im right reiserfs uses 4k blocks, but quota 
> documentation says that when u edit block limit, a block is 1k.
> 
> When I edit user disk quotas a block is 1k or 4k?
  The quota blocks have nothing in common with filesystem blocks so
when you edit quotas you really set limits in 1 kb blocks.

Honza



Re: [reiserfs-list] quota support for 2.4.20-pre11

2002-11-01 Thread Jan Kara
> Are there quota patches for 2.4.20-pre11 kernel?
> Patches for 2.4.19 doesn't applies correctly.
  I'll make them soon.

Honza



Re: [reiserfs-list] Quota patches and kinoded versions

2002-11-01 Thread Jan Kara
  Hello,
  
> On 30 Oct 2002 07:44:33 -0500
> Chris Mason <[EMAIL PROTECTED]> wrote:
> 
>   |  The kinoded-8 patch was updated to 2.4.19-pre7, but not changes were
>   |   made other than a simple merge.
> 
> Ok.
> 
>   |  
>   |   > 
>   |   > Also, do you plan to sync quota patches with 2.4.20pre or with 2.4.20 final 
>once its out ?
>   |  
>   |   These won't fix the oops you posted yesterday.  I've got a patch in
>   |   testing here, I was hoping to reproduce the bug before sending it out
>   |   (instead of just guessing ;-)  I'll cleanup the debugging statements and
>   |   get you something worth using.
> 
> Unfortunately, we got another oops on another filer yesterday which seems to be 
>because of the same
> bug ( ksymoops at the end)
> 
> The above statement leads me to think that this bug could still be hit even if i
> had 2.4.20-rc1 , right ?
> 
> On the same filer, i also found this in the logs: 
> 
> VFS: find_free_dqentry(): Data block full but it shouldn't.
> VFS: Error -5 occured while creating quota.
> VFS: find_free_dqentry(): Data block full but it shouldn't.
> VFS: Error -5 occured while creating quota.
> etc..
  This looks like corrupted quota file. Can you try running quotacheck?
  There was some race in quota code which caused exactly this messages
  but it should be fixed for a long time..

Honza



Re: quota problem

2002-11-07 Thread Jan Kara
  Hello,
  
> VFS: Quota for id 1114 referenced but not present.
> VFS: Can't read quota structure for id 1114.
> VFS: Quota for id 1116 referenced but not present.
> VFS: Can't read quota structure for id 1116.
> VFS: Quota for id 1115 referenced but not present.
> VFS: Can't read quota structure for id 1115.
> VFS: Inserting already present quota entry (block 12).
> VFS: Error -5 occured while creating quota.
  This looks like corrupted quota file. quotacheck(8) should be able
to fix it (but probably you should check that limits are set properly -
some of them might got lost). 2.4.18 kernel has a small race which can
occur under heavy load and cause such corruption so I suggest you
updating to newer kernel...
Honza



Re: 32bit UID/GID's Quota in 2.4!

2003-07-11 Thread Jan Kara
  Hello,

> On Fri, Jul 11, 2003 at 12:08:07AM +0200, Philippe Gramoull? wrote:
> 
> > I haven't had time to look how "new" this quota code is and if it differs
> > from what is actually available through external patches though.
> 
> I see that the Config.in stuff is a bit different. Also comment says
> it is 2.5 backport, not quotav2 for 2.4 patches that were floating
> around.  I plan to look at it closer soon and port reiserfs quota to
> new interface if necessary.
  Quota code in 2.5 kernel is almost the same as the 2.4 I have at my
ftp site. The code that got merged has some minor differences which I'm
now discussing with Marcelo.

Honza


Re: quota support

2003-12-12 Thread Jan Kara
> Chris and Vladimir, what is the status of quotas in the official kernel?
  Generic VFS support needed for ReiserFS is all there since 2.4.21.
Modification needed in ReiserFS code are only in Chris's patches AFAIK.

Honza


Re: [PATCH 00/11] reiserfs: xattr rework

2006-03-01 Thread Jan Kara
  Hello,

> Following this message is a series of 11 patches that rework the reiserfs
> xattr code. The current implementation open codes a number of functions that
> are well tested and stable elsewhere, but will slight modifications. It also
> does a number of things suboptimally, such as operations that affect all of
> the xattrs associated with an inode, as well as not handling journalling as
> well as can be.
> 
> I've run them through a weekend of stress testing successfully, but I'd like
> some additional testing before considering them safe.
> 
> Here's the run down:
> 
> * 01 - Make internal xattr operations use vfs ops
>   This eliminates the open coding of the read/write loops in favor of using
>   vfs_read and vfs_write. Performance-wise, it's very little difference
>   from the open coding, and implementation-wise, it's more tested. Yes, this
>   violates the no-file-io-in-the-kernel rules, but this rule has been violated
>   since the day the original patches were accepted.
  I'm not completely sure but from briefly looking at the patches I
think there might be the following problem: you first start a transaction
and then call a VFS write operation. That will lock a page it wants to
write. But if bdflush works on the xattr file and sends some dirty data
to disk, it will first take page lock and then start the transaction.
This introduces essentially a lock inversion (as starting a transaction
behaves like taking a lock) and hence deadlocks... I've been solving
these for the quota code some time ago (quota also has similar needs as
your xattr code) - I really had some deadlock reports for heavily loaded
machines. There the only reasonable solution was an extra write
functions bypassing the page cache. Maybe in your case you can solve the
problems differently as you don't need the working solution for all the
filesystems but just for Reiserfs. 

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH 00/11] reiserfs: xattr rework

2006-03-02 Thread Jan Kara
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jan Kara wrote:
> >   Hello,
> > 
> >> Following this message is a series of 11 patches that rework the reiserfs
> >> xattr code. The current implementation open codes a number of functions 
> >> that
> >> are well tested and stable elsewhere, but will slight modifications. It 
> >> also
> >> does a number of things suboptimally, such as operations that affect all of
> >> the xattrs associated with an inode, as well as not handling journalling as
> >> well as can be.
> >>
> >> I've run them through a weekend of stress testing successfully, but I'd 
> >> like
> >> some additional testing before considering them safe.
> >>
> >> Here's the run down:
> >>
> >> * 01 - Make internal xattr operations use vfs ops
> >>   This eliminates the open coding of the read/write loops in favor of using
> >>   vfs_read and vfs_write. Performance-wise, it's very little difference
> >>   from the open coding, and implementation-wise, it's more tested. Yes, 
> >> this
> >>   violates the no-file-io-in-the-kernel rules, but this rule has been 
> >> violated
> >>   since the day the original patches were accepted.
> >   I'm not completely sure but from briefly looking at the patches I
> > think there might be the following problem: you first start a transaction
> > and then call a VFS write operation. That will lock a page it wants to
> > write. But if bdflush works on the xattr file and sends some dirty data
> > to disk, it will first take page lock and then start the transaction.
> > This introduces essentially a lock inversion (as starting a transaction
> > behaves like taking a lock) and hence deadlocks... I've been solving
> > these for the quota code some time ago (quota also has similar needs as
> > your xattr code) - I really had some deadlock reports for heavily loaded
> > machines. There the only reasonable solution was an extra write
> > functions bypassing the page cache. Maybe in your case you can solve the
> > problems differently as you don't need the working solution for all the
> > filesystems but just for Reiserfs. 
> 
> Sigh. Ok. The way you describe it definitely makes it a lock inversion
> issue. I haven't run into it yet, but as you say, it occurs on heavily
> loaded machines. I've done some load testing, but apparently not enough,
> since your analysis is sound.
> 
> But, I think there is a silver lining after all. It sounds like you've
> already worked around these issues for the journaled quota code. What do
> you think about turning the quota read/write functions into something
> more generic and using that for xattrs as well as quotas?
> 
> Ultimately, I think that quota files and xattrs are the same things -
> "normal" files read from and written to during the I/O path. The changes
> to journaled quotas would be minimal - just turn
> reiserfs_quota_{read,write} into small wrappers that make the
> type->inode mapping and then call the remainder of the existing code as
> reiserfs_internal_{read,write} with the appropriate inode. The xattr
> code could juse the reiserfs_internal_{read,write} similarly and get all
> the deadlock avoidance work you've already done for free.
  You are right that the quota code and xattrs need to do the same thing.
We only need to do slight interface changes (currently functions take a
superblock and a type and pick appropriate quota inode themselves) and
function renaming. I would vote for renaming the s_op->quota_{read,write}
to s_op->internal_{read,write} and pass appropriate inode directly from
the quota code. The only thing I'm not sure about is how to deal with the
journaling mode - quota code either uses data journaling or just ordered
mode depending on mount options (journaled / non-journaled quota). So we
probably also need to pass the journaling mode to the write function.
  BTW: note that using these functions bypassing page cache means that
userspace really should not touch these files. It is asking for data
corruption. Quota code does during quotaon sync the quota inode and
set it as immutable to prevent accidents. Also during quotaoff it flushes
the page cache of the inode so that userspace is able to see the changes
made by kernel. I guess something similar will be needed for xattrs too.

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH 00/11] reiserfs: xattr rework

2006-03-03 Thread Jan Kara
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jan Kara wrote:
> >   You are right that the quota code and xattrs need to do the same thing.
> > We only need to do slight interface changes (currently functions take a
> > superblock and a type and pick appropriate quota inode themselves) and
> > function renaming. I would vote for renaming the s_op->quota_{read,write}
> > to s_op->internal_{read,write} and pass appropriate inode directly from
> > the quota code. The only thing I'm not sure about is how to deal with the
> > journaling mode - quota code either uses data journaling or just ordered
> > mode depending on mount options (journaled / non-journaled quota). So we
> > probably also need to pass the journaling mode to the write function.
> >   BTW: note that using these functions bypassing page cache means that
> > userspace really should not touch these files. It is asking for data
> > corruption. Quota code does during quotaon sync the quota inode and
> > set it as immutable to prevent accidents. Also during quotaoff it flushes
> > the page cache of the inode so that userspace is able to see the changes
> > made by kernel. I guess something similar will be needed for xattrs too.
> 
> If you feel that changing the entire quota system to reflect the change
> is a good plan, that's your call. Personally, I'd like to keep the
> patches as small as possible, but if you think there is a need for
> internal_{read,write} elsewhere, I wouldn't object.
  OK, when I'm thinking about it in the morning, you're probably right.
And I can do the bigger change if I see that more filesystems would use
it too.

> The data journaling mode can be set as a flag associated with the inode.
>  Currently, i_data_log is set in REISERFS_I(inode)->i_flags. I add
> i_data_ordered in one of my later patches. They can be tested easily
> with reiserfs_file_data_{log,ordered}. There's no reason that one
> couldn't be moved up and made a prerequisite for the first patch.
  Fine. So we can just set proper journaling flags in reiserfs_quota_on
and then honor them in the internal writing functions.

Honza
  
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH 00/11] reiserfs: xattr rework

2006-03-06 Thread Jan Kara
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jan Kara wrote:
> >> The data journaling mode can be set as a flag associated with the inode.
> >>  Currently, i_data_log is set in REISERFS_I(inode)->i_flags. I add
> >> i_data_ordered in one of my later patches. They can be tested easily
> >> with reiserfs_file_data_{log,ordered}. There's no reason that one
> >> couldn't be moved up and made a prerequisite for the first patch.
> >   Fine. So we can just set proper journaling flags in reiserfs_quota_on
> > and then honor them in the internal writing functions.
> 
> Ok, how do the attached patches look to you? The internal I/O changes
> need to be applied after the journaled xattr patch or we get an Oops
> trying to start a transaction without calling reiserfs_write_lock()
> first. I've modified the first patch in the xattr series to abstract out
> the fp->f_op->{read,write} calls to an xattr_{read,write} pair of
> functions. This makes it easier to move to the internal i/o code later.
> I've included it for clarity, but that is the only change.
  The patch looks fine. Just two minor comments:


> diff -ruNpX ../dontdiff linux-2.6.15-staging1/fs/reiserfs/super.c 
> linux-2.6.15-staging2/fs/reiserfs/super.c
> --- linux-2.6.15-staging1/fs/reiserfs/super.c 2006-03-03 17:09:01.0 
> -0500
> +++ linux-2.6.15-staging2/fs/reiserfs/super.c 2006-03-03 17:09:04.0 
> -0500
> @@ -1949,6 +1949,109 @@ static int reiserfs_statfs(struct super_
>   return 0;
>  }
>  
> +#if defined(CONFIG_QUOTA) || defined(CONFIG_REISERFS_FS_XATTR)
> +/* Read data from quotafile - avoid pagecache and such because we cannot 
> afford
> + * acquiring the locks... As quota files are never truncated and quota code
> + * itself serializes the operations (and noone else should touch the files)
> + * we don't have to be afraid of races */
 Update here the comment to reflect that we use this function also for
xattrs now - I suppose those files cannot be truncated either and that
xattr code serializes the operations there.

> +ssize_t reiserfs_internal_read(struct inode *inode, char *data, size_t len,
> +   loff_t off)
  

> +/* Write to quotafile (we know the transaction is already started and has
> + * enough credits) */
  Here again update the comment...

> +ssize_t reiserfs_internal_write(struct inode *inode, const char *data,
> +size_t len, loff_t off)

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: [PATCH 00/11] reiserfs: xattr rework

2006-03-08 Thread Jan Kara
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jan Kara wrote:
> >> -BEGIN PGP SIGNED MESSAGE-
> >> Hash: SHA1
> >>
> >> Jan Kara wrote:
> >>>> The data journaling mode can be set as a flag associated with the inode.
> >>>>  Currently, i_data_log is set in REISERFS_I(inode)->i_flags. I add
> >>>> i_data_ordered in one of my later patches. They can be tested easily
> >>>> with reiserfs_file_data_{log,ordered}. There's no reason that one
> >>>> couldn't be moved up and made a prerequisite for the first patch.
> >>>   Fine. So we can just set proper journaling flags in reiserfs_quota_on
> >>> and then honor them in the internal writing functions.
> >> Ok, how do the attached patches look to you? The internal I/O changes
> >> need to be applied after the journaled xattr patch or we get an Oops
> >> trying to start a transaction without calling reiserfs_write_lock()
> >> first. I've modified the first patch in the xattr series to abstract out
> >> the fp->f_op->{read,write} calls to an xattr_{read,write} pair of
> >> functions. This makes it easier to move to the internal i/o code later.
> >> I've included it for clarity, but that is the only change.
> >   The patch looks fine. Just two minor comments:
> > 
> > 
> >> diff -ruNpX ../dontdiff linux-2.6.15-staging1/fs/reiserfs/super.c 
> >> linux-2.6.15-staging2/fs/reiserfs/super.c
> >> --- linux-2.6.15-staging1/fs/reiserfs/super.c  2006-03-03 
> >> 17:09:01.0 -0500
> >> +++ linux-2.6.15-staging2/fs/reiserfs/super.c  2006-03-03 
> >> 17:09:04.0 -0500
> >> @@ -1949,6 +1949,109 @@ static int reiserfs_statfs(struct super_
> >>return 0;
> >>  }
> >>  
> >> +#if defined(CONFIG_QUOTA) || defined(CONFIG_REISERFS_FS_XATTR)
> >> +/* Read data from quotafile - avoid pagecache and such because we cannot 
> >> afford
> >> + * acquiring the locks... As quota files are never truncated and quota 
> >> code
> >> + * itself serializes the operations (and noone else should touch the 
> >> files)
> >> + * we don't have to be afraid of races */
> >  Update here the comment to reflect that we use this function also for
> > xattrs now - I suppose those files cannot be truncated either and that
> > xattr code serializes the operations there.
> > 
> >> +ssize_t reiserfs_internal_read(struct inode *inode, char *data, size_t 
> >> len,
> >> +   loff_t off)
> >   
> > 
> >> +/* Write to quotafile (we know the transaction is already started and has
> >> + * enough credits) */
> >   Here again update the comment...
> > 
> >> +ssize_t reiserfs_internal_write(struct inode *inode, const char *data,
> >> +size_t len, loff_t off)
> > 
> > Honza
> > 
> 
> I've updated the patches with the comment changes, though I did run into
> a much bigger snag.
> 
> The internal i/o patches don't support tails, and that's a silver bullet
> against this working for xattrs. Most xattrs, such as ACLs, are likley
> to be only a few tens of bytes long and allocating an entire block is
> extremely wasteful.
  Umm, that is really nasty. Ext3 solves this by sharing a block among
several inodes but that's far to much work to fix this bug...

> I've managed to alter internal read to handle tails by allocating an
> anonymous page and using it with the temporary buffer head to get the
> tail data from reiserfs_get_block back. But the rest of the tail packing
> code very much needs the page cache. Is there going to be any way this
> can be managed without reintroducing deadlocks?
  I've been trying to find some other way when solving problems for
quotas but find none. If you want xattr changes to be journaled with
other data changes, you have to first start a transaction and then issue
a write that will consequently need PageLock. So do you really need
a trasaction started before a write starts? For journaled quota this was
must but for xattrs it might not be necessary. Then we would still need
to sort out the problems with xattr lock but that might be easier to
deal with.

Bye
Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


[PATCH] Fix reiserfs deadlock

2006-04-21 Thread Jan Kara
  Hello!

  the patch below fixes potential deadlock in reiserfs code. The problem
is we can sometimes return 1 even if we did not manage to find the
xattr. Later we take the xattr lock because the function returned 1.
But then the code in the error path of reiserfs_new_inode() checks
i_default_acl and because it is not set it assumes we have not taken
the lock and tries to retake it -> deadlock.
  Jeff has larger rewrite of the xattr locking but it will take some
time before it will be accepted so this could be used as a temporary
fix. Andrew, please apply.

Honza

-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs

reiserfs_cache_default_acl() should return whether we successfully found the
acl or not. We have to return correct value even if reiserfs_get_acl() returns
error code and not just 0. Otherwise callers such as reiserfs_mkdir() can
unnecessarily lock the xattrs and later functions such as reiserfs_new_inode()
fail to notice that we have already taken the lock and try to take it again
with obvious consequences.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude 
linux-2.6.5-SLES9_SP3_BRANCH/fs/reiserfs/xattr_acl.c 
linux-2.6.5-SLES9_SP3_BRANCH-1-reiser_xattr_fix/fs/reiserfs/xattr_acl.c
--- linux-2.6.5-SLES9_SP3_BRANCH/fs/reiserfs/xattr_acl.c2006-01-21 
03:02:06.0 +0100
+++ linux-2.6.5-SLES9_SP3_BRANCH-1-reiser_xattr_fix/fs/reiserfs/xattr_acl.c 
2006-01-21 09:09:04.0 +0100
@@ -410,8 +410,10 @@ reiserfs_cache_default_acl (struct inode
 acl = reiserfs_get_acl (inode, ACL_TYPE_DEFAULT);
 reiserfs_read_unlock_xattrs (inode->i_sb);
 reiserfs_read_unlock_xattr_i (inode);
-ret = acl ? 1 : 0;
+   if (!acl || IS_ERR(acl))
+   return 0;
 posix_acl_release (acl);
+   ret = 1;
 }
 
 return ret;


Re: ReiserFS v3 choking when free space falls below 10%?

2006-07-07 Thread Jan Kara
  Hi,

  just one note: I've looked to the in scan_bitmap() in bitmap.c. There is:
/* When the bitmap is more than 10% free, anyone can allocate.
 * When it's less than 10% free, only files that already use the
 * bitmap are allowed. Once we pass 80% full, this restriction
 * is lifted.
 *
 * We do this so that files that grow later still have space
 * close to
 * their original allocation. This improves locality, and
 * presumably
 * performance as a result.
 *
 * This is only an allocation policy and does not make up for
 * getting a
 * bad hint. Decent hinting must be implemented for this to work
 * well.
 */
if (TEST_OPTION(skip_busy, s)
&& SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) {

   So the comment suggests we should lift the restriction when we are 80%
full but if you see the condition, it checks wherher we are 95% full! I
guess that is really asking for trouble and could explain the
behaviour...
  Mike could you try changing that 20 in the test to 5? IMHO that could
fix your problem.

    Honza



-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: ReiserFS v3 choking when free space falls below 10%?

2006-07-07 Thread Jan Kara
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Jan Kara wrote:
> >   Hi,
> > 
> >   just one note: I've looked to the in scan_bitmap() in bitmap.c. There is:
> > /* When the bitmap is more than 10% free, anyone can allocate.
> >  * When it's less than 10% free, only files that already use the
> >  * bitmap are allowed. Once we pass 80% full, this restriction
> >  * is lifted.
> >  *
> >  * We do this so that files that grow later still have space
> >  * close to
> >  * their original allocation. This improves locality, and
> >  * presumably
> >  * performance as a result.
> >  *
> >  * This is only an allocation policy and does not make up for
> >  * getting a
> >  * bad hint. Decent hinting must be implemented for this to work
> >  * well.
> >  */
> > if (TEST_OPTION(skip_busy, s)
> > && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) {
> > 
> >So the comment suggests we should lift the restriction when we are 80%
> > full but if you see the condition, it checks wherher we are 95% full! I
> > guess that is really asking for trouble and could explain the
> > behaviour...
> >   Mike could you try changing that 20 in the test to 5? IMHO that could
> > fix your problem.
> 
> Shoot. I guess I never sent that mail out last night. I had discovered
> the same thing. The thing is, I don't think it will cause the kind of
> performance problem we're seeing here. Once it sees the 90% check it
> will bail out. Minor slowdown, not anything like we're seeing.
  Hmm, right. You'll only scan that one bitmap the file is in, won't
you? That can still take some time so maybe it's worth trying this fix
anyway.

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: ReiserFS v3 choking when free space falls below 10%?

2006-07-07 Thread Jan Kara
> Jan Kara wrote:
> 
> >>-BEGIN PGP SIGNED MESSAGE-
> >>Hash: SHA1
> >>
> >>Jan Kara wrote:
> >>
> >>
> >>>  Hi,
> >>>
> >>>  just one note: I've looked to the in scan_bitmap() in bitmap.c. There is:
> >>>/* When the bitmap is more than 10% free, anyone can allocate.
> >>> * When it's less than 10% free, only files that already use the
> >>> * bitmap are allowed. Once we pass 80% full, this restriction
> >>> * is lifted.
> >>> *
> >>> * We do this so that files that grow later still have space
> >>> * close to
> >>> * their original allocation. This improves locality, and
> >>> * presumably
> >>> * performance as a result.
> >>> *
> >>> * This is only an allocation policy and does not make up for
> >>> * getting a
> >>> * bad hint. Decent hinting must be implemented for this to work
> >>> * well.
> >>> */
> >>>if (TEST_OPTION(skip_busy, s)
> >>>&& SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) {
> >>>  
> >>>
> How about eliminating this feature entirely.   It seems rather dubious.
  Yes, but it may help reducing fragmentation as it leaves some free
space in bitmaps for the files already ending in that bitmaps. I'm not
sure if it really helps through...

> >>>   So the comment suggests we should lift the restriction when we are 80%
> >>>full but if you see the condition, it checks wherher we are 95% full! I
> >>>guess that is really asking for trouble and could explain the
> >>>behaviour...
> >>>  Mike could you try changing that 20 in the test to 5? IMHO that could
> >>>fix your problem.
> >>>  
> >>>
> >>Shoot. I guess I never sent that mail out last night. I had discovered
> >>the same thing. The thing is, I don't think it will cause the kind of
> >>performance problem we're seeing here. Once it sees the 90% check it
> >>will bail out. Minor slowdown, not anything like we're seeing.
> >>
> >>
> >  Hmm, right. You'll only scan that one bitmap the file is in, won't
> >  
> >
> I don't understand your remark.  These files are in many many
> bitmaps  Can you quote more of the code?
  The condition really is:
  if ((off && (!unfm || (file_block != 0))) || SB_AP_BITMAP(s)[bm].free_count
  > (s->s_blocksize << 3) / 10)

  and we reset 'off' after the first test so the first part of || can be
true only once (when we are scanning the bitmap containing the last file
block).

Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs


Re: is quotacheck slow with reiserfs

2006-10-09 Thread Jan Kara
> On Fri, Oct 06, 2006 at 02:09:11AM +0400, Vladimir V. Saveliev wrote:
> > 
> > On Friday 06 October 2006 00:34, Louis-David Mitterrand wrote:
> > > 
> > > On a 200-user mail server with a 500gb reiser3 fs, quotacheck takes an 
> > > hour at boot time. This is a mail server with 200 users. 
> > > 
> > which linux version is in use on the server?
> 
> Debian unstable with latest kernel 2.6.17.
  Actually if you are running that recent system, you can consider using
journaled quota. Then you don't have to run quotacheck at all (only
after running fsck). You can turn on journaled quota by mount options
usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0 (provided you
use both user and group quotas and that you use new quota format).

> > > Is that normal? Is there a way to speed it up?
> > > 
> > 
> > How do users store their mails? If they store one mail in a separate 
> > file, then quotacheck is to iterate over
> > a lot files which can be very time consuming.
> 
> We use maildir, so, yes, it's a lot of files.
> 
> Isn't there a way to run quotacheck in the background while daemons 
> start serving users? 
> 
> Or must it absolutely be run at mount time to be effective?
  Someone already answered this in this thread ;).

Honza