Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks

2007-04-20 Thread Eric Sandeen

Mingming Cao wrote:

On Fri, 2007-04-20 at 18:14 -0500, Eric Sandeen wrote:
It's a bug, today.   


They are fixed in mm tree, as part of the patches which backports ext3
block reservation code to ext2. filesystem block numbers are all
ext2_fsblk_t type(i.e. unsigned long)(see ext2_new_blocks()). Maybe need
a round of thorough review to see if anything left, but I think what in
mm tree looks good.


Oh... oops.  I didn't think to check mm, didn't expect to find those 
changes on ext2.  Ok, I will double-check that against what I did.



And those patches in mm tree also backports the ext3 best-effort
allocates multiple blocks code (allocate multiple blocks within the
block reservation window as much as possible), FYI.


Ok, thanks Mingming!

-Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks

2007-04-20 Thread Mingming Cao
On Fri, 2007-04-20 at 18:14 -0500, Eric Sandeen wrote:
> Andreas Dilger wrote:
> > On Apr 20, 2007  12:10 -0500, Eric Sandeen wrote:
> >> If ext3 can do 16T, ext2 probably should be able to as well.
> >> There are still "int" block containers in the block allocation path
> >> that need to be fixed up.
> > 
> > Yeah, but who wants to do 16TB e2fsck on every boot?  I think there
> > needs to be some limits imposed for the sake of usability.
> 
> I figure this is in the fine tradition of "enough rope to hang oneself"
> 
> If you have 16T of filesystem you probably know enough to not hang 
> yourself this way.
> 
> *shrug*
> 
> It's a bug, today.   

They are fixed in mm tree, as part of the patches which backports ext3
block reservation code to ext2. filesystem block numbers are all
ext2_fsblk_t type(i.e. unsigned long)(see ext2_new_blocks()). Maybe need
a round of thorough review to see if anything left, but I think what in
mm tree looks good.

And those patches in mm tree also backports the ext3 best-effort
allocates multiple blocks code (allocate multiple blocks within the
block reservation window as much as possible), FYI.

Mingming

> If we need another change to limit ext2 to 500G or 
> something, fine by me.  :)
> 
> -Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks

2007-04-20 Thread Eric Sandeen

Andreas Dilger wrote:

On Apr 20, 2007  12:10 -0500, Eric Sandeen wrote:

If ext3 can do 16T, ext2 probably should be able to as well.
There are still "int" block containers in the block allocation path
that need to be fixed up.


Yeah, but who wants to do 16TB e2fsck on every boot?  I think there
needs to be some limits imposed for the sake of usability.


I figure this is in the fine tradition of "enough rope to hang oneself"

If you have 16T of filesystem you probably know enough to not hang 
yourself this way.


*shrug*

It's a bug, today.   If we need another change to limit ext2 to 500G or 
something, fine by me.  :)


-Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix ext2 allocator overflows above 31 bit blocks

2007-04-20 Thread Andreas Dilger
On Apr 20, 2007  12:10 -0500, Eric Sandeen wrote:
> If ext3 can do 16T, ext2 probably should be able to as well.
> There are still "int" block containers in the block allocation path
> that need to be fixed up.

Yeah, but who wants to do 16TB e2fsck on every boot?  I think there
needs to be some limits imposed for the sake of usability.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


+ jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread.patch added to -mm tree

2007-04-20 Thread akpm

The patch titled
 jbd: check for error returned by kthread_create on creating journal thread
has been added to the -mm tree.  Its filename is
 
jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread.patch

*** Remember to use Documentation/SubmitChecklist when testing your code ***

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

--
Subject: jbd: check for error returned by kthread_create on creating journal 
thread
From: Pavel Emelianov <[EMAIL PROTECTED]>

If the thread failed to create the subsequent wait_event will hang forever.

This is likely to happen if kernel hits max_threads limit.

Will be critical for virtualization systems that limit the number of tasks
and kernel memory usage within the container.

(akpm: JBD should be converted fully to the kthread API: kthread_should_stop()
and kthread_stop()).

Cc: 
Cc: Christoph Hellwig <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/jbd/journal.c  |   13 +
 fs/jbd2/journal.c |   13 +
 2 files changed, 18 insertions(+), 8 deletions(-)

diff -puN 
fs/jbd/journal.c~jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread
 fs/jbd/journal.c
--- 
a/fs/jbd/journal.c~jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread
+++ a/fs/jbd/journal.c
@@ -210,10 +210,16 @@ end_loop:
return 0;
 }
 
-static void journal_start_thread(journal_t *journal)
+static int journal_start_thread(journal_t *journal)
 {
-   kthread_run(kjournald, journal, "kjournald");
+   struct task_struct *t;
+
+   t = kthread_run(kjournald, journal, "kjournald");
+   if (IS_ERR(t))
+   return PTR_ERR(t);
+
wait_event(journal->j_wait_done_commit, journal->j_task != 0);
+   return 0;
 }
 
 static void journal_kill_thread(journal_t *journal)
@@ -839,8 +845,7 @@ static int journal_reset(journal_t *jour
 
/* Add the dynamic fields and write it to disk. */
journal_update_superblock(journal, 1);
-   journal_start_thread(journal);
-   return 0;
+   return journal_start_thread(journal);
 }
 
 /**
diff -puN 
fs/jbd2/journal.c~jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread
 fs/jbd2/journal.c
--- 
a/fs/jbd2/journal.c~jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread
+++ a/fs/jbd2/journal.c
@@ -210,10 +210,16 @@ end_loop:
return 0;
 }
 
-static void jbd2_journal_start_thread(journal_t *journal)
+static int jbd2_journal_start_thread(journal_t *journal)
 {
-   kthread_run(kjournald2, journal, "kjournald2");
+   struct task_struct *t;
+
+   t = kthread_run(kjournald2, journal, "kjournald2");
+   if (IS_ERR(t))
+   return PTR_ERR(t);
+
wait_event(journal->j_wait_done_commit, journal->j_task != 0);
+   return 0;
 }
 
 static void journal_kill_thread(journal_t *journal)
@@ -839,8 +845,7 @@ static int journal_reset(journal_t *jour
 
/* Add the dynamic fields and write it to disk. */
jbd2_journal_update_superblock(journal, 1);
-   jbd2_journal_start_thread(journal);
-   return 0;
+   return jbd2_journal_start_thread(journal);
 }
 
 /**
_

Patches currently in -mm which might be from [EMAIL PROTECTED] are

origin.patch
lockdep-treats-down_write_trylock-like-regular-down_write.patch
introduce-a-handy-list_first_entry-macro-v2.patch
report-that-kernel-is-tainted-if-there-were-an-oops-before.patch
jbd-check-for-error-returned-by-kthread_create-on-creating-journal-thread.patch

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Check for error returned by kthread_create on creating journal thread

2007-04-20 Thread Andrew Morton
On Mon, 16 Apr 2007 11:41:14 +0400
Pavel Emelianov <[EMAIL PROTECTED]> wrote:

> If the thread failed to create the subsequent wait_event
> will hang forever.
> 
> This is likely to happen if kernel hits max_threads limit.
> 
> Will be critical for virtualization systems that limit the
> number of tasks and kernel memory usage within the container.
> 
> 
> [diff-jbd-check-start-journal-thread-return-value  text/plain (1.7KB)]
> --- ./fs/jbd/journal.c.jbdthreads 2007-04-16 11:17:36.0 +0400
> +++ ./fs/jbd/journal.c2007-04-16 11:30:09.0 +0400
> @@ -211,10 +211,16 @@ end_loop:
>   return 0;
>  }
>  
> -static void journal_start_thread(journal_t *journal)
> +static int journal_start_thread(journal_t *journal)
>  {
> - kthread_run(kjournald, journal, "kjournald");
> + struct task_struct *t;
> +
> + t = kthread_run(kjournald, journal, "kjournald");
> + if (IS_ERR(t))
> + return PTR_ERR(t);
> +
>   wait_event(journal->j_wait_done_commit, journal->j_task != 0);
> + return 0;
>  }

Thanks.   Please don't forget those Signed-off-by:s

I assume that you runtime tested this and that the mount failed in
an appropriate fashion?
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix up lazy_bg bitmap initialization at mkfs time

2007-04-20 Thread Andreas Dilger
On Apr 20, 2007  09:57 -0500, Eric Sandeen wrote:
> Anyway, how about something like this for calculating journal size in 
> the face of lazy_bg.  I know the last group may be smaller... but I figure
> this is just a heuristic anyway.
> 
> Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>
> 
> Index: e2fsprogs-1.39_ext4_hg/misc/util.c
> ===
> --- e2fsprogs-1.39_ext4_hg.orig/misc/util.c
> +++ e2fsprogs-1.39_ext4_hg/misc/util.c
> @@ -252,8 +252,16 @@ void parse_journal_opts(const char *opts
>  int figure_journal_size(int size, ext2_filsys fs)
>  {
>   blk_t j_blocks;
> + blk_t fs_size;
>  
> - if (fs->super->s_blocks_count < 2048) {
> + if (EXT2_HAS_COMPAT_FEATURE(fs->super, 
> + EXT2_FEATURE_COMPAT_LAZY_BG)) {
> + fs_size = fs->super->s_blocks_per_group * 2; 
> + } else {
> + fs_size = fs->super->s_blocks_count;
> + }

This should also check for !RO_COMPAT_UNINIT_GROUPS, since LAZY_BG
can be used in conjunction with UNINIT_GROUPS to mean "don't zero
uninitialized groups" but doesn't set bg_free_blocks_count == 0.

I was going to suggest using s_free_blocks_count, but that might
lead to confusion if e.g. e2fsck deletes a journal on a nearly-full
fs and then tune2fs recreates it much smaller than before.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] e2fsprogs: Offsets of EAs in inode need not be sorted

2007-04-20 Thread Kalpak Shah
On Fri, 2007-04-20 at 10:10 -0400, Theodore Tso wrote:
> On Fri, Apr 20, 2007 at 06:35:41PM +0530, Kalpak Shah wrote:
> > 
> > I saw this problem when I was running a script which created a random
> > number of EAs for a file of random sizes. If you mount the image I have
> > given, all the EAs are displayed and they can be used/modified/deleted
> > without any problems so its not a bug in ext3 code.
> 
> Not a bug in the ext3 code, but if a RHEL5 or SLES 10 or Debian etch
> user uses EA-in-Inode, with SELinux enabled, and then uses the e2fsck
> shipped with their distro, some of the unsorted EA's would get
> deleted right?  

Yes thats what I meant, it is certainly not a bug in ext3. And actually
the effects are worse, _all_ the EAs in the inode get deleted. I think
not many people have been using > 128 byte inodes and hence this problem
might not have cropped up until now.

Thanks,
Kalpak Shah.
<[EMAIL PROTECTED]>

> 
> If that's correct, then that's a pretty nasty data corruption problem
> (that with SELinux enabled could cause the system to become completely
> non-functional, further contributing to SELinux's bad reputation...)
> and we'll need to push your patch to the various distro's as a
> relatively high priority bug fix.
> 
>   - Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fix ext2 allocator overflows above 31 bit blocks

2007-04-20 Thread Eric Sandeen
If ext3 can do 16T, ext2 probably should be able to as well.
There are still "int" block containers in the block allocation path
that need to be fixed up.

Perhaps ext2 should get the ext2_fsblk_t/ext2_grpblk_t treatment
as ext3 did, for clarity...

Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>

Index: linux-2.6.21-rc5/fs/ext2/balloc.c
===
--- linux-2.6.21-rc5.orig/fs/ext2/balloc.c
+++ linux-2.6.21-rc5/fs/ext2/balloc.c
@@ -323,7 +323,7 @@ got_it:
  * bitmap, and then for any free bit if that fails.
  * This function also updates quota and i_blocks field.
  */
-int ext2_new_block(struct inode *inode, unsigned long goal,
+unsigned long ext2_new_block(struct inode *inode, unsigned long goal,
u32 *prealloc_count, u32 *prealloc_block, int *err)
 {
struct buffer_head *bitmap_bh = NULL;
@@ -332,8 +332,8 @@ int ext2_new_block(struct inode *inode, 
int group_no;   /* i */
int ret_block;  /* j */
int group_idx;  /* k */
-   int target_block;   /* tmp */
-   int block = 0;
+   unsigned long target_block; /* tmp */
+   unsigned long block = 0;
struct super_block *sb = inode->i_sb;
struct ext2_sb_info *sbi = EXT2_SB(sb);
struct ext2_super_block *es = sbi->s_es;
@@ -460,7 +460,7 @@ got_block:
  sbi->s_itb_per_group))
ext2_error (sb, "ext2_new_block",
"Allocating block in system zone - "
-   "block = %u", target_block);
+   "block = %lu", target_block);
 
if (target_block >= le32_to_cpu(es->s_blocks_count)) {
ext2_error (sb, "ext2_new_block",
Index: linux-2.6.21-rc5/fs/ext2/ext2.h
===
--- linux-2.6.21-rc5.orig/fs/ext2/ext2.h
+++ linux-2.6.21-rc5/fs/ext2/ext2.h
@@ -91,7 +91,7 @@ static inline struct ext2_inode_info *EX
 /* balloc.c */
 extern int ext2_bg_has_super(struct super_block *sb, int group);
 extern unsigned long ext2_bg_num_gdb(struct super_block *sb, int group);
-extern int ext2_new_block (struct inode *, unsigned long,
+extern unsigned long ext2_new_block (struct inode *, unsigned long,
   __u32 *, __u32 *, int *);
 extern void ext2_free_blocks (struct inode *, unsigned long,
  unsigned long);
Index: linux-2.6.21-rc5/fs/ext2/inode.c
===
--- linux-2.6.21-rc5.orig/fs/ext2/inode.c
+++ linux-2.6.21-rc5/fs/ext2/inode.c
@@ -107,7 +107,7 @@ void ext2_discard_prealloc (struct inode
 #endif
 }
 
-static int ext2_alloc_block (struct inode * inode, unsigned long goal, int 
*err)
+static unsigned long ext2_alloc_block (struct inode * inode, unsigned long 
goal, int *err)
 {
 #ifdef EXT2FS_DEBUG
static unsigned long alloc_hits, alloc_attempts;
@@ -425,13 +425,13 @@ static int ext2_alloc_branch(struct inod
int n = 0;
int err;
int i;
-   int parent = ext2_alloc_block(inode, goal, &err);
+   unsigned long parent = ext2_alloc_block(inode, goal, &err);
 
branch[0].key = cpu_to_le32(parent);
if (parent) for (n = 1; n < num; n++) {
struct buffer_head *bh;
/* Allocate the next block */
-   int nr = ext2_alloc_block(inode, parent, &err);
+   unsigned long nr = ext2_alloc_block(inode, parent, &err);
if (!nr)
break;
branch[n].key = cpu_to_le32(nr);


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Interface for the new fallocate() system call

2007-04-20 Thread Jakub Jelinek
On Fri, Apr 20, 2007 at 07:21:46PM +0530, Amit K. Arora wrote:
> Ok.
> In this case we may have to consider following things:
> 
> 1) Obviously, for this glibc will have to call fallocate() syscall with
> different arguments on s390, than other archs. I think this should be
> doable and should not be an issue with glibc folks (right?).

glibc can cope with this easily, will just add
sysdeps/unix/sysv/linux/s390/fallocate.c or something similar to override
the generic Linux implementation.

> 2) we also need to see how strace behaves in this case. With little
> knowledge that I have of strace, I don't think it should depend on
> argument ordering of a system call on different archs (since it uses
> ptrace internally and that should take care of it). But, it will be
> nice if someone can confirm this.

strace would solve this with #ifdef mess, it already does that in many
places so guess another few lines don't make it significantly worse.

Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix up lazy_bg bitmap initialization at mkfs time

2007-04-20 Thread Eric Sandeen
Andreas Dilger wrote:

> Just as an FYI - it probably makes little sense to have a 128MB journal
> for a filesystem you want to use for testing, since it would still write
> 128MB to your loopback device when zeroing the journal.  It still makes
> sense for mke2fs and libext2fs to be fixed for correctness, but I think
> it also makes sense for lazy_bg to influence the journal size.
>   
yup, it probably should change that heuristic.  With lazy_bg, it should 
probably 
look at something like blocks per group * 2 instead of total filesystem 
blocks...
see below.

>> Unfortunately it also increases mkfs time a bit, as it must search
>> a huge string of unavailable blocks if it has to allocate in the 
>> last bg.  Ah well...
>> 
>
> It also probably makes sense for libext2fs to check group descriptors
> while allocating...
>   
yeah... I was going with the "lazy" theme.  ;-)  (and the 
small-functional-change-at-a-time theme...) But for large filesystems, that 
would certainly be a bonus. 

/*
 * Stupid algorithm --- we now just search forward starting from the
 * goal.  Should put in a smarter one someday
 */
errcode_t ext2fs_new_block(...

I get the impression that allocation in libext2 is not too glorious in 
the first place... :)

-Eric

Anyway, how about something like this for calculating journal size in 
the face of lazy_bg.  I know the last group may be smaller... but I figure
this is just a heuristic anyway.

Signed-off-by: Eric Sandeen <[EMAIL PROTECTED]>

Index: e2fsprogs-1.39_ext4_hg/misc/util.c
===
--- e2fsprogs-1.39_ext4_hg.orig/misc/util.c
+++ e2fsprogs-1.39_ext4_hg/misc/util.c
@@ -252,8 +252,16 @@ void parse_journal_opts(const char *opts
 int figure_journal_size(int size, ext2_filsys fs)
 {
blk_t j_blocks;
+   blk_t fs_size;
 
-   if (fs->super->s_blocks_count < 2048) {
+   if (EXT2_HAS_COMPAT_FEATURE(fs->super, 
+   EXT2_FEATURE_COMPAT_LAZY_BG)) {
+   fs_size = fs->super->s_blocks_per_group * 2; 
+   } else {
+   fs_size = fs->super->s_blocks_count;
+   }
+
+   if (fs_size < 2048) {
fputs(_("\nFilesystem too small for a journal\n"), stderr);
return 0;
}
@@ -276,13 +284,13 @@ int figure_journal_size(int size, ext2_f
return j_blocks;
}
 
-   if (fs->super->s_blocks_count < 32768)
+   if (fs_size < 32768)
j_blocks = 1400;
-   else if (fs->super->s_blocks_count < 256*1024)
+   else if (fs_size < 256*1024)
j_blocks = 4096;
-   else if (fs->super->s_blocks_count < 512*1024)
+   else if (fs_size < 512*1024)
j_blocks = 8192;
-   else if (fs->super->s_blocks_count < 1024*1024)
+   else if (fs_size < 1024*1024)
j_blocks = 16384;
else
j_blocks = 32768;


-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] e2fsprogs: Offsets of EAs in inode need not be sorted

2007-04-20 Thread Theodore Tso
On Fri, Apr 20, 2007 at 06:35:41PM +0530, Kalpak Shah wrote:
> 
> I saw this problem when I was running a script which created a random
> number of EAs for a file of random sizes. If you mount the image I have
> given, all the EAs are displayed and they can be used/modified/deleted
> without any problems so its not a bug in ext3 code.

Not a bug in the ext3 code, but if a RHEL5 or SLES 10 or Debian etch
user uses EA-in-Inode, with SELinux enabled, and then uses the e2fsck
shipped with their distro, some of the unsorted EA's would get
deleted right?  

If that's correct, then that's a pretty nasty data corruption problem
(that with SELinux enabled could cause the system to become completely
non-functional, further contributing to SELinux's bad reputation...)
and we'll need to push your patch to the various distro's as a
relatively high priority bug fix.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Interface for the new fallocate() system call

2007-04-20 Thread Amit K. Arora
On Wed, Apr 18, 2007 at 07:06:00AM -0600, Andreas Dilger wrote:
> On Apr 17, 2007  18:25 +0530, Amit K. Arora wrote:
> > On Fri, Mar 30, 2007 at 02:14:17AM -0500, Jakub Jelinek wrote:
> > > Wouldn't
> > > int fallocate(loff_t offset, loff_t len, int fd, int mode)
> > > work on both s390 and ppc/arm?  glibc will certainly wrap it and
> > > reorder the arguments as needed, so there is no need to keep fd first.
> > 
> > I think more people are comfirtable with this approach.
> 
> Really?  I thought from the last postings that "fd first, wrap on s390"
> was better.
> 
> > Since glibc
> > will wrap the system call and export the "conventional" interface
> > (with fd first) to applications, we may not worry about keeping fd first
> > in kernel code. I am personally fine with this approach.
> 
> It would seem to make more sense to wrap the syscall on those architectures
> that can't handle the "conventional" interface (fd first).

Ok.
In this case we may have to consider following things:

1) Obviously, for this glibc will have to call fallocate() syscall with
different arguments on s390, than other archs. I think this should be
doable and should not be an issue with glibc folks (right?).

2) we also need to see how strace behaves in this case. With little
knowledge that I have of strace, I don't think it should depend on
argument ordering of a system call on different archs (since it uses
ptrace internally and that should take care of it). But, it will be
nice if someone can confirm this.

Thanks!
--
Regards,
Amit Arora
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] e2fsprogs: Offsets of EAs in inode need not be sorted

2007-04-20 Thread Kalpak Shah
On Fri, 2007-04-20 at 08:38 -0400, Theodore Tso wrote:
> On Thu, Apr 19, 2007 at 05:35:36PM +0530, Kalpak Shah wrote:
> > Hi,
> > 
> > This patch removes a code snippet from check_ea_in_inode() in pass1
> > which checks if the EA values in the inode are sorted or not. The
> > comments in fs/ext*/xattr.c state that the EA values in the external
> > EA block are sorted but those in the inode need not be sorted. I
> > have also attached a test image which has unsorted EAs in the
> > inodes. The current e2fsck wrongly clears the EAs in the inode.
> 
> Hmm, have you been able to create test images that have unsorted EA's
> in inodes using a standard ext3 kernel implementat?  If so, then this
> is a patch which we should push to the distro's since it could cause
> data loss, and cause serious malfunctions, especially for people who
> have SELinux enabled and are using large inodes for the EA-in-inode
> feature

Hi,

I saw this problem when I was running a script which created a random
number of EAs for a file of random sizes. If you mount the image I have
given, all the EAs are displayed and they can be used/modified/deleted
without any problems so its not a bug in ext3 code.

Thanks,
Kalpak.

> 
>   - Ted
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] e2fsprogs: Offsets of EAs in inode need not be sorted

2007-04-20 Thread Theodore Tso
On Thu, Apr 19, 2007 at 05:35:36PM +0530, Kalpak Shah wrote:
> Hi,
> 
> This patch removes a code snippet from check_ea_in_inode() in pass1
> which checks if the EA values in the inode are sorted or not. The
> comments in fs/ext*/xattr.c state that the EA values in the external
> EA block are sorted but those in the inode need not be sorted. I
> have also attached a test image which has unsorted EAs in the
> inodes. The current e2fsck wrongly clears the EAs in the inode.

Hmm, have you been able to create test images that have unsorted EA's
in inodes using a standard ext3 kernel implementat?  If so, then this
is a patch which we should push to the distro's since it could cause
data loss, and cause serious malfunctions, especially for people who
have SELinux enabled and are using large inodes for the EA-in-inode
feature

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] e2fsprogs - e2fsck pass1c does extra work if root dir has shared blocks

2007-04-20 Thread Theodore Tso
On Tue, Apr 10, 2007 at 03:39:56PM -0700, Jim Garlick wrote:
> 
> Another small bug I think: if the root directory contains shared 
> blocks, e2fsck pass1c search_dirent_proc() will be looking for 
> one more containing directory than it will ever find, and thus 
> loses an opportunity to terminate early.
> 
> Signed-off-by: Jim Garlick <[EMAIL PROTECTED]>

Thanks, applied.

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fix up lazy_bg bitmap initialization at mkfs time

2007-04-20 Thread Andreas Dilger
On Apr 19, 2007  16:07 -0500, Eric Sandeen wrote:
> While trying out the -O lazy_bg option, I ran into some trouble on my
> big filesystem.  The journal size was > free blocks in the first block group,
> so it spilled into the next bg with available blocks.  Since we are using
> lazy_bg here, that -should- have been the last block group.

Just as an FYI - it probably makes little sense to have a 128MB journal
for a filesystem you want to use for testing, since it would still write
128MB to your loopback device when zeroing the journal.  It still makes
sense for mke2fs and libext2fs to be fixed for correctness, but I think
it also makes sense for lazy_bg to influence the journal size.

> Unfortunately it also increases mkfs time a bit, as it must search
> a huge string of unavailable blocks if it has to allocate in the 
> last bg.  Ah well...

It also probably makes sense for libext2fs to check group descriptors
while allocating...

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Copy i_flags to ext3 inode flags on write (version 2)

2007-04-20 Thread Andrew Morton
On Tue, 17 Apr 2007 12:38:55 +0200 Jan Kara <[EMAIL PROTECTED]> wrote:

>   attached is a second version of a patch that stores inode flags such as
> S_IMMUTABLE, S_APPEND, etc. from i_flags to EXT3_I(inode)->i_flags when
> inode is written to disk. The same thing is done on GETFLAGS ioctl.
>   Quota code changes these flags on quota files (to make it harder for
> sysadmin to screw himself) and these changes were not correctly
> propagated into the filesystem (especially, lsattr did not show them and
> users were wondering...). Andrew, could you please put the patch into your
> queue? Thanks.

umm, OK.

What about ext2 and all the zillions of others?
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


+ ext3-copy-i_flags-to-inode-flags-on-write.patch added to -mm tree

2007-04-20 Thread akpm

The patch titled
 ext3: copy i_flags to inode flags on write
has been added to the -mm tree.  Its filename is
 ext3-copy-i_flags-to-inode-flags-on-write.patch

*** Remember to use Documentation/SubmitChecklist when testing your code ***

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

--
Subject: ext3: copy i_flags to inode flags on write
From: Jan Kara <[EMAIL PROTECTED]>

A patch that stores inode flags such as S_IMMUTABLE, S_APPEND, etc.  from
i_flags to EXT3_I(inode)->i_flags when inode is written to disk.  The same
thing is done on GETFLAGS ioctl.

Quota code changes these flags on quota files (to make it harder for
sysadmin to screw himself) and these changes were not correctly propagated
into the filesystem (especially, lsattr did not show them and users were
wondering...).


Propagate flags such as S_APPEND, S_IMMUTABLE, etc.  from i_flags into
ext3-specific i_flags.  Hence, when someone sets these flags via a
different interface than ioctl, they are stored correctly.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>
Cc: 
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/ext3/inode.c |   20 
 fs/ext3/ioctl.c |1 +
 include/linux/ext3_fs.h |1 +
 3 files changed, 22 insertions(+)

diff -puN fs/ext3/inode.c~ext3-copy-i_flags-to-inode-flags-on-write 
fs/ext3/inode.c
--- a/fs/ext3/inode.c~ext3-copy-i_flags-to-inode-flags-on-write
+++ a/fs/ext3/inode.c
@@ -2580,6 +2580,25 @@ void ext3_set_inode_flags(struct inode *
inode->i_flags |= S_DIRSYNC;
 }
 
+/* Propagate flags from i_flags to EXT3_I(inode)->i_flags */
+void ext3_get_inode_flags(struct ext3_inode_info *ei)
+{
+   unsigned int flags = ei->vfs_inode.i_flags;
+
+   ei->i_flags &= ~(EXT3_SYNC_FL|EXT3_APPEND_FL|
+   EXT3_IMMUTABLE_FL|EXT3_NOATIME_FL|EXT3_DIRSYNC_FL);
+   if (flags & S_SYNC)
+   ei->i_flags |= EXT3_SYNC_FL;
+   if (flags & S_APPEND)
+   ei->i_flags |= EXT3_APPEND_FL;
+   if (flags & S_IMMUTABLE)
+   ei->i_flags |= EXT3_IMMUTABLE_FL;
+   if (flags & S_NOATIME)
+   ei->i_flags |= EXT3_NOATIME_FL;
+   if (flags & S_DIRSYNC)
+   ei->i_flags |= EXT3_DIRSYNC_FL;
+}
+
 void ext3_read_inode(struct inode * inode)
 {
struct ext3_iloc iloc;
@@ -2735,6 +2754,7 @@ static int ext3_do_update_inode(handle_t
if (ei->i_state & EXT3_STATE_NEW)
memset(raw_inode, 0, EXT3_SB(inode->i_sb)->s_inode_size);
 
+   ext3_get_inode_flags(ei);
raw_inode->i_mode = cpu_to_le16(inode->i_mode);
if(!(test_opt(inode->i_sb, NO_UID32))) {
raw_inode->i_uid_low = cpu_to_le16(low_16_bits(inode->i_uid));
diff -puN fs/ext3/ioctl.c~ext3-copy-i_flags-to-inode-flags-on-write 
fs/ext3/ioctl.c
--- a/fs/ext3/ioctl.c~ext3-copy-i_flags-to-inode-flags-on-write
+++ a/fs/ext3/ioctl.c
@@ -28,6 +28,7 @@ int ext3_ioctl (struct inode * inode, st
 
switch (cmd) {
case EXT3_IOC_GETFLAGS:
+   ext3_get_inode_flags(ei);
flags = ei->i_flags & EXT3_FL_USER_VISIBLE;
return put_user(flags, (int __user *) arg);
case EXT3_IOC_SETFLAGS: {
diff -puN include/linux/ext3_fs.h~ext3-copy-i_flags-to-inode-flags-on-write 
include/linux/ext3_fs.h
--- a/include/linux/ext3_fs.h~ext3-copy-i_flags-to-inode-flags-on-write
+++ a/include/linux/ext3_fs.h
@@ -824,6 +824,7 @@ extern int ext3_change_inode_journal_fla
 extern int ext3_get_inode_loc(struct inode *, struct ext3_iloc *);
 extern void ext3_truncate (struct inode *);
 extern void ext3_set_inode_flags(struct inode *);
+extern void ext3_get_inode_flags(struct ext3_inode_info *);
 extern void ext3_set_aops(struct inode *inode);
 
 /* ioctl.c */
_

Patches currently in -mm which might be from [EMAIL PROTECTED] are

readahead-improve-heuristic-detecting-sequential-reads.patch
readahead-code-cleanup.patch
readahead-code-cleanup-fix.patch
make-remove_inode_dquot_ref-static.patch
ext3-copy-i_flags-to-inode-flags-on-write.patch
ext4-copy-i_flags-to-inode-flags-on-write.patch
udf-use-sector_t-and-loff_t-for-file-offsets.patch
udf-introduce-struct-extent_position.patch
udf-use-get_bh.patch
udf-add-assertions.patch
udf-support-files-larger-than-1g.patch
udf-fix-link-counts.patch

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


+ ext4-copy-i_flags-to-inode-flags-on-write.patch added to -mm tree

2007-04-20 Thread akpm

The patch titled
 ext4: copy i_flags to inode flags on write
has been added to the -mm tree.  Its filename is
 ext4-copy-i_flags-to-inode-flags-on-write.patch

*** Remember to use Documentation/SubmitChecklist when testing your code ***

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

--
Subject: ext4: copy i_flags to inode flags on write
From: Jan Kara <[EMAIL PROTECTED]>

Propagate flags such as S_APPEND, S_IMMUTABLE, etc.  from i_flags into
ext4-specific i_flags.  Hence, when someone sets these flags via a
different interface than ioctl, they are stored correctly.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>
Cc: 
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/ext4/inode.c |   20 
 fs/ext4/ioctl.c |1 +
 include/linux/ext4_fs.h |1 +
 3 files changed, 22 insertions(+)

diff -puN fs/ext4/inode.c~ext4-copy-i_flags-to-inode-flags-on-write 
fs/ext4/inode.c
--- a/fs/ext4/inode.c~ext4-copy-i_flags-to-inode-flags-on-write
+++ a/fs/ext4/inode.c
@@ -2613,6 +2613,25 @@ void ext4_set_inode_flags(struct inode *
inode->i_flags |= S_DIRSYNC;
 }
 
+/* Propagate flags from i_flags to EXT4_I(inode)->i_flags */
+void ext4_get_inode_flags(struct ext4_inode_info *ei)
+{
+   unsigned int flags = ei->vfs_inode.i_flags;
+
+   ei->i_flags &= ~(EXT4_SYNC_FL|EXT4_APPEND_FL|
+   EXT4_IMMUTABLE_FL|EXT4_NOATIME_FL|EXT4_DIRSYNC_FL);
+   if (flags & S_SYNC)
+   ei->i_flags |= EXT4_SYNC_FL;
+   if (flags & S_APPEND)
+   ei->i_flags |= EXT4_APPEND_FL;
+   if (flags & S_IMMUTABLE)
+   ei->i_flags |= EXT4_IMMUTABLE_FL;
+   if (flags & S_NOATIME)
+   ei->i_flags |= EXT4_NOATIME_FL;
+   if (flags & S_DIRSYNC)
+   ei->i_flags |= EXT4_DIRSYNC_FL;
+}
+
 void ext4_read_inode(struct inode * inode)
 {
struct ext4_iloc iloc;
@@ -2773,6 +2792,7 @@ static int ext4_do_update_inode(handle_t
if (ei->i_state & EXT4_STATE_NEW)
memset(raw_inode, 0, EXT4_SB(inode->i_sb)->s_inode_size);
 
+   ext4_get_inode_flags(ei);
raw_inode->i_mode = cpu_to_le16(inode->i_mode);
if(!(test_opt(inode->i_sb, NO_UID32))) {
raw_inode->i_uid_low = cpu_to_le16(low_16_bits(inode->i_uid));
diff -puN fs/ext4/ioctl.c~ext4-copy-i_flags-to-inode-flags-on-write 
fs/ext4/ioctl.c
--- a/fs/ext4/ioctl.c~ext4-copy-i_flags-to-inode-flags-on-write
+++ a/fs/ext4/ioctl.c
@@ -29,6 +29,7 @@ int ext4_ioctl (struct inode * inode, st
 
switch (cmd) {
case EXT4_IOC_GETFLAGS:
+   ext4_get_inode_flags(ei);
flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
return put_user(flags, (int __user *) arg);
case EXT4_IOC_SETFLAGS: {
diff -puN include/linux/ext4_fs.h~ext4-copy-i_flags-to-inode-flags-on-write 
include/linux/ext4_fs.h
--- a/include/linux/ext4_fs.h~ext4-copy-i_flags-to-inode-flags-on-write
+++ a/include/linux/ext4_fs.h
@@ -949,6 +949,7 @@ extern int ext4_change_inode_journal_fla
 extern int ext4_get_inode_loc(struct inode *, struct ext4_iloc *);
 extern void ext4_truncate (struct inode *);
 extern void ext4_set_inode_flags(struct inode *);
+extern void ext4_get_inode_flags(struct ext4_inode_info *);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
 extern int ext4_block_truncate_page(handle_t *handle, struct page *page,
_

Patches currently in -mm which might be from [EMAIL PROTECTED] are

readahead-improve-heuristic-detecting-sequential-reads.patch
readahead-code-cleanup.patch
readahead-code-cleanup-fix.patch
make-remove_inode_dquot_ref-static.patch
ext4-copy-i_flags-to-inode-flags-on-write.patch
udf-use-sector_t-and-loff_t-for-file-offsets.patch
udf-introduce-struct-extent_position.patch
udf-use-get_bh.patch
udf-add-assertions.patch
udf-support-files-larger-than-1g.patch
udf-fix-link-counts.patch

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html