Re: [PATCH 05/25] Unionfs: cast page-index loff_t before shifting

2007-09-26 Thread Christoph Hellwig
On Tue, Sep 25, 2007 at 11:09:44PM -0400, Erez Zadok wrote:
 Fixes bugs in number promotion/demotion computation, as per
 http://lkml.org/lkml/2007/9/20/17

It's better to use te page_offset helper as that avoids any confusion
on where to cast.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/25] Unionfs: display informational messages only if debug is on

2007-09-26 Thread Jan Engelhardt

On Sep 25 2007 23:09, Erez Zadok wrote:
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -394,8 +394,8 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
   if (willwrite  IS_WRITE_FLAG(file-f_flags) 
   !IS_WRITE_FLAG(unionfs_lower_file(file)-f_flags) 
   is_robranch(dentry)) {
-  printk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
- dentry-d_name.name);
+  dprintk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
+  dentry-d_name.name);

Don't we have pr_debug() for that?

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-26 Thread Erez Zadok
In message [EMAIL PROTECTED], Kok, Auke writes:
 Erez Zadok wrote:
  Signed-off-by: Erez Zadok [EMAIL PROTECTED]
  ---
   fs/unionfs/copyup.c |  102 
  +-
   1 files changed, 51 insertions(+), 51 deletions(-)
  
  diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
  index 23ac4c8..e3c5f15 100644
  --- a/fs/unionfs/copyup.c
  +++ b/fs/unionfs/copyup.c
  @@ -36,14 +36,14 @@ static int copyup_xattrs(struct dentry 
  *old_lower_dentry,
   
  /* query the actual size of the xattr list */
  list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
  -   if (list_size = 0) {
  +   if (unlikely(list_size = 0)) {
 
 
 I've been told several times that adding these is almost always bogus - 
 either it
 messes up the CPU branch prediction or the compiler/CPU just does a lot 
 better at
 finding the right way without these hints.

 Adding them as a blanket seems rather strange. Have you got any numbers that 
 this
 really improves performance?
 
 Auke

Auke, that's a good question, but I found it hard to find any info about it.
There's no discussion on it in Documentation/, and very little I could find
elsewhere.  I did see one url explaining what un/likely does precisely, but
no guidelines.  My understanding is that it can improve performance, as long
as it's used carefully (otherwise it may hurt performance).

Background: we used un/likely in Unionfs only sparingly until now.  We
looked at what other filesystems and kernel components do, and it seems that
it varies a lot: some subsystems use it religiously wherever they can, and
some use it just a little here and there.  We looked at what other
filesystems do in particular and tried to follow a similar model under what
cases other filesystems use un/likely.

Recently we've done a full audit of the entire code, and added un/likely
where we felt that the chance of succeeding is 95% or better (e.g., error
conditions that should rarely happen, and such).  So while it looks like
we've added many of those in this series of patches, I can assure you we
didn't just wrap every conditional in an un/likely just for the sake of
using it. :-) There are plenty of conditionals (over 250) left untouched b/c
it didn't make sense to force the branch prediction on them one way or
another.

So my questions are, for everyone, what's the policy on using un/likely?
Any common conventions/wisdom?  I think we need something added to
Documentation/.

Also, Auke, if indeed compilers are [sic] likely to do better than
programmers adding un/likely wrappers, then why do we still support that in
the kernel?  (Working for a company tat produces high-quality compilers, you
may know the answer better.)

Personally I'm not too fond of what those wrappers do the code: they make it
a bit harder to read the code (yet another extra set of parentheses); and
they cause the code to be indented further to the right, which you sometimes
have to split to multiple lines to avoid going over 80 chars.

Cheers,
Erez.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/25] Unionfs: cast page-index loff_t before shifting

2007-09-26 Thread Erez Zadok
In message [EMAIL PROTECTED], Christoph Hellwig writes:
 On Tue, Sep 25, 2007 at 11:09:44PM -0400, Erez Zadok wrote:
  Fixes bugs in number promotion/demotion computation, as per
  http://lkml.org/lkml/2007/9/20/17
 
 It's better to use te page_offset helper as that avoids any confusion
 on where to cast.

Excellent.  Will do.

Erez.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/25] Unionfs: display informational messages only if debug is on

2007-09-26 Thread Erez Zadok
In message [EMAIL PROTECTED], Jan Engelhardt writes:
 
 On Sep 25 2007 23:09, Erez Zadok wrote:
 --- a/fs/unionfs/commonfops.c
 +++ b/fs/unionfs/commonfops.c
 @@ -394,8 +394,8 @@ int unionfs_file_revalidate(struct file *file, bool 
 willwrite)
  if (willwrite  IS_WRITE_FLAG(file-f_flags) 
  !IS_WRITE_FLAG(unionfs_lower_file(file)-f_flags) 
  is_robranch(dentry)) {
 -printk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
 -   dentry-d_name.name);
 +dprintk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
 +dentry-d_name.name);
 
 Don't we have pr_debug() for that?

Jan, what's the policy on people writing their own debugging printk systems.
I've looked at what other file systems do, and found out that it varies a
lot: some use a simple mapping wrapper from printk-dprintk, while others
define complex ways to have debugging levels/bitmaps where users can
selectively decide (say, by a mount option), what debugging output they'd
like to see or not.  Here's a small sampling of what some file systems use:

9p/fid.c:   P9_DPRINTK(P9_DEBUG_VFS, fid %d dentry %s\n,
autofs/inode.c: DPRINTK((autofs: shutting down\n));
lockd/svcproc.c:dprintk(lockd: GRANTED_RES   called\n);
ncpfs/dir.c:DDPRINTK(ncp_lookup_validate: result=%d\n, val);
nfs/client.c:   dprintk(-- nfs_free_client()\n);
nfsd/export.c:  dprintk(found domain %s\n, buf);
partitions/efi.c:   Dprintk(GPT my_lba incorrect: %lld != %lld\n,

Not much standardization there :-) but dprintk does appear to be the more
common use.

I wanted something simple to allow me to not printk something that's just
for informational/debugging purposes, but w/o having to #ifdef the printk in
question, or define a complex debugging-level printk system.

Now, looking at pr_debug (in linux/kernel.h), and indeed some filesystems
(e.g., affs and configfs) use it.  But pr_debug is only active when #define
DEBUG is on (not CONFIG_DEBUG).  I didn't see a config option that enable
DEBUG: is there one?

From other file systems, it appears that the more common convention is for
filesystem XYZ to offer a config option CONFIG_DEBUG_XYZ, which makes its
debugging more self-contained and keeps the config option closer to where
the f/s itself is enabled.

Thanks,
Erez.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/25] Unionfs: display informational messages only if debug is on

2007-09-26 Thread Jan Engelhardt

On Sep 26 2007 10:01, Erez Zadok wrote:
 
 On Sep 25 2007 23:09, Erez Zadok wrote:
 --- a/fs/unionfs/commonfops.c
 +++ b/fs/unionfs/commonfops.c
 @@ -394,8 +394,8 @@ int unionfs_file_revalidate(struct file *file, bool 
 willwrite)
 if (willwrite  IS_WRITE_FLAG(file-f_flags) 
 !IS_WRITE_FLAG(unionfs_lower_file(file)-f_flags) 
 is_robranch(dentry)) {
 -   printk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
 -  dentry-d_name.name);
 +   dprintk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
 +   dentry-d_name.name);
 
 Don't we have pr_debug() for that?

Jan, what's the policy on people writing their own debugging printk systems.

Generally, the rule is don't (re)invent another debug system

I've looked at what other file systems do, and found out that it varies a lot:

Oh that's either old code or code that has not been reviewed properly.
Fact is that I have been made aware of pr_debug() often enough
on netfilter-devel, and it looks like a good idea.

I wanted something simple to allow me to not printk something that's just
for informational/debugging purposes, but w/o having to #ifdef the printk in
question, or define a complex debugging-level printk system.

Surprise, pr_debug() is just that all nicely wrapped up.
Want debug? Do it like this.

#ifdef CONFIG_UNIONFS_DEBUG
#   define DEBUG 1
#endif

and pr_debug() works magic.

Now, looking at pr_debug (in linux/kernel.h), and indeed some filesystems
(e.g., affs and configfs) use it.  But pr_debug is only active when #define
DEBUG is on (not CONFIG_DEBUG).  I didn't see a config option that enable
DEBUG: is there one?

No I do not think so. But when you grep for dprintk, you should also
grep for DEBUG, to be fair :p

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-26 Thread Kyle Moffett

On Sep 26, 2007, at 09:40:20, Erez Zadok wrote:

In message [EMAIL PROTECTED], Kok, Auke writes:
I've been told several times that adding these is almost always  
bogus - either it messes up the CPU branch prediction or the  
compiler/CPU just does a lot better at finding the right way  
without these hints.


Adding them as a blanket seems rather strange. Have you got any  
numbers that this really improves performance?


Auke, that's a good question, but I found it hard to find any info  
about it.  There's no discussion on it in Documentation/, and very  
little I could find elsewhere.  I did see one url explaining what  
un/likely does precisely, but no guidelines.  My understanding is  
that it can improve performance, as long as it's used carefully  
(otherwise it may hurt performance).


Hmm, even still I agree with Auke, you probably use it too much.


Recently we've done a full audit of the entire code, and added un/ 
likely where we felt that the chance of succeeding is 95% or better  
(e.g., error conditions that should rarely happen, and such).


Actually due to the performance penalty on some systems I think you  
only want to use it if the chance of succeeding is 99% or better, as  
the benefit if predicted is a cycle or two and the harm if  
mispredicted can be more than 50 cycles, depending on the CPU.  You  
should also remember than in filesystems many failures are  
triggered by things like the ld.so library searches, where it  
literally calls access() 20 different times on various possible paths  
for library files, failing the first 19.  It does this once for each  
necessary library.


Typically you only want to add unlikely() or likely() for about 2  
reasons:
  (A)  It's a hot path and the unlikely case is just going to burn a  
bunch of CPU anyways
  (B)  It really is extremely unlikely that it fails (Think physical  
hardware failure)


Anything else is just bogus.

Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrading datastructures between different filesystem versions

2007-09-26 Thread Andreas Dilger
On Sep 25, 2007  23:40 -0600, Jim Cromie wrote:
 kernel learner wrote:
 ext3 filesystem has 32-bit block address and ext4 filesystem has 
 48-bit block address. If a user installs ext4, how will the file 
 system handle already existing block with 32 bit values? 

 Why should it ? thats what ext3 is for.

Bzzt. Wrong answer.  The ext4 code will be able to read existing ext3
(and ext2) filesystems just fine.  Otherwise there wouldn't be much
of an upgrade path.

 Id expect ext4 drivers handling ext3 filesystems is a distant, secondary 
 goal to getting a fast, reliable, clean 48bit filesystem working.

Far from the truth.  One of the main goals of ext4 is that it is a drop-in
replacement for ext3.  The code is mostly incremental improvements over
ext3, and that IS one of the reasons that it is reliable.  We didn't throw
away 10 years of bug fixes in the ext2/ext3 code when adding the ext4
features.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-26 Thread Jan Engelhardt

On Sep 26 2007 11:43, Erez Zadok wrote:

*That's* the information I was looking for, Kyle: what's the estimated
probability I should be using as my guideline.  I used 95% (20/1 ratio), and

;-)

19:1 = 95:5 = 95% = ratio=0.95  != 20.0 (=20/1)

you're telling me I should use 99% (100/1 ratio).  The difference between

99:1 = 99% = ratio=0.99  != 100.0 (=100/1)

the number of cycles saved/added is very compelling.  Given that I certainly
agree with you that I'm using un/likely too much.  I'll re-evaluate and
update my patch series then.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-26 Thread Erez Zadok
In message [EMAIL PROTECTED], Jan Engelhardt writes:
 
 On Sep 26 2007 11:43, Erez Zadok wrote:
 
 *That's* the information I was looking for, Kyle: what's the estimated
 probability I should be using as my guideline.  I used 95% (20/1 ratio), and
 
 ;-)
 
 19:1 = 95:5 = 95% = ratio=0.95  != 20.0 (=20/1)
 
 you're telling me I should use 99% (100/1 ratio).  The difference between
 
 99:1 = 99% = ratio=0.99  != 100.0 (=100/1)
 
 the number of cycles saved/added is very compelling.  Given that I certainly
 agree with you that I'm using un/likely too much.  I'll re-evaluate and
 update my patch series then.

Yeah, close enough. :-)

The important issue is that I'm probably using about five times too many
un/likely wrappers.

Erez.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 0/4] 64k pagesize/blocksize fixes

2007-09-26 Thread Mingming Cao
On Tue, 2007-09-25 at 16:30 -0700, Christoph Lameter wrote:
 Attached the fixes necessary to support 64k pagesize/blocksize. I think these 
 are useful
 independent of the large blocksize patchset since there are architectures 
 that support
 64k page size and that could use these large buffer sizes without the large 
 buffersize
 patchset.
 
 Are these patches in the right shape to be merged? I rediffed these against 
 2.6.32-rc8-mm1.
 
 I had to fix some things in the second patch (ext2) that may need some review 
 since the
 way that commits work changed.
 

Thanks,  As Andreas mentioned in another email there is a better way to
deal with rec_len issue on 64k block size. I will get those patches
merged in ext4 patch queue. I think the updated patches could get
merged.


Mingming

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Threaded e2fsck proof of concept

2007-09-26 Thread Valerie Henson
The below patch is a proof of concept that e2fsck can get a
performance improvement on file systems with more than one disk
underneath.  On my test case, a 500GB file system with 150GB in use
and 10+1 RAID underneath, elapsed time is reduced by 40-50%.  I see no
performance improvement in the single disk case.  Only the reading of
inode tables and indirect blocks in pass 1 is multi-threaded; most
likely multithreading passes 2 and 5 will help too.  The actual data
processing is still all single-threaded, which is convenient.

Designing multithreaded readahead for the long term is another
question.  The Lustre folks are working on a sys_readahead() based
patch.  True aio() is the obvious solution, but won't work for older
kernels.  Pthreads works for all kernels but is clumsy.  Coming up
with a design for readahead that allows these different
implementations is probably a good idea.

Finally, if you are planning on testing these patches:

* Use -n!  You are crazy to let this write to your file system.
* Use about 2 * number_disks threads. (Doesn't work without -t n of
  some sort.)
* The striping logic is probably bogus.  Try something like -A 10.

Thanks to EMC for making this patch possible.  Share and enjoy!

-VAL

--- e2fsprogs-1.40.2.orig/e2fsck/Makefile.in
+++ e2fsprogs-1.40.2/e2fsck/Makefile.in
@@ -119,16 +119,16 @@ e2fsck: [EMAIL PROTECTED]@
 e2fsck.static: $(OBJS)  $(STATIC_DEPLIBS)
@echo  LD $@
@$(LD) $(ALL_LDFLAGS) $(LDFLAG_STATIC) -o e2fsck.static $(OBJS) \
-   $(STATIC_LIBS) 
+   $(STATIC_LIBS) -lpthread
 
 e2fsck.shared: $(OBJS)  $(DEPLIBS)
@echo  LD $@
-   @$(LD) $(ALL_LDFLAGS) -o e2fsck.shared $(OBJS) $(LIBS) 
+   @$(LD) $(ALL_LDFLAGS) -o e2fsck.shared $(OBJS) $(LIBS)  -lpthread
 
 e2fsck.profiled: $(PROFILED_OBJS)  $(PROFILED_DEPLIBS)
@echo  LD $@
@$(LD) $(ALL_LDFLAGS) -g -pg -o e2fsck.profiled $(PROFILED_OBJS) \
-   $(PROFILED_LIBS) 
+   $(PROFILED_LIBS)  -lpthread
 
 tst_refcount: ea_refcount.c
@echo  LD $@
--- e2fsprogs-1.40.2.orig/e2fsck/e2fsck.h
+++ e2fsprogs-1.40.2/e2fsck/e2fsck.h
@@ -25,6 +25,7 @@
 #ifdef HAVE_SETJMP_H
 #include setjmp.h
 #endif
+#include pthread.h
 
 #if EXT2_FLAT_INCLUDES
 #include ext2_fs.h
@@ -334,6 +335,17 @@ struct e2fsck_struct {
profile_t   profile;
 
/*
+* Multithreaded readahead variables
+*/
+   unsigned int read_threads;
+   unsigned int stripe_size;
+   struct readahead_state *readahead;
+   /* Used to signal the main thread when a bg is ready */
+   pthread_mutex_t mutex_ready;
+   pthread_cond_t buffer_ready;
+   /* Have to count groups left at the ctx level, not scan level */
+   dgrp_t groups_left;
+   /*
 * For the use of callers of the e2fsck functions; not used by
 * e2fsck functions themselves.
 */
--- e2fsprogs-1.40.2.orig/e2fsck/pass1.c
+++ e2fsprogs-1.40.2/e2fsck/pass1.c
@@ -96,9 +96,64 @@ struct process_inode_block {
struct ext2_inode inode;
 };
 
-struct scan_callback_struct {
+/*
+ * XXX Complete and total interface violation
+ *
+ * We need to skip around between block groups based on when they're
+ * done with readahead, rather than processing them sequentially.
+ * Probably just using the fs-get_blocks hook or something similar
+ * will work and will cut a few hundred lines of code.  For now, mess
+ * around with libext2fs's private structures.
+ *
+ */
+
+struct ext2_struct_inode_scan {
+   errcode_t   magic;
+   ext2_filsys fs;
+   ext2_ino_t  current_inode;
+   blk_t   current_block;
+   dgrp_t  current_group;
+   ext2_ino_t  inodes_left;
+   blk_t   blocks_left;
+   dgrp_t  groups_left;
+   blk_t   inode_buffer_blocks;
+   char *  inode_buffer;
+   int inode_size;
+   char *  ptr;
+   int bytes_left;
+   char*temp_buffer;
+   errcode_t   (*done_group)(ext2_filsys fs,
+ ext2_inode_scan scan,
+ dgrp_t group,
+ void * priv_data);
+   void *  done_group_data;
+   int bad_block_ptr;
+   int scan_flags;
+   int reserved[6];
+};
+
+/*
+ * Per thread readahead state.
+ */
+
+struct readahead_state {
+   ext2_filsys fs;
+   ext2_inode_scan scan;
e2fsck_tctx;
-   char*block_buf;
+   pthread_t   pthread;
+   unsigned intthread;
+   int bg_readahead_done;
+   pthread_mutex_t mutex;
+   pthread_cond_t  pause;
+   blk_t   *ind_blks_queue;

Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-26 Thread Adrian Bunk
On Wed, Sep 26, 2007 at 09:40:20AM -0400, Erez Zadok wrote:
...
 Also, Auke, if indeed compilers are [sic] likely to do better than
 programmers adding un/likely wrappers, then why do we still support that in
 the kernel?  (Working for a company tat produces high-quality compilers, you
 may know the answer better.)
 
 Personally I'm not too fond of what those wrappers do the code: they make it
 a bit harder to read the code (yet another extra set of parentheses); and
 they cause the code to be indented further to the right, which you sometimes
 have to split to multiple lines to avoid going over 80 chars.

There are some places in generic code where it makes sense, e.g.:
  #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while(0)

If you run into a BUG() it's anyway game over.

And there are some rare hotpaths in the kernel where it might make 
sense, and many other places where the likely/unlikely usage that might 
be present doesn't make sense.

Unless you know you need it you simply shouldn't use likely/unlikely.

 Cheers,
 Erez.

cu
Adrian

-- 

   Is there not promise of rain? Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   Only a promise, Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 13/25] Unionfs: add un/likely conditionals on dir ops

2007-09-26 Thread roel
Erez Zadok wrote:

 @@ -194,7 +194,7 @@ int check_empty(struct dentry *dentry, struct 
 unionfs_dir_state **namelist)
  
   BUG_ON(!S_ISDIR(dentry-d_inode-i_mode));
  
 - if ((err = unionfs_partial_lookup(dentry)))
 + if (unlikely((err = unionfs_partial_lookup(dentry
   goto out;
  
   bstart = dbstart(dentry);

Is it bad to leave this assignment within the unlikely()?
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/25] Unionfs: add un/likely conditionals on debug ops

2007-09-26 Thread roel
Erez Zadok wrote:
 Signed-off-by: Erez Zadok [EMAIL PROTECTED]
 ---
  fs/unionfs/debug.c |  108 +++
  1 files changed, 57 insertions(+), 51 deletions(-)
 
 diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
 index 9546a41..09b52ce 100644
 --- a/fs/unionfs/debug.c
 +++ b/fs/unionfs/debug.c
 @@ -56,19 +56,19 @@ void __unionfs_check_inode(const struct inode *inode,
   sb = inode-i_sb;
   istart = ibstart(inode);
   iend = ibend(inode);
 - if (istart  iend) {
 + if (unlikely(istart  iend)) {
   PRINT_CALLER(fname, fxn, line);
   printk( Ci0: inode=%p istart/end=%d:%d\n,
  inode, istart, iend);
   }
 - if ((istart == -1  iend != -1) ||
 - (istart != -1  iend == -1)) {
 + if (unlikely((istart == -1  iend != -1) ||
 +  (istart != -1  iend == -1))) {

you could also do
if (unlikely((istart == -1 || iend == -1)  istart != iend)) {

[...]

 @@ -138,8 +139,8 @@ void __unionfs_check_dentry(const struct dentry *dentry,
   dend = dbend(dentry);
   BUG_ON(dstart  dend);
  
 - if ((dstart == -1  dend != -1) ||
 - (dstart != -1  dend == -1)) {
 + if (unlikely((dstart == -1  dend != -1) ||
 +  (dstart != -1  dend == -1))) {

[...]

the same here

 @@ -223,30 +224,30 @@ void __unionfs_check_dentry(const struct dentry *dentry,
   istart = ibstart(inode);
   iend = ibend(inode);
   BUG_ON(istart  iend);
 - if ((istart == -1  iend != -1) ||
 - (istart != -1  iend == -1)) {
 + if (unlikely((istart == -1  iend != -1) ||
 +  (istart != -1  iend == -1))) {

and here

[...]

 @@ -350,30 +354,30 @@ void __unionfs_check_file(const struct file *file,
   fend = fbend(file);
   BUG_ON(fstart  fend);
  
 - if ((fstart == -1  fend != -1) ||
 - (fstart != -1  fend == -1)) {
 + if (unlikely((fstart == -1  fend != -1) ||
 +  (fstart != -1  fend == -1))) {

and here

[...]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrading datastructures between different filesystem versions

2007-09-26 Thread Sachin Gaikwad
On 9/26/07, Andreas Dilger [EMAIL PROTECTED] wrote:
 On Sep 25, 2007  23:40 -0600, Jim Cromie wrote:
  kernel learner wrote:
  ext3 filesystem has 32-bit block address and ext4 filesystem has
  48-bit block address. If a user installs ext4, how will the file
  system handle already existing block with 32 bit values?
 
  Why should it ? thats what ext3 is for.

 Bzzt. Wrong answer.  The ext4 code will be able to read existing ext3
 (and ext2) filesystems just fine.  Otherwise there wouldn't be much
 of an upgrade path.

  Id expect ext4 drivers handling ext3 filesystems is a distant, secondary
  goal to getting a fast, reliable, clean 48bit filesystem working.

 Far from the truth.  One of the main goals of ext4 is that it is a drop-in
 replacement for ext3.  The code is mostly incremental improvements over
 ext3, and that IS one of the reasons that it is reliable.  We didn't throw
 away 10 years of bug fixes in the ext2/ext3 code when adding the ext4
 features.

 Cheers, Andreas
 --
 Andreas Dilger
 Principal Software Engineer
 Cluster File Systems, Inc.


 --
 To unsubscribe from this list: send an email with
 unsubscribe kernelnewbies to [EMAIL PROTECTED]
 Please read the FAQ at http://kernelnewbies.org/FAQ



Is it not the case that VFS takes care of all filesystems available ?
VFS will see if a particular file belongs to ext3 or ext4 and call
that FS's drivers to access information ??

Correct me if I am wrong. I am a newbie!

Sachin
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrading datastructures between different filesystem versions

2007-09-26 Thread Theodore Tso
On Wed, Sep 26, 2007 at 06:29:19PM -0500, Sachin Gaikwad wrote:
 Is it not the case that VFS takes care of all filesystems available ?
 VFS will see if a particular file belongs to ext3 or ext4 and call
 that FS's drivers to access information ??

No, it doesn't quite work that way.  You have to mount a particular
partition using a specific filesystem (i.e., ntfs, vfat, ext2, ext3,
ext4, etc.).  A partition formatted using ext2 can be mounted using
the ext2, ext3, or ext4 filesystem driver.  You can explicitly specify
what filesystem should be used to mount a particuar partition using
the -t option to the mount program, or by specifying a particular
filesystem type in the /etc/fstab file.

- Ted
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch]anon_inodes.c: fix error check in anon_inode_getfd

2007-09-26 Thread Yan Zheng
Hello,

igrab return NULL on error.

Signed-off-by: Yan Zheng[EMAIL PROTECTED]
---
diff -ur linux-2.6.23-rc8/fs/anon_inodes.c linux/fs/anon_inodes.c
--- linux-2.6.23-rc8/fs/anon_inodes.c   2007-09-27 10:05:07.0 +0800
+++ linux/fs/anon_inodes.c  2007-09-27 10:18:26.0 +0800
@@ -87,8 +87,8 @@
return -ENFILE;

inode = igrab(anon_inode_inode);
-   if (IS_ERR(inode)) {
-   error = PTR_ERR(inode);
+   if (!inode) {
+   error = -ENOENT;
goto err_put_filp;
}
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch]anon_inodes.c: fix error check in anon_inode_getfd

2007-09-26 Thread Andrew Morton
On Thu, 27 Sep 2007 10:30:50 +0800 Yan Zheng [EMAIL PROTECTED] wrote:

 Hello,
 
 igrab return NULL on error.
 
 Signed-off-by: Yan Zheng[EMAIL PROTECTED]
 ---
 diff -ur linux-2.6.23-rc8/fs/anon_inodes.c linux/fs/anon_inodes.c
 --- linux-2.6.23-rc8/fs/anon_inodes.c 2007-09-27 10:05:07.0 +0800
 +++ linux/fs/anon_inodes.c2007-09-27 10:18:26.0 +0800
 @@ -87,8 +87,8 @@
   return -ENFILE;
 
   inode = igrab(anon_inode_inode);
 - if (IS_ERR(inode)) {
 - error = PTR_ERR(inode);
 + if (!inode) {
 + error = -ENOENT;
   goto err_put_filp;
   }

hm, does that code actually need to exist?  igrab() is pretty expensive and
that fs is permanently mounted anyway...

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html