On Jan 23, 2008  14:07 -0800, Andrew Morton wrote:
> > +#define mb_correct_addr_and_bit(bit, addr)         \
> > +{                                                  \
> > +   bit += ((unsigned long) addr & 3UL) << 3;       \
> > +   addr = (void *) ((unsigned long) addr & ~3UL);  \
> > +}
> 
> Why do these exist?

They seem to be a holdover from when mballoc stored the buddy bitmaps
on disk.  That no longer happens (to avoid bitmap vs. buddy consistency
problems), so I suspect they can be removed.

I can't comment on many of the other issues because Alex wrote most
of the code.

> Gosh what a lot of code.  Is it faster?

Yes, and also importantly it uses a lot less CPU to do a given amount
of allocation, which is critical in our environments where there is
very high disk bandwidth on a single node and CPU becomes the limiting
factor of the IO speed.  This of course also helps any write-intensive
environment where the CPU is doing something "useful".

Some older test results include:
https://ols2006.108.redhat.com/2007/Reprints/mathur-Reprint.pdf (Section 7)

In the linux-ext4 thread "compilebench numbers for ext4":
http://www.mail-archive.com/[EMAIL PROTECTED]/msg03834.html

http://oss.oracle.com/~mason/compilebench/ext4/ext-create-compare.png
http://oss.oracle.com/~mason/compilebench/ext4/ext-compile-compare.png
http://oss.oracle.com/~mason/compilebench/ext4/ext-read-compare.png
http://oss.oracle.com/~mason/compilebench/ext4/ext-rm-compare.png

note the ext-read-compare.png graph shows lower read performance, but
a couple of bugs in mballoc were since fixed to have ext4 allocate more
contiguous extents.

In the old linux-ext4 thread "[RFC] delayed allocation testing on node zefir"
http://www.mail-archive.com/[EMAIL PROTECTED]/msg00587.html

        : dd2048rw                             
        : REAL   UTIME  STIME  READ    WRITTEN DETAILS
EXT3    : 58.46  23     1491   2572    2097292 17 extents
EXT4    : 44.56  19     1018   12      2097244 19 extents
REISERFS: 56.80  26     1370   2952    2097336 457 extents
JFS     : 45.77  22     984    0       2097216 1 extents
XFS     : 50.97  20     1394   0       2100825 7 extents

        : kernuntar                            
        : REAL   UTIME  STIME  READ    WRITTEN DETAILS
EXT3    : 56.99  5037   651    68      252016  
EXT4    : 55.03  5034   553    36      249884  
REISERFS: 52.55  4996   854    64      238068  
JFS     : 70.15  5057   630    496     288116  
XFS     : 72.84  5052   953    132     316798  

        : kernstat                             
        : REAL   UTIME  STIME  READ    WRITTEN DETAILS
EXT3    : 2.83   8      15     5892    0       
EXT4    : 0.51   9      10     5892    0       
REISERFS: 0.81   7      49     2696    0       
JFS     : 6.19   11     49     12552   0       
XFS     : 2.09   9      61     6504    0       

        : kerncat                              
        : REAL   UTIME  STIME  READ    WRITTEN DETAILS
EXT3    : 9.48   25     213    241624  0       
EXT4    : 6.29   27     197    238560  0       
REISERFS: 14.69  33     230    234744  0       
JFS     : 23.51  23     231    244596  0       
XFS     : 18.24  36     254    238548  0       

        : kernrm                               
        : REAL   UTIME  STIME  READ    WRITTEN DETAILS
EXT3    : 4.82   4      108    9628    4672    
EXT4    : 1.61   5      110    6536    4632    
REISERFS: 3.15   8      276    2768    236     
JFS     : 33.90  7      168    14400   33048   
XFS     : 20.03  8      296    6632    86160   


Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to