Re: How git affects kernel.org performance

2007-01-08 Thread Jeff Garzik

Theodore Tso wrote:

The fastest and probably most important thing to add is some readahead
smarts to directories --- both to the htree and non-htree cases.  If
you're using some kind of b-tree structure, such as XFS does for
directories, preallocation doesn't help you much.  Delayed allocation
can save you if your delayed allocator knows how to structure disk
blocks so that a btree-traversal is efficient, but I'm guessing the
biggest reason why we are losing is because we don't have sufficient
readahead.  This also has the advantage that it will help without
needing to doing a backup/restore to improve layout.



Something I just thought of:  ATA and SCSI hard disks do their own 
read-ahead.  Seeking all over the place to pick up bits of directory 
will hurt even more with the disk reading and throwing away data (albeit 
in its internal elevator and cache).


Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How git affects kernel.org performance

2007-01-08 Thread Theodore Tso
On Mon, Jan 08, 2007 at 02:41:47PM +0100, Johannes Stezenbach wrote:
 
 Would e2fsck -D help? What kind of optimization
 does it perform?

It will help a little; e2fsck -D compresses the logical view of the
directory, but it doesn't optimize the physical layout on disk at all,
and of course, it won't help with the lack of readahead logic.  It's
possible to improve how e2fsck -D works, at the moment, it's not
trying to make the directory be contiguous on disk.  What it should
probably do is to pull a list of all of the blocks used by the
directory, sort them, and then try to see if it can improve on the
list by allocating some new blocks that would make the directory more
contiguous on disk.  I suspect any improvements that would be seen by
doing this would be second order effects at most, though.

- Ted
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How git affects kernel.org performance

2007-01-08 Thread Pavel Machek
Hi!

  Would e2fsck -D help? What kind of optimization
  does it perform?
 
 It will help a little; e2fsck -D compresses the logical view of the
 directory, but it doesn't optimize the physical layout on disk at all,
 and of course, it won't help with the lack of readahead logic.  It's
 possible to improve how e2fsck -D works, at the moment, it's not
 trying to make the directory be contiguous on disk.  What it should
 probably do is to pull a list of all of the blocks used by the
 directory, sort them, and then try to see if it can improve on the
 list by allocating some new blocks that would make the directory more
 contiguous on disk.  I suspect any improvements that would be seen by
 doing this would be second order effects at most, though.

...sounds like a job for e2defrag, not e2fsck...
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How git affects kernel.org performance

2007-01-08 Thread Johannes Stezenbach
On Mon, Jan 08, 2007 at 07:58:19AM -0500, Theodore Tso wrote:
 
 The fastest and probably most important thing to add is some readahead
 smarts to directories --- both to the htree and non-htree cases.  If
 you're using some kind of b-tree structure, such as XFS does for
 directories, preallocation doesn't help you much.  Delayed allocation
 can save you if your delayed allocator knows how to structure disk
 blocks so that a btree-traversal is efficient, but I'm guessing the
 biggest reason why we are losing is because we don't have sufficient
 readahead.  This also has the advantage that it will help without
 needing to doing a backup/restore to improve layout.

Would e2fsck -D help? What kind of optimization
does it perform?


Thanks,
Johannes
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How git affects kernel.org performance

2007-01-08 Thread Theodore Tso
On Mon, Jan 08, 2007 at 02:59:52PM +0100, Pavel Machek wrote:
 Hi!
 
   Would e2fsck -D help? What kind of optimization
   does it perform?
  
  It will help a little; e2fsck -D compresses the logical view of the
  directory, but it doesn't optimize the physical layout on disk at all,
  and of course, it won't help with the lack of readahead logic.  It's
  possible to improve how e2fsck -D works, at the moment, it's not
  trying to make the directory be contiguous on disk.  What it should
  probably do is to pull a list of all of the blocks used by the
  directory, sort them, and then try to see if it can improve on the
  list by allocating some new blocks that would make the directory more
  contiguous on disk.  I suspect any improvements that would be seen by
  doing this would be second order effects at most, though.
 
 ...sounds like a job for e2defrag, not e2fsck...

I wasn't proposing to move other data blocks around in order make the
directory be contiguous, but just a quick and dirty try to make
things better.  But yes, in order to really fix layout issues you
would have to do a full defrag, and it's probably more important that
we try to fix things so that defragmentation runs aren't necessary in
the first place

- Ted

-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How git affects kernel.org performance

2007-01-08 Thread Jeremy Higdon
On Mon, Jan 08, 2007 at 05:09:34PM -0800, Paul Jackson wrote:
 Jeff wrote:
  Something I just thought of:  ATA and SCSI hard disks do their own
  read-ahead.
 
 Probably this is wishful thinking on my part, but I would have hoped
 that most of the read-ahead they did was for stuff that happened to be
 on the cylinder they were reading anyway.  So long as their read-ahead
 doesn't cause much extra or delayed disk head motion, what does it
 matter?


And they usually won't readahead if there is another command to
process, though they can be set up to read unrequested data in
spite of outstanding commands.

When they are reading ahead, they'll only fetch LBAs beyond the last
request until a buffer fills or the readahead gets interrupted.

jeremy
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html