Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-12 Thread Abhishek Rai
Thanks for the great feedback Daniel! Following this email, I'll be sending out two separate emails with the actual patches, one against the latest stable kernel and one against the latest mm patch, using the format suggested by you. Sorry about the tabs and spaces thing, I've fixed my email

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-12 Thread Abhishek Rai
Thanks for the great feedback Daniel! Following this email, I'll be sending out two separate emails with the actual patches, one against the latest stable kernel and one against the latest mm patch, using the format suggested by you. Sorry about the tabs and spaces thing, I've fixed my email

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Daniel Phillips
On Friday 11 January 2008 16:04, Andrew Morton wrote: > It needs to be reviewed. In exhaustive detail. Few people can do > that and fewer are inclined to do so. Agreed, there just have to be a few bugs in this many lines of code. I spent a couple of hours going through it, not really looking at

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Andrew Morton
On Fri, 11 Jan 2008 09:05:17 -0800 Daniel Phillips <[EMAIL PROTECTED]> wrote: > On Thursday 10 January 2008 13:17, Abhishek Rai wrote: > > Benchmark 5: fsck > > Description: Prepare a newly formated 400GB disk as follows: create > > 200 files of 0.5GB each, 100 files of 1GB each, 40 files of

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Daniel Phillips
On Thursday 10 January 2008 13:17, Abhishek Rai wrote: > Benchmark 5: fsck > Description: Prepare a newly formated 400GB disk as follows: create > 200 files of 0.5GB each, 100 files of 1GB each, 40 files of 2.5GB > ech, and 10 files of 10GB each. fsck command line: fsck -f -n > 1. vanilla: >

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Abhishek Rai
That will surely help sequential read performance for large unfragmented files and we have considered it before. There are two main reasons why we want the data blocks and the corresponding indirect blocks to share the same block group. 1. When a block group runs out of a certain types of blocks

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Bodo Eggert
Abhishek Rai <[EMAIL PROTECTED]> wrote: > Putting metacluster at the end of the block group gives slightly > inferior sequential read throughput compared to putting it in the > beginning or the middle, but the difference is very tiny and exists > only for large files that span multiple block

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Abhishek Rai
That will surely help sequential read performance for large unfragmented files and we have considered it before. There are two main reasons why we want the data blocks and the corresponding indirect blocks to share the same block group. 1. When a block group runs out of a certain types of blocks

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Bodo Eggert
Abhishek Rai [EMAIL PROTECTED] wrote: Putting metacluster at the end of the block group gives slightly inferior sequential read throughput compared to putting it in the beginning or the middle, but the difference is very tiny and exists only for large files that span multiple block groups.

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Daniel Phillips
On Thursday 10 January 2008 13:17, Abhishek Rai wrote: Benchmark 5: fsck Description: Prepare a newly formated 400GB disk as follows: create 200 files of 0.5GB each, 100 files of 1GB each, 40 files of 2.5GB ech, and 10 files of 10GB each. fsck command line: fsck -f -n 1. vanilla: Total:

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Andrew Morton
On Fri, 11 Jan 2008 09:05:17 -0800 Daniel Phillips [EMAIL PROTECTED] wrote: On Thursday 10 January 2008 13:17, Abhishek Rai wrote: Benchmark 5: fsck Description: Prepare a newly formated 400GB disk as follows: create 200 files of 0.5GB each, 100 files of 1GB each, 40 files of 2.5GB ech,

Re: [PATCH] Clustering indirect blocks in Ext3

2008-01-11 Thread Daniel Phillips
On Friday 11 January 2008 16:04, Andrew Morton wrote: It needs to be reviewed. In exhaustive detail. Few people can do that and fewer are inclined to do so. Agreed, there just have to be a few bugs in this many lines of code. I spent a couple of hours going through it, not really looking at

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-20 Thread John Stoffel
Abhishek> It took me some time to get compilebench working due to the Abhishek> known issue with drop_caches due to circular lock dependency Abhishek> between j_list_lock and inode_lock (compilebench triggers Abhishek> drop_caches quite frequently). Here are the results for Abhishek> compilebench

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-20 Thread John Stoffel
Abhishek It took me some time to get compilebench working due to the Abhishek known issue with drop_caches due to circular lock dependency Abhishek between j_list_lock and inode_lock (compilebench triggers Abhishek drop_caches quite frequently). Here are the results for Abhishek compilebench run

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-19 Thread Kyungmin Park
Hi, > > > > Setup: 4 cpu, 8GB RAM, 400GB disk. > > > > Average vanilla results > > == > > intial create total runs 30 avg 46.49 MB/s (user 1.12s sys 2.25s) > > create total runs 5 avg 12.90 MB/s (user 1.08s sys 1.97s) > >

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-19 Thread Kyungmin Park
Hi, Setup: 4 cpu, 8GB RAM, 400GB disk. Average vanilla results == intial create total runs 30 avg 46.49 MB/s (user 1.12s sys 2.25s) create total runs 5 avg 12.90 MB/s (user 1.08s sys 1.97s) patch total runs 4

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-18 Thread Matt Mackall
On Sun, Nov 18, 2007 at 07:52:36AM -0800, Abhishek Rai wrote: > Thanks for the suggestion Matt. > > It took me some time to get compilebench working due to the known > issue with drop_caches due to circular lock dependency between > j_list_lock and inode_lock (compilebench triggers drop_caches

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-18 Thread Abhishek Rai
Thanks for the suggestion Matt. It took me some time to get compilebench working due to the known issue with drop_caches due to circular lock dependency between j_list_lock and inode_lock (compilebench triggers drop_caches quite frequently). Here are the results for compilebench run with options

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-18 Thread Abhishek Rai
Thanks for the suggestion Matt. It took me some time to get compilebench working due to the known issue with drop_caches due to circular lock dependency between j_list_lock and inode_lock (compilebench triggers drop_caches quite frequently). Here are the results for compilebench run with options

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-18 Thread Matt Mackall
On Sun, Nov 18, 2007 at 07:52:36AM -0800, Abhishek Rai wrote: Thanks for the suggestion Matt. It took me some time to get compilebench working due to the known issue with drop_caches due to circular lock dependency between j_list_lock and inode_lock (compilebench triggers drop_caches quite

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-17 Thread Abhishek Rai
Thanks for the comments. On Nov 16, 2007 6:58 PM, Theodore Tso <[EMAIL PROTECTED]> wrote: > On Fri, Nov 16, 2007 at 04:25:38PM -0800, Abhishek Rai wrote: > > Ideally, this is how things should be done, but I feel in practice, it > > will make little difference. To summarize, the difference

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-17 Thread Abhishek Rai
Thanks for the comments. On Nov 16, 2007 6:58 PM, Theodore Tso [EMAIL PROTECTED] wrote: On Fri, Nov 16, 2007 at 04:25:38PM -0800, Abhishek Rai wrote: Ideally, this is how things should be done, but I feel in practice, it will make little difference. To summarize, the difference between my

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Theodore Tso
On Fri, Nov 16, 2007 at 04:25:38PM -0800, Abhishek Rai wrote: > Ideally, this is how things should be done, but I feel in practice, it > will make little difference. To summarize, the difference between > my approach and above approach is that when out of free blocks in a > block group while

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Abhishek Rai
Thanks for the great feedback. On Nov 16, 2007 1:11 PM, Theodore Tso <[EMAIL PROTECTED]> wrote: > On Thu, Nov 15, 2007 at 11:02:19PM -0800, Andrew Morton wrote: > > What happens when it fills up but we still have room for more data blocks > > in that blockgroup? > > It does fall back, but it does

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Abhishek Rai
On Nov 15, 2007 11:02 PM, Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Thu, 15 Nov 2007 21:02:46 -0800 "Abhishek Rai" <[EMAIL PROTECTED]> wrote: > > One solution to this problem implemented in this patch is to cluster > > indirect blocks together on a per group basis, similar to how inodes > >

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Theodore Tso
On Thu, Nov 15, 2007 at 11:02:19PM -0800, Andrew Morton wrote: > > Presmably it starts around 50% of the way into the blockgroup? Yes. > How do you decide its size? It's fixed at 1/128th (0.78%) of the blockgroup. > What happens when it fills up but we still have room for more data blocks >

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Andreas Dilger
On Nov 15, 2007 23:02 -0800, Andrew Morton wrote: > So we have a section of blocks around the middle of the blockgroup which > are used for indirect blocks. > > Presmably it starts around 50% of the way into the blockgroup? > > An important question is: how does it stand up over time? Simply

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Andreas Dilger
On Nov 15, 2007 23:02 -0800, Andrew Morton wrote: So we have a section of blocks around the middle of the blockgroup which are used for indirect blocks. Presmably it starts around 50% of the way into the blockgroup? An important question is: how does it stand up over time? Simply laying

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Theodore Tso
On Thu, Nov 15, 2007 at 11:02:19PM -0800, Andrew Morton wrote: Presmably it starts around 50% of the way into the blockgroup? Yes. How do you decide its size? It's fixed at 1/128th (0.78%) of the blockgroup. What happens when it fills up but we still have room for more data blocks in

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Abhishek Rai
On Nov 15, 2007 11:02 PM, Andrew Morton [EMAIL PROTECTED] wrote: On Thu, 15 Nov 2007 21:02:46 -0800 Abhishek Rai [EMAIL PROTECTED] wrote: One solution to this problem implemented in this patch is to cluster indirect blocks together on a per group basis, similar to how inodes and bitmaps

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Abhishek Rai
Thanks for the great feedback. On Nov 16, 2007 1:11 PM, Theodore Tso [EMAIL PROTECTED] wrote: On Thu, Nov 15, 2007 at 11:02:19PM -0800, Andrew Morton wrote: What happens when it fills up but we still have room for more data blocks in that blockgroup? It does fall back, but it does so

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-16 Thread Theodore Tso
On Fri, Nov 16, 2007 at 04:25:38PM -0800, Abhishek Rai wrote: Ideally, this is how things should be done, but I feel in practice, it will make little difference. To summarize, the difference between my approach and above approach is that when out of free blocks in a block group while

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-15 Thread Matt Mackall
On Thu, Nov 15, 2007 at 11:02:19PM -0800, Andrew Morton wrote: > On Thu, 15 Nov 2007 21:02:46 -0800 "Abhishek Rai" <[EMAIL PROTECTED]> wrote: ... > > 3. e2fsck speedup with metaclustering varies from disk > > to disk with most benefit coming from disks which have a large number > > of indirect

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-15 Thread Andrew Morton
On Thu, 15 Nov 2007 21:02:46 -0800 "Abhishek Rai" <[EMAIL PROTECTED]> wrote: > (This patch was previously posted on linux-ext4 where Andreas Dilger > offered some valuable comments on it). > > This patch modifies the block allocation strategy in ext3 in order to > improve fsck performance. This

[PATCH] Clustering indirect blocks in Ext3

2007-11-15 Thread Abhishek Rai
(This patch was previously posted on linux-ext4 where Andreas Dilger offered some valuable comments on it). This patch modifies the block allocation strategy in ext3 in order to improve fsck performance. This was initially sent out as a patch for ext2, but given the lack of ongoing development on

[PATCH] Clustering indirect blocks in Ext3

2007-11-15 Thread Abhishek Rai
(This patch was previously posted on linux-ext4 where Andreas Dilger offered some valuable comments on it). This patch modifies the block allocation strategy in ext3 in order to improve fsck performance. This was initially sent out as a patch for ext2, but given the lack of ongoing development on

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-15 Thread Andrew Morton
On Thu, 15 Nov 2007 21:02:46 -0800 Abhishek Rai [EMAIL PROTECTED] wrote: (This patch was previously posted on linux-ext4 where Andreas Dilger offered some valuable comments on it). This patch modifies the block allocation strategy in ext3 in order to improve fsck performance. This was

Re: [PATCH] Clustering indirect blocks in Ext3

2007-11-15 Thread Matt Mackall
On Thu, Nov 15, 2007 at 11:02:19PM -0800, Andrew Morton wrote: On Thu, 15 Nov 2007 21:02:46 -0800 Abhishek Rai [EMAIL PROTECTED] wrote: ... 3. e2fsck speedup with metaclustering varies from disk to disk with most benefit coming from disks which have a large number of indirect blocks. For