On Thu, Mar 15, 2012 at 11:42:24AM +0100, Jacek Luczak wrote:
>
> That was not a SVN server. It was a build host having checkouts of SVN
> projects.
>
> The many files/dirs case is common for VCS and the SVN is not the only
> that would be affected here.
Well, with SVN it's 2x or 3x the number
2012/3/11 Ted Ts'o :
>
> Well, my goal in proposing this optimization is that helps for the
> "medium size" directories in the cold cache case. The ext4 user who
> first kicked off this thread was using his file system for an SVN
> server, as I recall. I could easily believe that he has thousand
2012/3/10 Ted Ts'o :
> Hey Jacek,
>
> I'm curious parameters of the set of directories on your production
> server. On an ext4 file system, assuming you've copied the
> directories over, what are the result of this command pipeline when
> you are cd'ed into the top of the directory hierarchy of in
On Wed, Mar 14, 2012 at 08:50:02AM -0400, Ted Ts'o wrote:
> On Wed, Mar 14, 2012 at 09:12:02AM +0100, Lukas Czerner wrote:
> > I kind of like the idea about having the separate btree with inode
> > numbers for the directory reading, just because it does not affect
> > allocation policy nor the writ
On 03/14/2012 12:48 PM, Ted Ts'o wrote:
On Wed, Mar 14, 2012 at 10:17:37AM -0400, Zach Brown wrote:
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destination in
On Wed, Mar 14, 2012 at 03:34:13PM +0100, Lukas Czerner wrote:
> >
> > You can make it be a RO_COMPAT change instead of an INCOMPAT change,
> > yes.
>
> Does it have to be RO_COMPAT change though ? Since this would be both
> forward and backward compatible.
The challenge is how do you notice if
On Wed, Mar 14, 2012 at 10:28:20AM -0400, Phillip Susi wrote:
>
> Do you really think it is that much easier? Even if it is easier,
> it is still an ugly kludge. It would be much better to fix the
> underlying problem rather than try to paper over it.
I don't think the choice is obvious. A sol
On Wed, Mar 14, 2012 at 10:17:37AM -0400, Zach Brown wrote:
>
> >We could do this if we have two b-trees, one indexed by filename and
> >one indexed by inode number, which is what JFS (and I believe btrfs)
> >does.
>
> Typically the inode number of the destination inode isn't used to index
> entr
On Wed, 14 Mar 2012, Ted Ts'o wrote:
> On Wed, Mar 14, 2012 at 09:12:02AM +0100, Lukas Czerner wrote:
> > I kind of like the idea about having the separate btree with inode
> > numbers for the directory reading, just because it does not affect
> > allocation policy nor the write performance which
On 3/13/2012 5:33 PM, Ted Ts'o wrote:
Are you volunteering to spearhead the design and coding of such a
thing? Run-time sorting is backwards compatible, and a heck of a lot
easier to code and test...
Do you really think it is that much easier? Even if it is easier, it is
still an ugly kludge
We could do this if we have two b-trees, one indexed by filename and
one indexed by inode number, which is what JFS (and I believe btrfs)
does.
Typically the inode number of the destination inode isn't used to index
entries for a readdir tree because of (wait for it) hard links. You end
up ri
On Wed, Mar 14, 2012 at 09:12:02AM +0100, Lukas Czerner wrote:
> I kind of like the idea about having the separate btree with inode
> numbers for the directory reading, just because it does not affect
> allocation policy nor the write performance which is a good thing. Also
> it has been done befor
On Wed, 14 Mar 2012, Yongqiang Yang wrote:
> On Wed, Mar 14, 2012 at 4:12 PM, Lukas Czerner wrote:
> > On Tue, 13 Mar 2012, Ted Ts'o wrote:
> >
> >> On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
> >> >
> >> > I think a format change would be preferable to runtime sorting.
> >>
> >
On Wed, Mar 14, 2012 at 4:12 PM, Lukas Czerner wrote:
> On Tue, 13 Mar 2012, Ted Ts'o wrote:
>
>> On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
>> >
>> > I think a format change would be preferable to runtime sorting.
>>
>> Are you volunteering to spearhead the design and coding of
On Tue, 13 Mar 2012, Ted Ts'o wrote:
> On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
> >
> > I think a format change would be preferable to runtime sorting.
>
> Are you volunteering to spearhead the design and coding of such a
> thing? Run-time sorting is backwards compatible, a
On Wed, Mar 14, 2012 at 10:48:17AM +0800, Yongqiang Yang wrote:
> What if we use inode number as the hash value? Does it work?
The whole point of using the tree structure is to accelerate filename
-> inode number lookups. So the namei lookup doesn't have the inode
number; the whole point is to u
What if we use inode number as the hash value? Does it work?
Yongqiang.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
>
> I think a format change would be preferable to runtime sorting.
Are you volunteering to spearhead the design and coding of such a
thing? Run-time sorting is backwards compatible, and a heck of a lot
easier to code and test...
The
On 3/13/2012 3:53 PM, Ted Ts'o wrote:
Because that would be a format change.
I think a format change would be preferable to runtime sorting.
What we have today is not a hash table; it's a hashed tree, where we
use a fixed-length key for the tree based on the hash of the file
name. Currently
On Tue, Mar 13, 2012 at 03:05:59PM -0400, Phillip Susi wrote:
> Why not just separate the hash table from the conventional, mostly
> in inode order directory entries? For instance, the first 200k of
> the directory could be the normal entries that would tend to be in
> inode order ( and e2fsck -D
On 3/9/2012 11:48 PM, Ted Ts'o wrote:
I suspect the best optimization for now is probably something like
this:
1) Since the vast majority of directories are less than (say) 256k
(this would be a tunable value), for directories which are less than
this threshold size, the entire directory is suck
On Sun, Mar 11, 2012 at 04:30:37AM -0600, Andreas Dilger wrote:
> > if the userspace process could
> > feed us the exact set of filenames that will be used in the directory,
> > plus the exact file sizes for each of the file names...
>
> Except POSIX doesn't allow anything close to this at all. S
On 2012-03-09, at 9:48 PM, Ted Ts'o wrote:
> On Fri, Mar 09, 2012 at 04:09:43PM -0800, Andreas Dilger wrote:
>>
>> Just reading this on the plane, so I can't find the exact reference
>> that I want, but a solution to this problem with htree was discussed
>> a few years ago between myself and Coly
On Fri, Mar 09, 2012 at 04:09:43PM -0800, Andreas Dilger wrote:
> > I have also run the correlation.py from Phillip Susi on directory with
> > 10 4k files and indeed the name to block correlation in ext4 is pretty
> > much random :)
>
> Just reading this on the plane, so I can't find the exact
Hey Jacek,
I'm curious parameters of the set of directories on your production
server. On an ext4 file system, assuming you've copied the
directories over, what are the result of this command pipeline when
you are cd'ed into the top of the directory hierarchy of interest
(your svn tree, as I reca
On 2012-03-09, at 3:29, Lukas Czerner wrote:
>
> I have created a simple script which creates a bunch of files with
> random names in the directory and then performs operation like list,
> tar, find, copy and remove. I have run it for ext4, xfs and btrfs with
> the 4k size files. And the result i
On Fri, Mar 09, 2012 at 12:29:29PM +0100, Lukas Czerner wrote:
> Hi,
>
> I have created a simple script which creates a bunch of files with
> random names in the directory and then performs operation like list,
> tar, find, copy and remove. I have run it for ext4, xfs and btrfs with
> the 4k size
On Wed, 29 Feb 2012, Jacek Luczak wrote:
> Hi All,
>
> /*Sorry for sending incomplete email, hit wrong button :) I guess I
> can't use Gmail */
>
> Long story short: We've found that operations on a directory structure
> holding many dirs takes ages on ext4.
>
> The Question: Why there's that h
On 2/29/2012 11:44 PM, Theodore Tso wrote:
You might try sorting the entries returned by readdir by inode number
before you stat them.This is a long-standing weakness in
ext3/ext4, and it has to do with how we added hashed tree indexes to
directories in (a) a backwards compatible way, that (b
On Mon, Mar 05, 2012 at 12:32:45PM +0100, Jacek Luczak wrote:
> 2012/3/4 Jacek Luczak :
> > 2012/3/3 Jacek Luczak :
> >> 2012/3/2 Chris Mason :
> >>> On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
> 2012/3/2 Chris Mason :
> > On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek
On Fri 02-03-12 14:32:15, Ted Tso wrote:
> On Fri, Mar 02, 2012 at 09:26:51AM -0500, Chris Mason wrote:
> It would be interesting to have a project where someone added
> fallocate() support into libelf, and then added some hueristics into
> ext4 so that if a file is fallocated to a precise size, or
2012/3/4 Jacek Luczak :
> 2012/3/3 Jacek Luczak :
>> 2012/3/2 Chris Mason :
>>> On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
2012/3/2 Chris Mason :
> On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
>>
>> I've took both on tests. The subject is acp
2012/3/3 Jacek Luczak :
> 2012/3/2 Chris Mason :
>> On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
>>> 2012/3/2 Chris Mason :
>>> > On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
>>> >>
>>> >> I've took both on tests. The subject is acp and spd_readdir used with
>>> >>
2012/3/2 Chris Mason :
> On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
>> 2012/3/2 Chris Mason :
>> > On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
>> >>
>> >> I've took both on tests. The subject is acp and spd_readdir used with
>> >> tar, all on ext4:
>> >> 1) acp:
On Fri, Mar 02, 2012 at 02:32:15PM -0500, Ted Ts'o wrote:
> On Fri, Mar 02, 2012 at 09:26:51AM -0500, Chris Mason wrote:
> >
> > filefrag will tell you how many extents each file has, any file with
> > more than one extent is interesting. (The ext4 crowd may have better
> > suggestions on measuri
On Fri, Mar 02, 2012 at 09:26:51AM -0500, Chris Mason wrote:
>
> filefrag will tell you how many extents each file has, any file with
> more than one extent is interesting. (The ext4 crowd may have better
> suggestions on measuring fragmentation).
You can get a *huge* amount of information (prob
On Fri, Mar 02, 2012 at 03:16:12PM +0100, Jacek Luczak wrote:
> 2012/3/2 Chris Mason :
> > On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
> >>
> >> I've took both on tests. The subject is acp and spd_readdir used with
> >> tar, all on ext4:
> >> 1) acp: http://91.234.146.107/~difrost
2012/3/2 Chris Mason :
> On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
>>
>> I've took both on tests. The subject is acp and spd_readdir used with
>> tar, all on ext4:
>> 1) acp: http://91.234.146.107/~difrost/seekwatcher/acp_ext4.png
>> 2) spd_readdir:
>> http://91.234.146.107/~di
On Fri, Mar 02, 2012 at 11:05:56AM +0100, Jacek Luczak wrote:
>
> I've took both on tests. The subject is acp and spd_readdir used with
> tar, all on ext4:
> 1) acp: http://91.234.146.107/~difrost/seekwatcher/acp_ext4.png
> 2) spd_readdir: http://91.234.146.107/~difrost/seekwatcher/tar_ext4_readir
2012/3/1 Chris Mason :
> On Wed, Feb 29, 2012 at 11:44:31PM -0500, Theodore Tso wrote:
>> You might try sorting the entries returned by readdir by inode number before
>> you stat them. This is a long-standing weakness in ext3/ext4, and it has
>> to do with how we added hashed tree indexes to d
2012/3/1 Ted Ts'o :
> On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
>>
>> Yep, ext4 is close to my wife's closet.
>>
>
> Were all of the file systems freshly laid down, or was this an aged
> ext4 file system?
Always fresh, recreated for each tests - that's why it takes quite
much t
On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
>
> Yep, ext4 is close to my wife's closet.
>
Were all of the file systems freshly laid down, or was this an aged
ext4 file system?
Also you should beware that if you have a workload which is heavy
parallel I/O, with lots of random,
2012/3/1 Chris Mason :
> On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
>> 2012/3/1 Chris Mason :
>> > XFS will probably beat btrfs in this test. Their directory indexes
>> > reflect on disk layout very well.
>>
>> True, but not that fast on small files.
>>
>> Except the question I'
On Thu, Mar 01, 2012 at 03:43:41PM +0100, Jacek Luczak wrote:
> 2012/3/1 Chris Mason :
> > XFS will probably beat btrfs in this test. Their directory indexes
> > reflect on disk layout very well.
>
> True, but not that fast on small files.
>
> Except the question I've raised in first mail there'
2012/3/1 Chris Mason :
> On Thu, Mar 01, 2012 at 03:03:53PM +0100, Jacek Luczak wrote:
>> 2012/3/1 Hillf Danton :
>> > On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak
>> > wrote:
>> >>
>> >> While I was about to grab acp I've noticed seekwatcher with made my day :)
>> >>
>> >> seekwatcher run of tar
On Wed, Feb 29, 2012 at 11:44:31PM -0500, Theodore Tso wrote:
> You might try sorting the entries returned by readdir by inode number before
> you stat them.This is a long-standing weakness in ext3/ext4, and it has
> to do with how we added hashed tree indexes to directories in (a) a backward
On Thu, Mar 01, 2012 at 03:03:53PM +0100, Jacek Luczak wrote:
> 2012/3/1 Hillf Danton :
> > On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak
> > wrote:
> >>
> >> While I was about to grab acp I've noticed seekwatcher with made my day :)
> >>
> >> seekwatcher run of tar cf to eliminate writes (all don
2012/3/1 Hillf Danton :
> On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak wrote:
>>
>> While I was about to grab acp I've noticed seekwatcher with made my day :)
>>
>> seekwatcher run of tar cf to eliminate writes (all done on 3.2.7):
>> 1) btrfs: http://dozzie.jarowit.net/~dozzie/luczajac/tar_btrfs.
On Thu, Mar 1, 2012 at 9:35 PM, Jacek Luczak wrote:
>
> While I was about to grab acp I've noticed seekwatcher with made my day :)
>
> seekwatcher run of tar cf to eliminate writes (all done on 3.2.7):
> 1) btrfs: http://dozzie.jarowit.net/~dozzie/luczajac/tar_btrfs.png
> 2) ext4: http://dozzie.ja
2012/2/29 Jacek Luczak :
> 2012/2/29 Chris Mason :
>> On Wed, Feb 29, 2012 at 03:07:45PM +0100, Jacek Luczak wrote:
>>
>> [ btrfs faster than ext for find and cp -a ]
>>
>>> 2012/2/29 Jacek Luczak :
>>>
>>> I will try to answer the question from the broken email I've sent.
>>>
>>> @Lukas, it was al
You might try sorting the entries returned by readdir by inode number before
you stat them.This is a long-standing weakness in ext3/ext4, and it has to
do with how we added hashed tree indexes to directories in (a) a backwards
compatible way, that (b) was POSIX compliant with respect to addi
2012/2/29 Chris Mason :
> On Wed, Feb 29, 2012 at 03:07:45PM +0100, Jacek Luczak wrote:
>
> [ btrfs faster than ext for find and cp -a ]
>
>> 2012/2/29 Jacek Luczak :
>>
>> I will try to answer the question from the broken email I've sent.
>>
>> @Lukas, it was always a fresh FS on top of LVM logica
On Wed, Feb 29, 2012 at 03:07:45PM +0100, Jacek Luczak wrote:
[ btrfs faster than ext for find and cp -a ]
> 2012/2/29 Jacek Luczak :
>
> I will try to answer the question from the broken email I've sent.
>
> @Lukas, it was always a fresh FS on top of LVM logical volume. I've
> been cleaning ca
On Wed, Feb 29, 2012 at 08:51:58AM -0500, Chris Mason wrote:
> On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
> > Ext4 results:
> > | Type | 2.6.39.4-3 | 3.2.7
> > | Dir cnt | 17m 40sec | 11m 20sec
> > | File cnt | 17m 36sec | 11m 22sec
> > | Copy| 1h 28m| 1h 27m
2012/2/29 Jacek Luczak :
> 2012/2/29 Jacek Luczak :
>> Hi Chris,
>>
>> the last one was borked :) Please check this one.
>>
>> -jacek
>>
>> 2012/2/29 Jacek Luczak :
>>> Hi All,
>>>
>>> /*Sorry for sending incomplete email, hit wrong button :) I guess I
>>> can't use Gmail */
>>>
>>> Long story shor
2012/2/29 Jacek Luczak :
> Hi Chris,
>
> the last one was borked :) Please check this one.
>
> -jacek
>
> 2012/2/29 Jacek Luczak :
>> Hi All,
>>
>> /*Sorry for sending incomplete email, hit wrong button :) I guess I
>> can't use Gmail */
>>
>> Long story short: We've found that operations on a dire
On Wed, 29 Feb 2012, Chris Mason wrote:
> On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
> > Hi All,
> >
> > Long story short: We've found that operations on a directory structure
> > holding many dirs takes ages on ext4.
> >
> > The Question: Why there's that huge difference in e
Hi Chris,
the last one was borked :) Please check this one.
-jacek
2012/2/29 Jacek Luczak :
> Hi All,
>
> /*Sorry for sending incomplete email, hit wrong button :) I guess I
> can't use Gmail */
>
> Long story short: We've found that operations on a directory structure
> holding many dirs takes
Hi All,
/*Sorry for sending incomplete email, hit wrong button :) I guess I
can't use Gmail */
Long story short: We've found that operations on a directory structure
holding many dirs takes ages on ext4.
The Question: Why there's that huge difference in ext4 and btrfs? See
below test results for
On Wed, Feb 29, 2012 at 02:31:03PM +0100, Jacek Luczak wrote:
> Hi All,
>
> Long story short: We've found that operations on a directory structure
> holding many dirs takes ages on ext4.
>
> The Question: Why there's that huge difference in ext4 and btrfs? See
> below test results for real values
Hi All,
Long story short: We've found that operations on a directory structure
holding many dirs takes ages on ext4.
The Question: Why there's that huge difference in ext4 and btrfs? See
below test results for real values.
Background: I had to backup a Jenkins directory holding workspace for
few
61 matches
Mail list logo