cp: performance improvement with small files

2008-05-23 Thread Hauke Laging
Hello,

I just read an interesting hint in the German shell Usenet group 
(<[EMAIL PROTECTED]>). As I could not find anything about 
that point in your mailing list archive I would like to mention it here.

The author claims that he achieved a huge performance increase (more than 
factor 10) when copying a big amount of small files (1-10 KiB) by sorting 
by inode numbers first. This probably reduces the disk access time which 
becomes the dominating factor for small files.

Of course, this kind of sorting could be (transparently) done in cp, too. 
When reading the directory contents you might count the number (and share) 
of small files and determine whether such sorting makes sense for the 
respective data. And certainly this decision is based on the assumption 
that the respective file system places the inodes on disk according to 
their number. I don't know if that is correct for all file systems. If 
not, cp might check that first.


The same goes for mv, of course, when moving between volumes (and maybe 
other programs that access inodes of many files in certain situations).


Best regards,

Hauke


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: cp: performance improvement with small files

2008-05-23 Thread Pádraig Brady
Hauke Laging wrote:
> Hello,
> 
> I just read an interesting hint in the German shell Usenet group 
> (<[EMAIL PROTECTED]>). As I could not find anything about 
> that point in your mailing list archive I would like to mention it here.
> 
> The author claims that he achieved a huge performance increase (more than 
> factor 10) when copying a big amount of small files (1-10 KiB) by sorting 
> by inode numbers first. This probably reduces the disk access time which 
> becomes the dominating factor for small files.

disk seeks have mostly missed the computer performance increases over time,
and hence why they're increasing being noticed as the bottleneck.
However I think mechanical disks will quickly become a thing of the past,
with the onset of solid state disks.

I've noticed myself a large performance gain in the few filesystems I've
tested by sorting by inode, so that the disk head seeks in 1 direction only.
One can see this in the findup component of fslint here for example:
http://code.google.com/p/fslint/source/browse/tags/2.26/fslint/findup#158
I've also found though that sorting by path is only a little worse.
Sorting by md5sum is very bad though :)

Pádraig.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: cp: performance improvement with small files

2008-05-23 Thread Phillip Susi

Hauke Laging wrote:

Hello,

I just read an interesting hint in the German shell Usenet group 
(<[EMAIL PROTECTED]>). As I could not find anything about 
that point in your mailing list archive I would like to mention it here.


The author claims that he achieved a huge performance increase (more than 
factor 10) when copying a big amount of small files (1-10 KiB) by sorting 
by inode numbers first. This probably reduces the disk access time which 
becomes the dominating factor for small files.


It depends on what filesystem you are using.  In ext3 this would help, 
but not on reiserfs, where there is no relationship between inode number 
and disk position.


In any case, this would significantly increase the complexity of cp for 
at best, dubious gains, so it isn't likely to happen.  Rather than sort 
by inode, it would be better if filesystems that would benefit from that 
would keep the directory list sorted that way so that the list would 
already be sorted when passed to cp.  IIRC, the defrag package sorts 
directories this way on ext2/3.




___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


[OT] recommended reading: usenix malicious hardware paper

2008-05-23 Thread Jim Meyering
Here's an awe-inspiring paper.  A colleague's summary was apt:
"we're all doomed" ;-)

  http://www.usenix.org/event/leet08/tech/full_papers/king/king_html/

You will view older hardware with more respect.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils