Re: [zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

2007-08-23 Thread Robert Milkowski
Hello Roch,

Wednesday, August 22, 2007, 10:13:10 AM, you wrote:

RP £ukasz K writes:
   Is ZFS efficient at handling huge populations of tiny-to-small files -
   for example, 20 million TIFF images in a collection, each between 5
   and 500k in size?
   
   I am asking because I could have sworn that I read somewhere that it
   isn't, but I can't find the reference.
  
  It depends, what type of I/O you will do. If only reads, there is no 
  problem. Writting small files ( and removing ) will fragmentate pool
  and it will be a huge problem.
  You can set recordsize to 32k ( or 16k ) and it will help for some time.
  

RP Comparing recordsize of 16K with 128K.

RP Files in the range of [0,16K] : no difference.
RP Files in the range of [16K,128K]  : more efficient to use 128K
RP Files in the range of [128K,500K] : more efficient to use 16K

RP In the [16K,128K] range the actual filesize is rounded up to 
RP 16K with 16K recordsize and to the nearest 512B boundary
RP with 128K recordsize. This will be fairly catastrophic for
RP files slightly above 16K (rounded up to 32K vs 16K+512B).

RP In the [128K, 500K] range we're hurt by this

RP 5003563 use smaller tail block for last block of object

RP until   it is  fixed, then  yes , files stored using  16K
RP records are  rounded up more tightly. metadata probably
RP east parts of the gains.


Roch, I guess Lukasz was talking about some problems we're seeing here
which are partly caused by utilizing all 128KB slabs, so forcing file
system to 16KB helps here (for CPU) - workaround. Sure, we're talking
about lots and lots of files, really small. Perhaps someone could work
with Lukasz and investigate it more closely. Lukasz posted so more
detailed info not so long ago - unfortunately there was no feedback.

-- 
Best regards,
 Robert Milkowski  mailto:[EMAIL PROTECTED]
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

2007-08-22 Thread Roch - PAE
£ukasz K writes:
   Is ZFS efficient at handling huge populations of tiny-to-small files -
   for example, 20 million TIFF images in a collection, each between 5
   and 500k in size?
   
   I am asking because I could have sworn that I read somewhere that it
   isn't, but I can't find the reference.
  
  It depends, what type of I/O you will do. If only reads, there is no 
  problem. Writting small files ( and removing ) will fragmentate pool
  and it will be a huge problem.
  You can set recordsize to 32k ( or 16k ) and it will help for some time.
  

Comparing recordsize of 16K with 128K.

Files in the range of [0,16K] : no difference.
Files in the range of [16K,128K]  : more efficient to use 128K
Files in the range of [128K,500K] : more efficient to use 16K

In the [16K,128K] range the actual filesize is rounded up to 
16K with 16K recordsize and to the nearest 512B boundary
with 128K recordsize. This will be fairly catastrophic for
files slightly above 16K (rounded up to 32K vs 16K+512B).

In the [128K, 500K] range we're hurt by this

5003563 use smaller tail block for last block of object

until   it is  fixed, then  yes , files stored using  16K
records are  rounded up more tightly. metadata probably
east parts of the gains.

-r


  Lukas
  
  
  CLUBNETIC SUMMER PARTY 2007
  House, club, electro. Najlepsza kompilacja na letnie imprezy!
  http://klik.wp.pl/?adr=http%3A%2F%2Fadv.reklama.wp.pl%2Fas%2Fclubnetic.htmlsid=1266
  
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

2007-08-22 Thread Roch - PAE
£ukasz K writes:
   Is ZFS efficient at handling huge populations of tiny-to-small files -
   for example, 20 million TIFF images in a collection, each between 5
   and 500k in size?
   
   I am asking because I could have sworn that I read somewhere that it
   isn't, but I can't find the reference.
  
  It depends, what type of I/O you will do. If only reads, there is no 
  problem. Writting small files ( and removing ) will fragmentate pool
  and it will be a huge problem.
  You can set recordsize to 32k ( or 16k ) and it will help for some time.
  

Comparing recordsize of 16K with 128K.

Files in the range of [0,16K] : no difference.
Files in the range of [16K,128K]  : more efficient to use 128K
Files in the range of [128K,500K] : more efficient to use 16K

In the [16K,128K] range the actual filesize is rounded up to 
16K with 16K recordsize and to the nearest 512B boundary
with 128K recordsize. This will be fairly catastrophic for
files slightly above 16K (rounded up to 32K vs 16K+512B).

In the [128K, 500K] range we're hurt by this

5003563 use smaller tail block for last block of object

until   it is  fixed, then  yes , files stored using  16K
records are  rounded up more tightly. metadata probably
east parts of the gains.

-r


  Lukas
  
  
  CLUBNETIC SUMMER PARTY 2007
  House, club, electro. Najlepsza kompilacja na letnie imprezy!
  http://klik.wp.pl/?adr=http%3A%2F%2Fadv.reklama.wp.pl%2Fas%2Fclubnetic.htmlsid=1266
  
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss