subject:"Re\: \[CentOS\] Question about optimal filesystem with many small files."

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-13 Thread Les Mikesell

JohnS wrote:
> On Mon, 2009-07-13 at 05:49 +,  o wrote:
> 
>>> It is 1024 chars long. Witch want still help.
>> I'm usng mysam and according to: 
>> http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html
>> "The maximum key length is 1000 bytes. This can also be changed by changing 
>> the source and recompiling. For the case of a key longer than 250 bytes, a 
>> larger key block size than the default of 1024 bytes is used. "
>>
>>> I would not store images in either one
>> as your SELECT LIKE and Random will kill it. 
>>
>> Well, I think that this can be avoided, using just searches in teh key 
>> fields should not give these issues. Does somebody have experience storing a 
>> large amount of medium (1KB-150KB) blob objects in mysql?
> 
> True
> 
> An option would be to encode them to Base64 on INSERT but if you Index
> all of you BLOBS on INSERT really there should be no problem. Besides
> 150Kb is not a big for a BLOB. Consider 20MB to 100MB with multiple
> joins on MSSQL, 64Bit although. Apparently size is based on the maximum
> amount of memory the client has. VARBLOB apparently has no limit per
> docs. As doing this on MySQL I can not relate to. I can on DB2 and
> MSSQL. I can say you can rival the 32Bit MSSQL performance by at least
> 15 percent. I can only say that I have experiance with raw DB
> predictions in Graphing. Edge and Adjacency Modeling on MySQL.
> 
> What I see slowing you down is the TQSL and SPROCS. The dll for the md5
> I posted earlier will scale to 1000s of inserts at the time. If speed is
> really your essence then use RAW Partitions for the DB and RAM. Use the
> MySQL Connector or the ODBC or you will hit size limits on INSERT and
> SELECT.
> 
>>> However I have not a clue that this is even doable in MySQL.
>> In mysql there is already a MD5 funtion: 
>> http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html#function_md5
> 
> Yes, I was informed that a call from a SPROC to "md5()" would do the
> trick and take the load of the client. At least that was my intent of
> the idea to balance the load. That is if this is client/server.
> 
> I do wonder about your memory allocation and disk. It is all about the
> DB design. Think about a Genealogy DB. Where do you end design? You
> don't. Where does predictions end? They don't.

I think you are making this way too complicated.  You are going to end 
up filling a large disk with small bits of data and your speed is going 
to be limited by how fast the disk head can get to the right place for 
anything that isn't already in a buffer.  Other than the special case of 
too many entries in a single directory, the software overhead isn't 
going to make much difference unless you can effectively predict what 
you are likely to want next or keep the most popular things in your 
buffers.  Hardware-wise, adding RAM is likely to help even if it is just 
for the filesystem inode/directory cache - and if you are lucky, the LRU 
data buffering.  Also, spreading your data over several disks would help 
by reducing the head contention.

-- 
   Les Mikesell
lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-13 Thread JohnS

On Mon, 2009-07-13 at 05:49 +,  o wrote:

> >It is 1024 chars long. Witch want still help.
> I'm usng mysam and according to: 
> http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html
> "The maximum key length is 1000 bytes. This can also be changed by changing 
> the source and recompiling. For the case of a key longer than 250 bytes, a 
> larger key block size than the default of 1024 bytes is used. "
> 
> >I would not store images in either one
> as your SELECT LIKE and Random will kill it. 
> 
> Well, I think that this can be avoided, using just searches in teh key fields 
> should not give these issues. Does somebody have experience storing a large 
> amount of medium (1KB-150KB) blob objects in mysql?

True

An option would be to encode them to Base64 on INSERT but if you Index
all of you BLOBS on INSERT really there should be no problem. Besides
150Kb is not a big for a BLOB. Consider 20MB to 100MB with multiple
joins on MSSQL, 64Bit although. Apparently size is based on the maximum
amount of memory the client has. VARBLOB apparently has no limit per
docs. As doing this on MySQL I can not relate to. I can on DB2 and
MSSQL. I can say you can rival the 32Bit MSSQL performance by at least
15 percent. I can only say that I have experiance with raw DB
predictions in Graphing. Edge and Adjacency Modeling on MySQL.

What I see slowing you down is the TQSL and SPROCS. The dll for the md5
I posted earlier will scale to 1000s of inserts at the time. If speed is
really your essence then use RAW Partitions for the DB and RAM. Use the
MySQL Connector or the ODBC or you will hit size limits on INSERT and
SELECT.

> >However I have not a clue that this is even doable in MySQL.
> 
> In mysql there is already a MD5 funtion: 
> http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html#function_md5

Yes, I was informed that a call from a SPROC to "md5()" would do the
trick and take the load of the client. At least that was my intent of
the idea to balance the load. That is if this is client/server.

I do wonder about your memory allocation and disk. It is all about the
DB design. Think about a Genealogy DB. Where do you end design? You
don't. Where does predictions end? They don't.

John

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-12 Thread oooooooooooo ooooooooooooo


>How many files per directory do you have?

I have 4 directory levels, 65536 leaves directories and around 200 files per 
dir (15M in total)- 
 
>Something is wrong. Got to figure this out.  Where did this RAM go?

Thanks I reduced the memory usage of mysql and my app it and I got around a 15% 
performance increase. Now my atop looks like this (currently reading only 
cached files from disk).

PRC | sys   0.51s | user   9.29s | #proc114 | #zombie0 | #exit  0 |
CPU | sys  4% | user 93% | irq   1% | idle208% | wait 94% |
cpu | sys  2% | user 48% | irq   1% | idle 21% | cpu001 w 28% |
cpu | sys  1% | user 17% | irq   0% | idle 41% | cpu000 w 40% |
cpu | sys  1% | user 14% | irq   0% | idle 74% | cpu003 w 12% |
cpu | sys  1% | user 13% | irq   0% | idle 72% | cpu002 w 14% |
CPL | avg1   3.45 | avg57.42 | avg15  10.76 | csw15891 | intr   11695 |
MEM | tot2.0G | free   51.2M | cache 587.8M | buff1.0M | slab  281.2M |
SWP | tot1.9G | free1.9G |  | vmcom   1.6G | vmlim   2.9G |
PAG | scan   3072 | stall  0 |  | swin   0 | swout  0 |
DSK | sdb | busy 89% | read1451 | write  0 | avio6 ms |
DSK | sda | busy  6% | read 178 | write 54 | avio2 ms |
NET | transport   | tcpi3631 | tcpo3629 | udpi   0 | udpo   0 |
NET | network | ipi 3632 | ipo 3630 | ipfrw  0 | deliv   3632 |
NET | eth0 0% | pcki   5 | pcko   3 | si0 Kbps | so1 Kbps |
NET | lo  | pcki3627 | pcko3627 | si  775 Kbps | so  775 Kbps |

>It is 1024 chars long. Witch want still help.
I'm usng mysam and according to: 
http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html
"The maximum key length is 1000 bytes. This can also be changed by changing the 
source and recompiling. For the case of a key longer than 250 bytes, a larger 
key block size than the default of 1024 bytes is used. "

>I would not store images in either one
as your SELECT LIKE and Random will kill it. 

Well, I think that this can be avoided, using just searches in teh key fields 
should not give these issues. Does somebody have experience storing a large 
amount of medium (1KB-150KB) blob objects in mysql?

>However I have not a clue that this is even doable in MySQL.

In mysql there is already a MD5 funtion: 
http://dev.mysql.com/doc/refman/5.1/en/encryption-functions.html#function_md5

Thanks for the help.

_
Connect to the next generation of MSN Messenger 
http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread JohnS


On Sat, 2009-07-11 at 11:48 -0400, JohnS wrote:
> On Sat, 2009-07-11 at 00:01 +,  o wrote:
> > > You mentioned that the data can be retrieved from somewhere else. Is
> > > some part of this filename a unique key? 
> > 
> > The real key is up to 1023 chracters long and it's unique, but I have to 
> > trim to 256 charactes, by this way is not unique unless I add the hash.
> > 
> > >Do you have to track this
> > > relationship anyway - or age/expire content? 
> > 
> > I have to track the long filename -> short file name realation ship. Age is 
> > not relevant here.
> > 
> > I'd try to arrange things
> > > so the most likely scenario would take the fewest operations. Perhaps a
> > > mix of hash+filename would give direct access 99+% of the time and you
> > > could move all copies of collisions to a different area. 
> > 
> > yes its a good idea, but at this point I don't want to add more complexity 
> > tomy app, and having a separate area for collisions would make it more 
> > complex.
> > 
> > >Then you could
> > > keep the database mapping the full name to the hashed path but you'd
> > > only have to consult it when the open() attempt fails.
> > 
> > As the long filename is up to 1023 chars long i can't index it with mysql 
> > (it has a lower max limit). That's why I use the hash which is indexed). 
> > What I do is keeping a list of just the md5 of teh cached files in memory 
> > in my app, before going to mysql, I frist check if it's in the list (realy 
> > a RB-Tree).
> ---
> It is 1024 chars long. Witch want still help. MSSQL 2005 and up is
> longer, if your interested:
> http://msdn.microsoft.com/en-us/library/ms143432.aspx
> But that greatly depends on your data size 900 bytes is the limit but
> can be exceeded.
> 
> You can use either one if you do a unique key id name for the index.
> File name to Unique short name. I would not store images in either one
> as your SELECT LIKE and Random will kill it. As much as I like DBs I
> have to say the flat file system is for those.
> 
> John
---
Just a random thought on Hashes VIA DB that none hardly give any thought
about.

Using Extended Stored Procedures like:MSSQL. You can make your on hashes
on the file insert.

USE master;
EXEC sp_extendedproc 'your_md5', 'your_md5.dll'

Of course you will have to create your own .DLL to to do the Hashing. 

Then create your on functions:
SELECT dbo.your_md5('YourHash');

Direct:
EXEC master.dbo.your_md5 'YourHash'

However I have not a clue that this is even doable in MySQL.

John

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread JohnS


On Sat, 2009-07-11 at 00:01 +,  o wrote:
> > You mentioned that the data can be retrieved from somewhere else. Is
> > some part of this filename a unique key? 
> 
> The real key is up to 1023 chracters long and it's unique, but I have to trim 
> to 256 charactes, by this way is not unique unless I add the hash.
> 
> >Do you have to track this
> > relationship anyway - or age/expire content? 
> 
> I have to track the long filename -> short file name realation ship. Age is 
> not relevant here.
> 
> I'd try to arrange things
> > so the most likely scenario would take the fewest operations. Perhaps a
> > mix of hash+filename would give direct access 99+% of the time and you
> > could move all copies of collisions to a different area. 
> 
> yes its a good idea, but at this point I don't want to add more complexity 
> tomy app, and having a separate area for collisions would make it more 
> complex.
> 
> >Then you could
> > keep the database mapping the full name to the hashed path but you'd
> > only have to consult it when the open() attempt fails.
> 
> As the long filename is up to 1023 chars long i can't index it with mysql (it 
> has a lower max limit). That's why I use the hash which is indexed). What I 
> do is keeping a list of just the md5 of teh cached files in memory in my app, 
> before going to mysql, I frist check if it's in the list (realy a RB-Tree).
---
It is 1024 chars long. Witch want still help. MSSQL 2005 and up is
longer, if your interested:
http://msdn.microsoft.com/en-us/library/ms143432.aspx
But that greatly depends on your data size 900 bytes is the limit but
can be exceeded.

You can use either one if you do a unique key id name for the index.
File name to Unique short name. I would not store images in either one
as your SELECT LIKE and Random will kill it. As much as I like DBs I
have to say the flat file system is for those.

John

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread Alexander Georgiev

>
> Thanks, using directories as file names is a great idea, anyway I'm not sure 
> if that would solve my performance issue, as the bottleneck is the disk and 
> not mysql.

The situation you described initally, suffers from only one issue -
too many files in one single directory. You are not the fists fighting
this - see qmail maildir, see squid etc.  The remedy is always one and
the same - split the files into a tree folder structure. For a sample
implementaition - check out squid, backup pc etc ...

>I just implemented the directories names based on the hash of the file and the 
>performance is a bit slower than before. This is the output of atop (15 secs. 
>avg.):
>
> PRC | sys   0.53s | user   5.43s | #proc    112 | #zombie    0 | #exit      0 
> |
> CPU | sys      4% | user     54% | irq       2% | idle    208% | wait    131% 
> |
> cpu | sys      1% | user     24% | irq       1% | idle     54% | cpu001 w 20% 
> |
> cpu | sys      2% | user     15% | irq       1% | idle     31% | cpu002 w 52% 
> |
> cpu | sys      1% | user      8% | irq       0% | idle     52% | cpu003 w 38% 
> |
> cpu | sys      1% | user      7% | irq       0% | idle     71% | cpu000 w 21% 
> |
> CPL | avg1  10.58 | avg5    6.92 | avg15   4.66 | csw    19112 | intr   19135 
> |
> MEM | tot    2.0G | free   49.8M | cache 157.4M | buff  116.8M | slab  122.7M 
> |
> SWP | tot    1.9G | free    1.2G |              | vmcom   2.2G | vmlim   2.9G 
> |

I am under the impression that you are swapping. Out of 2GB of cache,
you have just 157MB cache and 116MB buffers. What is eating the RAM?
Why do you have 0.8GB swap used? You need more memory for file system
cache.

> PAG | scan   1536 | stall      0 |              | swin       9 | swout      0 
> |
> DSK |         sdb | busy     91% | read     884 | write    524 | avio    6 ms 
> |
> DSK |         sda | busy     12% | read     201 | write    340 | avio    2 ms 
> |
> NET | transport   | tcpi    8551 | tcpo    8204 | udpi     702 | udpo     718 
> |
> NET | network     | ipi     9264 | ipo     8946 | ipfrw      0 | deliv   9264 
> |
> NET | eth0     5% | pcki    6859 | pcko    6541 | si 5526 Kbps | so  466 Kbps 
> |
> NET | lo      | pcki    2405 | pcko    2405 | si  397 Kbps | so  397 Kbps 
> |
>
>
> in sdb is the cache and in sda is all other stuff, including the mysql db 
> files. Check that I have a lot of disk reads in sdb, but I'm really getting 
> one file from disk for each 10 written, so my guess is that all other reads 
> are directory listings. As I'm using the hash as directory names, (I think) 
> this makes the linux cache slower, as the files are distributed in a more 
> homogeneous and randomly way among the directories.
>
I think that linux file system cache is smart enough for this type of load.
How many files per directory do you have?

> The app is running a bit slower than using the file name for directory name, 
> although I expect (not really sure) that it will be better as the number of 
> files on disk grows (currently there are only 600k files from 15M). My 
> current performance is around 50 file i/o per second.
>

Something is wrong. Got to figure this out.  Where did this RAM go?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread oooooooooooo ooooooooooooo


Thanks, using directories as file names is a great idea, anyway I'm not sure if 
that would solve my performance issue, as the bottleneck is the disk and not 
mysql. I just implemented the directories names based on the hash of the file 
and the performance is a bit slower than before. This is the output of atop (15 
secs. avg.):

PRC | sys   0.53s | user   5.43s | #proc112 | #zombie0 | #exit  0 |
CPU | sys  4% | user 54% | irq   2% | idle208% | wait131% |
cpu | sys  1% | user 24% | irq   1% | idle 54% | cpu001 w 20% |
cpu | sys  2% | user 15% | irq   1% | idle 31% | cpu002 w 52% |
cpu | sys  1% | user  8% | irq   0% | idle 52% | cpu003 w 38% |
cpu | sys  1% | user  7% | irq   0% | idle 71% | cpu000 w 21% |
CPL | avg1  10.58 | avg56.92 | avg15   4.66 | csw19112 | intr   19135 |
MEM | tot2.0G | free   49.8M | cache 157.4M | buff  116.8M | slab  122.7M |
SWP | tot1.9G | free1.2G |  | vmcom   2.2G | vmlim   2.9G |
PAG | scan   1536 | stall  0 |  | swin   9 | swout  0 |
DSK | sdb | busy 91% | read 884 | write524 | avio6 ms |
DSK | sda | busy 12% | read 201 | write340 | avio2 ms |
NET | transport   | tcpi8551 | tcpo8204 | udpi 702 | udpo 718 |
NET | network | ipi 9264 | ipo 8946 | ipfrw  0 | deliv   9264 |
NET | eth0 5% | pcki6859 | pcko6541 | si 5526 Kbps | so  466 Kbps |
NET | lo  | pcki2405 | pcko2405 | si  397 Kbps | so  397 Kbps |


in sdb is the cache and in sda is all other stuff, including the mysql db 
files. Check that I have a lot of disk reads in sdb, but I'm really getting one 
file from disk for each 10 written, so my guess is that all other reads are 
directory listings. As I'm using the hash as directory names, (I think) this 
makes the linux cache slower, as the files are distributed in a more 
homogeneous and randomly way among the directories. 

The app is running a bit slower than using the file name for directory name, 
although I expect (not really sure) that it will be better as the number of 
files on disk grows (currently there are only 600k files from 15M). My current 
performance is around 50 file i/o per second.




_
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread Alexander Georgiev

2009/7/11  o :
>
>> You mentioned that the data can be retrieved from somewhere else. Is
>> some part of this filename a unique key?
>
> The real key is up to 1023 chracters long and it's unique, but I have to trim 
> to 256 charactes, by this way is not unique unless I add the hash.
>

The fact that this 1023 file name is unique is very nice. And no
trimming is needed!
I think you have 2 issues to deal with:

1)  you have files with unique file names unfortunatelly with lenth <=
1023 characters.
Regarding filenames and paths in linux and ext3 you have:

  file name length limit = 254 bytes
  path length limit = 4096

If you try to store such a file directly, you will break the file name
limit. But if you decompose the name into N chunks each of 250
characters, you will be able to preserve the file as a sequence of

 N - 1 nested folders plus a file with a name equal to the Nth
chunk residing into the N-1th folder.

Via this decomposition you will translate the unique 1023 character
'file name' into a unique 1023 character 'file path' with length lower
than the path length limit

 2) You suffer performance degradation when number of files in a
folder goes beyond 1000.

Filipe Brandenburger has suggested a slick scheme to overcome this
problem, that will work perfectly without a database:

quote start
$ echo -n example.txt | md5sum
e76faa0543e007be095bb52982802abe  -

Then say you take the first 4 digits of it to build the hash: e/7/6/f

Then you store file example.txt at: e/7/6/f/example.txt
quote end

of course, "example.txt" might be a long filename: "exa . 1000
chars here .txt" so after the "hash tree" e/7/6/f you will store
the file path structure described in 1).

As was suggested by Les Mikesell, squid and other products have
already implemented similar strategies, and you might be able to use
either the algorithm or directly the code that implements it. I would
spend some time investigating squid's code. I think squid has to deal
with exactly same problem - cache the contents of resources whose urls
might be > 254 characters.


If you use this approach - no need for a database to store hashes!

I did some tests on a Centos 3 system with the following script:

=script start
#! /bin/bash

for a in a b c d e f g j; do
f=""
for i in `seq 1 250`; do
f=$a$f
done
mkdir $f
cd $f
done
pwd > some_file.txt
=script end

which creates a nested directory structure with and a file in it.
Total file path length is > 8 * 250. I had no problems accessing this
file by its full path:

$ find ./ -name some\* -exec cat {} \; | wc -c
   2026
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo


> You mentioned that the data can be retrieved from somewhere else. Is
> some part of this filename a unique key? 

The real key is up to 1023 chracters long and it's unique, but I have to trim 
to 256 charactes, by this way is not unique unless I add the hash.

>Do you have to track this
> relationship anyway - or age/expire content? 

I have to track the long filename -> short file name realation ship. Age is not 
relevant here.

I'd try to arrange things
> so the most likely scenario would take the fewest operations. Perhaps a
> mix of hash+filename would give direct access 99+% of the time and you
> could move all copies of collisions to a different area. 

yes its a good idea, but at this point I don't want to add more complexity tomy 
app, and having a separate area for collisions would make it more complex.

>Then you could
> keep the database mapping the full name to the hashed path but you'd
> only have to consult it when the open() attempt fails.

As the long filename is up to 1023 chars long i can't index it with mysql (it 
has a lower max limit). That's why I use the hash which is indexed). What I do 
is keeping a list of just the md5 of teh cached files in memory in my app, 
before going to mysql, I frist check if it's in the list (realy a RB-Tree).



_
Invite your mail contacts to join your friends list with Windows Live Spaces. 
It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Les Mikesell

 o wrote:
>> I don't think you've explained the constraint that would make you use
>> mysql or not.
> 
> My original idea was using the just the hash as filename, by this way I could 
> have a direct access. But the customer rejected this and requested to have 
> part of the long file name (from 11 to 1023 characters). As linux only allows 
> 256 characters in the path and I could get duplicates with the 256 first 
> chars, I trim teh real filename to around 200 characters and I add the hash 
> at the end (plus a couple metadata small fields). 
> 
> Yes, there requirements does not makes too much sense, but I've tried to 
> convince the customer to use just the hash with no luck (seems he does not 
> understand well what is a hash although I've tried to explain it several 
> times).

You mentioned that the data can be retrieved from somewhere else.  Is 
some part of this filename a unique key?  Do you have to track this 
relationship anyway - or age/expire content?  I'd try to arrange things 
so the most likely scenario would take the fewest operations.  Perhaps a 
mix of hash+filename would give direct access 99+% of the time and you 
could move all copies of collisions to a different area.  Then you could 
  keep the database mapping the full name to the hashed path but you'd 
only have to consult it when the open() attempt fails.

> That's why  I need or a) use mysql or b) do a directory lising.
> 
>> 00/AA/FF/filename
> That would make up to 256^3 directory leaves, what is more than 16 Million 
> ones, due I have around 15M files, I think that this is an excessive number 
> of directories.

I guess that's why squid only uses 16 x 256...

-- 
   Les Mikesell
 lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo


According to my tests the average size per file is around 15KB (although there 
are files from 1Kb to 150KB).


_
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev

2009/7/10, Filipe Brandenburger :
> On Fri, Jul 10, 2009 at 16:21, Alexander
> Georgiev wrote:
>> I would use either only a database, or only the file system. To me -
>> using them both is a violation of KISS.
>
> I disagree with your general statement.
>
> Storing content that is appropriate for files (e.g., pictures) as
> BLOBs in an SQL database only makes it more complex.
>

Please, explain why. I was under the impression that storing large
binary streams is BLOB's reason to exist.

> Creating "clever" file formats to store relationships between objects
> in a filesystem instead of using a SQL database only makes it more
> complex (and harder to extend!).

Indeed.

> Just because you are using less technologies doesn't necessarily make
> it simpler.

Of course, but if one of those technologies can provide both
functionalities without hacks, twists and abuse, I would stay with
that single technology.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Filipe Brandenburger

On Fri, Jul 10, 2009 at 16:21, Alexander
Georgiev wrote:
> I would use either only a database, or only the file system. To me -
> using them both is a violation of KISS.

I disagree with your general statement.

Storing content that is appropriate for files (e.g., pictures) as
BLOBs in an SQL database only makes it more complex.

Creating "clever" file formats to store relationships between objects
in a filesystem instead of using a SQL database only makes it more
complex (and harder to extend!).

Think a website that stores user's pictures and has social networking
features (maybe like Flickr?). The natural place to store the JPEG
images is the filesystem. The natural place to store user info,
favorites, relations between users, etc. is the SQL database. If you
try to do it different, it starts looking like you are trying to fit a
square piece in a round hole. It may be possible to do it, but it is
certainly not elegant.

Just because you are using less technologies doesn't necessarily make
it simpler.

Filipe
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev

2009/7/10,  o :
>
> Ok, I coudl use mysql, but think we have around 15M entries and I would have
> to add to each a file from 1KB to 150KB, in total the files size can be
> around 200GB. How will be the performance of this in mysql?
>

in the worst case - 150kb for a 1500 of files I get:

1500 * 150 / (1024 * 1024)
2145.7672119140625000

or 2TB
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo


Ok, I coudl use mysql, but think we have around 15M entries and I would have to 
add to each a file from 1KB to 150KB, in total the files size can be around 
200GB. How will be the performance of this in mysql?

_
Discover the new Windows Vista
http://search.msn.com/results.aspx?q=windows+vista&mkt=en-US&form=QBRE
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev

>
> My original idea was using the just the hash as filename, by this way I
> could have a direct access. But the customer rejected this and requested to
> have part of the long file name (from 11 to 1023 characters). As linux only
> allows 256 characters in the path and I could get duplicates with the 256
> first chars, I trim teh real filename to around 200 characters and I add the
> hash at the end (plus a couple metadata small fields).
>
> Yes, there requirements does not makes too much sense, but I've tried to
> convince the customer to use just the hash with no luck (seems he does not
> understand well what is a hash although I've tried to explain it several
> times).
>
> That's why  I need or a) use mysql or b) do a directory lising.

I would use either only a database, or only the file system. To me -
using them both is a violation of KISS.

If you were able to convince them to change the directory layout, and
if you are more confortable with a database  - try to convince them to
use a database.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo


>I don't think you've explained the constraint that would make you use
> mysql or not.

My original idea was using the just the hash as filename, by this way I could 
have a direct access. But the customer rejected this and requested to have part 
of the long file name (from 11 to 1023 characters). As linux only allows 256 
characters in the path and I could get duplicates with the 256 first chars, I 
trim teh real filename to around 200 characters and I add the hash at the end 
(plus a couple metadata small fields). 

Yes, there requirements does not makes too much sense, but I've tried to 
convince the customer to use just the hash with no luck (seems he does not 
understand well what is a hash although I've tried to explain it several times).

That's why  I need or a) use mysql or b) do a directory lising.

>00/AA/FF/filename
That would make up to 256^3 directory leaves, what is more than 16 Million 
ones, due I have around 15M files, I think that this is an excessive number of 
directories.


_
Connect to the next generation of MSN Messenger 
http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Les Mikesell

 o wrote:
> Hi, After talking with te customer, I finnaly managed to convince him for 
> using the first characters of the hash as directory names.
> 
> Now I'm in doubt about the following options:
> 
> a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mysql 
> with a hash->filename table, so I can get teh file name from the hash and 
> then I can directly access it (I first query mysql for the hash of the file, 
> and the I read the file).
> 
> b) Using 5 levels without mysql, and making a dir listing (due to technical 
> issues, I can't only know an approximate file name, so I can't make a direct 
> access here), match the file name and then read it. The issue here is that I 
> would have 16^5 leave directories (more than a million).
> 
> I could also make more combinations of mysql/not mysql and number of levels.
> 
> What do you think it would give the best performance in ext3?

I don't think you've explained the constraint that would make you use 
mysql or not.  I'd avoid it if everything involved can compute the hash 
or is passed the whole path since is bound to be slower than doing the 
math, and just on general principles I'd use a tree like 
00/AA/FF/filename (three levels of 2 hex characters) as the first cut, 
although squid uses just two levels with a default of 16 first level and 
256 2nd level directories and probably has some good reason for it.

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo


Hi, After talking with te customer, I finnaly managed to convince him for using 
the first characters of the hash as directory names.

Now I'm in doubt about the following options:

a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mysql with 
a hash->filename table, so I can get teh file name from the hash and then I can 
directly access it (I first query mysql for the hash of the file, and the I 
read the file).

b) Using 5 levels without mysql, and making a dir listing (due to technical 
issues, I can't only know an approximate file name, so I can't make a direct 
access here), match the file name and then read it. The issue here is that I 
would have 16^5 leave directories (more than a million).

I could also make more combinations of mysql/not mysql and number of levels.

What do you think it would give the best performance in ext3?

Thanks.


_
Invite your mail contacts to join your friends list with Windows Live Spaces. 
It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread James A. Peltier

On a side note, perhaps this is something that Hadoop would be good with.

-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax : 778-782-3045
E-Mail  : jpelt...@sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
   http://blogs.sfu.ca/people/jpeltier
MSN : subatomic_s...@hotmail.com

The point of the HPC scheduler is to
keep everyone equally unhappy.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread JohnS


On Thu, 2009-07-09 at 10:09 -0700, James A. Peltier wrote:
> On Thu, 9 Jul 2009,  o wrote:
> 
> >
> > It's possible that I will be able to name the directory tree based in the 
> > hash of te file, so I would get the structure described in one of my 
> > previous post (4 directory levels, each directory name would be a single 
> > character from 0-9 and A-F, and 65536 (16^4) leaves, each leave containing 
> > 200 files). Do you think that this would really improve performance? Could 
> > this structure be improved?
> >
> 
> If you don't plan on modifying the file after creation I could see it 
> working.  You could consider the use of a Berkley DB style database for 
> quick and easy lookups on large amounts of data, but depending on your 
> exact needs maintenance might be a chore and not really feasable.

MUMPS DB will go at it even faster.

> It's an interesting suggestion but I don't know if it would actually work 
> like you describe based on having to always compute the hash first.
> 
Indeed interesting. Actually it would be the same as taking the file to
base 64 on final storage. My thoughts are it would would. Even faster
would be to implement this with the table in RAM.

john

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread James A. Peltier

On Thu, 9 Jul 2009,  o wrote:

>
> It's possible that I will be able to name the directory tree based in the 
> hash of te file, so I would get the structure described in one of my previous 
> post (4 directory levels, each directory name would be a single character 
> from 0-9 and A-F, and 65536 (16^4) leaves, each leave containing 200 files). 
> Do you think that this would really improve performance? Could this structure 
> be improved?
>

If you don't plan on modifying the file after creation I could see it 
working.  You could consider the use of a Berkley DB style database for 
quick and easy lookups on large amounts of data, but depending on your 
exact needs maintenance might be a chore and not really feasable.

It's an interesting suggestion but I don't know if it would actually work 
like you describe based on having to always compute the hash first.

-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax : 778-782-3045
E-Mail  : jpelt...@sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
   http://blogs.sfu.ca/people/jpeltier
MSN : subatomic_s...@hotmail.com

The point of the HPC scheduler is to
keep everyone equally unhappy.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread JohnS


On Wed, 2009-07-08 at 16:14 -0600, Frank Cox wrote:
> On Wed, 08 Jul 2009 18:09:28 -0400
> Filipe Brandenburger wrote:
> 
> > You can hash it and still keep the original filename, and you don't
> > even need a MySQL database to do lookups.
> 
> Now that is slick as all get-out.  I'm really impressed your scheme, though I
> don't actually have any use for it right at this moment.
> 
> It's really clever. 
---
Yes it is but think about a SAN server with terabytes of data
directories disparsed over multiple controllers. I'm am kinda curious
how that would scale. That's my problem.

John

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo


>There's C code to do this in squid, and backuppc does it in perl (for a 
pool directory where all identical files are hardlinked).

Unfortunately I have to write the file with some predefined format, so these 
would not provide the flexibility I need.

>Rethink how you're writing files or you'll be in a world of hurt.

It's possible that I will be able to name the directory tree based in the hash 
of te file, so I would get the structure described in one of my previous post 
(4 directory levels, each directory name would be a single character from 0-9 
and A-F, and 65536 (16^4) leaves, each leave containing 200 files). Do you 
think that this would really improve performance? Could this structure be 
improved?

>BTW, you can pretty much say goodbye to any backup solution for this type 
of project as well.  They'll all die dealing with a file system structure 
like this.

We don't plan to use backups (if the data gets corrupted, we can retrieve it 
again), but thanks for teh advice.

>I think entry level list pricing starts at about $80-100k for
1 NAS gateway (no disks).

That's far above the budget... 

>depending on the total size of this cache files, as it was suggested
by nate - throw some hardware at it.

Same that above, seems they don't want to spend more in HW  (so I have to deal 
with all performance issues...). Anyway if I can get all the directories to 
have around 200 files, I think I will be able to make this with the current 
hardware.

Thanks for the advice.

_
Invite your mail contacts to join your friends list with Windows Live Spaces. 
It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Alexander Georgiev

2009/7/9,  o :
>
> After a quick calculation, that could put around 3200 files per directory (I
> have around 15 million of files), I think that above 1000 files the
> performance will start to degrade significantly, anyway it would be a mater
> of doing some benchmarks.

depending on the total size of this cache files, as it was suggested
by nate - throw some hardware at it.

perhaps a hardware ram device will provide adequate performance :

http://www.tomshardware.com/reviews/hyperos-dram-hard-drive-block,1186.html
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread nate

James A. Peltier wrote:

> There isn't a good file system for this type of thing.  filesystems with
> many very small files are always slow.  Ext3, XFS, JFS are all terrible
> for this type of thing.

I can think of one...though you'll pay out the ass for it, the
Silicon file system from BlueArc (NFS), file system runs on
FPGAs. Our BlueArc's never had more than 50-100,000 files in any
particular directory(millions in any particular tree), though
they are supposed to be able to handle this sort of thing quite
well.

I think entry level list pricing starts at about $80-100k for
1 NAS gateway (no disks).

Our BlueArc's went end of life earlier this year and we migrated
to an Exanet cluster(runs on top of CentOS 4.4 though uses it's
own file system, clustering and NFS services) which is still
very fast though not as fast as BlueArc.

And with block based replication it doesn't matter how many
files there are, performance is excellent for backup, send
data to another rack in your data center or to another
continent over the WAN. In BlueArc's case transparently
send data to a dedupe device or tape drive based on
dynamic access patterns(and move it back automatically
when needed).

http://www.bluearc.com/html/products/file_system.shtml
http://www.exanet.com/default.asp?contentID=231

Both systems scale to gigabytes/second of throughput linearly,
and petabytes of storage without downtime. The only downside
to BlueArc is their back end storage, they only offer tier
2 storage and only have HDS for tier 1. You can make an HDS
perform but it'll cost you even more..The tier 2 stuff is
too unreliable(LSI logic). Exanet at least supports
almost any storage out there(we went with 3PAR).

Don't even try to get a netapp to do such a thing.

nate

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread James A. Peltier

On Wed, 8 Jul 2009,  o wrote:

>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15 
> Million fo files), and a node can have up to 40 files (and I don't have 
> any way to split this ammount in smaller ones). As the number of files grows, 
> my application gets slower and slower (the app is works something like a 
> cache for another app and I can't redesign the way it distributes files into 
> disk due to the other app requirements).
>
> The filesystem I use is ext3 with teh following options enabled:
>
> Filesystem features:  has_journal resize_inode dir_index filetype 
> needs_recovery sparse_super large_file
>
> Is there any way to improve performance in ext3? Would you suggest another FS 
> for this situation (this is a prodution server, so I need a stable one) ?
>
> Thanks in advance (and please excuse my bad english).


BTW, you can pretty much say goodbye to any backup solution for this type 
of project as well.  They'll all die dealing with a file system structure 
like this

  -- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax : 778-782-3045
E-Mail  : jpelt...@sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
   http://blogs.sfu.ca/people/jpeltier
MSN : subatomic_s...@hotmail.com

The point of the HPC scheduler is to
keep everyone equally unhappy.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread James A. Peltier

On Wed, 8 Jul 2009,  o wrote:

>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15 
> Million fo files), and a node can have up to 40 files (and I don't have 
> any way to split this ammount in smaller ones). As the number of files grows, 
> my application gets slower and slower (the app is works something like a 
> cache for another app and I can't redesign the way it distributes files into 
> disk due to the other app requirements).
>
> The filesystem I use is ext3 with teh following options enabled:
>
> Filesystem features:  has_journal resize_inode dir_index filetype 
> needs_recovery sparse_super large_file
>
> Is there any way to improve performance in ext3? Would you suggest another FS 
> for this situation (this is a prodution server, so I need a stable one) ?
>
> Thanks in advance (and please excuse my bad english).

There isn't a good file system for this type of thing.  filesystems with 
many very small files are always slow.  Ext3, XFS, JFS are all terrible 
for this type of thing.

Rethink how you're writing files or you'll be in a world of hurt.

-- 
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax : 778-782-3045
E-Mail  : jpelt...@sfu.ca
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
   http://blogs.sfu.ca/people/jpeltier
MSN : subatomic_s...@hotmail.com

The point of the HPC scheduler is to
keep everyone equally unhappy.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Les Mikesell

 o wrote:
>> You can hash it and still keep the original filename, and you don't
>> even need a MySQL database to do lookups.
> 
> There are an issue I forgot to mention: the original file name can be up top 
> 1023 characters long. As linux only allows 256 characters in the file path, I 
> could have a (very small) number of collisions, that's why my original idea 
> was using a hash->filename table. So I'm not sure if I could implement that 
> idea in my scenario.
> 
>> For instance: example.txt ->
>> e7/6f/example.txt. That might (or might not) give you a better
>> performance.
> 
> After a quick calculation, that could put around 3200 files per directory (I 
> have around 15 million of files), I think that above 1000 files the 
> performance will start to degrade significantly, anyway it would be a mater 
> of doing some benchmarks.

There's C code to do this in squid, and backuppc does it in perl (for a 
pool directory where all identical files are hardlinked).  Source for 
both is available and might be worth a look at their choices for the 
depth of the trees and collision handling (backuppc actually hashes the 
file content, not the name, though).

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo


> You can hash it and still keep the original filename, and you don't
> even need a MySQL database to do lookups.

There are an issue I forgot to mention: the original file name can be up top 
1023 characters long. As linux only allows 256 characters in the file path, I 
could have a (very small) number of collisions, that's why my original idea was 
using a hash->filename table. So I'm not sure if I could implement that idea in 
my scenario.

>For instance: example.txt ->
> e7/6f/example.txt. That might (or might not) give you a better
> performance.

After a quick calculation, that could put around 3200 files per directory (I 
have around 15 million of files), I think that above 1000 files the performance 
will start to degrade significantly, anyway it would be a mater of doing some 
benchmarks.

Thanks for the advice.


_
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Frank Cox

On Wed, 08 Jul 2009 18:09:28 -0400
Filipe Brandenburger wrote:

> You can hash it and still keep the original filename, and you don't
> even need a MySQL database to do lookups.

Now that is slick as all get-out.  I'm really impressed your scheme, though I
don't actually have any use for it right at this moment.

It's really clever. 

-- 
MELVILLE THEATRE ~ Melville Sask ~ http://www.melvilletheatre.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Filipe Brandenburger

Hi,

On Wed, Jul 8, 2009 at 17:59, 
o wrote:
> My original idea was storing the file with a hash of it name, and then store 
> a  hash->real filename in mysql. By this way I have direct access to the file 
> and I can make a directory hierachy with the first characters of teh hash 
> /c/0/2/a, so i would have 16*4 =65536 leaves in the directoy tree, and the 
> files would be identically distributed, with around 200 files per dir (waht 
> should not give any perfomance issues). But the requiremenst are to use the 
> real file name for the directory tree, what gives the issue.

You can hash it and still keep the original filename, and you don't
even need a MySQL database to do lookups.

For instance, let's take "example.txt" as the file name.

Then let's hash it, say using MD5 (just for the sake of example, a
simpler hash could give you good enough results and be quicker to
calculate):
$ echo -n example.txt | md5sum
e76faa0543e007be095bb52982802abe  -

Then say you take the first 4 digits of it to build the hash: e/7/6/f

Then you store file example.txt at: e/7/6/f/example.txt

The file still has its original name (example.txt), and if you want to
find it, you can just calculate the hash for the name again, in which
case you will find the e/7/6/f, and prepend that to the original name.

I would also suggest that you keep less directories levels with more
branches on them, the optimal performance will be achieved by getting
a balance of them. For example, in this case (4 hex digits) you would
have 4 levels with 16 entries each. If you group the hex digits two by
two, you would have (up to) 256 entries on each level, but only two
levels of subdirectories. For instance: example.txt ->
e7/6f/example.txt. That might (or might not) give you a better
performance. A benchmark should tell you which one is better, but in
any case, both of these setups will be many times faster than the one
where you have 400,000 files in a single directory.

Would that help solve your issue?

HTH,
Filipe
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo


(i resent thsi message as previous one seems bad formatted, sorry for the mess).


>Perhaps think about running tune2fs maybe also consider adding noatime 
 
Yes, I added it and I got a perfomance increase, anyway as the number of fields 
grows the speed keeps going below an acceptable level.
 


>I saw this article some time back.
 
http://www.linux.com/archive/feature/127055


Good idea, I already use mysql for indexing the files, so everytime I need to 
make a lookup I don't need the entire dir and then get the file, anyway my 
requirements are keeping the files on disk.


 
>The only way to deal with it (especially if the
application adds and removes these files regularly) is to every once in a
while copy the files to another directory, nuke the directory and restore
from the copy.


Thanks, but there will not be too many file updates once the cache is done, so 
recreating directories can not be very helpful here. The issue is that as the 
number of files grows, bot reads from existing files and new insertion gets 
slower and slower.


 
>I haven't done, or even seen, any recent benchmarks but I'd expect
 reiserfs to still be the best at that sort of thing. I've looking at some 
benchmarks and reiser seems a bit faster in my scenario, however my problem 
happens when I have a arge number of files, for what I have seen, I'm not sure 
if reiser would be a fix
>However even if 
you can improve things slightly, do not let whoever is responsible for 
that application ignore the fact that it is a horrible design that 
ignores a very well known problem that has easy solutions.

My original idea was storing the file with a hash of it name, and then store a  
hash->real filename in mysql. By this way I have direct access to the file and 
I can make a directory hierachy with the first characters of teh hash /c/0/2/a, 
so i would have 16*4 =65536 leaves in the directoy tree, and the files would be 
identically distributed, with around 200 files per dir (waht should not give 
any perfomance issues). But the requiremenst are to use the real file name for 
the directory tree, what gives the issue.

 
 
>Did that program also write your address header ?
:)


 
Thanks for the help.
 
 

> From: hhh...@hotmail.com
> To: centos@centos.org
> Date: Wed, 8 Jul 2009 06:27:40 +
> Subject: [CentOS] Question about optimal filesystem with many small files.
>
>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15 
> Million fo files), and a node can have up to 40 files (and I don't have 
> any way to split this ammount in smaller ones). As the number of files grows, 
> my application gets slower and slower (the app is works something like a 
> cache for another app and I can't redesign the way it distributes files into 
> disk due to the other app requirements).
>
> The filesystem I use is ext3 with teh following options enabled:
>
> Filesystem features: has_journal resize_inode dir_index filetype 
> needs_recovery sparse_super large_file
>
> Is there any way to improve performance in ext3? Would you suggest another FS 
> for this situation (this is a prodution server, so I need a stable one) ?
>
> Thanks in advance (and please excuse my bad english).
>
>
> _
> Connect to the next generation of MSN Messenger
> http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos
 
_
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx
_
Connect to the next generation of MSN Messenger 
http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo



>Perhaps think about running tune2fs maybe also consider adding noatime 

Yes, I added it and I got a perfomance increase, anyway as the number of fields 
grows the speed keeps going below an acceptable level.

>I saw this article some time back.

http://www.linux.com/archive/feature/127055
Good idea, I already use mysql for indexing the files, so everytime I need to 
make a lookup I don't need the entire dir and then get the file, anyway my 
requirements are keeping the files on disk.

>The only way to deal with it (especially if the
application adds and removes these files regularly) is to every once in a
while copy the files to another directory, nuke the directory and restore
from the copy.Thanks, but there will not be too many file updates once the 
cache is done, so recreating directories can not be very helpful here. The 
issue is that as the number of files grows, bot reads from existing files and 
new insertion gets slower and slower.

>I haven't done, or even seen, any recent benchmarks but I'd expect
 reiserfs to still be the best at that sort of thing. I've looking at some 
benchmarks and reiser seems a bit faster in my scenario, however my problem 
happens when I have a arge number of files, for what I have seen, I'm not sure 
if reiser would be a fix
>However even if 
you can improve things slightly, do not let whoever is responsible for 
that application ignore the fact that it is a horrible design that 
ignores a very well known problem that has easy solutions.My original idea was 
storing the file with a hash of it name, and then store a  hash->real filename 
in mysql. By this way I have direct access to the file and I can make a 
directory hierachy with the first characters of teh hash /c/0/2/a, so i would 
have 16*4 =65536 leaves in the directoy tree, and the files would be 
identically distributed, with around 200 files per dir (waht should not give 
any perfomance issues). But the requiremenst are to use the real file name for 
the directory tree, what gives the issue.


>Did that program also write your address header ?
:)

Thanks for the help.



> From: hhh...@hotmail.com
> To: centos@centos.org
> Date: Wed, 8 Jul 2009 06:27:40 +
> Subject: [CentOS] Question about optimal filesystem with many small files.
>
>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15 
> Million fo files), and a node can have up to 40 files (and I don't have 
> any way to split this ammount in smaller ones). As the number of files grows, 
> my application gets slower and slower (the app is works something like a 
> cache for another app and I can't redesign the way it distributes files into 
> disk due to the other app requirements).
>
> The filesystem I use is ext3 with teh following options enabled:
>
> Filesystem features: has_journal resize_inode dir_index filetype 
> needs_recovery sparse_super large_file
>
> Is there any way to improve performance in ext3? Would you suggest another FS 
> for this situation (this is a prodution server, so I need a stable one) ?
>
> Thanks in advance (and please excuse my bad english).
>
>
> _
> Connect to the next generation of MSN Messenger
> http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-us&source=wlmailtagline
> ___
> CentOS mailing list
> CentOS@centos.org
> http://lists.centos.org/mailman/listinfo/centos

_
News, entertainment and everything you care about at Live.com. Get it now!
http://www.live.com/getstarted.aspx
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Gary Greene

On 7/8/09 8:56 AM, "Les Mikesell"  wrote:
>  o wrote:
>> Hi,
>> 
>> I have a program that writes lots of files to a directory tree (around 15
>> Million fo files), and a node can have up to 40 files (and I don't have
>> any way to split this ammount in smaller ones). As the number of files grows,
>> my application gets slower and slower (the app is works something like a
>> cache for another app and I can't redesign the way it distributes files into
>> disk due to the other app requirements).
>> 
>> The filesystem I use is ext3 with teh following options enabled:
>> 
>> Filesystem features:  has_journal resize_inode dir_index filetype
>> needs_recovery sparse_super large_file
>> 
>> Is there any way to improve performance in ext3? Would you suggest another FS
>> for this situation (this is a prodution server, so I need a stable one) ?
>> 
>> Thanks in advance (and please excuse my bad english).
> 
> I haven't done, or even seen, any recent benchmarks but I'd expect
> reiserfs to still be the best at that sort of thing.   However even if
> you can improve things slightly, do not let whoever is responsible for
> that application ignore the fact that it is a horrible design that
> ignores a very well known problem that has easy solutions.  And don't
> ever do business with someone who would write a program like that again.
>   Any way you approach it, when you want to write a file the system must
> check to see if the name already exists, and if not, create it in an
> empty space that it must also find - and this must be done atomically so
> the directory must be locked against other concurrent operations until
> the update is complete.  If you don't index the contents the lookup is a
> slow linear scan - if you do, you then have to rewrite the index on
> every change so you can't win.  Sensible programs that expect to access
> a lot of files will build a tree structure to break up the number that
> land in any single directory (see squid for an example).  Even more
> sensible programs would re-use some existing caching mechanism like
> squid or memcached instead of writing a new one badly.

In many ways this is similar to issues you'll see in a very active mail or
news server that uses maildir wherein the d-entries get too large to be
traversed quickly. The only way to deal with it (especially if the
application adds and removes these files regularly) is to every once in a
while copy the files to another directory, nuke the directory and restore
from the copy. This is why databases are better for this kind of intensive
data caching.

-- 
Gary L. Greene, Jr.
IT Operations
Minerva Networks, Inc.
Cell:  (650) 704-6633
Phone: (408) 240-1239

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Kwan Lowe

On Wed, Jul 8, 2009 at 2:27 AM,  o <
hhh...@hotmail.com> wrote:

>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15
> Million fo files), and a node can have up to 40 files (and I don't have
> any way to split this ammount in smaller ones). As the number of files
> grows, my application gets slower and slower (the app is works something
> like a cache for another app and I can't redesign the way it distributes
> files into disk due to the other app requirements).
>
> The filesystem I use is ext3 with teh following options enabled:
>
> Filesystem features:  has_journal resize_inode dir_index filetype
> needs_recovery sparse_super large_file
>
> Is there any way to improve performance in ext3? Would you suggest another
> FS for this situation (this is a prodution server, so I need a stable one) ?
>

I saw this article some time back.

http://www.linux.com/archive/feature/127055

I've not implemented it, but from past experience, you may lose some
performance initially, but the database fs performance might be more
consistent as the number of files grow.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Les Mikesell

 o wrote:
> Hi,
> 
> I have a program that writes lots of files to a directory tree (around 15 
> Million fo files), and a node can have up to 40 files (and I don't have 
> any way to split this ammount in smaller ones). As the number of files grows, 
> my application gets slower and slower (the app is works something like a 
> cache for another app and I can't redesign the way it distributes files into 
> disk due to the other app requirements).
> 
> The filesystem I use is ext3 with teh following options enabled:
> 
> Filesystem features:  has_journal resize_inode dir_index filetype 
> needs_recovery sparse_super large_file
> 
> Is there any way to improve performance in ext3? Would you suggest another FS 
> for this situation (this is a prodution server, so I need a stable one) ?
> 
> Thanks in advance (and please excuse my bad english).

I haven't done, or even seen, any recent benchmarks but I'd expect 
reiserfs to still be the best at that sort of thing.   However even if 
you can improve things slightly, do not let whoever is responsible for 
that application ignore the fact that it is a horrible design that 
ignores a very well known problem that has easy solutions.  And don't 
ever do business with someone who would write a program like that again. 
  Any way you approach it, when you want to write a file the system must 
check to see if the name already exists, and if not, create it in an 
empty space that it must also find - and this must be done atomically so 
the directory must be locked against other concurrent operations until 
the update is complete.  If you don't index the contents the lookup is a 
slow linear scan - if you do, you then have to rewrite the index on 
every change so you can't win.  Sensible programs that expect to access 
a lot of files will build a tree structure to break up the number that 
land in any single directory (see squid for an example).  Even more 
sensible programs would re-use some existing caching mechanism like 
squid or memcached instead of writing a new one badly.

-- 
   Les Mikesell
lesmikes...@gmail.com

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-07 Thread Per Qvindesland

Perhaps think about running tune2fs maybe also consider
adding noatime

Regards
Per
E-mail: p...@norhex.com [1]
http://www.linkedin.com/in/perqvindesland [2]
--- Original message follows ---
SUBJECT: Re: [CentOS] Question about optimal filesystem with many
small files.
FROM:  Niki Kovacs
TO: "CentOS mailing list"
DATE: 08-07-2009 8:41

 o a écrit :
> Hi,
> 
> I have a program that writes lots of files to a directory tree

Did that program also write your address header ?

:o)
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Links:
--
[1] http://webmail.norhex.com/#
[2] http://www.linkedin.com/in/perqvindesland___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-07 Thread Niki Kovacs

 o a écrit :
> Hi,
> 
> I have a program that writes lots of files to a directory tree 

Did that program also write your address header ?

:o)
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos

39 matches

Mail list logo