Re: How many files can I put in one diretory?

2000-06-27 Thread David Scheidt

On 26 Jun 2000, Chris Shenton wrote:

:I was considering this for a project I developed: web up/download of
:lots of large files. I was using MySQL and some of the folks on that
:list recommended not storing large files in the DB: even though the
:disk consumption is the same, if it's in a DB you can't spread it
:across partitions as space requirements grow.

That's a failing of your DBMS, and not of a database in general.  I add
space to existing databases under Sybase fairly often.  


David



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-26 Thread Chris Shenton

On 23-Jun-00 Nicole Harrington. wrote:
  Yeah.. This is why databases where invented :) Hey I
 agree... However even if the html was databased.. (working on that
 now) the custom graphics cannot be. (yet)

On Fri, 23 Jun 2000 13:12:48 +0930 (CST), "Daniel O'Connor" [EMAIL PROTECTED] 
said:

Daniel Hmm.. can't you do binary blobs in a DB and change the image
Daniel URL's to be cgi requests?

I was considering this for a project I developed: web up/download of
lots of large files. I was using MySQL and some of the folks on that
list recommended not storing large files in the DB: even though the
disk consumption is the same, if it's in a DB you can't spread it
across partitions as space requirements grow.

So I store the file path in the DB and the actual file on the UNIX
filesystem. To reduce search time I use a two-level directory
hierarchy, each of which has 256 subdirectories. To distribute files
evenly, I store the file under a name which is the MD5 hash of the
filename, time, etc, etc. This gives me a 32-char name of [0-9a-f].

So if file foo.tar.gz hashes to name

cafebabedeadbeef0123456789abcdef

it is stored under

/filestore/ca/fe/cafebabedeadbeef0123456789abcdef

This gives me 256 * 256 = 65536 directories. My requirement was to
store at least 10 Million files, and this works out to about 150 files
per directory -- easy for UNIX to get to quickly. It's been working
very well for me.





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-25 Thread Doug Barton

"Nicole Harrington." wrote:
 
 On 22-Jun-00 Luigi Rizzo wrote:
 
   Hello
   I have a user who needs to store a large amount of small html files. Like
  around 2 million...
 
  that sounds insane! Because a name is a name, why dont they call
  those files xx/yy/zz/tt.html and the like, to get down to a more
  reasonable # of files per directory.
 
 
  Well.. Yea that's the idea.. But what is a reasonable number? 10K 100K etc.

I heard 10k a while back from several sources I considered reliable.
I've always stuck to that limit and never had a problem on freebsd or
sun. I've also had very good luck with a hashed directory structure,
such as: 

/a/b/c/abcfile

The level of hashing, and the number of characters per level can be
determined by your expected number of files, naming schemes, etc.

Good luck,

Doug
-- 
"Live free or die"
- State motto of my ancestral homeland, New Hampshire

Do YOU Yahoo!?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-23 Thread Luigi Rizzo

   I have a user who needs to store a large amount of small html files. Like
  around 2 million...
  
  that sounds insane! Because a name is a name, why dont they call
  those files xx/yy/zz/tt.html and the like, to get down to a more
  reasonable # of files per directory.
 
  Well.. Yea that's the idea.. But what is a reasonable number? 10K 100K etc.

i would not go above 1K, probably even below so that a directory fits in
1-2 pages.

cheers
luigi
---+-
  Luigi RIZZO, [EMAIL PROTECTED]  . Dip. di Ing. dell'Informazione
  http://www.iet.unipi.it/~luigi/  . Universita` di Pisa
  TEL/FAX: +39-050-568.533/522 . via Diotisalvi 2, 56126 PISA (Italy)
  Mobile   +39-347-0373137
---+-


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-23 Thread Murray Stokely

On Fri, 23 Jun 2000, Daniel O'Connor wrote:
% I chunk of binary data you can put in a DB.
% 
% Like an image, or an mpeg, or a sound file..
% 
% AFAIK postgres supports BLOBS.

  So does MySQL.  You can display a BLOB using a Perl/DBI cgi script
with about 5 lines of code.  Just print the correct Content-type
header and then the contents of the BLOB. 

- Murray



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-23 Thread Jeroen C. van Gelderen

"Nicole Harrington." wrote:
 
 On 23-Jun-00 Daniel O'Connor wrote:
 
  On 23-Jun-00 Nicole Harrington. wrote:
   Yeah.. This is why databases where invented :)
Hey I agree... However even if the html was databased.. (working on that
   now)
   the custom graphics cannot be. (yet)
 
  Hmm.. can't you do binary blobs in a DB and change the image URL's to be cgi
  requests?
 
 
  I dunno.. whats a "binary Blob"?

Pleonasm? :-) BLOB = Binary Large OBject.

From the TransBase SQL Reference Manual:
"TransBase does not interpret the contents of a BLOB. Each field of type 
BLOB either contains the NULL value or a BLOB object. The only
operations 
on BLOBs are creation, insertion, update of a BLOB, testing a BLOB on 
being the NULL value, extracting a BLOB via the field name in the SELECT 
clause, extracting a subrange of a BLOB (i.e. an adjacent byte range of 
a BLOB), and extracting the size of a BLOB."

Cheers,
Jeroen
-- 
Jeroen C. van Gelderen  o  _ _ _
[EMAIL PROTECTED]  _o /\_   _ \\o  (_)\__/o  (_)
  _ \_   _(_) (_)/_\_| \   _|/' \/
 (_)(_) (_)(_)   (_)(_)'  _\o_


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



How many files can I put in one diretory?

2000-06-22 Thread Nicole Harrington.


 Hello
 I have a user who needs to store a large amount of small html files. Like
around 2 million...

 Assuming FreeBSD 4.0-Stable with Soft Updates, what is a sane number that can
be handled per directory?


  Thanks!!

   Nicole



 
 [EMAIL PROTECTED] |\ __ /|   (`\   http://www.unixgirl.com/
 [EMAIL PROTECTED] | o_o  |__  ) )  http://www.dangermouse.org/
//  \\
---(((---(((-
 
 --  Powered by Coka-Cola and FreeBSD  --
-- Strong enough for a man - But made for a Woman --
--   OWNED?  MS: Who's Been In/Virused Your Computer Today? --

 ---
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Luigi Rizzo

 
  Hello
  I have a user who needs to store a large amount of small html files. Like
 around 2 million...

that sounds insane! Because a name is a name, why dont they call
those files xx/yy/zz/tt.html and the like, to get down to a more
reasonable # of files per directory.

Or use a single file and a cgi which extracts things from the right place.
In such a context, i assume that the best place to do the name lookup
is in the app, not in the kernel.

cheers
luigi

  Assuming FreeBSD 4.0-Stable with Soft Updates, what is a sane number that can
 be handled per directory?
 
 
   Thanks!!
 
Nicole
 
 
 
  
  [EMAIL PROTECTED] |\ __ /|   (`\   http://www.unixgirl.com/
  [EMAIL PROTECTED] | o_o  |__  ) )  http://www.dangermouse.org/
 //  \\
 ---(((---(((-
  
  --  Powered by Coka-Cola and FreeBSD  --
 -- Strong enough for a man - But made for a Woman --
 --   OWNED?  MS: Who's Been In/Virused Your Computer Today? --
 
  ---
  
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message
 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Kris Kirby

On Thu, 22 Jun 2000, Don Lewis wrote:
 other ways of quickly finding the desired directory entry.  Even so,
 you probably still would want to avoid doing an "ls" or an "echo *" ;-)

Heh. I once wrote a program that made 1K files until it ran out of disk
space. It took the 386DX-40 about two days to run out of inodes. The
purpose was to find some rather elusive IDE bad sectors. I soon tired of
such attempts, as I spent two days writing and another two rm'ing the
mess. newfs helped, but I had other bad sectors to deal with. I soon
removed that hard drive. I think I smashed somewhere. I was once given a
whole pile of 40 MB and 80 MB SCSI drives (3.5"). I broke a few but the
novelty wore off. It's tiring work destroying hard drives.

-
Kris Kirby, KE4AHR  | TGIFreeBSD... 'Nuff said.
[EMAIL PROTECTED]|
---
"Fate, it seems, is not without a sense of irony."



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Arun Sharma

On Wed, 21 Jun 2000 23:42:37 -0700 (PDT), Nicole Harrington. [EMAIL PROTECTED] 
wrote:
 
  Hello
  I have a user who needs to store a large amount of small html files. Like
 around 2 million...
 
  Assuming FreeBSD 4.0-Stable with Soft Updates, what is a sane number that can
 be handled per directory?

I investigated this for about 25k files and it seemed to be fine. Note that 
if you keep the in memory directory cache (which is hashed) large enough,
you might be able to get away with a one time linear search cost in the
directory. So your worst case is scanning two million filenames in a directory.
The average case can be made O(1)

Also, picking names intelligently is also a good idea -

fbar123456789

is a bad idea, because the string comparision routine has to skip over
the first 50 character, before it finds a mismatch. I think netscape 
commits this sin.

-Arun


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Robert Watson

On Thu, 22 Jun 2000, Daniel O'Connor wrote:

 On 22-Jun-00 Luigi Rizzo wrote:
   that sounds insane! Because a name is a name, why dont they call
   those files xx/yy/zz/tt.html and the like, to get down to a more
   reasonable # of files per directory.
   
   Or use a single file and a cgi which extracts things from the right place.
   In such a context, i assume that the best place to do the name lookup
   is in the app, not in the kernel.
 
 Yeah.. This is why databases where invented :)
 
 FYI 4 in a directory really makes directory listings slow.. 2 million would
 suck :)

Actually, I'd choose a higher starting suck number -- if you're thinking
of ls, remember that ls attempts to read all of the entries into memory
and sort them.  The directory listing becomes much faster if you use
``-f'', which prevents sorting of output.  I have a cyrus server with
easily 50,000 entries in many directories and that has not been a serious
impediment to correct functioning, although no doubt there is a high
performance impact.

One possibility here, if the names of the files don't matter, is to make
use of Adrian Chadd's IFS, which avoids the issue by providing direct
inode # access to an FFS disk layout.  When opening a file, the inode
number is returned so that you can handle meta-data in your own database
(possibly on the same drive), which permits custom name mechanisms
optimized for seeks, etc.  This would be great, for example, for AFS and
Coda client caches and server storage, where the distributed file systems
provide their own stoarge for meta-data in internal databases (and in the
case of Coda, in a transactional database).  Name lookup against the IFS
space is O(1).

This code is not yet committed, but is definitely of interest.

  Robert N M Watson 

[EMAIL PROTECTED]  http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Nicole Harrington.


On 22-Jun-00 Luigi Rizzo wrote:
 
  Hello
  I have a user who needs to store a large amount of small html files. Like
 around 2 million...
 
 that sounds insane! Because a name is a name, why dont they call
 those files xx/yy/zz/tt.html and the like, to get down to a more
 reasonable # of files per directory.
 

 Well.. Yea that's the idea.. But what is a reasonable number? 10K 100K etc.

   Nicole


 Or use a single file and a cgi which extracts things from the right place.
 In such a context, i assume that the best place to do the name lookup
 is in the app, not in the kernel.
 
   cheers
   luigi
 
  Assuming FreeBSD 4.0-Stable with Soft Updates, what is a sane number that
  can
 be handled per directory?
 
 
   Thanks!!
 
Nicole
 
 
 
  
  [EMAIL PROTECTED] |\ __ /|   (`\   http://www.unixgirl.com/
  [EMAIL PROTECTED] | o_o  |__  ) )  http://www.dangermouse.org/
 //  \\
 ---(((---(((-
  
  --  Powered by Coka-Cola and FreeBSD  --
 -- Strong enough for a man - But made for a Woman --
 --   OWNED?  MS: Who's Been In/Virused Your Computer Today? --
 
  
  ---
  
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message
 

 
 [EMAIL PROTECTED] |\ __ /|   (`\   http://www.unixgirl.com/
 [EMAIL PROTECTED] | o_o  |__  ) )  http://www.dangermouse.org/
//  \\
---(((---(((-
 
 --  Powered by Coka-Cola and FreeBSD  --
-- Strong enough for a man - But made for a Woman --
--   OWNED?  MS: Who's Been In/Virused Your Computer Today? --

 ---
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Nicole Harrington.


On 22-Jun-00 Daniel O'Connor wrote:
 
 On 22-Jun-00 Luigi Rizzo wrote:
  that sounds insane! Because a name is a name, why dont they call
  those files xx/yy/zz/tt.html and the like, to get down to a more
  reasonable # of files per directory.
  
  Or use a single file and a cgi which extracts things from the right place.
  In such a context, i assume that the best place to do the name lookup
  is in the app, not in the kernel.
 
 Yeah.. This is why databases where invented :)
 
 Hey I agree... However even if the html was databased.. (working on that now)
the custom graphics cannot be. (yet)


 FYI 4 in a directory really makes directory listings slow.. 2 million
 would suck :)
 

 Well.. Yea. But assuming you are using Apache and requesting the page and
graphics via a fully formed URL it should be pretty high.. I would assume.


   Nicole


 ---
 Daniel O'Connor software and network engineer
 for Genesis Software - http://www.gsoft.com.au
 "The nice thing about standards is that there
 are so many of them to choose from."
   -- Andrew Tanenbaum
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message

 
 [EMAIL PROTECTED] |\ __ /|   (`\   http://www.unixgirl.com/
 [EMAIL PROTECTED] | o_o  |__  ) )  http://www.dangermouse.org/
//  \\
---(((---(((-
 
 --  Powered by Coka-Cola and FreeBSD  --
-- Strong enough for a man - But made for a Woman --
--   OWNED?  MS: Who's Been In/Virused Your Computer Today? --

 ---
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Daniel O'Connor


On 23-Jun-00 Nicole Harrington. wrote:
  Yeah.. This is why databases where invented :)
   Hey I agree... However even if the html was databased.. (working on that
  now)
  the custom graphics cannot be. (yet)

Hmm.. can't you do binary blobs in a DB and change the image URL's to be cgi
requests?

---
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Nicole Harrington.


On 23-Jun-00 Daniel O'Connor wrote:
 
 On 23-Jun-00 Nicole Harrington. wrote:
  Yeah.. This is why databases where invented :)
   Hey I agree... However even if the html was databased.. (working on that
  now)
  the custom graphics cannot be. (yet)
 
 Hmm.. can't you do binary blobs in a DB and change the image URL's to be cgi
 requests?
 

 I dunno.. whats a "binary Blob"?

 Also would'nt this make the DB HUGE

   Nicole



 ---
 Daniel O'Connor software and network engineer
 for Genesis Software - http://www.gsoft.com.au
 "The nice thing about standards is that there
 are so many of them to choose from."
   -- Andrew Tanenbaum
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-hackers" in the body of the message

 
 [EMAIL PROTECTED] |\ __ /|   (`\   http://www.unixgirl.com/
 [EMAIL PROTECTED] | o_o  |__  ) )  http://www.dangermouse.org/
//  \\
---(((---(((-
 
 --  Powered by Coka-Cola and FreeBSD  --
-- Strong enough for a man - But made for a Woman --
--   OWNED?  MS: Who's Been In/Virused Your Computer Today? --

 ---
 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: How many files can I put in one diretory?

2000-06-22 Thread Daniel O'Connor


On 23-Jun-00 Nicole Harrington. wrote:
   I dunno.. whats a "binary Blob"?

I chunk of binary data you can put in a DB.

Like an image, or an mpeg, or a sound file..

AFAIK postgres supports BLOBS.

   Also would'nt this make the DB HUGE

Yep :)

---
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message