Note that mod-11 will give you 11 items, 0..10! Squid (and Apache's mod_proxy) use a system something like:
- hash the URL. - hex-encode or base64-encode the hash. - peel off 2 or 3 characters at a time for as many directories deep as you want. - the remaining characters are the filename in that directory. Later, scott On Nov 13, 2007 12:08 PM, Andreas Volz <[EMAIL PROTECTED]> wrote: > Am Tue, 13 Nov 2007 07:18:19 -0600 schrieb John Stanton: > > > In a cache situation I would expect that keeping the binary data in > > files would be preferable because you can use far more efficient > > mechanisms for loading them into your cache and in particular in > > transmitting them downstream. Your DB only needs to store a pathname. > > > > Just be wary of directory size, and do not put them all in the one > > directory. > > I noticed that problem in my current situation. I don't know the file > number and size limit in Linux or Windows, but I'm sure there is a > limit. > > My main problem is to find a good algorithm to name the cached files > and split them into directories. My current idea is: > > 1) Put the URL into DB > 2) Use a hash function to create a unique name for the cache file > 3) Insert the hash name into the same row as the URL > > The problem with many files in a directory: > > 4) Use e.g. 'modulo 11' on the URL hash value to get one of ten > directory names where to find a file. > > But this has the drawback to have a static number of cache directories. > The algorithm isn't scalable with growing files. > > Do you think is a good way? Or do you've another idea? > > > regards > Andreas > > ----------------------------------------------------------------------------- > To unsubscribe, send email to [EMAIL PROTECTED] > ----------------------------------------------------------------------------- > > ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------