Re: [OT] Data store options

2001-11-08 Thread Joachim Zobel
At 08:59 08.11.01 -0800, you wrote: My specifics are that I have a need to permanently store tens of thousands of smallish (5K) items. I'm currently using a simple file system store, one file per record, all in the same directory. Clearly, I need to move into a directory tree for better

[OT] Data store options

2001-11-08 Thread Bill Moseley
Hi, verbose I'm looking for a little discussion on selecting a data storage method, and I'm posting here because Cache::Cache often is discussed here (along with Apache::Session). And people here are smart, of course ;). Basically, I'm trying to understand when to use Cache::Cache, vs.

Re: [OT] Data store options

2001-11-08 Thread Perrin Harkins
Basically, I'm trying to understand when to use Cache::Cache, vs. Berkeley DB, and locking issues. (Perrin, I've been curious why at etoys you used Berkeley DB over other caching options, such as Cache::Cache). Cache::Cache didn't exist at the time. BerkeleyDB seemed easier than rolling our

Cache::* and MD5 collisions [was: [OT] Data store options]

2001-11-08 Thread Barrie Slaymaker
On Thu, Nov 08, 2001 at 08:59:55AM -0800, Bill Moseley wrote: Hi, verbose I'm looking for a little discussion on selecting a data storage method, and I'm posting here because Cache::Cache often is discussed here (along with Apache::Session). And people here are smart, of course ;).

Re: Cache::* and MD5 collisions [was: [OT] Data store options]

2001-11-08 Thread Jeffrey W. Baker
On Thu, 2001-11-08 at 10:11, Barrie Slaymaker wrote: On Thu, Nov 08, 2001 at 08:59:55AM -0800, Bill Moseley wrote: Hi, verbose I'm looking for a little discussion on selecting a data storage method, and I'm posting here because Cache::Cache often is discussed here (along with

Re: Cache::* and MD5 collisions [was: [OT] Data store options]

2001-11-08 Thread DeWitt Clinton
On Thu, Nov 08, 2001 at 01:11:21PM -0500, Barrie Slaymaker wrote: Even a bit more OT: one thing to watch out for, especially if you plan on caching a *lot* of data, is that the Cache::* modules did not do collision detection on MD5 collisions the last time I looked. Forgive me if that's

Re: Cache::* and MD5 collisions [was: [OT] Data store options]

2001-11-08 Thread Andrew Ho
Hello, DCFor example, file system caches fill their directories roughly equally DCwhen their paths are created from MD5 hashed keys. Doing something DCsimple and unique like URL-encoding the key to make a legal identifier DC(legal in the sense that it is a valid filename) wouldn't distribute as

Re: [OT] Data store options

2001-11-08 Thread Andrew Ho
Hello, PHIf you do use BerkeleyDB, I suggest you just use the simple PHdatabase-level lock. Otherwise, you have to think about deadlocks and I PHfound the deadlock daemon that comes with it kind of difficult to use. Later versions of BerkeleyDB have a row-level lock available which works pretty

Re: Cache::* and MD5 collisions [was: [OT] Data store options]

2001-11-08 Thread Bill Moseley
At 10:54 AM 11/08/01 -0800, Andrew Ho wrote: For example, say your keys are e-mail addresses and you just want to use an MD5 hash to spread your data files over directories so that no one directory has too many files in it. Say your original key is [EMAIL PROTECTED] (hex encoded MD5 hash of this

Re: Cache::* and MD5 collisions [was: [OT] Data store options]

2001-11-08 Thread Barrie Slaymaker
On Thu, Nov 08, 2001 at 10:54:11AM -0800, Andrew Ho wrote: Let me point out that if you are using MD5 hashes for directory spreading (i.e. to spread a large number of files across a tree of directories so that no one directory is filled with too many files for efficient filesystem access),

Re: [OT] Data store options

2001-11-08 Thread Joshua Chamas
Bill Moseley wrote: Hi, verbose I'm looking for a little discussion on selecting a data storage method, and I'm posting here because Cache::Cache often is discussed here (along with Apache::Session). And people here are smart, of course ;). Basically, I'm trying to understand when to