[ANNOUNCE] MLDBM::Sync v.07

2001-03-19 Thread Joshua Chamas

Hey,

The latest MLDBM::Sync v.07 is in your local CPAN and also
  http://www.perl.com/CPAN-local/modules/by-module/MLDBM/

It provides a wrapper around MLDBM databases, like SDBM_File
and DB_File, providing safe concurrent access, using a flock()
strategy and per access dbm i/o flushing.  

A recent API addition allows for a secondary cache layer with
Tie::Cache to be automatically used, like:

  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 0640;
  $sync_dbm_obj-SyncCacheSize('100K');

On my dual PIII 450 linux box, I might get 1500 or so reads per sec 
to a SDBM_File based MLDBM::Sync database, and the Tie::Cache layer
runs at about 15000 reads/sec, so for a high cache hit usage, the 
speedup can be considerable.

MLDBM::Sync also comes with MLDBM::Sync::SDBM_File, a wrapper around 
SDBM_File that overcomes its 1024 byte limit for values, which 
can be fast for caching data up to 1 bytes or so in length.

-- Josh

CHANGES

$MODULE = "MLDBM::Sync"; $VERSION = .07; $DATE = 'TBA';

+ $dbm-SyncCacheSize() API activates 2nd layer RAM cache
  via Tie::Cache with MaxBytes set.

+ CACHE documentation, cache.t test, sample benchmarks
  with ./bench/bench_sync.pl -c

$MODULE = "MLDBM::Sync"; $VERSION = .05; $DATE = '2001/03/13';

+ Simpler use of locking.

- Read locking works on Solaris, had to open lock file in
  read/write mode.  Linux/NT didn't care.

NAME
  MLDBM::Sync (BETA) - safe concurrent access to MLDBM databases

SYNOPSIS
  use MLDBM::Sync;   # this gets the default, SDBM_File
  use MLDBM qw(DB_File Storable);# use Storable for serializing
  use MLDBM qw(MLDBM::Sync::SDBM_File);  # use extended SDBM_File, handles values 
 1024 bytes

  # NORMAL PROTECTED read/write with implicit locks per i/o request
  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync' [..other DBM args..] or die $!;
  $cache{""} = "";
  my $value = $cache{""};

  # SERIALIZED PROTECTED read/write with explicit lock for both i/o requests
  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 
0640;
  $sync_dbm_obj-Lock;
  $cache{""} = "";
  my $value = $cache{""};
  $sync_dbm_obj-UnLock;

  # SERIALIZED PROTECTED READ access with explicit read lock for both reads
  $sync_dbm_obj-ReadLock;
  my @keys = keys %cache;
  my $value = $cache{''};
  $sync_dbm_obj-UnLock;

  # MEMORY CACHE LAYER with Tie::Cache
  $sync_dbm_obj-SyncCacheSize('100K');

  # KEY CHECKSUMS, for lookups on MD5 checksums on large keys
  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 
0640;
  $sync_dbm_obj-SyncKeysChecksum(1);
  my $large_key = "KEY" x 1;
  $sync{$large_key} = "LARGE";
  my $value = $sync{$large_key};

DESCRIPTION
This module wraps around the MLDBM interface, by handling concurrent
access to MLDBM databases with file locking, and flushes i/o explicity
per lock/unlock. The new [Read]Lock()/UnLock() API can be used to
serialize requests logically and improve performance for bundled reads 
writes.

  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 
0640;

  # Write locked critical section
  $sync_dbm_obj-Lock;
... all accesses to DBM LOCK_EX protected, and go to same tied file handles
$cache{'KEY'} = 'VALUE';
  $sync_dbm_obj-UnLock;

  # Read locked critical section
  $sync_dbm_obj-ReadLock;
... all read accesses to DBM LOCK_SH protected, and go to same tied files
... WARNING, cannot write to DBM in ReadLock() section, will die()
my $value = $cache{'KEY'};
  $sync_dbm_obj-UnLock;

  # Normal access OK too, without explicity locking
  $cache{'KEY'} = 'VALUE';
  my $value = $cache{'KEY'};

MLDBM continues to serve as the underlying OO layer that serializes
complex data structures to be stored in the databases. See the MLDBM the
BUGS manpage section for important limitations.

MLDBM::Sync also provides built in RAM caching with Tie::Cache md5 key
checksum functionality.



Re: [ANNOUNCE] MLDBM::Sync v.07

2001-03-19 Thread Perrin Harkins

On Mon, 19 Mar 2001, Joshua Chamas wrote:
 A recent API addition allows for a secondary cache layer with
 Tie::Cache to be automatically used

When one process writes a change to the dbm, will the others all see it,
even if they use this?
- Perrin




Re: [ANNOUNCE] MLDBM::Sync v.07

2001-03-19 Thread Joshua Chamas

Perrin Harkins wrote:
 
 On Mon, 19 Mar 2001, Joshua Chamas wrote:
  A recent API addition allows for a secondary cache layer with
  Tie::Cache to be automatically used
 
 When one process writes a change to the dbm, will the others all see it,
 even if they use this?

No, activation of the secondary cache layer will not see
updates from other processes.  This is best used for static 
data being cached.

I can see a request coming down that "expires" this 
cached data.  I'll build it when someone asks for it.

-- Josh

_
Joshua Chamas   Chamas Enterprises Inc.
NodeWorks  free web link monitoring   Huntington Beach, CA  USA 
http://www.nodeworks.com1-714-625-4051



[ANNOUNCE] MLDBM::Sync

2001-02-28 Thread Joshua Chamas

Hey there,

MLDBM::Sync is finally available in CPAN, also at:
  http://www.perl.com/CPAN-local/modules/by-module/MLDBM/

Below is a bit of the README...

Its a locking wrapper around MLDBM I developed for the purpose 
of creating safe fast DBM storage for multi-process environments.
DBM's like DB_File  SDBM_File can become corrupt if not properly
locked and have their i/o flushed, though DB_File is much more 
sensitive to this.

Further, there's a special wrapper around SDBM_File which 
gets around its 1024 byte limit, called MLDBM::Sync::SDBM_File,
see benchmarks below.

--Josh
_
Joshua Chamas   Chamas Enterprises Inc.
NodeWorks  free web link monitoring   Huntington Beach, CA  USA 
http://www.nodeworks.com1-714-625-4051


NAME
  MLDBM::Sync (BETA) - safe concurrent access to MLDBM databases

SYNOPSIS
  use MLDBM::Sync;   # this gets the default, SDBM_File
  use MLDBM qw(DB_File Storable);# use Storable for serializing
  use MLDBM qw(MLDBM::Sync::SDBM_File);  # use extended SDBM_File, handles values 
 1024 bytes

  # NORMAL PROTECTED read/write with implicit locks per i/o request
  tie %cache, 'MLDBM::Sync' [..other DBM args..] or die $!;
  $cache{""} = "";
  my $value = $cache{""};

  # SERIALIZED PROTECTED read/write with explicity lock for both i/o requests
  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 
0640;
  $sync_dbm_obj-Lock;
  $cache{""} = "";
  my $value = $cache{""};
  $sync_dbm_obj-UnLock;

DESCRIPTION
This module wraps around the MLDBM interface, by handling concurrent
access to MLDBM databases with file locking, and flushes i/o explicity
per lock/unlock. The new Lock()/UnLock() API can be used to serialize
requests logically and improve performance for bundled reads  writes.

  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync', '/tmp/syncdbm', O_CREAT|O_RDWR, 
0640;
  $sync_dbm_obj-Lock;
... all accesses to DBM LOCK_EX protected, and go to same file handles ...
  $sync_dbm_obj-UnLock;

MLDBM continues to serve as the underlying OO layer that serializes
complex data structures to be stored in the databases. See the MLDBM the
BUGS manpage section for important limitations.

BENCHMARKS
In the distribution ./bench directory is a bench_sync.pl script that can
benchmark using the various DBMs with MLDBM::Sync.

The MLDBM::Sync::SDBM_File DBM is special because is uses SDBM_File for
fast small inserts, but slows down linearly with the size of the data
being inserted and read, with the speed matching that of GDBM_File 
DB_File somewhere around 20,000 bytes.

So for DBM key/value pairs up to 1 bytes, you are likely better off
with MLDBM::Sync::SDBM_File if you can afford the extra space it uses.
At 20,000 bytes, time is a wash, and disk space is greater, so you might
as well use DB_File or GDBM_File.

Note that MLDBM::Sync::SDBM_File is ALPHA as of 2/27/2001.

The results for a dual 450 linux 2.2.14, with a ext2 file system
blocksize 4096 mounted async on a SCSI disk were as follows:

 === INSERT OF 50 BYTE RECORDS ===
  Time for 100 write/read's for  SDBM_File   0.12 seconds  
12288 bytes
  Time for 100 write/read's for  MLDBM::Sync::SDBM_File  0.14 seconds  
12288 bytes
  Time for 100 write/read's for  GDBM_File   2.07 seconds  
18066 bytes
  Time for 100 write/read's for  DB_File 2.48 seconds  
20480 bytes

 === INSERT OF 500 BYTE RECORDS ===
  Time for 100 write/read's for  SDBM_File   0.21 seconds 
658432 bytes
  Time for 100 write/read's for  MLDBM::Sync::SDBM_File  0.51 seconds 
135168 bytes
  Time for 100 write/read's for  GDBM_File   2.29 seconds  
63472 bytes
  Time for 100 write/read's for  DB_File 2.44 seconds 
114688 bytes

 === INSERT OF 5000 BYTE RECORDS ===
 (skipping test for SDBM_File 1024 byte limit)
  Time for 100 write/read's for  MLDBM::Sync::SDBM_File  1.30 seconds
2101248 bytes
  Time for 100 write/read's for  GDBM_File   2.55 seconds 
832400 bytes
  Time for 100 write/read's for  DB_File 3.27 seconds 
839680 bytes

 === INSERT OF 2 BYTE RECORDS ===
 (skipping test for SDBM_File 1024 byte limit)
  Time for 100 write/read's for  MLDBM::Sync::SDBM_File  4.54 seconds   
13162496 bytes
  Time for 100 write/read's for  GDBM_File   5.39 seconds
2063912 bytes
  Time for 100 write/read's for  DB_File 4.79 seconds
2068480 bytes

 === INSERT OF 5 BYTE RECORDS ===
 (skipping test for SDBM_File 1024 byte limit)