Re: Problem with DBM concurrent access

2002-04-05 Thread Dan Wilga

I would also suggest using BerkeleyDB.pm, but with the 
DB_INIT_MPOOL|DB_INIT_CDB flags. In this mode, only one writer is 
allowed at a time, and Berkeley automatically handles all the locking 
and flushing. Just don't forget to use db_close() to close the file 
before untie'ing it.


Dan Wilga [EMAIL PROTECTED]
Web Technology Specialist http://www.mtholyoke.edu
Mount Holyoke CollegeTel: 413-538-3027
South Hadley, MA  01075   Who left the cake out in the rain?




Re: Problem with DBM concurrent access

2002-04-05 Thread Andrew Ho

Hello,

DWI would also suggest using BerkeleyDB.pm, but with the 
DWDB_INIT_MPOOL|DB_INIT_CDB flags. In this mode, only one writer is 
DWallowed at a time, and Berkeley automatically handles all the locking 
DWand flushing. Just don't forget to use db_close() to close the file 
DWbefore untie'ing it.

One caveat on this, BerkeleyDB maintains its locks and other environment
information in a local memory segment so this won't work if multiple
machines share the same BerkeleyDB file (e.g., you are using the
BerkeleyDB file over NFS).

Humbly,

Andrew

--
Andrew Ho   http://www.tellme.com/   [EMAIL PROTECTED]
Engineer   [EMAIL PROTECTED]  Voice 650-930-9062
Tellme Networks, Inc.   1-800-555-TELLFax 650-930-9101
--




Re: Problem with DBM concurrent access

2002-04-04 Thread Perrin Harkins

Franck PORCHER wrote:
 So my question narrows down to :
 How to flush on disk the cache of a tied DBM (DB_File) structure
 in a way that any concurrent process accessing it in *read only* mode
 would automatically get the new values as soon as they
 are published (synchronisation)

You have to tie and untie on each request.  There's some discussion of 
this in the Guide.  As an alternative, you could look at using 
BerkeleyDB, or MLDBM::Sync (which does the tie/untie for you).

- Perrin




RE: Problem with DBM concurrent access

2002-04-04 Thread Rob Bloodgood

 So my question narrows down to :
 How to flush on disk the cache of a tied DBM (DB_File) structure
 in a way that any concurrent process accessing it in *read only* mode
 would automatically get the new values as soon as they
 are published (synchronisation)

Isn't that just as simple as

tied(%dbm_array)-sync();

?

HTH!

L8r,
Rob



Re: Problem with DBM concurrent access

2002-04-04 Thread Stas Bekman

Rob Bloodgood wrote:
So my question narrows down to :
How to flush on disk the cache of a tied DBM (DB_File) structure
in a way that any concurrent process accessing it in *read only* mode
would automatically get the new values as soon as they
are published (synchronisation)

 
 Isn't that just as simple as
 
 tied(%dbm_array)-sync();

I believe that's not enough, because the reader may read data during the 
write, resulting in corrupted data read. You have to add locking. see 
the DBM chapter in the guide.

-- 


_
Stas Bekman JAm_pH  --   Just Another mod_perl Hacker
http://stason.org/  mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/





Re: Problem with DBM concurrent access

2002-04-04 Thread Joshua Chamas

Stas Bekman wrote:
 
  tied(%dbm_array)-sync();
 
 I believe that's not enough, because the reader may read data during the
 write, resulting in corrupted data read. You have to add locking. see
 the DBM chapter in the guide.
 

You might add MLDBM::Sync to the docs, which easily adds locking
to MLDBM.  MLDBM is a front end to store complex data structures

  http://www.perl.com/CPAN-local/modules/by-module/MLDBM/CHAMAS/MLDBM-Sync-0.25.readme

What's nice about MLDBM is you can easily swap in  out various dbms
like SDBM_File, DB_File, GDBM_File, etc.  More recently it even
supports Tie::TextDir too, which provides key per file type storage
which is good when you have a fast file system  big data you want
to store.

SYNOPSIS
  use MLDBM::Sync;   # this gets the default, SDBM_File
  use MLDBM qw(DB_File Storable);# use Storable for serializing
  use MLDBM qw(MLDBM::Sync::SDBM_File);  # use extended SDBM_File, handles values 
 1024 bytes
  use Fcntl qw(:DEFAULT);# import symbols O_CREAT  O_RDWR for use 
with DBMs

  # NORMAL PROTECTED read/write with implicit locks per i/o request
  my $sync_dbm_obj = tie %cache, 'MLDBM::Sync' [..other DBM args..] or die $!;
  $cache{} = ;
  my $value = $cache{};

...

DESCRIPTION
This module wraps around the MLDBM interface, by handling concurrent
access to MLDBM databases with file locking, and flushes i/o explicity
per lock/unlock. The new [Read]Lock()/UnLock() API can be used to
serialize requests logically and improve performance for bundled reads 
writes.

Here's some benchmarks on my 2.4.x linux box dual PIII 450 with a couple
7200 RPM IDE drives  raid-1 ext3 fs mounted default async.

MLDBM-Sync-0.25]# perl bench/bench_sync.pl 

NUMBER OF PROCESSES IN TEST: 4

=== INSERT OF 50 BYTE RECORDS ===
  Time for 100 writes + 100 reads for  SDBM_File  0.17 seconds 
12288 bytes 
  Time for 100 writes + 100 reads for  MLDBM::Sync::SDBM_File 0.20 seconds 
12288 bytes 
  Time for 100 writes + 100 reads for  GDBM_File  1.06 seconds 
18066 bytes 
  Time for 100 writes + 100 reads for  DB_File0.63 seconds 
12288 bytes 
  Time for 100 writes + 100 reads for  Tie::TextDir .04   0.38 seconds 
13192 bytes 

=== INSERT OF 500 BYTE RECORDS ===
 (skipping test for SDBM_File 100 byte limit)
  Time for 100 writes + 100 reads for  MLDBM::Sync::SDBM_File 0.58 seconds
261120 bytes 
  Time for 100 writes + 100 reads for  GDBM_File  1.09 seconds 
63472 bytes 
  Time for 100 writes + 100 reads for  DB_File0.64 seconds 
98304 bytes 
  Time for 100 writes + 100 reads for  Tie::TextDir .04   0.33 seconds 
58192 bytes 

=== INSERT OF 5000 BYTE RECORDS ===
 (skipping test for SDBM_File 100 byte limit)
  Time for 100 writes + 100 reads for  MLDBM::Sync::SDBM_File 1.37 seconds   
4128768 bytes 
  Time for 100 writes + 100 reads for  GDBM_File  1.13 seconds
832400 bytes 
  Time for 100 writes + 100 reads for  DB_File1.08 seconds
831488 bytes 
  Time for 100 writes + 100 reads for  Tie::TextDir .04   0.52 seconds
508192 bytes 

=== INSERT OF 2 BYTE RECORDS ===
 (skipping test for SDBM_File 100 byte limit)
 (skipping test for MLDBM::Sync db size  1M)
  Time for 100 writes + 100 reads for  GDBM_File  1.76 seconds   
2063912 bytes 
  Time for 100 writes + 100 reads for  DB_File1.78 seconds   
2060288 bytes 
  Time for 100 writes + 100 reads for  Tie::TextDir .04   1.27 seconds   
2008192 bytes 

=== INSERT OF 5 BYTE RECORDS ===
 (skipping test for SDBM_File 100 byte limit)
 (skipping test for MLDBM::Sync db size  1M)
  Time for 100 writes + 100 reads for  GDBM_File  3.52 seconds   
5337944 bytes 
  Time for 100 writes + 100 reads for  DB_File3.37 seconds   
5337088 bytes 
  Time for 100 writes + 100 reads for  Tie::TextDir .04   2.80 seconds   
5008192 bytes 

--Josh

_
Joshua Chamas   Chamas Enterprises Inc.
NodeWorks Founder   Huntington Beach, CA  USA 
http://www.nodeworks.com1-714-625-4051



Re: Problem with DBM concurrent access

2002-04-04 Thread Perrin Harkins

  Isn't that just as simple as
 
  tied(%dbm_array)-sync();

 I believe that's not enough, because the reader may read data during
the
 write, resulting in corrupted data read.

Not only that, there's also the issue with at least some dbm
implementations that they cache part of the file in memory and will not
pick up changed data unless you untie and re-tie.  I remember a good
discussion about this on the list a year or two back.

- Perrin