Re: [ZODB-Dev] API question

2013-01-15 Thread Jim Fulton
On Mon, Jan 14, 2013 at 1:32 PM, Tres Seaver tsea...@palladion.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 While working on preparation for a Py3k port, I've stumbled across a
 fundamental issue with how ZODB structures its API.  Do we intend that
 client code do the following::

   from ZDOB import DB, FileStorage
   db = DB(FileStorage('/path/to/Data.fs'))

As Marius points out, this doesn't work.


 or use the module as a facade ::

   import ZODB
   db = ZODB.DB(ZODB.FileStorage.FileStorage('/path/to/Data.fs'))

This doesn't work either. You haven't imported FileStorage.

WRT ZODB.DB, ZODB.DB is an age-old convenience. It's unfortunate that
ZODB.DB (the class) shadows the module, ZODB.DB, just like the class
ZODB.FileStorage.FileStorage shadows the modules
ZODB.FileStorage.FileStorage.FileStorage. (Of course, it's also
unfortunate that there's a ZODB.FileStorage.FileStorage.FileStorage
module. :)

If we had a do-over, we'd use ZODB.db.DB and
ZODB.filestorage.FileStorage, and ZODB.DB would be a convenience for
ZODB.db.DB.


 I would actually prefer that clients explicitly import the intermediate
 modules::

   from ZDOB import DB, FileStorage
   db = DB.DB(FileStorage.FileStorage('/path/to/Data.fs'))

So you don't mind shadowing FileStorage.FileStorage.FileStorage. ;)

 or even better::

   from ZDOB.DB import DB
   # This one can even be ambiguous now

FTR, I don't like this style.  Somewhat a matter of taste.


   from ZODB.FileStorage import FileStorage
   db = DB(FileStorage('/path/to/Data.fs'))

 The driver for the question is getting the tests to pass under both
 'nosetests' and 'setup.py test', where the order of module imports etc.
 can make the ambiguous cases problematic.  It would be a good time to do
 whatever BBB stuff we need to (I would guess figuring out how to emit
 deprecation warnings for whichever variants) before releasing 4.0.0.

I'm pretty happy with the Zope test runner and I don't think using
nosetests is a good reason to cause backward-incompatibility. The zope
test runner works just fine with Python 3. Why do you feel compelled
to introduce nose?

I'm sort of in favor of moving to nose to follow the crowd, although
otherwise, nose is far too implicit for my taste. It doesn't hande
doctest well at all.

Having said that, if I was going to do something like this, I'd
rename the modules, ZODB.DB and ZODB.FileStorage to ZODB.db and
ZODB.filestorage and add module aliases for backward compatibility. I
don't know if that would be enough to satisfy nose.

I'm not up for doing any of this for 4.0.  I'm not alergic to a 5.0 in
the not too distant future.  I'm guessing that a switch to nose would
also make you rewrite all of the doctests as unittests. As the
prrimary maintainer of ZODB, I'm -0.8 on that.

Back to APIs, I think 90% of users don't import the APIs but set up
ZODB via ZConfig (or probably should, if they don't).  For Python use,
I think the ZODB.DB class short-cut us useful.  Over the last few
years, ZODB has grown some additional shortcuts that I think are also
useful. Among them:

ZODB.DB(filename) - DB with a file storage
ZODB.DB(None) - DB with a mapping storage
ZODB.connection(filename) - connection to DB with file storage
ZODB.connection(None) - connection to DB with mapping storage

More importantly:

ZEO.client us a shortcut for ZEO.ClientStorage.ClientStorage
ZEO.DB(addr or port) - DB with a ZEO client
ZEO.connection(addr or port) - connection to DB with a ZEO client

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] API question

2013-01-15 Thread Jim Fulton
On Mon, Jan 14, 2013 at 7:20 PM, Tres Seaver tsea...@palladion.com wrote:
...
 I'm tempted to rename the 'DB.py' module 'db.py', and jam in a BBB entry
 in sys.modules for 'ZODB.DB';  likewise, I am tempted to rename the
 'FileStorage.py' package 'filestorage', its same-named module
 '_filestorage.py', and jam in BBB entries for the old names.

+.9 if done without backward-incompatiblke breakage. This would be a
4.1 thing.  +1 if you used zodb.filestorage.filestorage rather than
zodb.filestorage._filestorage.

 Those renames would make the preferred API:
from ZODB import DB # convenience alias for the class
from ZODB import db # the moodule
from ZODB.db import DB # my preferred speling
from ZDOB.filestorage imoprt FileStorage # conv. alias for class
from ZODB import filestorage # the package
from ZODB.filestorage import FileStorage # my preferred speling

This is the same as one earlier.  I suspect you meant:

from ZODB.filestorage._filestorage import FileStorage

but couldn't type the underware.

I don't think the packagification of the FileStorage module was a win,
but it's too hard to fix it now.

Some day, I'd like to work on a filestorage2, but fear I won't ever
find the time. :(

from ZODB.filestorage import _filestorage # if needed

We shouldn't design an API where we expected people to grab underware.

Aside from not liking from imports and the _filestorage nit, +1

 For extra bonus fun, we could rename 'ZODB' to 'zodb' :)

In that case, we might switch to a namespace package, oodb, which I've
already reserved:

  http://pypi.python.org/pypi/oodb

But I doubt we're up for this much disruption.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Claudiu Saftoiu
Hello all,

I'm looking to speed up my server and it seems memcached would be a good
way to do it - at least for the `Catalog` (I've already put the catalog in
a separate
zodb with a separate zeoserver with persistent client caching enabled and it
still doesn't run as nice as I like...)

I've googled around a bit and found nothing definitive, though... what's
the
best way to combine zodb/zeo + memcached as of now?

Thanks,
- Claudiu
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Leonardo Santagada
On Tue, Jan 15, 2013 at 3:10 PM, Jim Fulton j...@zope.com wrote:

 On Tue, Jan 15, 2013 at 12:00 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  Hello all,
 
  I'm looking to speed up my server and it seems memcached would be a good
  way to do it - at least for the `Catalog` (I've already put the catalog
 in a
  separate
  zodb with a separate zeoserver with persistent client caching enabled
 and it
  still doesn't run as nice as I like...)
 
  I've googled around a bit and found nothing definitive, though... what's
 the
  best way to combine zodb/zeo + memcached as of now?

 My opinion is that a distributed memcached isn't
 a big enough win, but this likely depends on your  use cases.

 We (ZC) took a different approach.  If there is a reasonable way
 to classify your corpus by URL (or other request parameter),
 then check out zc.resumelb.  This fit our use cases well.


Maybe I don't understand zodb correctly but if the catalog is small enough
to fit in memory wouldn't it be much faster to just cache the whole catalog
on the clients? Then at least for catalog searches it is all mostly as fast
as running through python objects. Memcache will put an extra
serialize/deserialize step into it (plus network io, plus context
switches).

-- 

Leonardo Santagada
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Claudiu Saftoiu
On Tue, Jan 15, 2013 at 2:07 PM, Leonardo Santagada santag...@gmail.comwrote:




 On Tue, Jan 15, 2013 at 3:10 PM, Jim Fulton j...@zope.com wrote:

 On Tue, Jan 15, 2013 at 12:00 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  Hello all,
 
  I'm looking to speed up my server and it seems memcached would be a good
  way to do it - at least for the `Catalog` (I've already put the catalog
 in a
  separate
  zodb with a separate zeoserver with persistent client caching enabled
 and it
  still doesn't run as nice as I like...)
 
  I've googled around a bit and found nothing definitive, though...
 what's the
  best way to combine zodb/zeo + memcached as of now?

 My opinion is that a distributed memcached isn't
 a big enough win, but this likely depends on your  use cases.

 We (ZC) took a different approach.  If there is a reasonable way
 to classify your corpus by URL (or other request parameter),
 then check out zc.resumelb.  This fit our use cases well.


 Maybe I don't understand zodb correctly but if the catalog is small enough
 to fit in memory wouldn't it be much faster to just cache the whole catalog
 on the clients? Then at least for catalog searches it is all mostly as fast
 as running through python objects. Memcache will put an extra
 serialize/deserialize step into it (plus network io, plus context
 switches).


That would be fine, actually. Is there a way to explicitly tell ZODB/ZEO to
load an entire object and keep it in the cache? I also want it to remain in
the cache on connection restart, but I think I've already accomplished that
with persistent client-side caching.
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Jim Fulton
On Tue, Jan 15, 2013 at 2:08 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
 On Tue, Jan 15, 2013 at 2:07 PM, Leonardo Santagada santag...@gmail.com
 wrote:




 On Tue, Jan 15, 2013 at 3:10 PM, Jim Fulton j...@zope.com wrote:

 On Tue, Jan 15, 2013 at 12:00 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  Hello all,
 
  I'm looking to speed up my server and it seems memcached would be a
  good
  way to do it - at least for the `Catalog` (I've already put the catalog
  in a
  separate
  zodb with a separate zeoserver with persistent client caching enabled
  and it
  still doesn't run as nice as I like...)
 
  I've googled around a bit and found nothing definitive, though...
  what's the
  best way to combine zodb/zeo + memcached as of now?

 My opinion is that a distributed memcached isn't
 a big enough win, but this likely depends on your  use cases.

 We (ZC) took a different approach.  If there is a reasonable way
 to classify your corpus by URL (or other request parameter),
 then check out zc.resumelb.  This fit our use cases well.


 Maybe I don't understand zodb correctly but if the catalog is small enough
 to fit in memory wouldn't it be much faster to just cache the whole catalog
 on the clients? Then at least for catalog searches it is all mostly as fast
 as running through python objects. Memcache will put an extra
 serialize/deserialize step into it (plus network io, plus context switches).


 That would be fine, actually. Is there a way to explicitly tell ZODB/ZEO to
 load an entire object and keep it in the cache? I also want it to remain in
 the cache on connection restart, but I think I've already accomplished that
 with persistent client-side caching.

You can't cause a specific object (or collection of objects) to stay
ion the cache, but if you're working set is small enough to fit in
the memory or client cache, you can get the same effect.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Leonardo Santagada
On Tue, Jan 15, 2013 at 5:15 PM, Claudiu Saftoiu csaft...@gmail.com wrote:

 You can't cause a specific object (or collection of objects) to stay
  ion the cache, but if you're working set is small enough to fit in
 the memory or client cache, you can get the same effect.


So just setting the cache size for the catalog db to be bigger than the
database is everything you need to have it mostly not touch the disc at
all.


 That makes sense. So, is there any way to give ZODB a Persistent and tell
 it load everything about the object now for this transaction so  that the
 cache mechanism then gets triggered, or do I have to do a custom search
 through every aspect of the object, touching all Persistents it touches,
 etc, in order to get everything loaded? Essentially, when  the server
 restarts, I'd like to pre-load all these objects (my cache is indeed big
 enough), so that if a few hours later someone makes a request that uses it,
 the objects will already be cached instead of starting to be cached right
 then.


You can do that using a set or web requests that warm up your app. This way
you cache everything not only zodb data.


-- 

Leonardo Santagada
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Jim Fulton
So, first, a concise partial answer to a previous question:

ZODB provides an in-memory object cache.  This is non-persistent.
If you restart, it is lost.  There is a cache per connection and the
cache size is limited by both object count and total object size (as
estimated by database record size).

ZEO also provides a disk-based cache of database records read
from the server.  This is normally much larger than the in-memory cache.
It can be configured to be persistent.  If you're using blobs, then there
is a separate blob cache.

On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
 You can't cause a specific object (or collection of objects) to stay
 ion the cache, but if you're working set is small enough to fit in
 the memory or client cache, you can get the same effect.


 That makes sense. So, is there any way to give ZODB a Persistent and tell it
 load everything about the object now for this transaction so  that the
 cache mechanism then gets triggered, or do I have to do a custom search
 through every aspect of the object, touching all Persistents it touches,
 etc, in order to get everything loaded? Essentially, when  the server
 restarts, I'd like to pre-load all these objects (my cache is indeed big
 enough), so that if a few hours later someone makes a request that uses it,
 the objects will already be cached instead of starting to be cached right
 then.

ZODB doesn't provide any pre-warming facility.  This would be
application dependent.

You're probably better off using a persistent ZEO cache
and letting the cache fill with objects you actually use.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Claudiu Saftoiu
On Tue, Jan 15, 2013 at 2:40 PM, Jim Fulton j...@zope.com wrote:

 So, first, a concise partial answer to a previous question:

 ZODB provides an in-memory object cache.  This is non-persistent.
 If you restart, it is lost.  There is a cache per connection and the
 cache size is limited by both object count and total object size (as
 estimated by database record size).

 ZEO also provides a disk-based cache of database records read
 from the server.  This is normally much larger than the in-memory cache.
 It can be configured to be persistent.  If you're using blobs, then there
 is a separate blob cache.

 On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  You can't cause a specific object (or collection of objects) to stay
  ion the cache, but if you're working set is small enough to fit in
  the memory or client cache, you can get the same effect.
 
 
  That makes sense. So, is there any way to give ZODB a Persistent and
 tell it
  load everything about the object now for this transaction so  that the
  cache mechanism then gets triggered, or do I have to do a custom search
  through every aspect of the object, touching all Persistents it touches,
  etc, in order to get everything loaded? Essentially, when  the server
  restarts, I'd like to pre-load all these objects (my cache is indeed big
  enough), so that if a few hours later someone makes a request that uses
 it,
  the objects will already be cached instead of starting to be cached right
  then.

 ZODB doesn't provide any pre-warming facility.  This would be
 application dependent.

 You're probably better off using a persistent ZEO cache
 and letting the cache fill with objects you actually use.


Okay, that makes sense. Would that be a server-side cache, or a client-side
cache? I believe I've already succeeded in getting a client-side persistent
disk-based cache to work (my zodb_indexdb_uri is
zeo://%(here)s/zeo_indexdb.sock?cache_size=2000MBconnection_cache_size=50connection_pool_size=5var=zeocacheclient=index),
but this doesn't seem to be what you're referring to as that is exactly the
same size as the in-memory cache. Could you provide some pointers as to how
to get a persistent disk-based cache on the ZEO server, if that is what you
meant? It seems ZODB/ZEO  really lacks for centralized documentation.

Thanks,
- Claudiu
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] what's the latest on zodb/zeo+memcached?

2013-01-15 Thread Jim Fulton
On Tue, Jan 15, 2013 at 2:45 PM, Claudiu Saftoiu csaft...@gmail.com wrote:
 On Tue, Jan 15, 2013 at 2:40 PM, Jim Fulton j...@zope.com wrote:

 So, first, a concise partial answer to a previous question:

 ZODB provides an in-memory object cache.  This is non-persistent.
 If you restart, it is lost.  There is a cache per connection and the
 cache size is limited by both object count and total object size (as
 estimated by database record size).

 ZEO also provides a disk-based cache of database records read
 from the server.  This is normally much larger than the in-memory cache.
 It can be configured to be persistent.  If you're using blobs, then there
 is a separate blob cache.

 On Tue, Jan 15, 2013 at 2:15 PM, Claudiu Saftoiu csaft...@gmail.com
 wrote:
  You can't cause a specific object (or collection of objects) to stay
  ion the cache, but if you're working set is small enough to fit in
  the memory or client cache, you can get the same effect.
 
 
  That makes sense. So, is there any way to give ZODB a Persistent and
  tell it
  load everything about the object now for this transaction so  that the
  cache mechanism then gets triggered, or do I have to do a custom search
  through every aspect of the object, touching all Persistents it touches,
  etc, in order to get everything loaded? Essentially, when  the server
  restarts, I'd like to pre-load all these objects (my cache is indeed big
  enough), so that if a few hours later someone makes a request that uses
  it,
  the objects will already be cached instead of starting to be cached
  right
  then.

 ZODB doesn't provide any pre-warming facility.  This would be
 application dependent.

 You're probably better off using a persistent ZEO cache
 and letting the cache fill with objects you actually use.


 Okay, that makes sense. Would that be a server-side cache, or a client-side
 cache?

There are no server-side caches (other than the OS disk cache).

 I believe I've already succeeded in getting a client-side persistent
 disk-based cache to work (my zodb_indexdb_uri is
 zeo://%(here)s/zeo_indexdb.sock?cache_size=2000MBconnection_cache_size=50connection_pool_size=5var=zeocacheclient=index),

This configuration syntax isn't part of ZODB.  I'm not familiar with
the options there.

 but this doesn't seem to be what you're referring to as that is exactly the
 same size as the in-memory cache.

I doubt it, but who knows?

 Could you provide some pointers as to how
 to get a persistent disk-based cache on the ZEO server, if that is what you
 meant?

ZODB is configured via ZConfig.  The parameters are defined here:

  https://github.com/zopefoundation/ZODB/blob/master/src/ZODB/component.xml

Not too readable, but at least precise. :/

Look at the parameters for zodb and zeoclient.

Here's an example:

zodb main
  cache-size 10
  pool-size 7

  zeoclient
blob-cache-size 1GB
blob-dir /home/zope/foo-classifieds/blob-cache
cache-size 2GB
server das-head1.foo.zope.net:11100
server das-head2.foo.zope.net:11100
  /zeoclient
/zodb

If you want to use this syntax with paste, see:

  http://pypi.python.org/pypi/zc.zodbwsgi

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
Jerky is better than bacon! http://zo.pe/Kqm
___
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev