On Thu, May 5, 2011 at 4:43 AM, Pedro Ferreira <jose.pedro.ferre...@cern.ch> wrote:
... > Since we are talking about speed, does anyone have any tips on making > ZODB (in general) faster? It's hard to give general advice. The basic thing to be aware of is that lods from ZEO cache have some cost, but are fairly cheap especially if your ZEO cache can fit in the local operating-system disk cache. (You may want to add RAM to your clients.) Loads from the storage server are quite a bit more expensive. If you have a large database (100s of Gigs) then disk access times on your storage server can also be a major factor. Effective use of ZEO cache can make a big difference. To decide how big your ZEO cache should be, turn on ZEO cache tracing, starting from an empty cache, and use the ZEO/scripts/cache_simul.py to experiment with different cache sizes. To enable cache tracing, run your application with the ZEO_CACHE_TRACE environment variable set to a non-empty value. > In our project, the DB is apparently the > bottleneck, and we are considering implementing a memcache layer in > order to avoid fetching so often from DB. You should look at your ZEO cache configuration first. You may be able to improve performance quite a bit by simply increasing your ZEO cache sizes. If you have larger ZEO caches, you should probably use persistent caches. You probably also want to set: - drop-cache-rather-verify to true on the client - set invalidation-age on the server to at least an hour or two so that you deal with being disconnected from the storage server for a reasonable period of time without having to verify. If you haven't, make sure binary data like photos, movies, whatever are stored in blobs. Consider compression your database records: http://pypi.python.org/pypi/zc.zlibstorage Not only will thhis save disk space, but it will allow more of your database to fit in the storage server's disk cache. It will allow your ZEO caches to store 2-3 times as many records for a given cache size. A disadvantage of ZEO caches is that they aren't shared between processes and I've been thinking of ways to leverage something like memcached. > However, we were also > wondering if we could in some way take advantage of different computer > hardware - since the ZEO server is mostly single-threaded we thought of > getting a machine with higher clock freq and larger cache rather than a > commodity 8-core server (which is what we are using now). Is your storage server CPU bound? Starting with ZODB 3.10, ZEO storage servers are multi-threaded. They have a thread for each client. We have a storage server that has run at 120% cpu on a 4-core box. Also, if you use zc.FileStorage, packing is mostly done in a separate process. > Any tips on the kind of hardware that performs best with ZODB/ZEO? A major source of slow down *can* be disk access times. How's IO wait on your server? If IO wait is high, then consider adding ram to get a larger cache or moving the database to an ssd. We're running our largest most active database on an SSD. (Blobs are still on magnetic disk.) Again, compression can help a lot here, allowing databases to fit on ssd that otherwise wouldn't. > Are > there any adjustments that can be done at the OS or even application > layer that might improve performance? Look at how your application is using data. If you have requests that have to load a lot of data, maybe you can refactor your application to load fewer. Jim -- Jim Fulton http://www.linkedin.com/in/jimfulton _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev