[Zope3-dev] 64-bit BTrees
I have a need for 64-bit BTrees (at least for IOBTree and OIBTree), and I'm not the first. I've created a feature development branch for this, and checked in my initial implementation. I've modified the existing code to use PY_LONG_LONG instead of int for the key and/or value type; there's no longer a 32-bit version in the modified code. Any Python int or long that can fit in 64 bits is accepted; ValueError is raised for values that require 65 bits (or more). Keys and values that can be reported as Python ints are, and longs are only returned when the value cannot be converted to a Python int. This can have a substantial effect on memory consumption, since keys and/or values now take twice the space. There may be performance issues as well, but those have not been tested. There are new unit tests, but more are likely needed. If you're interested in getting the code from Subversion, it's available at: svn://svn.zope.org/repos/main/ZODB/branches/fdrake-64bits/ Ideally, this or some variation on this could be folded back into the main development for ZODB. If this is objectionable, making 64-bit btrees available would require introducing new versions of the btrees (possibly named LLBTree, LOBTree, and OLBTree). I welcome comments. -Fred -- Fred L. Drake, Jr.fdrake at gmail.com Don't let schooling interfere with your education. -- Mark Twain ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
[Zope3-dev] Re: 64-bit BTrees
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Fred Drake wrote: I have a need for 64-bit BTrees (at least for IOBTree and OIBTree), and I'm not the first. I've created a feature development branch for this, and checked in my initial implementation. I've modified the existing code to use PY_LONG_LONG instead of int for the key and/or value type; there's no longer a 32-bit version in the modified code. Any Python int or long that can fit in 64 bits is accepted; ValueError is raised for values that require 65 bits (or more). Keys and values that can be reported as Python ints are, and longs are only returned when the value cannot be converted to a Python int. This can have a substantial effect on memory consumption, since keys and/or values now take twice the space. There may be performance issues as well, but those have not been tested. There are new unit tests, but more are likely needed. If you're interested in getting the code from Subversion, it's available at: svn://svn.zope.org/repos/main/ZODB/branches/fdrake-64bits/ Ideally, this or some variation on this could be folded back into the main development for ZODB. If this is objectionable, making 64-bit btrees available would require introducing new versions of the btrees (possibly named LLBTree, LOBTree, and OLBTree). I think coming up with new types is the only reasonable thing to do, given the prevalence of persistent BTrees out in the wild. Changing the runtime behavior (footprint, performance) of those objects is probably not something which most users are going to want, at least not without carefully considering the implications. Tres. - -- === Tres Seaver +1 202-558-7113 [EMAIL PROTECTED] Palladion Software Excellence by Designhttp://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFEPpyu+gerLs4ltQ4RAmh1AJ9/dLigNMrQgIFNASKWbpvboapywwCePV22 /3d8kFGTjipAVCsy5fnuLa4= =xe6v -END PGP SIGNATURE- ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
[Zope3-dev] Re: [Zope-dev] Re: 64-bit BTrees
Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Fred Drake wrote: I have a need for 64-bit BTrees (at least for IOBTree and OIBTree), and I'm not the first. I've created a feature development branch for this, and checked in my initial implementation. I've modified the existing code to use PY_LONG_LONG instead of int for the key and/or value type; there's no longer a 32-bit version in the modified code. Any Python int or long that can fit in 64 bits is accepted; ValueError is raised for values that require 65 bits (or more). Keys and values that can be reported as Python ints are, and longs are only returned when the value cannot be converted to a Python int. This can have a substantial effect on memory consumption, since keys and/or values now take twice the space. There may be performance issues as well, but those have not been tested. There are new unit tests, but more are likely needed. If you're interested in getting the code from Subversion, it's available at: svn://svn.zope.org/repos/main/ZODB/branches/fdrake-64bits/ Ideally, this or some variation on this could be folded back into the main development for ZODB. If this is objectionable, making 64-bit btrees available would require introducing new versions of the btrees (possibly named LLBTree, LOBTree, and OLBTree). I think coming up with new types is the only reasonable thing to do, given the prevalence of persistent BTrees out in the wild. Changing the runtime behavior (footprint, performance) of those objects is probably not something which most users are going to want, at least not without carefully considering the implications. It really depends on what the impact is. It would be nice to get a feel for whether this really impacts memory or performance for real applications. This adds 4-bytes per key or value. That isn't much, especially in a typical Zope application. Similarly, it's hard to say what the difference in C integer operations will be. I can easily imagine it being negligible (or being significant :). OTOH, adding a new type could be a huge PITA. We'd like to use these with existing catalog and index code, all of which uses IIBTrees. If the performance impacts are modest, I'd much rather declare IIBTrees to use 64-bit rather than 32-bit integers. I suppose an alternative would be to add a mechanism to configure IIBTrees to use either 32-bit or 64-bit integers at run-time. Jim -- Jim Fulton mailto:[EMAIL PROTECTED] Python Powered! CTO (540) 361-1714http://www.python.org Zope Corporation http://www.zope.com http://www.zope.org ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com
Re: [Zope3-dev] Re: [Zope-dev] Re: 64-bit BTrees
[Tres Seaver] ... I would guess that if you could do a census of all the OIDs in all the Datas.fs in the world, a significant majority of them would be instances of classes declared in IOBTree / IIBTree (certainly the bulk of *transaction* records are going to be tied up with them). Provided it still works, people can use ZODB's analyze.py to figure that out. But supposing I flavors of BTrees are the only objects that exist, what follows from that? It's not clear. I can guarantee that multiunion() will run slower, because its workhorse radix sort will need 8 (instead of 4) passes. Beyond that, it requires someone to try it. I'm reminded that when the MEMS Exchange wrote Durus (a kind of ZODB lite ;-): http://www.mems-exchange.org/software/durus/ ) they left their entire BTree implementation coded in Python -- it was fast enough that way. The difference between ZODB's BTree C code pushing 4 or 8 bytes around at a time may well be insignificant overall. If done carefully, pickle sizes probably won't change: cPickle has a large number of ways to pickle integers, and picks the smallest one needed to hold an integer's actual value. Provided the internal getstate() functions are careful to avoid Python longs when possible, bigger pickles won't happen unless more than 32 bits are actually needed to hold an integer. There's also that ZODB's current I trees are badly broken on 64-bit boxes (except for Win64) in at least this way: http://collector.zope.org/Zope/1592 That problem would go away by magic. looks-like-a-case-of-measure-twice-cut-once-ly y'rs - tim ___ Zope3-dev mailing list Zope3-dev@zope.org Unsub: http://mail.zope.org/mailman/options/zope3-dev/archive%40mail-archive.com