> There are a couple factual inaccuracies on the site that I'd like to clear up > first: > Trivial benchmarks put cerealizer and banana/jelly on the same level as far > as performance goes: > $ python -m timeit -s 'from cereal import dumps; L = ["Hello", " ", ("w", > "o", "r", "l", "d", ".")]' 'dumps(L)' > 10000 loops, best of 3: 84.1 usec per loop > $ python -m timeit -s 'from twisted.spread import banana, jelly; dumps = > lambda o: banana.encode(jelly.jelly(o)); L = ["Hello", " ", ("w", "o", "r", > "l", "d", ".")]' 'dumps(L)' > 10000 loops, best of 3: 89.7 usec per loop > > This is with cBanana though, which has to be explicitly enabled and, of > course, is written in C. So Cerealizer looks like it has the potential to do > pretty well, performance-wise.
My personal benchmark was different; it was using a list with 2000 objects defined as following: class O(object): def __init__(self): self.x = 1 self.s = "jiba" self.o = None with self.o referring to another O object. I think my benchmark, although still very limited, is more representative since it involves object, string, number and list. See it there: http://svn.gna.org/viewcvs/*checkout*/soya/trunk/cerealizer/test/test1.py?content-type=text%2Fplain&rev=31 The results are (using Psyco): With old-style classes: cerealizer dumps in 0.0619530677795 s, 114914 bytes length loads in 0.0313038825989 s cPickle dumps in 0.0301840305328 s, 116356 bytes length loads in 0.023097038269 s jelly + banana dumps in 0.168012142181 s 169729 bytes length loads in 1.82081913948 s jelly + cBanana dumps in 0.082946062088 s 169729 bytes length loads in 0.156159877777 s With new-style classes: cerealizer dumps in 0.0575239658356 s, 114914 bytes length loads in 0.028165102005 s cPickle dumps in 0.07634806633 s, 116428 bytes length loads in 0.0278959274292 s jelly + banana dumps in 0.156242132187 s 169729 bytes length (TypeError; I didn't investigate this problem yet although it is surely solvable) jelly + cBanana dumps in 0.10772895813 s 169729 bytes length (TypeError; I didn't investigate this problem yet although it is surely solvable) As you see, cPickle is about 2 times faster than cerealizer for old-style classes, but cerealizer beats cPickle for new-style classes (which makes sense since I have optimized it for new-style classes). However, Jelly is far behind, even using cBanana, especially for loading. > You talked about _Tuple and _Dereference on the website as well. These are > internal implementation details. jelly also supports extension types, by way > of setUnjellyableForClass and similar functions. The problem arises only when the extension type expects an attribute of a specific class, e.g. (in Pyrex): cdef class MyClass: cdef MyClass other The other attribute of MyClass can only contains a reference to an instance of MyClass (or None). Thus it cannot be set to an instance of _Dereference or _Tuple, even temporarily; doing other = _Dereference(...) raises an exception. I solve this problem in Cerealizer by doing a 2-pass object creation: step 1, create all the objects; step 2, set all objects' states. > As far as security goes, no obvious problems jump out at me, either > from the API for from skimming the code. I think early-binding > __new__, __getstate__, and __setstate__ may be going further than > is necessary. If someone can find code to set attributes on classes > in your process space, they can probably already do anything they > want to your program and don't need to exploit security problems in > your serializer. I agree on that; however I prefer to be "over-secure" than "just as secure as necessary" :-) Thank you for your opinion! I'm going to update my website. Jiba -- http://mail.python.org/mailman/listinfo/python-list