[ https://issues.apache.org/jira/browse/THRIFT-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084855#comment-13084855 ]
Will Pierce commented on THRIFT-1115: ------------------------------------- Here is sample comparison of the code generated for the trunk/test/ThriftTest.thrift's new "NestedMixedx2" struct, which is defined as: {code} struct NestedMixedx2 { 1: list<set<i32>> int_set_list 2: map<i32,set<string>> map_int_strset 3: list<map<i32,set<string>>> map_int_strset_list } {code} Here is the generated code from ttypes.py for this class using the new dynamic,slots option: {code} class NestedMixedx2(TBase): """ Attributes: - int_set_list - map_int_strset - map_int_strset_list """ __slots__ = [ 'int_set_list', 'map_int_strset', 'map_int_strset_list', ] thrift_spec = ( None, # 0 (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1 (2, TType.MAP, 'map_int_strset', (TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2 (3, TType.LIST, 'map_int_strset_list', (TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3 ) def __init__(self, int_set_list=None, map_int_strset=None, map_int_strset_list=None,): self.int_set_list = int_set_list self.map_int_strset = map_int_strset self.map_int_strset_list = map_int_strset_list {code} Here is the same class using old style default options: {code} class NestedMixedx2: """ Attributes: - int_set_list - map_int_strset - map_int_strset_list """ thrift_spec = ( None, # 0 (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1 (2, TType.MAP, 'map_int_strset', (TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2 (3, TType.LIST, 'map_int_strset_list', (TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3 ) def __init__(self, int_set_list=None, map_int_strset=None, map_int_strset_list=None,): self.int_set_list = int_set_list self.map_int_strset = map_int_strset self.map_int_strset_list = map_int_strset_list def read(self, iprot): if iprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is not None and f astbinary is not None: fastbinary.decode_binary(self, iprot.trans, (self.__class__, self.thrift_spec)) return iprot.readStructBegin() while True: (fname, ftype, fid) = iprot.readFieldBegin() if ftype == TType.STOP: break if fid == 1: if ftype == TType.LIST: self.int_set_list = [] (_etype176, _size173) = iprot.readListBegin() for _i177 in xrange(_size173): _elem178 = set() (_etype182, _size179) = iprot.readSetBegin() for _i183 in xrange(_size179): _elem184 = iprot.readI32(); _elem178.add(_elem184) iprot.readSetEnd() self.int_set_list.append(_elem178) iprot.readListEnd() else: iprot.skip(ftype) elif fid == 2: if ftype == TType.MAP: self.map_int_strset = {} (_ktype186, _vtype187, _size185 ) = iprot.readMapBegin() for _i189 in xrange(_size185): _key190 = iprot.readI32(); _val191 = set() (_etype195, _size192) = iprot.readSetBegin() for _i196 in xrange(_size192): _elem197 = iprot.readString(); _val191.add(_elem197) iprot.readSetEnd() self.map_int_strset[_key190] = _val191 iprot.readMapEnd() else: iprot.skip(ftype) elif fid == 3: if ftype == TType.LIST: self.map_int_strset_list = [] (_etype201, _size198) = iprot.readListBegin() for _i202 in xrange(_size198): _elem203 = {} (_ktype205, _vtype206, _size204 ) = iprot.readMapBegin() for _i208 in xrange(_size204): _key209 = iprot.readI32(); _val210 = set() (_etype214, _size211) = iprot.readSetBegin() for _i215 in xrange(_size211): _elem216 = iprot.readString(); _val210.add(_elem216) iprot.readSetEnd() _elem203[_key209] = _val210 iprot.readMapEnd() self.map_int_strset_list.append(_elem203) iprot.readListEnd() else: iprot.skip(ftype) else: iprot.skip(ftype) iprot.readFieldEnd() iprot.readStructEnd() def write(self, oprot): if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and self.thrift_spec is not None and fastbinary is not None: oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, self.thrift_spec))) return oprot.writeStructBegin('NestedMixedx2') if self.int_set_list is not None: oprot.writeFieldBegin('int_set_list', TType.LIST, 1) oprot.writeListBegin(TType.SET, len(self.int_set_list)) for iter217 in self.int_set_list: oprot.writeSetBegin(TType.I32, len(iter217)) for iter218 in iter217: oprot.writeI32(iter218) oprot.writeSetEnd() oprot.writeListEnd() oprot.writeFieldEnd() if self.map_int_strset is not None: oprot.writeFieldBegin('map_int_strset', TType.MAP, 2) oprot.writeMapBegin(TType.I32, TType.SET, len(self.map_int_strset)) for kiter219,viter220 in self.map_int_strset.items(): oprot.writeI32(kiter219) oprot.writeSetBegin(TType.STRING, len(viter220)) for iter221 in viter220: oprot.writeString(iter221) oprot.writeSetEnd() oprot.writeMapEnd() oprot.writeFieldEnd() if self.map_int_strset_list is not None: oprot.writeFieldBegin('map_int_strset_list', TType.LIST, 3) oprot.writeListBegin(TType.MAP, len(self.map_int_strset_list)) for iter222 in self.map_int_strset_list: oprot.writeMapBegin(TType.I32, TType.SET, len(iter222)) for kiter223,viter224 in iter222.items(): oprot.writeI32(kiter223) oprot.writeSetBegin(TType.STRING, len(viter224)) for iter225 in viter224: oprot.writeString(iter225) oprot.writeSetEnd() oprot.writeMapEnd() oprot.writeListEnd() oprot.writeFieldEnd() oprot.writeFieldStop() oprot.writeStructEnd() def validate(self): return def __repr__(self): L = ['%s=%r' % (key, value) for key, value in self.__dict__.iteritems()] return '%s(%s)' % (self.__class__.__name__, ', '.join(L)) def __eq__(self, other): return isinstance(other, self.__class__) and self.__dict__ == other.__dict__ def __ne__(self, other): return not (self == other) {code} If this (THRIFT-1115) passes muster, then the next direction that might be interesting to go would be to look at reviving the python .thrift compiler from branches/py-compiler/compiler/py/src/, removing all of its code text generation, and instead build up the thrift classes at runtime using metaclasses. If the parser/builder class lived in the python thrift libraries, then it would allow some kinds of applications (proxies maybe) to work more dynamically. > python TBase class for dynamic (de)serialization, and __slots__ option for > memory savings > ----------------------------------------------------------------------------------------- > > Key: THRIFT-1115 > URL: https://issues.apache.org/jira/browse/THRIFT-1115 > Project: Thrift > Issue Type: New Feature > Components: Build Process, Python - Compiler, Python - Library > Reporter: Will Pierce > Assignee: Will Pierce > Attachments: THRIFT-1115.python_dynamic_code_and_slots_v1.patch, > THRIFT-1115.python_dynamic_code_and_slots_v2.patch, > THRIFT-1115.python_dynamic_code_and_slots_v3.patch, > THRIFT-1115.python_dynamic_code_and_slots_v4.patch, > THRIFT-1115.python_dynamic_code_and_slots_v5.patch, > THRIFT-1115.python_dynamic_code_and_slots_v6.patch, test_dynser.py, > test_size.py > > > This patch adds several new features to the compiler for python and the > python libraries, and exercises the new features with expanded unit testing. > This adds support for generating python classes that have no {{read()}} or > {{write()}} methods. Instead, the generated classes inherit from a new base > class, {{TProtocolDynamic}}. This new class implements the de/serialization > with {{read()}} and {{write()}} methods that iterate over the derived class's > "{{thrift_spec}}" class member that describes the structure and types of the > thrift. This dynamic evaluation works with both binary and compact > protocols, and has the same hook as before for delegating de/serialization to > fastbinary.so for the "accelerated binary" protocol. This new baseclass > {{read()}} method may even be more efficient than the generated explicit > {{read()}} code for objects with lots of attributes because it doesn't have a > case/switch style series of "{{if field_id == X...}}" nested inside a loop. > Instead, it indexes the decoded field ID into the {{thrift_spec}} tuple > directly. That efficiency gain is probably just noise though, since the > dynamic process probably uses more CPU later on, though I haven't benchmarked > it. (Up[date: see the benchmarking results posted below for > construction/serialization/deserialization comparisons.) > If the 'dynamic' flag is given as a -gen py: flag to the compiler, then the > generated classes no longer get individual {{\_\_repr\_\_}} and > {{\_\_eq\_\_}} and {{\_\_ne\_\_}} methods, instead they inherit from the > TProtocolDynamic base class implementation, which uses {{\_\_slots\_\_}} > instead of {{\_\_dict\_\_}} for repr and equality testing. > When "dynamic" python classes are generated, they have very little code, just > a constructor and class data. All the work of serialization and > deserialization is done by the base class. This produces about 980 lines for > DebugProtoTest vs. 3540 lines in default "\-gen py" mode, or about 1/3 the > original code size. > The {{\_\_slots\_\_}} support is available without requiring the dynamic base > class, so users can save memory using the slots flag to generate non-dict > based instances. The memory difference between dict and slots based objects > is hard to measure, but seems to be around 10x smaller using slots, as long > as the base class also uses {{\_\_slots\_\_}}. If the generated classes are > old-style, and use slots, there's no memory savings at all, because the base > class still creates a {{\_\_dict\_\_}} object for every instance. Python is > just tricky when it comes to using {{\_\_slots\_\_}} best. > The memory savings is pretty astounding using new-style classes and > {{\_\_slots\_\_}}. Building DebugProtoTest.thrift with: -gen > py:dynamic,slots versus \-gen py results in some pretty amazing memory > savings. I tested by instantiating 1 million of the heavy > DebugProtoTest.thrift's {{CompactProtoTestStruct()}}, which has 49 attributes > in it, using regular "\-gen py" code versus "{{\-gen py:dynamic,slots}}" and > compared the VmRSS resident memory usage of both processes. I didn't set any > values to any attributes, so every attribute was left with the null value, > None. The slots technique used 441 MB with slots vs. 3485 MB using > non-slots, non-dynamic generated code. That's about 13% of the original > size, or 87% memory savings. > I tried the same test using a very tiny object instead, the DebugThrift.thift > {{Bonk()}} class, which only has two fields. For this, I made 10 million > instances of {{Bonk()}} and the results were very similar: 730 MB with slots > vs. 2622 MB using non-slots. > The speed difference between non-slots and slots is about 25% better with > slots, or 14.6 seconds for non-slots 10 million Bonk() test, versus 11.1 > seconds for the same slots test, or a savings of 24% time. This only > measured object creation time, not serialization/deserialization or instance > attribute access. (See the benchmarking results below for > serialization/deserialization comparisons). > If the user wants to subclass the {{TProtocolDynamic}} class and override the > read/write methods, or add more functionality, this patch also adds a > {{dynbase=CLS}} thrift compiler option and a "{{dynimport='from foo import > *}}" option, to specify a different base class (CLS) for the generated > "dynamic" objects, and an import line that brings that base class into scope. > The python unit tests now get yet another test loop layer, with the > {{Makefile.am}} specifying multiple thrift generation lines, exercising the > possible combinations of {{thrift \-gen py:blah}}, sending the output to > multiple gen-py-blah output directories (like gen\-py\-default, > gen\-py\-dynamicslots, gen\-py\-newstyle, etc), so that > {{RunClientServer.py}} can then loop over each of these and do all of its > testing (which is combinatorially exhaustive) on each of those > subdirectories. It's a pretty insane number of tests now, over 390 > individual tests on python >= 2.5 (ssl + multiprocessing). Since the test/py > "make check" tests aren't done for the 'casual' installer, I think it's > acceptable to have the "make check" tests take this long to complete- but it > is annoying to have to wait a few minutes for all the tests to > finish/succeed. I think it could be improved a fair bit. > This patch also makes a few other small changes: > * adds a comment to the top of each generated .py file with the options to > \-gen, i.e. "{{py:dynamic,slots}}", to make it easier to figure out how the > thrift compiler generated code was created, after the fact. This should > probably be backported to other languages. > * added another test for inequality to {{SerializationTest.py}} to compare > objects of different types, to exercise the {{\_\_eq\_\_}} code more > * added \-\-genpydir cmdline options to {{TestServer.py}} and > {{TestClient.py}} to let {{RunClientServer.py}} loop over the gen\-py\-\* > directories as a kind of test controller. > * reduced verbosity on the TestServer/Client.py code from 2 to 1, since the > number of invocations is now huge. > * changed TestServer's intentional Exception to have a message "Exception > test PASSES." instead of "foo", which previously made the unit test output > look kind of scary. > * Adds container read/write methods to {{TProtocol}}, only used by > {{TProtocolDynamic}} for "dynamic" generated classes > * Added a {{_VALUES_TO_NAMES}} list to {{TType}} in {{Thrift.py}} to make it > easier to log about {{TType}} related (de)serialization issues. > * Changes the import line in generated code from "{{from thrift.Thrift import > *}}" to "{{from thrift.Thrift import TType, TMessageType}}" to be a little > more controlled about what is imported into the generated\-code namespace. > May be slightly more efficient too. > * reduces the inter\-test {{sleep()}} delays somewhat, to try to speed up the > "make check" unit testing a bit. hopefully the delays are large enough that > no spurious errors pop up. > Things to ponder: > Using -gen py:dynamic,slots actually results in python rejecting any attempt > to assign to a non-existent member variable of a thrift object. It's not the > same as statically-typed languages blowing up at compile time for the same > reason, but it could expose more subtle user-code errors that otherwise are > undetected during serialization. > There's an interesting edge case in python 2.4 that arises due to old-style > classes and exception processing that only shows up with new-style classes > (those derived ultimately from object). I'm not sure how to handle this, > since it's a problem that you don't know exists at code-generation time, > since we can't really know what version of python is going to execute the > generated code. Usually it just doesn't matter, but the case of exceptions > with multiple inheritance is an edge case, it seems. Since this is only a > tricky problem for the unit tests, I put some failure detection code into > {{TestClient.py}} to skip the exception testing if the generated class causes > a {{TypeError}} exception when raised, instead of itself. The test code > prints a warning: "{{Skipping testing of Xception because python (2, 4) does > not support this type of exception (gen py is: gen\-py\-dynamic)}}". > I'm not sure the {{TProtocolDynamic}} class name is the best possible name > for the base class for "dynamic" objects to derive from. The > {{TProtocolDynamic}} class does embody the underlying protocol, at an > abstract level, but the only other class in thrift/protocol/ that it is > similar to is the {{TProtocolBase}} class. > This patch adds more dynamic behavior to the thrift library, and it makes > sense that we'd want to take it another step and implement a python class > that can parse a .thrift file, and generate the corresponding > message/struct/etc classes at runtime. This patch doesn't do that, but I > think it makes it easier. Obviously the use-cases are compelling for > generating the actual per-class {{read()}} and {{write()}} methods at > thrift-compile time. But there's some interesting possibilities ahead. > The unit testing code needs a little more TLC... I'm not too familiar with > best practices here, but I'd love to see some code-coverage metrics produced > by the unit testing / test cases. It would be nice to know if some code > paths aren't being exercised. I won't be surprised if there are edge-cases > that break the dynamic/slots generated code. It would be nice to know of > any, so we can update the unit tests to explore it (and fix whatever it is). > The unit tests don't exercise the py:twisted compatible code at all, so > there's likely some lurking bugs in there. > With this patch, the new output from {{thrift -h}} relating to python is: > {noformat} > --gen STR Generate code with a dynamically-registered generator. > STR has the form language[:key1=val1[,key2,[key3=val3]]]. > Keys and values are options passed to the generator. > Many options will not require values. > Available generators (and options): > [ ... snip ... ] > py (Python): > new_style: Generate new-style classes. > twisted: Generate Twisted-friendly RPC services. > slots: Generate code using slots for instance members. > dynamic: Generate dynamic code, less code generated but slower. > dynbase=CLS Derive generated classes from class CLS instead of > TProtocolDynamic. > dynimport='from foo.bar import CLS' > Add an import line to generated code to find the dynbase > class. > {noformat} > Sorry for such a long ticket body. I'll attach the patch in a minute or so, -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira