[jira] [Commented] (THRIFT-1115) python TBase class for dynamic (de)serialization, and slots option for memory savings

Will Pierce (JIRA) Sun, 14 Aug 2011 08:53:51 -0700

    [ 
https://issues.apache.org/jira/browse/THRIFT-1115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084855#comment-13084855
 ]


Will Pierce commented on THRIFT-1115:
-------------------------------------

Here is sample comparison of the code generated for the 
trunk/test/ThriftTest.thrift's new "NestedMixedx2" struct, which is defined as:
{code}
struct NestedMixedx2 {
  1: list<set<i32>> int_set_list
  2: map<i32,set<string>> map_int_strset
  3: list<map<i32,set<string>>> map_int_strset_list
}
{code}

Here is the generated code from ttypes.py for this class using the new 
dynamic,slots option:
{code}
class NestedMixedx2(TBase):
  """
  Attributes:
   - int_set_list
   - map_int_strset
   - map_int_strset_list
  """

  __slots__ = [ 
    'int_set_list',
    'map_int_strset',
    'map_int_strset_list',
   ]

  thrift_spec = (
    None, # 0
    (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1
    (2, TType.MAP, 'map_int_strset', 
(TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2
    (3, TType.LIST, 'map_int_strset_list', 
(TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3
  )

  def __init__(self, int_set_list=None, map_int_strset=None, 
map_int_strset_list=None,):
    self.int_set_list = int_set_list
    self.map_int_strset = map_int_strset
    self.map_int_strset_list = map_int_strset_list
{code}

Here is the same class using old style default options:
{code}
class NestedMixedx2:
  """
  Attributes:
   - int_set_list
   - map_int_strset
   - map_int_strset_list
  """

  thrift_spec = (
    None, # 0
    (1, TType.LIST, 'int_set_list', (TType.SET,(TType.I32,None)), None, ), # 1
    (2, TType.MAP, 'map_int_strset', 
(TType.I32,None,TType.SET,(TType.STRING,None)), None, ), # 2
    (3, TType.LIST, 'map_int_strset_list', 
(TType.MAP,(TType.I32,None,TType.SET,(TType.STRING,None))), None, ), # 3
  )

  def __init__(self, int_set_list=None, map_int_strset=None, 
map_int_strset_list=None,):
    self.int_set_list = int_set_list
    self.map_int_strset = map_int_strset
    self.map_int_strset_list = map_int_strset_list

  def read(self, iprot):
    if iprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and 
isinstance(iprot.trans, TTransport.CReadableTransport) and self.thrift_spec is 
not None and f
astbinary is not None:
      fastbinary.decode_binary(self, iprot.trans, (self.__class__, 
self.thrift_spec))
      return
    iprot.readStructBegin()
    while True:
      (fname, ftype, fid) = iprot.readFieldBegin()
      if ftype == TType.STOP:
        break
      if fid == 1:
        if ftype == TType.LIST:
          self.int_set_list = []
          (_etype176, _size173) = iprot.readListBegin()
          for _i177 in xrange(_size173):
            _elem178 = set()
            (_etype182, _size179) = iprot.readSetBegin()
            for _i183 in xrange(_size179):
              _elem184 = iprot.readI32();
              _elem178.add(_elem184)
            iprot.readSetEnd()
            self.int_set_list.append(_elem178)
          iprot.readListEnd()
        else:
          iprot.skip(ftype)
      elif fid == 2:
        if ftype == TType.MAP:
          self.map_int_strset = {}
          (_ktype186, _vtype187, _size185 ) = iprot.readMapBegin() 
          for _i189 in xrange(_size185):
            _key190 = iprot.readI32();
            _val191 = set()
            (_etype195, _size192) = iprot.readSetBegin()
            for _i196 in xrange(_size192):
              _elem197 = iprot.readString();
              _val191.add(_elem197)
            iprot.readSetEnd()
            self.map_int_strset[_key190] = _val191
          iprot.readMapEnd()
        else:
          iprot.skip(ftype)
      elif fid == 3:
        if ftype == TType.LIST:
          self.map_int_strset_list = []
          (_etype201, _size198) = iprot.readListBegin()
          for _i202 in xrange(_size198):
            _elem203 = {}
            (_ktype205, _vtype206, _size204 ) = iprot.readMapBegin() 
            for _i208 in xrange(_size204):
              _key209 = iprot.readI32();
              _val210 = set()
              (_etype214, _size211) = iprot.readSetBegin()
              for _i215 in xrange(_size211):
                _elem216 = iprot.readString();
                _val210.add(_elem216)
              iprot.readSetEnd()
              _elem203[_key209] = _val210
            iprot.readMapEnd()
            self.map_int_strset_list.append(_elem203)
          iprot.readListEnd()
        else:
          iprot.skip(ftype)
      else:
        iprot.skip(ftype)
      iprot.readFieldEnd()
    iprot.readStructEnd()

  def write(self, oprot):
    if oprot.__class__ == TBinaryProtocol.TBinaryProtocolAccelerated and 
self.thrift_spec is not None and fastbinary is not None:
      oprot.trans.write(fastbinary.encode_binary(self, (self.__class__, 
self.thrift_spec)))
      return
    oprot.writeStructBegin('NestedMixedx2')
    if self.int_set_list is not None:
      oprot.writeFieldBegin('int_set_list', TType.LIST, 1)
      oprot.writeListBegin(TType.SET, len(self.int_set_list))
      for iter217 in self.int_set_list:
        oprot.writeSetBegin(TType.I32, len(iter217))
        for iter218 in iter217:
          oprot.writeI32(iter218)
        oprot.writeSetEnd()
      oprot.writeListEnd()
      oprot.writeFieldEnd()
    if self.map_int_strset is not None:
      oprot.writeFieldBegin('map_int_strset', TType.MAP, 2)
      oprot.writeMapBegin(TType.I32, TType.SET, len(self.map_int_strset))
      for kiter219,viter220 in self.map_int_strset.items():
        oprot.writeI32(kiter219)
        oprot.writeSetBegin(TType.STRING, len(viter220))
        for iter221 in viter220:
          oprot.writeString(iter221)
        oprot.writeSetEnd()
      oprot.writeMapEnd()
      oprot.writeFieldEnd()
    if self.map_int_strset_list is not None:
      oprot.writeFieldBegin('map_int_strset_list', TType.LIST, 3)
      oprot.writeListBegin(TType.MAP, len(self.map_int_strset_list))
      for iter222 in self.map_int_strset_list:
        oprot.writeMapBegin(TType.I32, TType.SET, len(iter222))
        for kiter223,viter224 in iter222.items():
          oprot.writeI32(kiter223)
          oprot.writeSetBegin(TType.STRING, len(viter224))
          for iter225 in viter224:
            oprot.writeString(iter225)
          oprot.writeSetEnd()
        oprot.writeMapEnd()
      oprot.writeListEnd()
      oprot.writeFieldEnd()
    oprot.writeFieldStop()
    oprot.writeStructEnd()

  def validate(self):
    return


  def __repr__(self):
    L = ['%s=%r' % (key, value)
      for key, value in self.__dict__.iteritems()]
    return '%s(%s)' % (self.__class__.__name__, ', '.join(L))

  def __eq__(self, other):
    return isinstance(other, self.__class__) and self.__dict__ == other.__dict__

  def __ne__(self, other):
    return not (self == other)
{code}

If this (THRIFT-1115) passes muster, then the next direction that might be 
interesting to go would be to look at reviving the python .thrift compiler from 
branches/py-compiler/compiler/py/src/, removing all of its code text 
generation, and instead build up the thrift classes at runtime using 
metaclasses.  If the parser/builder class lived in the python thrift libraries, 
then it would allow some kinds of applications (proxies maybe) to work more 
dynamically.


> python TBase class for dynamic (de)serialization, and __slots__ option for 
> memory savings
> -----------------------------------------------------------------------------------------
>
>                 Key: THRIFT-1115
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1115
>             Project: Thrift
>          Issue Type: New Feature
>          Components: Build Process, Python - Compiler, Python - Library
>            Reporter: Will Pierce
>            Assignee: Will Pierce
>         Attachments: THRIFT-1115.python_dynamic_code_and_slots_v1.patch, 
> THRIFT-1115.python_dynamic_code_and_slots_v2.patch, 
> THRIFT-1115.python_dynamic_code_and_slots_v3.patch, 
> THRIFT-1115.python_dynamic_code_and_slots_v4.patch, 
> THRIFT-1115.python_dynamic_code_and_slots_v5.patch, 
> THRIFT-1115.python_dynamic_code_and_slots_v6.patch, test_dynser.py, 
> test_size.py
>
>
> This patch adds several new features to the compiler for python and the 
> python libraries, and exercises the new features with expanded unit testing.
> This adds support for generating python classes that have no {{read()}} or 
> {{write()}} methods. Instead, the generated classes inherit from a new base 
> class, {{TProtocolDynamic}}. This new class implements the de/serialization 
> with {{read()}} and {{write()}} methods that iterate over the derived class's 
> "{{thrift_spec}}" class member that describes the structure and types of the 
> thrift.  This dynamic evaluation works with both binary and compact 
> protocols, and has the same hook as before for delegating de/serialization to 
> fastbinary.so for the "accelerated binary" protocol.  This new baseclass 
> {{read()}} method may even be more efficient than the generated explicit 
> {{read()}} code for objects with lots of attributes because it doesn't have a 
> case/switch style series of "{{if field_id == X...}}" nested inside a loop.  
> Instead, it indexes the decoded field ID into the {{thrift_spec}} tuple 
> directly.  That efficiency gain is probably just noise though, since the 
> dynamic process probably uses more CPU later on, though I haven't benchmarked 
> it. (Up[date: see the benchmarking results posted below for 
> construction/serialization/deserialization comparisons.)
> If the 'dynamic' flag is given as a -gen py: flag to the compiler, then the 
> generated classes no longer get individual {{\_\_repr\_\_}} and 
> {{\_\_eq\_\_}} and {{\_\_ne\_\_}} methods, instead they inherit from the 
> TProtocolDynamic base class implementation, which uses {{\_\_slots\_\_}} 
> instead of {{\_\_dict\_\_}} for repr and equality testing.
> When "dynamic" python classes are generated, they have very little code, just 
> a constructor and class data.  All the work of serialization and 
> deserialization is done by the base class.  This produces about 980 lines for 
> DebugProtoTest vs. 3540 lines in default "\-gen py" mode, or about 1/3 the 
> original code size.
> The {{\_\_slots\_\_}} support is available without requiring the dynamic base 
> class, so users can save memory using the slots flag to generate non-dict 
> based instances.  The memory difference between dict and slots based objects 
> is hard to measure, but seems to be around 10x smaller using slots, as long 
> as the base class also uses {{\_\_slots\_\_}}.  If the generated classes are 
> old-style, and use slots, there's no memory savings at all, because the base 
> class still creates a {{\_\_dict\_\_}} object for every instance.  Python is 
> just tricky when it comes to using {{\_\_slots\_\_}} best.
> The memory savings is pretty astounding using new-style classes and 
> {{\_\_slots\_\_}}.  Building DebugProtoTest.thrift with: -gen 
> py:dynamic,slots versus \-gen py results in some pretty amazing memory 
> savings.  I tested by instantiating 1 million of the heavy 
> DebugProtoTest.thrift's {{CompactProtoTestStruct()}}, which has 49 attributes 
> in it, using regular "\-gen py" code versus "{{\-gen py:dynamic,slots}}" and 
> compared the VmRSS resident memory usage of both processes.  I didn't set any 
> values to any attributes, so every attribute was left with the null value, 
> None.  The slots technique used 441 MB with slots vs. 3485 MB using 
> non-slots, non-dynamic generated code.  That's about 13% of the original 
> size, or 87% memory savings.
> I tried the same test using a very tiny object instead, the DebugThrift.thift 
> {{Bonk()}} class, which only has two fields.  For this, I made 10 million 
> instances of {{Bonk()}} and the results were very similar: 730 MB with slots 
> vs. 2622 MB using non-slots.
> The speed difference between non-slots and slots is about 25% better with 
> slots, or 14.6 seconds for non-slots 10 million Bonk() test, versus 11.1 
> seconds for the same slots test, or a savings of 24% time.  This only 
> measured object creation time, not serialization/deserialization or instance 
> attribute access. (See the benchmarking results below for 
> serialization/deserialization comparisons).
> If the user wants to subclass the {{TProtocolDynamic}} class and override the 
> read/write methods, or add more functionality, this patch also adds a 
> {{dynbase=CLS}} thrift compiler option and a "{{dynimport='from foo import 
> *}}" option, to specify a different base class (CLS) for the generated 
> "dynamic" objects, and an import line that brings that base class into scope.
> The python unit tests now get yet another test loop layer, with the 
> {{Makefile.am}} specifying multiple thrift generation lines, exercising the 
> possible combinations of {{thrift \-gen py:blah}}, sending the output to 
> multiple gen-py-blah output directories (like gen\-py\-default, 
> gen\-py\-dynamicslots, gen\-py\-newstyle, etc), so that 
> {{RunClientServer.py}} can then loop over each of these and do all of its 
> testing (which is combinatorially exhaustive) on each of those 
> subdirectories.  It's a pretty insane number of tests now, over 390 
> individual tests on python >= 2.5 (ssl + multiprocessing).  Since the test/py 
> "make check" tests aren't done for the 'casual' installer, I think it's 
> acceptable to have the "make check" tests take this long to complete- but it 
> is annoying to have to wait a few minutes for all the tests to 
> finish/succeed.  I think it could be improved a fair bit.
> This patch also makes a few other small changes:
> * adds a comment to the top of each generated .py file with the options to 
> \-gen, i.e. "{{py:dynamic,slots}}", to make it easier to figure out how the 
> thrift compiler generated code was created, after the fact.  This should 
> probably be backported to other languages.
> * added another test for inequality to {{SerializationTest.py}} to compare 
> objects of different types, to exercise the {{\_\_eq\_\_}} code more
> * added \-\-genpydir cmdline options to {{TestServer.py}} and 
> {{TestClient.py}} to let {{RunClientServer.py}} loop over the gen\-py\-\* 
> directories as a kind of test controller.
> * reduced verbosity on the TestServer/Client.py code from 2 to 1, since the 
> number of invocations is now huge.
> * changed TestServer's intentional Exception to have a message "Exception 
> test PASSES." instead of "foo", which previously made the unit test output 
> look kind of scary.
> * Adds container read/write methods to {{TProtocol}}, only used by 
> {{TProtocolDynamic}} for "dynamic" generated classes
> * Added a {{_VALUES_TO_NAMES}} list to {{TType}} in {{Thrift.py}} to make it 
> easier to log about {{TType}} related (de)serialization issues.
> * Changes the import line in generated code from "{{from thrift.Thrift import 
> *}}" to "{{from thrift.Thrift import TType, TMessageType}}" to be a little 
> more controlled about what is imported into the generated\-code namespace.  
> May be slightly more efficient too.
> * reduces the inter\-test {{sleep()}} delays somewhat, to try to speed up the 
> "make check" unit testing a bit.  hopefully the delays are large enough that 
> no spurious errors pop up.
> Things to ponder:
> Using -gen py:dynamic,slots actually results in python rejecting any attempt 
> to assign to a non-existent member variable of a thrift object.  It's not the 
> same as statically-typed languages blowing up at compile time for the same 
> reason, but it could expose more subtle user-code errors that otherwise are 
> undetected during serialization.
> There's an interesting edge case in python 2.4 that arises due to old-style 
> classes and exception processing that only shows up with new-style classes 
> (those derived ultimately from object).  I'm not sure how to handle this, 
> since it's a problem that you don't know exists at code-generation time, 
> since we can't really know what version of python is going to execute the 
> generated code.  Usually it just doesn't matter, but the case of exceptions 
> with multiple inheritance is an edge case, it seems.  Since this is only a 
> tricky problem for the unit tests, I put some failure detection code into 
> {{TestClient.py}} to skip the exception testing if the generated class causes 
> a {{TypeError}} exception when raised, instead of itself.  The test code 
> prints a warning: "{{Skipping testing of Xception because python (2, 4) does 
> not support this type of exception (gen py is: gen\-py\-dynamic)}}".
> I'm not sure the {{TProtocolDynamic}} class name is the best possible name 
> for the base class for "dynamic" objects to derive from.  The 
> {{TProtocolDynamic}} class does embody the underlying protocol, at an 
> abstract level, but the only other class in thrift/protocol/ that it is 
> similar to is the {{TProtocolBase}} class.
> This patch adds more dynamic behavior to the thrift library, and it makes 
> sense that we'd want to take it another step and implement a python class 
> that can parse a .thrift file, and generate the corresponding 
> message/struct/etc classes at runtime.  This patch doesn't do that, but I 
> think it makes it easier.  Obviously the use-cases are compelling for 
> generating the actual per-class {{read()}} and {{write()}} methods at 
> thrift-compile time.  But there's some interesting possibilities ahead.
> The unit testing code needs a little more TLC... I'm not too familiar with 
> best practices here, but I'd love to see some code-coverage metrics produced 
> by the unit testing / test cases.  It would be nice to know if some code 
> paths aren't being exercised.  I won't be surprised if there are edge-cases 
> that break the dynamic/slots generated code.  It would be nice to know of 
> any, so we can update the unit tests to explore it (and fix whatever it is).  
> The unit tests don't exercise the py:twisted compatible code at all, so 
> there's likely some lurking bugs in there.
> With this patch, the new output from {{thrift -h}} relating to python is:
> {noformat}
>   --gen STR   Generate code with a dynamically-registered generator.
>                 STR has the form language[:key1=val1[,key2,[key3=val3]]].
>                 Keys and values are options passed to the generator.
>                 Many options will not require values.
> Available generators (and options):
> [ ... snip ... ]
>   py (Python):
>     new_style:       Generate new-style classes.
>     twisted:         Generate Twisted-friendly RPC services.
>     slots:           Generate code using slots for instance members.
>     dynamic:         Generate dynamic code, less code generated but slower.
>     dynbase=CLS      Derive generated classes from class CLS instead of 
> TProtocolDynamic.
>     dynimport='from foo.bar import CLS'
>                      Add an import line to generated code to find the dynbase 
> class.
> {noformat}
> Sorry for such a long ticket body.  I'll attach the patch in a minute or so,

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (THRIFT-1115) python TBase class for dynamic (de)serialization, and __slots__ option for memory savings

Reply via email to

[jira] [Commented] (THRIFT-1115) python TBase class for dynamic (de)serialization, and slots option for memory savings