Re: ActiveMQ implementation of protobuf

Kenton Varda Fri, 18 Sep 2009 23:16:44 -0700

Somehow I missed that message.  Sorry about that.
I'd definitely like to have lazy parsing (as an option) in the official
implementation.  The reason I'm "stressing" is because there's a lot of
these things that I'd like protocol buffers to have, but I don't have enough
time to write them all myself, so I need help from contributors.
 Unfortunately it seems that a lot of people would rather write their own
implementations from scratch than try to contribute to the main one -- you
aren't the first person who has done this.  That said, having competition is
a good thing too.


Regarding maven plugins -- why can't the plugin just invoke protoc using
Runtime.exec()?  What's the benefit of having the code generator running
inside the Maven process?  Honest question -- I don't know very much about
Maven.

On Fri, Sep 18, 2009 at 7:36 PM, [email protected]
<[email protected]>wrote:

>
> Firstly, I want to clarify that I did not write the benchmark that I
> plugged into.  There is no ill intent.  I published the benchmark so
> that folks take the time to look into why my implementation performed
> so much better.  I think it's good to have healthy discussions about
> the pros and cons of alternative implementations which deliver
> different sets of features.
>
> The main reason I started from scratch is that I wanted to implement a
> java based code generator so that it would be easy to embed in a maven
> plugin or ant task.  Furthermore, It was just more expedient to start
> from a clean slate and design my ideal object model.
> I did ping this list over a year ago to gauge if there would be any
> interest in collaborating, but did not garner interest. So, I did not
> pursue it further:
>
>
> http://groups.google.com/group/protobuf/browse_thread/thread/fe7ea8706b40146f/bdd22ddf89e4a6d3?#bdd22ddf89e4a6d3
>
> Perhaps I'm misreading you, but it seems like there have been very few
> ideas that you are actually interested in from my implementation.  So
> I'm not sure why you're stressing about me rolling this out as new
> implementation.
>
> Bottom line, is I would LOVE IT if the google implementation achieves
> feature parity with mine.  That way it's one less code base I need to
> maintain!  Best of luck and if you do change your mind and want to
> poach any of the concepts or code, please feel free to do so.
>
> Regards,
> Hiram
>
> On Sep 18, 9:40 pm, Kenton Varda <[email protected]> wrote:
> > I think the usual way we would have solved this problem at Google would
> be
> > to have the message "payload" be encoded separately and embedded in the
> > "envelope" as a "bytes" field, e.g.:
> >   message Envelope {
> >     required string to_address = 1;
> >     optional string from_address = 2;
> >     required bytes payload = 3;  // an encoded message
> >   }
> >
> > It's not as transparent as your solution, but it is a whole lot simpler,
> and
> > the behavior is easy to understand.
> >
> > That said, again, there's nothing preventing lazy parsing from being
> added
> > to Google's Java protobuf implementation, so I'm not sure why writing
> > something completely new was necessary.
> >
> > As far as the performance arguments go, I'd again encourage you to create
> a
> > benchmark that actually measures the performance of the case where the
> > application code ends up accessing all the fields.  If you really think
> > there's no significant overhead, prove it.  :)
> >
> > I'd also suggest that you not publish benchmarks implying that your
> > implementation is an order of magnitude faster at parsing without
> explaining
> > what is really going on.  It's rather misleading.
> >
> > On Fri, Sep 18, 2009 at 5:53 PM, [email protected]
> > <[email protected]>wrote:
> >
> >
> >
> > > Hi Kenton,
> >
> > > Let me start off by describing my usage scenario.
> >
> > > I'm interested in using protobuf to implement the messaging protocol
> > > between clients and servers of a distributed messaging system.  For
> > > simplicity, lets pretend the that protocol is similar to xmpp and that
> > > there are severs which handle delivering messages to and from clients.
> >
> > > In this case, the server clearly is not interested in the meat of the
> > > messages being sent around.  It is typically only interested routing
> > > data.  In this case, deferred decoding provides a substantial win.
> > > Furthermore, when the server passes on the message to the consumer, he
> > > does not need to encode the message again.  For important messages,
> > > the server may be configured to persist those messages as they come
> > > in, so the server would once again benefit from not having to encode
> > > the message yet again.
> >
> > > I don't think the user could implement those optimizations on their
> > > own without support from the protobuf implementation.  At least not as
> > > efficiently and elegantly.  You have to realize that the 'free
> > > encoding' holds true for even nested message structures in the
> > > message.  So lets say that the user aggregating data from multiple
> > > source protobuf messages and is picking data out of it and placing it
> > > into a new protobuf message that then gets encoded.  Only the outer
> > > message would need encoding, the inner nested element which were
> > > picked from the other buffers would benefit from the 'free encoding'.
> >
> > > The overhead of the lazy decoding is exactly 1 extra "if (bean ==
> > > null)" statement, which is probably cheaper than most virtual dispatch
> > > invocations.  But if you're really trying to milk the performance out
> > > of your app, you should just call buffer.copy() to get the bean
> > > backing the buffer.  All get operations on the bean do NOT have the
> > > overhead.
> >
> > > Regarding threading, since the buffer is immutable and decoding is
> > > idempotent, you don't really need to worry about thread safety.  Worst
> > > case scenario is that 2 threads decode the same buffer concurrently
> > > and then set the bean field of the buffer.  Since the resulting beans
> > > are equal, in most cases it would not really matter which thread wins
> > > when they overwrite the bean field.
> >
> > > As for up front validation, in my use case, deferring validation is a
> > > feature.  The less work the server has to do the better since, it will
> > > help scale vertically.  I do agree that in some use cases it would be
> > > desirable to fully validate up front.  I think it should be up to the
> > > application to decide if it wants up front validation or deferred
> > > decoding.  For example, it would be likely that the client of the
> > > messaging protocol would opt for up front validation.   On the other
> > > hand, the server would use deferred decoding.  It's definitely a
> > > performance versus consistency trade-off.
> >
> > > I think that once you make 'free encoding', and deferred decoding an
> > > option, users that have high performance use cases will design their
> > > application so that they can exploit those features as much as
> > > possible.
> >
> > > --
> > > Regards,
> > > Hiram
> >
> > > Blog:http://hiramchirino.com
> >
> > > Open Source SOA
> > >http://fusesource.com/
> >
> > > On Sep 18, 6:43 pm, Kenton Varda <[email protected]> wrote:
> > > > Hmm, your bean and buffer classes sound conceptually equivalent to my
> > > > builder and message classes.
> > > > Regarding lazy parsing, this is certainly something we've considered
> > > before,
> > > > but it introduces a lot of problems:
> >
> > > > 1) Every getter method must now first check whether the message is
> > > parsed,
> > > > and parse it if not.  Worse, for proper thread safety it really needs
> to
> > > > lock a mutex while performing this check.  For a fair comparison of
> > > parsing
> > > > speed, you really need another benchmark which measures the speed of
> > > > accessing all the fields of the message.  I think you'll find that
> > > parsing a
> > > > message *and* accessing all its fields is significantly slower with
> the
> > > lazy
> > > > approach.  Your approach might be faster in the case of a very deep
> > > message
> > > > in which the user only wants to access a few shallow fields, but I
> think
> > > > this case is relatively uncommon.
> >
> > > > 2) What happens if the message is invalid?  The user will probably
> expect
> > > > that calling simple getter methods will not throw parse exceptions,
> and
> > > > probably isn't in a good position to handle these exceptions.  You
> really
> > > > want to detect parse errors at parse time, not later on down the
> road.
> >
> > > > We might add lazy parsing to the official implementation at some
> point.
> > > >  However, the approach we'd probably take is to use it only on fields
> > > which
> > > > are explicitly marked with a "[lazy=true]" option.  Developers would
> use
> > > > this to indicate fields for which the performance trade-offs favor
> lazy
> > > > parsing, and they are willing to deal with delayed error-checking.
> >
> > > > In your blog post you also mention that encoding the same message
> object
> > > > multiple times without modifying it in between, or parsing a message
> and
> > > > then serializing it without modification, is "free"...  but how often
> > > does
> > > > this happen in practice?  These seem like unlikely cases, and easy
> for
> > > the
> > > > user to optimize on their own without support from the protobuf
> > > > implementation.
> >
> > > > On Fri, Sep 18, 2009 at 3:15 PM, [email protected]
> > > > <[email protected]>wrote:
> >
> > > > > Hi Kenton,
> >
> > > > > Your right, the reason that one benchmark has those results is
> because
> > > > > the implementation does lazy decoding.  While lazy decoding is
> nice, I
> > > > > think that implementation has a couple of other features which are
> > > > > equally as nice.  See more details about it them here:
> >
> > > > >
> http://hiramchirino.com/blog/2009/09/activemq-protobuf-implemtation-f.
> > > ..
> >
> > > > > It would have hard to impossible to implement some of the stuff
> > > > > without the completely different class structure it uses.  I'd be
> > > > > happy if it's features could be absorbed into the official
> > > > > implementation.  I'm just not sure how you could do that and
> maintain
> > > > > compatibility with your existing users.
> >
> > > > > If you have any suggestions of how we can integrate better please
> > > > > advise.
> >
> > > > > Regards,
> > > > > Hiram
> >
> > > > > On Sep 18, 12:34 pm, Kenton Varda <[email protected]> wrote:
> > > > > > So, his implementation is a little bit faster in two of the
> > > benchmarks,
> > > > > and
> > > > > > impossibly faster in the other one.  I don't really believe that
> it's
> > > > > > possible to improve parsing time by as much as he claims, except
> by
> > > doing
> > > > > > something like lazy parsing, which would just be deferring the
> work
> > > to
> > > > > later
> > > > > > on.  Would have been nice if he'd contributed his optimizations
> back
> > > to
> > > > > the
> > > > > > official implementation rather than write a whole new one...
> >
> > > > > > On Fri, Sep 18, 2009 at 1:38 AM, ijuma <[email protected]>
> wrote:
> >
> > > > > > > Hey all,
> >
> > > > > > > I ran across the following and thought it may be of interest to
> > > this
> > > > > > > list:
> >
> > >http://hiramchirino.com/blog/2009/09/activemq-protobuf-implementation.
> > > > > ..
> >
> > > > > > > Best,
> > > > > > > Ismael
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: ActiveMQ implementation of protobuf

Reply via email to