Re: ActiveMQ implementation of protobuf

hi...@hiramchirino.com Sat, 19 Sep 2009 07:11:17 -0700

I've been founding and contributing to open source projects for over
nine years now, so I understand your situation.


Here are my suggestions for encouraging users to contribute:

1) Folks have language preferences, so ideally the code generator for
a language should be written in the language of the implementation.
Why?  Because then you have a better chance that a user will turn into
a contributor since they will be able to grok and be comfortable with
all the parts of the implementation, including the code generator.
2) Some enhancements require more drastic changes than others.  You
should provide an avenue where folks can research and explore the
bigger drastic changes within your project.
3) Be more open to contributor feedback.  Even if an idea seems wacky
at first, encourage the contribution and have it at least go into an
experimental branch.

Regarding the maven question.  Let me first explain the build
challenges that most of the project I participate in experience.  I
spend most of my time working on ActiveMQ, Camel, and ServiceMix.  The
common thread with these projects is that they are integration
technologies.  And since they are integration technologies, their goal
is to integrate and leverage the strengths of as many technologies as
possible.

The build challenge this presents is that the laundry list of
dependencies that are needed to compile each project is mind
boggling.  Manually installing all the dependencies is a waste of
time.  Maven automates dependency downloading and this even includes
downloading the maven plugins that are used to compile a maven build.

The net result is users of maven builds hardly ever have to worry
about having the right prerequisites installed before kicking off the
build.  Having to exec out to protoc would break that concept.

--
Regards,
Hiram

Blog: http://hiramchirino.com

Open Source SOA
http://fusesource.com/

On Sep 19, 2:16 am, Kenton Varda <ken...@google.com> wrote:
> Somehow I missed that message.  Sorry about that.
> I'd definitely like to have lazy parsing (as an option) in the official
> implementation.  The reason I'm "stressing" is because there's a lot of
> these things that I'd like protocol buffers to have, but I don't have enough
> time to write them all myself, so I need help from contributors.
>  Unfortunately it seems that a lot of people would rather write their own
> implementations from scratch than try to contribute to the main one -- you
> aren't the first person who has done this.  That said, having competition is
> a good thing too.
>
> Regarding maven plugins -- why can't the plugin just invoke protoc using
> Runtime.exec()?  What's the benefit of having the code generator running
> inside the Maven process?  Honest question -- I don't know very much about
> Maven.
>
> On Fri, Sep 18, 2009 at 7:36 PM, hi...@hiramchirino.com
> <chir...@gmail.com>wrote:
>
>
>
> > Firstly, I want to clarify that I did not write the benchmark that I
> > plugged into.  There is no ill intent.  I published the benchmark so
> > that folks take the time to look into why my implementation performed
> > so much better.  I think it's good to have healthy discussions about
> > the pros and cons of alternative implementations which deliver
> > different sets of features.
>
> > The main reason I started from scratch is that I wanted to implement a
> > java based code generator so that it would be easy to embed in a maven
> > plugin or ant task.  Furthermore, It was just more expedient to start
> > from a clean slate and design my ideal object model.
> > I did ping this list over a year ago to gauge if there would be any
> > interest in collaborating, but did not garner interest. So, I did not
> > pursue it further:
>
> >http://groups.google.com/group/protobuf/browse_thread/thread/fe7ea870...
>
> > Perhaps I'm misreading you, but it seems like there have been very few
> > ideas that you are actually interested in from my implementation.  So
> > I'm not sure why you're stressing about me rolling this out as new
> > implementation.
>
> > Bottom line, is I would LOVE IT if the google implementation achieves
> > feature parity with mine.  That way it's one less code base I need to
> > maintain!  Best of luck and if you do change your mind and want to
> > poach any of the concepts or code, please feel free to do so.
>
> > Regards,
> > Hiram
>
> > On Sep 18, 9:40 pm, Kenton Varda <ken...@google.com> wrote:
> > > I think the usual way we would have solved this problem at Google would
> > be
> > > to have the message "payload" be encoded separately and embedded in the
> > > "envelope" as a "bytes" field, e.g.:
> > >   message Envelope {
> > >     required string to_address = 1;
> > >     optional string from_address = 2;
> > >     required bytes payload = 3;  // an encoded message
> > >   }
>
> > > It's not as transparent as your solution, but it is a whole lot simpler,
> > and
> > > the behavior is easy to understand.
>
> > > That said, again, there's nothing preventing lazy parsing from being
> > added
> > > to Google's Java protobuf implementation, so I'm not sure why writing
> > > something completely new was necessary.
>
> > > As far as the performance arguments go, I'd again encourage you to create
> > a
> > > benchmark that actually measures the performance of the case where the
> > > application code ends up accessing all the fields.  If you really think
> > > there's no significant overhead, prove it.  :)
>
> > > I'd also suggest that you not publish benchmarks implying that your
> > > implementation is an order of magnitude faster at parsing without
> > explaining
> > > what is really going on.  It's rather misleading.
>
> > > On Fri, Sep 18, 2009 at 5:53 PM, hi...@hiramchirino.com
> > > <chir...@gmail.com>wrote:
>
> > > > Hi Kenton,
>
> > > > Let me start off by describing my usage scenario.
>
> > > > I'm interested in using protobuf to implement the messaging protocol
> > > > between clients and servers of a distributed messaging system.  For
> > > > simplicity, lets pretend the that protocol is similar to xmpp and that
> > > > there are severs which handle delivering messages to and from clients.
>
> > > > In this case, the server clearly is not interested in the meat of the
> > > > messages being sent around.  It is typically only interested routing
> > > > data.  In this case, deferred decoding provides a substantial win.
> > > > Furthermore, when the server passes on the message to the consumer, he
> > > > does not need to encode the message again.  For important messages,
> > > > the server may be configured to persist those messages as they come
> > > > in, so the server would once again benefit from not having to encode
> > > > the message yet again.
>
> > > > I don't think the user could implement those optimizations on their
> > > > own without support from the protobuf implementation.  At least not as
> > > > efficiently and elegantly.  You have to realize that the 'free
> > > > encoding' holds true for even nested message structures in the
> > > > message.  So lets say that the user aggregating data from multiple
> > > > source protobuf messages and is picking data out of it and placing it
> > > > into a new protobuf message that then gets encoded.  Only the outer
> > > > message would need encoding, the inner nested element which were
> > > > picked from the other buffers would benefit from the 'free encoding'.
>
> > > > The overhead of the lazy decoding is exactly 1 extra "if (bean ==
> > > > null)" statement, which is probably cheaper than most virtual dispatch
> > > > invocations.  But if you're really trying to milk the performance out
> > > > of your app, you should just call buffer.copy() to get the bean
> > > > backing the buffer.  All get operations on the bean do NOT have the
> > > > overhead.
>
> > > > Regarding threading, since the buffer is immutable and decoding is
> > > > idempotent, you don't really need to worry about thread safety.  Worst
> > > > case scenario is that 2 threads decode the same buffer concurrently
> > > > and then set the bean field of the buffer.  Since the resulting beans
> > > > are equal, in most cases it would not really matter which thread wins
> > > > when they overwrite the bean field.
>
> > > > As for up front validation, in my use case, deferring validation is a
> > > > feature.  The less work the server has to do the better since, it will
> > > > help scale vertically.  I do agree that in some use cases it would be
> > > > desirable to fully validate up front.  I think it should be up to the
> > > > application to decide if it wants up front validation or deferred
> > > > decoding.  For example, it would be likely that the client of the
> > > > messaging protocol would opt for up front validation.   On the other
> > > > hand, the server would use deferred decoding.  It's definitely a
> > > > performance versus consistency trade-off.
>
> > > > I think that once you make 'free encoding', and deferred decoding an
> > > > option, users that have high performance use cases will design their
> > > > application so that they can exploit those features as much as
> > > > possible.
>
> > > > --
> > > > Regards,
> > > > Hiram
>
> > > > Blog:http://hiramchirino.com
>
> > > > Open Source SOA
> > > >http://fusesource.com/
>
> > > > On Sep 18, 6:43 pm, Kenton Varda <ken...@google.com> wrote:
> > > > > Hmm, your bean and buffer classes sound conceptually equivalent to my
> > > > > builder and message classes.
> > > > > Regarding lazy parsing, this is certainly something we've considered
> > > > before,
> > > > > but it introduces a lot of problems:
>
> > > > > 1) Every getter method must now first check whether the message is
> > > > parsed,
> > > > > and parse it if not.  Worse, for proper thread safety it really needs
> > to
> > > > > lock a mutex while performing this check.  For a fair comparison of
> > > > parsing
> > > > > speed, you really need another benchmark which measures the speed of
> > > > > accessing all the fields of the message.  I think you'll find that
> > > > parsing a
> > > > > message *and* accessing all its fields is significantly slower with
> > the
> > > > lazy
> > > > > approach.  Your approach might be faster in the case of a very deep
> > > > message
> > > > > in which the user only wants to access a few shallow fields, but I
> > think
> > > > > this case is relatively uncommon.
>
> > > > > 2) What happens if the message is invalid?  The user will probably
> > expect
> > > > > that calling simple getter methods will not throw parse exceptions,
> > and
> > > > > probably isn't in a good position to handle these exceptions.  You
> > really
> > > > > want to detect parse errors at parse time, not later on down the
> > road.
>
> > > > > We might add lazy parsing to the official implementation at some
> > point.
> > > > >  However, the approach we'd probably take is to use it only on fields
> > > > which
> > > > > are explicitly marked with a "[lazy=true]" option.  Developers would
> > use
> > > > > this to indicate fields for which the performance trade-offs favor
> > lazy
> > > > > parsing, and they are willing to deal with delayed error-checking.
>
> > > > > In your blog post you also mention that encoding the same message
> > object
> > > > > multiple times without modifying it in between, or parsing a message
> > and
> > > > > then serializing it without modification, is "free"...  but how often
> > > > does
> > > > > this happen in practice?  These seem like unlikely cases, and easy
> > for
> > > > the
> > > > > user to optimize on their own without support from the protobuf
> > > > > implementation.
>
> > > > > On Fri, Sep 18, 2009 at 3:15 PM, hi...@hiramchirino.com
> > > > > <chir...@gmail.com>wrote:
>
> > > > > > Hi Kenton,
>
> > > > > > Your right, the reason that one benchmark has those results is
> > because
> > > > > > the implementation does lazy decoding.  While lazy decoding is
> > nice, I
> > > > > > think that implementation has a couple of
>
> ...
>
> read more »
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to protobuf@googlegroups.com
To unsubscribe from this group, send email to 
protobuf+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/protobuf?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: ActiveMQ implementation of protobuf

Reply via email to