Re: Routing with Google Protocol Buffers

John O'Hara Sun, 22 Feb 2009 13:21:21 -0800

"The exchange then opens the protocol spec file and determines the key
number of the field we want."
So the broker needs to have access to the application message definitions at
runtime?


That was at the heart of the question I asked...

Cheers
John

2009/2/22 Joshua Kramer <[email protected]>

> I agree that it's preferable to have one exchange handle multiple message
> types, as it reduces code maintenance.  Here are a few relevant questions
> that I thought of.
>
> Does XQuilla require the entire XML document, or can its document
> projection feature tell you how much of the PB message you have to XML-ify
> before it can get a valid match?
>
> How much value would a speed increase provide on structured data?  Does
> this discussion have practical value, or is it just a science experiment?
>
> If I were to implement a PB exchange, I would do so in the following
> manner, after having read the documentation on message formats (
> http://code.google.com/apis/protocolbuffers/docs/encoding.html).  I don't
> think I would implement a new query language; that doesn't make sense for
> the reasons you outlined.  I don't think it would be difficult to implement
> an XPath mechanism, if it contained a subset of XPath.
>
> A. On exchange subscription, the client gives the exchange an XPath query.
>  The exchange then opens the protocol spec file and determines the key
> number of the field we want.  If the requested element is at the top level,
> the rule is simple: field n equals value y.  If the requested element is one
> or more levels deep, we build a rule chain: first, get field n; then, get
> field o; then, get field p.  (Field p is in the object that is field o,
> which in turn is in the object that is field n).
>
> B. When we get a message:
>      1. If the MSB = 0 AND we are not working on something, skip to next
> byte.
>      2. If the MSB = 1:
>            i. right-shift 1 step.  Does the field number equal the one
> we're referencing?  If yes:
>                  a. Get the data and move forward the number of bytes
> specified by the type and length.
>                  b. Test the data to see if it matches the right of the =
> in the XPath.  If yes, route the message as specified by the subscription
> and return.
>                  c. If the XPath rule is a chain, and the data matches this
> link, then repeat from step 2 on this particular object using the next link
> in the rulechain.
>                  d. Goto 1.
>            ii. If no:
>                  a. Skip the number of bytes specified in the type and
> length.
>                  b. Goto 1.
>
> It seems that this would use fewer CPU cycles than XML-ifying the entire
> message, and that doesn't include running the resulting data through
> XQzilla.
>
> Having said that - it may be necessary to use the full set of XPath
> functionality, and in that case we'd have to XMLify the message.
>
> Thoughts?
>
> Cheers,
> -Josh
>
>
> Jonathan Robie wrote:
>
>> Joshua Kramer wrote:
>>
>>> Jonathan Robie wrote:
>>>
>>>> There is a reflection API for protocol buffers that would allow you to
>>>> easily create an XML representation:
>>>>
>>> Good thoughts, Jonathan.  I hadn't considered doing it that way before.
>>>  Here's a question, though... how many CPU cycles would your method take,
>>> compared to modifying XQuilla (or creating our own query mechanism) to
>>> directly route the messages as they exist in the wire format they enter the
>>> broker?  One of the primary benefits of using PB with QPid is the speed with
>>> which structured data may be processed.
>>>
>> I rather suspect that the difference in processing time would be much
>> smaller than the overhead of reading the message, but this is something best
>> found by trying it and measuring it, then optimizing. If we can get good
>> enough performance, I see a real advantage to using one exchange type for
>> XML, Protocol Buffers, and JSON, and using the same language to specify
>> criteria for all three.
>>
>> If we create our own query mechanism, we wind up creating our own query
>> language, I've done this a few times in different settings, and it takes
>> work to get it right. And it would be a language used by a very small
>> community. If we use a standard structured query language, XQuery seems to
>> be the main contender.
>>
>> XQilla can query many kinds of input - a Xerces DOM tree, an istream
>> (which requires serialized XML),  a SAX Stream, among others. It probably
>> optimizes best for an istream, because it does "document projection", which
>> means that it does not parse the entire document if the query clearly
>> requires only part of the document.  This is of most benefit when the
>> message content is large.
>>
>> Jonathan
>>
>> ---------------------------------------------------------------------
>> Apache Qpid - AMQP Messaging Implementation
>> Project:      http://qpid.apache.org
>> Use/Interact: mailto:[email protected]
>>
>>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:[email protected]
>
>

Re: Routing with Google Protocol Buffers

Reply via email to