[ 
https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623623#action_12623623
 ] 

Shalin Shekhar Mangar commented on THRIFT-110:
----------------------------------------------

We evaluated using protocol buffers to replace the custom binary serialization 
format in Solr. However, we stopped short because we already had a very 
optimized client for Java (even more optimized than protobuf e.g. with extern 
strings and streaming support). The C++ client was not very relevant because 
not many of Solr users use C++ for consuming it. Python was the only other 
language protobuf supported which was interesting to us. Thrift on the other 
hand has the advantage of supporting multiple languages which are very relevant 
to Solr and besides it's in incubation at Apache so we are kind of biased 
towards it :)

Looking at this discussion, I can gather a few points:
# DenseProtocol can be enhanced with these suggestions provided people are 
willing to do it. Almost all suggestions seem to have been shot down. That 
cannot work if Thrift wants to get acceptance as a de-facto open source binary 
protocol.
# We don't have to support it in all languages right from the start. We can 
start with C++ and Java -- release it and then keep adding more languages to 
the fray (release early, release often). Let the early adopters use this new 
format and give us feedback. In these early stages, we don't even need to worry 
about back compatibility until we get to 1.0

>From Solr's side I can say that we have the flexibility of not using certain 
>features if they are impossible to support by Thrift across all languages. The 
>format would anyway be faster than the XML which we were using previously. It 
>would be great if we can provide the flexibility of an efficient binary 
>protocol across multiple languages using Thrift rather than a custom format.

I suggest we start with:
# A list of things that will be good to have in the format
# What would easily be accommodated across all languages? Not all things will 
be equally efficient which is OK.

Btw is there an issue open on DenseProtocol?

> A more compact format 
> ----------------------
>
>                 Key: THRIFT-110
>                 URL: https://issues.apache.org/jira/browse/THRIFT-110
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Noble Paul
>
> Thrift is not very compact in writing out data as (say protobuf) . It does 
> not have the concept of variable length integers and various other 
> optimizations possible . In Solr we use a lot of such optimizations to make a 
> very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value  in same byte, very fast 
> writes of Strings, externalizable strings etc 
> We could use a thrift format for non-java clients and I would like to see it 
> as compact as the current java version

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to