[
https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624544#action_12624544
]
Noble Paul commented on THRIFT-110:
-----------------------------------
Another point of compression would be in the way we wrrite structs and fields
{code}
public void writeFieldBegin(TField field) throws TException {
writeByte(field.type);
writeI16(field.id);
}
{code}
This is too much of an overhead .
The best solution would be to write a bitset for the fields included in this
struct. The bitset will mark the fields that are present as '1' and fields
which are absent as '0' .The type information is available at both end of the
pipes so that is redundant. If the type/name of a field is changed in a newer
version the struct must assign it a different index . The bitset must be
serailized like a variable length integer . So the overhead is 1 byte per seven
fields
The advantages are that
* we do not have a per field overhead of 3 bytes
* we can easily add remove fields in different versions of the objects and it
will be backward compatible too
I'll add this in the wiki too and open a separate issue too
> A more compact format
> ----------------------
>
> Key: THRIFT-110
> URL: https://issues.apache.org/jira/browse/THRIFT-110
> Project: Thrift
> Issue Type: Improvement
> Reporter: Noble Paul
>
> Thrift is not very compact in writing out data as (say protobuf) . It does
> not have the concept of variable length integers and various other
> optimizations possible . In Solr we use a lot of such optimizations to make a
> very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value in same byte, very fast
> writes of Strings, externalizable strings etc
> We could use a thrift format for non-java clients and I would like to see it
> as compact as the current java version
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.