[
https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Duxbury updated THRIFT-110:
---------------------------------
Attachment: compact-proto-spec-2.txt
Here's a new spec for the compact protocol that uses the 4 MSB of the type
header to encode field id deltas instead of extra value information. It's a LOT
simpler, and I think it gets probably nearly the same compaction, but I haven't
coded it up yet, so I'm not sure.
One notable downside of this approach is that all ints (other than collection
sizes) become zigzag varints. This avoids the worst-case size explosions for
negative numbers, but makes positive numbers take up more space, too. I think
it's a decent tradeoff, though.
> A more compact format
> ----------------------
>
> Key: THRIFT-110
> URL: https://issues.apache.org/jira/browse/THRIFT-110
> Project: Thrift
> Issue Type: Improvement
> Reporter: Noble Paul
> Attachments: compact-proto-spec-2.txt, compact_proto_spec.txt,
> compact_proto_spec.txt, thrift-110-v2.patch, thrift-110-v3.patch,
> thrift-110-v4.patch, thrift-110-v5.patch, thrift-110.patch
>
>
> Thrift is not very compact in writing out data as (say protobuf) . It does
> not have the concept of variable length integers and various other
> optimizations possible . In Solr we use a lot of such optimizations to make a
> very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value in same byte, very fast
> writes of Strings, externalizable strings etc
> We could use a thrift format for non-java clients and I would like to see it
> as compact as the current java version
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.