[
https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665691#action_12665691
]
Bryan Duxbury commented on THRIFT-110:
--------------------------------------
I've been thinking some more about the idea of using the extra bits in the type
header for field id deltas instead of as type modifier space, and it seems like
a good idea.
There's 4 bits in the type header that are free to modify, which means you can
represent 0-15When you have dense structs, or even marginally sparse structs,
the delta enocding will save you a 2 bytes per field, plus 1-2 bytes as
overhead for the first field. When you have very sparse structs (ie set fields
are > 15 field ids apart at all times), you will have at least one byte per
field, and quickly end up with 2, since you're skipping so many ids you must
have hundreds of fields.
As a positive side effect of this approach, the type modifier wouldn't be used
to help with encoding the value anymore, which would probably significantly
reduce the complexity of the protocol. On the downside, we'll have to maintain
a stack of the last field id for the current struct so that we can descend into
nested structures, adding some complexity. Overall, I think it seems like it'll
be a win.
> A more compact format
> ----------------------
>
> Key: THRIFT-110
> URL: https://issues.apache.org/jira/browse/THRIFT-110
> Project: Thrift
> Issue Type: Improvement
> Reporter: Noble Paul
> Attachments: compact_proto_spec.txt, compact_proto_spec.txt,
> thrift-110-v2.patch, thrift-110-v3.patch, thrift-110-v4.patch,
> thrift-110-v5.patch, thrift-110.patch
>
>
> Thrift is not very compact in writing out data as (say protobuf) . It does
> not have the concept of variable length integers and various other
> optimizations possible . In Solr we use a lot of such optimizations to make a
> very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value in same byte, very fast
> writes of Strings, externalizable strings etc
> We could use a thrift format for non-java clients and I would like to see it
> as compact as the current java version
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.