[
https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670813#action_12670813
]
Bryan Duxbury commented on THRIFT-110:
--------------------------------------
bq. First: allow casting from bool to int, so that you can send the integer
values 0 and 1 as boolean-false and boolean-true respectively.
That would be cool, but this is one of those changes that would require all of
Thrift to change, too. I'm not usually one to avoid sweeping changes if I think
there's benefit, but right now I'm not really pro the whole "change Thrift
interface" thing. We're talking about every library, protocol, and code
generator changing to match some different form of protocol/struct interface.
While Ben has mentioned this a few times, I haven't seen a complete proposal
for something like this yet, and it's definitely a nontrivial change.
For the sake of expediency, I'd really like to limit the scope of the
discussion of this protocol to the *current* Thrift interface. Changing Thrift
to be even more compact isn't going to happen in the near future (and possibly
not before our first release), while this protocol implementation could be
committed, working, now.
bq. Second: make up your mind--are you using zigzag ints or not?
Zigzags are important in this protocol, and they're not used in every
situation. For user-entered data and field ids, there could be negative
numbers, so I have to protect against worst-case sign extension by using
zigzag. But there are other things, like list and string lengths, which are
uniformly non-negative, and so zigzagging them would be a waste.
Also, while it would be nice to have specific-sized int headers available, this
doesn't help me when I'm in a map or list/set and I have to use one type header
without knowing the range of values up front. Zigzag allows me to just put
stuff in there and get a pretty respectable compression.
bq. ... "followed by a variable-length type-header value" ...
I'm not sure I understand this proposal. What does this accomplish? Leaving
extra room for more types in the future? The current formulation of the
protocol leaves 3 open type spots, and while one might be spoken for by
externalized strings, I don't really know what other types we're likely to
introduce in the future.
> A more compact format
> ----------------------
>
> Key: THRIFT-110
> URL: https://issues.apache.org/jira/browse/THRIFT-110
> Project: Thrift
> Issue Type: Improvement
> Reporter: Noble Paul
> Assignee: Bryan Duxbury
> Attachments: compact-proto-spec-2.txt, compact_proto_spec.txt,
> compact_proto_spec.txt, thrift-110-v2.patch, thrift-110-v3.patch,
> thrift-110-v4.patch, thrift-110-v5.patch, thrift-110-v6.patch,
> thrift-110-v7.patch, thrift-110-v8.patch, thrift-110-v9.patch,
> thrift-110.patch
>
>
> Thrift is not very compact in writing out data as (say protobuf) . It does
> not have the concept of variable length integers and various other
> optimizations possible . In Solr we use a lot of such optimizations to make a
> very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value in same byte, very fast
> writes of Strings, externalizable strings etc
> We could use a thrift format for non-java clients and I would like to see it
> as compact as the current java version
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.