Allow readFieldBegin() to pass back the field name instead of the field id
--------------------------------------------------------------------------

                 Key: THRIFT-1477
                 URL: https://issues.apache.org/jira/browse/THRIFT-1477
             Project: Thrift
          Issue Type: Improvement
          Components: Java - Compiler
            Reporter: Benjy Weinberger
            Priority: Minor


[Apologies if this has been addressed in another issue. I couldn't find 
anything relevant on JIRA or the mailing list archives.]

Background: I'm implementing a BSON protocol, in order to write Thrift messages 
to MongoDB (technically the protocol generates the object representation that 
the MongoDB driver expects, not a raw BSON string directly to the transport, 
but that's an unimportant detail here). 

BSON, like JSON, naturally uses human-readable string field names. 

When reading, the generated Thrift code (at least in Java) requires that 
readFieldBegin() pass back a TField with the id field set. It ignores the name 
field. Therefore the ids must appear in the stream. It's possible to contort 
these protocols to use ids instead of human-readable names (as TJSONProtocol 
does) but this isn't helpful in dealing with prior BSON or JSON data that we're 
trying to back-port into Thrift schemata.

However, the generated read() method already knows how to map names to ids. So 
I propose allowing a TProtocol's readFieldBegin() method to pass back a TField 
with the name set and no id set (indicated, say, by id==-1), and let the read() 
method figure out the id to then switch on. 

In some cases we could also allow the TField to omit the type information, 
which, again, is not naturally present in JSON. (BSON does embed type 
information, but its type system does not align fully with Thrift's, so it 
can't be used without further context). If the field is unknown, the only use 
for the type is for skipping the field value. But protocols like JSON and BSON 
can skip fields without this type information, since fields are delimited in 
the protocol in a type-independent way.

Basically, what I propose is that readFieldBegin() be allowed to pass back just 
an id or just a name (and, for some protocols, no type information), since that 
is all read() needs in order to figure out how to read or skip the field. 

I'm wondering what the Thrift elders think of this. Has it been discussed? 
Thanks!


PS This does have the downside that if Thrift were to implement a pass-through 
feature for unrecognized fields (so that new messages read with old protocol 
versions will serialize back out with no loss) - we wouldn't be able to 
preserve fields for which we only had a name and no id. Or rather, we wouldn't 
be able to write them out to a protocol that requires ids, like the binary 
protocols. However this feature doesn't exist anyway, and I don't know if it's 
on the roadmap. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to