If you're working on 0.8+, you could always just create a new Scheme to go with your Protocol that never uses IDs and always uses names. See the TupleProtocol for an example of using a non-standard scheme.
On Tue, Jan 3, 2012 at 9:53 AM, Benjy Weinberger <[email protected]> wrote: > Sadly no. I have tried this extensively, but if read() can't accept > this information directly then the TProtocol implementation has to > receive it out-of-band somehow. It seemed like it would be possible to > generate it up front from the available FieldValueMetaData - starting > at the root class and working your way down. Unfortunately this: > > A) greatly complicates the TProtocol implementation > > and > > B) is defeated by the presence of typedefs, since, unfortunately and > for reasons I'm not clear on, the FieldValueMetaData for a typedef > describes the typedef, not the thing it refers to. (*) > > > There is a trick using stack traces (see (**) for details), but it: > > A) greatly complicates the TProtocol implementation > > B) is fragile > > C) performs poorly and is unsuitable for production use > > D) is defeated by the presence of unions. > > > My conclusion, and believe me this was after several days of > deep-diving into the issue, is that it is much simpler and more > natural to fix up the read() method. > > It's simpler because it isn't a particularly big change. And it's > probably better to add complication to generated code than to > hand-written code. > > It's more natural because the read() method already has the > information it needs to read a field by name. Forcing the TProtocol to > do the conversion means passing it out-of-band information that > ultimately has to come from the generated struct code anyway. And the > TProtocol should be about wire representation, not about id<->field > mappings. That seems properly the role of the generated code. > > What do you think? I could put together a change to the compiler and > we can see the degree of complexity it adds to the generated code is a > problem. > > Benjy > > > > > (*) This typedef metadata doesn't seem to be useful at runtime, since > it just states the typedef alias, but this doesn't refer to anything > you can reflect on at runtime. > > (**) FWIW there is a "TTextProtocol" in the twitter/commons libraries on > github. It is essentially a JSON protocol that uses names. > > However because of the problem mentioned below it has to do some > non-trivial state tracking, including the following trick to figure > out which struct it's currently reading (so it knows which struct's > metadata to use to map names to ids): > > In readStructBegin() it generates a stack trace and walks up it frame > by frame, calling Class.forName() on the declaring class of each frame > and tests it for assignability to TBase. The first TBase subclass > encountered is assumed to be the currently read struct - calling > readStructBegin() from its read() method. > > This is fragile (e.g., it doesn't work with unions because the first > TBase subclass encountered will be TUnion) and has poor performance, > so it's only used in tests. > > > On Tue, Jan 3, 2012 at 9:40 AM, Bryan Duxbury (Commented) (JIRA) > <[email protected]> wrote: > > > > [ > https://issues.apache.org/jira/browse/THRIFT-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13178850#comment-13178850] > > > > Bryan Duxbury commented on THRIFT-1477: > > --------------------------------------- > > > > This feels like it would unnecessarily complicate the read() method. Can > you just figure out a way to do the name->id mapping yourself? > > > >> Allow readFieldBegin() to pass back the field name instead of the field > id > >> > -------------------------------------------------------------------------- > >> > >> Key: THRIFT-1477 > >> URL: https://issues.apache.org/jira/browse/THRIFT-1477 > >> Project: Thrift > >> Issue Type: Improvement > >> Components: Java - Compiler > >> Reporter: Benjy Weinberger > >> Priority: Minor > >> > >> [Apologies if this has been addressed in another issue. I couldn't find > anything relevant on JIRA or the mailing list archives.] > >> Background: I'm implementing a BSON protocol, in order to write Thrift > messages to MongoDB (technically the protocol generates the object > representation that the MongoDB driver expects, not a raw BSON string > directly to the transport, but that's an unimportant detail here). > >> BSON, like JSON, naturally uses human-readable string field names. > >> When reading, the generated Thrift code (at least in Java) requires > that readFieldBegin() pass back a TField with the id field set. It ignores > the name field. Therefore the ids must appear in the stream. It's possible > to contort these protocols to use ids instead of human-readable names (as > TJSONProtocol does) but this isn't helpful in dealing with prior BSON or > JSON data that we're trying to back-port into Thrift schemata. > >> However, the generated read() method already knows how to map names to > ids. So I propose allowing a TProtocol's readFieldBegin() method to pass > back a TField with the name set and no id set (indicated, say, by id==-1), > and let the read() method figure out the id to then switch on. > >> In some cases we could also allow the TField to omit the type > information, which, again, is not naturally present in JSON. (BSON does > embed type information, but its type system does not align fully with > Thrift's, so it can't be used without further context). If the field is > unknown, the only use for the type is for skipping the field value. But > protocols like JSON and BSON can skip fields without this type information, > since fields are delimited in the protocol in a type-independent way. > >> Basically, what I propose is that readFieldBegin() be allowed to pass > back just an id or just a name (and, for some protocols, no type > information), since that is all read() needs in order to figure out how to > read or skip the field. > >> I'm wondering what the Thrift elders think of this. Has it been > discussed? Thanks! > >> PS This does have the downside that if Thrift were to implement a > pass-through feature for unrecognized fields (so that new messages read > with old protocol versions will serialize back out with no loss) - we > wouldn't be able to preserve fields for which we only had a name and no id. > Or rather, we wouldn't be able to write them out to a protocol that > requires ids, like the binary protocols. However this feature doesn't exist > anyway, and I don't know if it's on the roadmap. > > > > -- > > This message is automatically generated by JIRA. > > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > > > >
