[ https://issues.apache.org/jira/browse/THRIFT-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191481#comment-13191481 ]
Diwaker Gupta commented on THRIFT-1394: --------------------------------------- Bump. Jake? > Treatment of optional fields is not consistent between C++ and Java > ------------------------------------------------------------------- > > Key: THRIFT-1394 > URL: https://issues.apache.org/jira/browse/THRIFT-1394 > Project: Thrift > Issue Type: Bug > Components: C++ - Compiler, Java - Compiler > Affects Versions: 0.7 > Environment: Mac OSX, Thrift trunk > Reporter: Diwaker Gupta > Assignee: Jake Farrell > Labels: C++, Java > Attachments: THRIFT-1394.patch > > > *Motivation* > We are trying to implement message signing in the application layer for an > internal project. The transport is over Thrift -- the server side is using > Java and the client side is using C++. We noticed that the message signing > would fail because the client and server would serialize the same Thrift > object in different ways. > *Problem* > The semantics of "optional" fields differ between thrift code generated for > Java and CPP. In CPP, all optional fields are guarded by the {{isset}} helper > struct. On Java, however, the generated code takes advantage of nullable > types: for containers, structs, exceptions, enums, and, notably, strings, the > generator elides explicit use of an "isset" bit vector and instead emits > checks of the form "field null". This leads to varying behavior between the > two languages: an optional string field with a default value will have > {{isset[fieldid]}} false on C, but the equivalent test in Java will be true. > To be concrete, consider the following example: > {code} > struct Foo { > 1: optional string bar = "hello"; > 2: optional i32 baz; > } > {code} > In C++, the checks for 'bar' and 'i' are similar: > {code} > if (this->__isset.bar) { > xfer += oprot->writeFieldBegin("bar", > ::apache::thrift::protocol::T_STRING, 1); > xfer += oprot->writeString(this->bar); > xfer += oprot->writeFieldEnd(); > } > if (this->__isset.baz) { > xfer += oprot->writeFieldBegin("baz", ::apache::thrift::protocol::T_I32, > 2); > xfer += oprot->writeI32(this->baz); > xfer += oprot->writeFieldEnd(); > } > {code} > However, in Java, the checks are different: > {code} > /** Returns true if field bar is set (has been assigned a value) and false > otherwise */ > public boolean isSetBar() { > return this.bar != null; > } > /** Returns true if field baz is set (has been assigned a value) and false > otherwise */ > public boolean isSetBaz() { > return __isset_bit_vector.get(__BAZ_ISSET_ID); > } > {code} > As a result, when the Java object is serialized, the value for 'bar' _is_ > included. But when the C++ object is serialized, the value for 'bar' is NOT > included. > For a system like Thrift, I would eschew a few bytes saved over correctness > and reliability. As a user of Thrift, I *do* expect that the wire data > generated for identical Thrift objects will be identical, regardless of the > language used. > *Proposal* > We already use a BitSet to track primitive types in Java. The compiler should > extend the bit vector to also guard nullable types, to be consistent with > C++. This is pretty easy and low impact -- I'm happy to provide a patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira