In light of the recent proposal for IETF standardization of Thrift protocol [1] and the observation that the Compact specification does not match the implementations [2] (all languages are consistent so the specification file has to be updated).
I opened a pull request [3] in order to fix that but a wording question remains regarding the element-types (types encoded at the beginning of lists, sets, and map) by contrast to the field types in the structures. In binary protocol, the same values are used on both sides, which is made completely explicit with the following statement: "The element-type values are the same as field-types. The full list is included in the struct section above." The current description of Compact protocol indicates that element types and field-types are different but the issue is that the values are incorrect. The current description indicates (binary added for reference) | Value | Binary | Elements | Fields | +-------+--------+----------+--------+ | 0 | | | | | 1 | | | TRUE | | 2 | BOOL | BOOL | FALSE | | 3 | I8 | I8 | I8 | | 4 | DOUBLE | DOUBLE | I16 | | 5 | | | I32 | | 6 | I16 | I16 | I64 | | 7 | | | DOUBLE | | 8 | I32 | I32 | BINARY | | 9 | | | LIST | | 10 | I64 | I64 | SET | | 11 | STRING | STRING | MAP | | 12 | STRUCT | STRUCT | STRUCT | | 13 | MAP | MAP | | | 14 | SET | SET | | | 15 | LIST | LIST | | As we can see, the description indicates that compact element types are identical to binary types. Unfortunately, the implementations do not follow this structure (but are consistent as shown by the cross-language tests), but the followin one: | Value | Binary | Elements | Fields | +-------+--------+----------+--------+ | 0 | | | | | 1 | | | TRUE | | 2 | BOOL | BOOL | FALSE | | 3 | I8 | I8 | I8 | | 4 | DOUBLE | I16 | I16 | | 5 | | I32 | I32 | | 6 | I16 | I64 | I64 | | 7 | | DOUBLE | DOUBLE | | 8 | I32 | STRING | BINARY | | 9 | | LIST | LIST | | 10 | I64 | SET | SET | | 11 | STRING | MAP | MAP | | 12 | STRUCT | STRUCT | STRUCT | | 13 | MAP | | | | 14 | SET | | | | 15 | LIST | | | It's clear in this case that, except for boolean where both values are encoded as different types, the element-types and field-types are consistent within compact protocol and have no link to the binary values. Proposal 1: Wording indicates that element-types and field-types (for compact) are using independent values and could diverge more in the future. Proposal 2: Wording indicates that element-types and field-types (for compact) are using similar values and will remain aligned in the future (and also that element-type value 1 is invalid). (My vote would go to the 2nd proposal.) As a side note, I propose to align the "BINARY"/"STRING" type to "BINARY" on both sides as documentation explicitly indicates that "Strings are first encoded to UTF-8, and then send as binary." [5] Please comment and I will update my PR with the commonly-preferred wording (also do not hesitate to propose a specific wording for the compact specification with regards to the element-types [6]. Consistently, Triton. [1] http://mail-archives.apache.org/mod_mbox/thrift-user/202107.mbox/raw/%3C03439030-2F4C-4A2A-AE21-5DAEAAC2D99B%40contoso.com%3E/1 [2] https://issues.apache.org/jira/browse/THRIFT-5300 [3] https://github.com/apache/thrift/pull/2417 [4] https://github.com/apache/thrift/blame/master/doc/specs/thrift-binary-protocol.md#L214 [5] https://github.com/apache/thrift/blame/master/doc/specs/thrift-binary-protocol.md#L78 [6] https://github.com/apache/thrift/pull/2417/files#diff-1dd22d4f1230706bd4cdef9b71f0559886aeae62640292d6fc30c94ca160437fL248-R248