[ https://issues.apache.org/jira/browse/THRIFT-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henrique Mendonça updated THRIFT-1841: -------------------------------------- Affects Version/s: (was: 1.0) 0.9.1 > NodeJS Thrift incorrectly parses non-UTF8-string types > ------------------------------------------------------ > > Key: THRIFT-1841 > URL: https://issues.apache.org/jira/browse/THRIFT-1841 > Project: Thrift > Issue Type: Bug > Components: Node.js - Compiler, Node.js - Library > Affects Versions: 0.9, 0.9.1 > Reporter: Nabeel Shahzad > Assignee: Henrique Mendonça > Labels: node, nodejs > Fix For: 0.9.2 > > > **Edit**: See my comment below. > When a double/float is used in a map (key or value), list, or set types, the > decoding is done as a utf8 string, which then incorrectly parses and adds > extra bytes. > For example: > The bytes of a map <double, double> (this is coming out of the Thrift call) > {noformat} > 00 01 00 08 3f f4 00 00 00 00 00 00 00 08 40 02 00 00 00 00 00 00 > {noformat} > But after it's been parsed out from the field as UTF8: > {noformat} > 00 01 00 08 3f 3f 00 00 00 00 00 00 00 08 40 02 00 00 00 00 00 00 > {noformat} > As you can see there's an incorrect byte (the 3f where the f4, and an extra > 00). For reference, this value was map<double, double> = {1.25: 2.25}. This > is the same behavior for floats. The f4 translated to ASCII 247, which I > believe isn't a valid utf8 code. > The actual value of the field becomes: > {noformat} > value: > '\u0000\u0002\u0000\b??\u0000\u0000\u0000\u0000\u0000\u0000\u0000\b@\u0002\u0000\u0000\u0000\u0000\u0000\u0000'' > {noformat} > Where the \b = 8, ? = f4, ? = unknown char. > I have seen cases where there are *extra* bytes added in, which breaks the > parsing based on byte size: > {noformat} > 00 01 00 08 40 24 48 72 c2 b0 20 c3 84 c2 9c 00 08 40 34 c3 bc c3 93 5a c2 85 > c2 87 c2 94 > {noformat} > Where the MAP value was {10.1415, 20.9876}. On a list or set, using either > value also yields extra bytes. > So this messes up any parsing based on the byte-length for the field, since > there are a variable number of extra bytes added, either to the key or value > of the map, and any values of a list. I believe this could also happen on > high-integer values. > It seems to me when the "ftype" is parsed (int16) before the actual field, > it's returning a TYPE value of "11" (string) - instead of the proper value of > a map/set/list. > For reference, the table, and an insert example: > {noformat} > CREATE TABLE sample_map ( > id text PRIMARY KEY, > map_col_text map < text, text >, > map_col_int map < int, text >, > map_col_float map < float, float >, > map_col_double map < double, double > > ); > INSERT INTO sample_map (id, map_col_double) VALUES('DOUBLE_ROW_SINGLE', > {10.1415: 20.9876}); > {noformat} > Not sure if it matters, but this was using CQL3. Also, we are not seeing this > on the C++ generated Thrift interface. > Versions: > {noformat} > cqlsh:orion> show version; > [cqlsh 2.3.0 | Cassandra 1.2.0 | CQL spec 3.0.0 | Thrift protocol 19.35.0] > {noformat} > {noformat} > $ thrift --version > Thrift version 0.9.0 > {noformat} > {noformat} > "name": "node-thrift", > "description": "node.js bindings for the Apache Thrift RPC system", > "homepage": "http://thrift.apache.org/", > "repository": { > "type": "svn", > "url": "http://svn.apache.org/repos/asf/thrift/trunk/" > }, > "version": "1.0.0-dev", > {noformat} > The issue also appears in the 0.9.0 version of the thrift library. -- This message was sent by Atlassian JIRA (v6.1#6144)