[ https://issues.apache.org/jira/browse/ARROW-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Phillip Cloud updated ARROW-1716: --------------------------------- Description: Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides a bug, because we're writing decimal values as hex encoded bytes. C++ and Java compare that the bytes are the same, but because C++ is interpreting everything as little endian after ARROW-1588 and Java is big endian the numbers these bytes represent will be different in their respective systems. I propose that instead of encoding DecimaArray/DecimalVector values as hex encoded bytes, we store the integer as a string when writing Arrow DecimalArray/DecimalVector data to JSON. This will allow us to compare that the bytes have the same meaning in both systems. This requires a change to the way Arrow writes JSON. cc [~icexelloss] [~wesmckinn] [~jnadeau] was: Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides a bug, because we're writing decimal values as hex encoded bytes. C++ and Java compare that the bytes are the same, but because C++ is interpreting everything as little endian after ARROW-1588 and Java is big endian the numbers these bytes represent will be different in their respective systems. I propose that instead of encoding DecimaArray/DecimalVector values as hex encoded bytes, we store the integer as a string when writing Arrow DecimalArray/DecimalVector data to JSON. This will allow us to compare that the bytes have the same meaning in both systems. This requires a change to how Arrow writes JSON. cc [~icexelloss] [~wesmckinn] [~jnadeau] > [Format/JSON] Use string integer value for Decimals in JSON > ----------------------------------------------------------- > > Key: ARROW-1716 > URL: https://issues.apache.org/jira/browse/ARROW-1716 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Java - Vectors > Affects Versions: 0.7.1 > Reporter: Phillip Cloud > Assignee: Phillip Cloud > Fix For: 0.8.0 > > > Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides > a bug, because we're writing decimal values as hex encoded bytes. > C++ and Java compare that the bytes are the same, but because C++ is > interpreting everything as little endian after ARROW-1588 and Java is big > endian the numbers these bytes represent will be different in their > respective systems. > I propose that instead of encoding DecimaArray/DecimalVector values as hex > encoded bytes, we store the integer as a string when writing Arrow > DecimalArray/DecimalVector data to JSON. This will allow us to compare that > the bytes have the same meaning in both systems. > This requires a change to the way Arrow writes JSON. > cc [~icexelloss] [~wesmckinn] [~jnadeau] -- This message was sent by Atlassian JIRA (v6.4.14#64029)