[ 
https://issues.apache.org/jira/browse/ARROW-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phillip Cloud updated ARROW-1716:
---------------------------------
    Description: 
Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides a 
bug, because we're writing decimal values as hex encoded bytes.

C++ and Java compare that the bytes are the same, but because C++ is 
interpreting everything as little endian after ARROW-1588 and Java is big 
endian the numbers these bytes represent will be different in their respective 
systems.

I propose that instead of encoding DecimaArray/DecimalVector values as hex 
encoded bytes, we store the integer as a string when writing Arrow 
DecimalArray/DecimalVector data to JSON. This will allow us to compare that the 
bytes have the same meaning in both systems.

This requires a change to how Arrow writes JSON.

cc [~icexelloss] [~wesmckinn] [~jnadeau]



  was:
Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides a 
bug, because we're writing decimal values as hex encoded bytes.

C++ and Java compare that the bytes are the same, but because C++ is 
interpreting everything as little endian after ARROW-1588 and Java is big 
endian the numbers these bytes represent will be different in their respective 
systems.

I propose that instead of encoding DecimaArray/DecimalVector values as hex 
encoded bytes, we store the integer as a string when writing Arrow data to 
JSON. This will allow us to compare that the bytes have the same meaning in 
both systems.

This requires a change to how Arrow writes JSON.

cc [~icexelloss] [~wesmckinn] [~jnadeau]




> [Format/JSON] Use string integer value for Decimals in JSON
> -----------------------------------------------------------
>
>                 Key: ARROW-1716
>                 URL: https://issues.apache.org/jira/browse/ARROW-1716
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Java - Vectors
>    Affects Versions: 0.7.1
>            Reporter: Phillip Cloud
>            Assignee: Phillip Cloud
>             Fix For: 0.8.0
>
>
> Suprisingly, Java and C++ integration tests pass after ARROW-1588. This hides 
> a bug, because we're writing decimal values as hex encoded bytes.
> C++ and Java compare that the bytes are the same, but because C++ is 
> interpreting everything as little endian after ARROW-1588 and Java is big 
> endian the numbers these bytes represent will be different in their 
> respective systems.
> I propose that instead of encoding DecimaArray/DecimalVector values as hex 
> encoded bytes, we store the integer as a string when writing Arrow 
> DecimalArray/DecimalVector data to JSON. This will allow us to compare that 
> the bytes have the same meaning in both systems.
> This requires a change to how Arrow writes JSON.
> cc [~icexelloss] [~wesmckinn] [~jnadeau]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to