Antony Mayi created ARROW-2153:
----------------------------------

             Summary: decimal conversion not working for exponential notation
                 Key: ARROW-2153
                 URL: https://issues.apache.org/jira/browse/ARROW-2153
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.8.0
            Reporter: Antony Mayi


{code:java}
import pyarrow as pa
import pandas as pd
import decimal

pa.Table.from_pandas(pd.DataFrame({'a': [decimal.Decimal('1.1'), 
decimal.Decimal('2E+1')]}))
{code}
 
{code:java}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/table.pxi", line 875, in pyarrow.lib.Table.from_pandas 
(/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:44927)
  File 
"/home/skadlec/.local/lib/python3.6/site-packages/pyarrow/pandas_compat.py", 
line 350, in dataframe_to_arrays
    convert_types)]
  File 
"/home/skadlec/.local/lib/python3.6/site-packages/pyarrow/pandas_compat.py", 
line 349, in <listcomp>
    for c, t in zip(columns_to_convert,
  File 
"/home/skadlec/.local/lib/python3.6/site-packages/pyarrow/pandas_compat.py", 
line 345, in convert_column
    return pa.array(col, from_pandas=True, type=ty)
  File "pyarrow/array.pxi", line 170, in pyarrow.lib.array 
(/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:29224)
  File "pyarrow/array.pxi", line 70, in pyarrow.lib._ndarray_to_array 
(/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:28465)
  File "pyarrow/error.pxi", line 77, in pyarrow.lib.check_status 
(/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:8270)
pyarrow.lib.ArrowInvalid: Expected base ten digit or decimal point but found 
'E' instead.
{code}
In manual cases clearly we can write {{decimal.Decimal('20')}} instead of 
{{decimal.Decimal('2E+1')}} but during arithmetical operations inside an 
application the exponential notation can be produced out of control (it is 
actually the _normalized_ form of the decimal number) plus for some values the 
exponential notation is the only form expressing the significance so this 
should be accepted.

The [documentation|https://docs.python.org/3/library/decimal.html] suggests 
using following transformation but that's only possible when the significance 
information doesn't need to be kept:
{code:java}
def remove_exponent(d):
    return d.quantize(Decimal(1)) if d == d.to_integral() else d.normalize()
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to