Yes, if I tell it to make a decimal128 array from the same ints, that works. It's something specific to handling longer python ints that it hits a snag on these two paths - make an array with automatic type inference (no target type specified) - make a float64 array
On Thu, May 11, 2023, 8:00 a.m. Felipe Oliveira Carvalho < [email protected]> wrote: > Does creating a decimal128 array, then casting that array to float64 work? > > On Mon, May 8, 2023 at 3:08 PM Chris Comeau <[email protected]> > wrote: > >> Is there any way to have pa.compute.cast handle int -> float64 with >> accepted loss of precision? >> >> Source value is a python int that's too long for int64, like >> 12345678901234567890, and I'd like to put into a float64 field in Arrow >> table. >> >> Using pyarrow 12.0.0: >> >> pa.array([12345678901234567890], type=pa.float64()) >> -> ArrowInvalid: PyLong is too large to fit int64 >> >> Converting it myself works with expected loss of precision >> pa.array([float(12345678901234567890)], type=pa.float64()) >> >> -> [1.2345678901234567e+19] >> >> >> but I can't get pa.compute to do the same. Some examples: >> >> >> pa.compute.cast([20033613169503999008], target_type=pa.float64(), >> safe=False) >> -> OverflowError: int too big to convert >> >> pa.compute.cast( >> [12345678901234567890], >> options = pa.compute.CastOptions.unsafe(target_type=pa.float64()) >> ) >> -> OverflowError: int too big to convert >> >> I tried the other options like int overflow and float truncate with no >> luck. >> >> Asking Arrow to infer types hits the same error: >> pa_array = pa.array([12345678901234567890]) >> -> OverflowError: int too big to convert >> >> Cast to decimal128(38,0) works if it's set explicitly >> pa.array([12345678901234567890], type=pa.decimal128(38, 0)) >> >> <pyarrow.lib.Decimal128Array object at 0x000001C45FAA8B80> >> -> [12345678901234567890] >> >> I'm working around it by doing the float() conversion myself, but this is >> slower of course. >> >>
