[
https://issues.apache.org/jira/browse/ARROW-12539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen Bias updated ARROW-12539:
---------------------------------
Description:
when importing csv data with dates in the format \{{"%d-%b-%y"}} or
\{{"%d-%b-%Y"}}an error is given in conversion:
example:
{code:java}
try:
schema_d64 = pa.schema([pa.field("a", pa.int64()), pa.field("b",
pa.date64())])
co_d64 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d64)
a_d64 = csv.read_csv(pa.py_buffer(data), convert_options=co_d64)
except Exception as e:
print(e)
try:
schema_d32 = pa.schema([pa.field("a", pa.int64()), pa.field("b",
pa.date32())])
co_d32 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d32)
a_d32 = csv.read_csv(pa.py_buffer(data), convert_options=co_d32)except
Exception as e:
print(e){code}
was:
when importing csv data with dates in the format "%d-%b-%y" or "%d-%b-%Y" an
error is given in conversion:
{{pyarrow.lib.ArrowInvalid: In CSV column #1: CSV conversion error to
date64[ms]: invalid value '15-JAN-16'}}
example:
{{ import pyarrow as pa}}
{{ from pyarrow import csv}}
{{data = b"a,b\n1,15-OCT-15\n2,18-JUN-90\n"}}
{{ tp = ["%d-%b-%y"]}}
{{try:}}
{{ schema_d64 = pa.schema([pa.field("a", pa.int64()), pa.field("b",
pa.date64())])}}
{{ co_d64 = csv.ConvertOptions(timestamp_parsers=tp,
column_types=schema_d64)}}
{{ a_d64 = csv.read_csv(pa.py_buffer(data), convert_options=co_d64)}}
{{ except Exception as e:}}
{{ print(e)}}
{{try:}}
{{ schema_d32 = pa.schema([pa.field("a", pa.int64()), pa.field("b",
pa.date32())])}}
{{ co_d32 = csv.ConvertOptions(timestamp_parsers=tp,
column_types=schema_d32)}}
{{ a_d32 = csv.read_csv(pa.py_buffer(data), convert_options=co_d32)}}
{{ except Exception as e:}}
{{ print(e)}}
> Unable to read date64 or date32 in specific format
> --------------------------------------------------
>
> Key: ARROW-12539
> URL: https://issues.apache.org/jira/browse/ARROW-12539
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 3.0.0
> Reporter: Stephen Bias
> Priority: Major
>
> when importing csv data with dates in the format \{{"%d-%b-%y"}} or
> \{{"%d-%b-%Y"}}an error is given in conversion:
> example:
> {code:java}
> try:
> schema_d64 = pa.schema([pa.field("a", pa.int64()), pa.field("b",
> pa.date64())])
> co_d64 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d64)
> a_d64 = csv.read_csv(pa.py_buffer(data), convert_options=co_d64)
> except Exception as e:
> print(e)
> try:
> schema_d32 = pa.schema([pa.field("a", pa.int64()), pa.field("b",
> pa.date32())])
> co_d32 = csv.ConvertOptions(timestamp_parsers=tp, column_types=schema_d32)
> a_d32 = csv.read_csv(pa.py_buffer(data), convert_options=co_d32)except
> Exception as e:
> print(e){code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)