[ 
https://issues.apache.org/jira/browse/ARROW-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185784#comment-17185784
 ] 

Frank Smith commented on ARROW-9561:
------------------------------------

Oh, nice, that's great news. I see this functionality was added to the C++ 
source code, but it doesn't seem to be available in the python wheel I'm using.
{code:java}
// code placeholder
(venv) frank@frank-devel:~/arrow/data$ python3
Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow as pa
>>> from pyarrow import csv
>>> s = b"""t\n2018-11-13T17:11:10.777000"""
>>> convert_options = csv.ConvertOptions(column_types={'t': pa.timestamp('us')})
>>> table = csv.read_csv(pa.py_buffer(s), convert_options=convert_options)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow/_csv.pyx", line 714, in pyarrow._csv.read_csv
  File "pyarrow/error.pxi", line 122, in 
pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: In CSV column #0: CSV conversion error to 
timestamp[us]: invalid value '2018-11-13T17:11:10.777000'
>>>

{code}
These are the wheels I'm using:

 
{code:java}
(venv) frank@frank-devel:~/arrow/data$ pip3 install pyarrow
Collecting pyarrow
  Using cached 
https://files.pythonhosted.org/packages/3a/9b/887d1d03d3d43706dee3a71cdad9f9bbb8fe74fc93d8db5d663f5bf34e48/pyarrow-1.0.1-cp36-cp36m-manylinux1_x86_64.whl
Collecting numpy>=1.14 (from pyarrow)
  Using cached 
https://files.pythonhosted.org/packages/22/e7/4b2bdddb99f5f631d8c1de259897c2b7d65dcfcc1e0a6fd17a7f62923500/numpy-1.19.1-cp36-cp36m-manylinux1_x86_64.whl
Installing collected packages: numpy, pyarrow
Successfully installed numpy-1.19.1 pyarrow-1.0.1

{code}
Also, these are the arrow libraries included in the wheel. I see the .100 
suffix in the names... does that mean that the wheel is using the (older) 1.0.0 
versions of the libs?
{code:java}
(venv) frank@frank-devel:~/arrow/data$ find venv/ -name '*.so.*'
venv/lib/python3.6/site-packages/pyarrow/libplasma.so.100
venv/lib/python3.6/site-packages/pyarrow/libarrow_flight.so.100
venv/lib/python3.6/site-packages/pyarrow/libarrow_python.so.100
venv/lib/python3.6/site-packages/pyarrow/libarrow_python_flight.so.100
venv/lib/python3.6/site-packages/pyarrow/libarrow_boost_regex.so.1.73.0
venv/lib/python3.6/site-packages/pyarrow/libarrow.so.100
venv/lib/python3.6/site-packages/pyarrow/libparquet.so.100
venv/lib/python3.6/site-packages/pyarrow/libarrow_dataset.so.100
{code}
 

 

> [C++] CSV parse fractional seconds in timestamps
> ------------------------------------------------
>
>                 Key: ARROW-9561
>                 URL: https://issues.apache.org/jira/browse/ARROW-9561
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Frank Smith
>            Priority: Minor
>             Fix For: 2.0.0
>
>
> It would be great to be able to parse fractional seconds from timestamps in 
> CSV files, e.g. 2017-06-26 16:58:20.651901
> strptime does not have a format specifier for fractional seconds, and the 
> built-in ISO8601 parser does not parse fractional seconds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to