[ https://issues.apache.org/jira/browse/ARROW-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489565#comment-17489565 ]
Dragoș Moldovan-Grünfeld commented on ARROW-13625: -------------------------------------------------- [ARROW-14442|https://issues.apache.org/jira/browse/ARROW-14442] looks at timestamps without timezone and assumes (just like R does) they are timestamps with the local timezone. As far as I can tell from a quick skim of this conversations the 2 issues are slightly different. I think the timestamp Neil is referring to contains the timezone as offset from UTC (the {{%z}} in R's format) which doesn't seem to be recognised by arrow (which recognises only string timezones - R's {{%Z}}). > [C++][CSV] Timestamp parsing should accept any valid ISO 8601 without > requiring custom parse strings > ---------------------------------------------------------------------------------------------------- > > Key: ARROW-13625 > URL: https://issues.apache.org/jira/browse/ARROW-13625 > Project: Apache Arrow > Issue Type: Improvement > Components: C++ > Reporter: Neal Richardson > Priority: Major > Fix For: 8.0.0 > > > I was trying to read in some git logs and got this parse error for a column I > had declared as timestamp type: > Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: > invalid value '2021-08-11T17:39:50-04:00' > This is valid ISO 8601 and is what git log produces with the {{I}} "strict > ISO 8601 format" option (https://git-scm.com/docs/pretty-formats). > I see mentioned on ARROW-10343 that timezone indicators are not supported--is > that still true? And I recognize that it's not trivial because a timestamp > array has to have the same timezone for all values, so if some rows in this > CSV had different timezones listed, we would have to handle that (converting > everything to UTC is probably the most useful thing but technically loses > information). -- This message was sent by Atlassian Jira (v8.20.1#820001)