[ https://issues.apache.org/jira/browse/ARROW-15599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488436#comment-17488436 ]
Nicola Crane commented on ARROW-15599: -------------------------------------- Here's a reprex with more verbose output. {code:r} library(arrow) #> #> Attaching package: 'arrow' #> The following object is masked from 'package:utils': #> #> timestamp tf <- tempfile() write.csv(data.frame(x = '2018-10-07 19:04:05.005'), tf, row.names = FALSE) # successfully read in file read_csv_arrow(tf, as_data_frame = FALSE) #> Table #> 1 rows x 1 columns #> $x <timestamp[ns]> # successfully read in with col_names and col_types specified read_csv_arrow( tf, col_names = "x", col_types = "?", skip = 1, as_data_frame = FALSE ) #> Table #> 1 rows x 1 columns #> $x <timestamp[ns]> read_csv_arrow( tf, col_names = "x", col_types = "T", skip = 1, as_data_frame = FALSE ) #> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: invalid value '2018-10-07 19:04:05.005' #> /home/nic2/arrow/cpp/src/arrow/csv/converter.cc:550 decoder_.Decode(data, size, quoted, &value) #> /home/nic2/arrow/cpp/src/arrow/csv/parser.h:123 status #> /home/nic2/arrow/cpp/src/arrow/csv/converter.cc:554 parser.VisitColumn(col_index, visit) read_csv_arrow( tf, col_names = "x", col_types = "T", as_data_frame = FALSE, skip = 1, timestamp_parsers = "%Y-%m-%d %H:%M:%S" ) #> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: invalid value '2018-10-07 19:04:05.005' #> /home/nic2/arrow/cpp/src/arrow/csv/converter.cc:550 decoder_.Decode(data, size, quoted, &value) #> /home/nic2/arrow/cpp/src/arrow/csv/parser.h:123 status #> /home/nic2/arrow/cpp/src/arrow/csv/converter.cc:554 parser.VisitColumn(col_index, visit) {code} > [R] can't explicitly convert a column as a sub-seconds typestamp from CSV (or > other delimited) file > --------------------------------------------------------------------------------------------------- > > Key: ARROW-15599 > URL: https://issues.apache.org/jira/browse/ARROW-15599 > Project: Apache Arrow > Issue Type: Bug > Affects Versions: 6.0.1 > Environment: R version 4.1.2 (2021-11-01) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu 20.04.3 LTS > Reporter: SHIMA Tatsuya > Priority: Major > > I tried to read the csv column type as timestamp, but I could only get it to > work well when `col_types` was not specified. > I'm sorry if I missed something and this is the expected behavior. (It would > be great if you could add an example with `col_types` in the documentation.) > {code:r} > library(arrow) > #> > #> Attaching package: 'arrow' > #> The following object is masked from 'package:utils': > #> > #> timestamp > t_string <- tibble::tibble( > x = "2018-10-07 19:04:05.005" > ) > write_csv_arrow(t_string, "tmp.csv") > read_csv_arrow( > "tmp.csv", > as_data_frame = FALSE > ) > #> Table > #> 1 rows x 1 columns > #> $x <timestamp[ns]> > read_csv_arrow( > "tmp.csv", > col_names = "x", > col_types = "?", > skip = 1, > as_data_frame = FALSE > ) > #> Table > #> 1 rows x 1 columns > #> $x <timestamp[ns]> > read_csv_arrow( > "tmp.csv", > col_names = "x", > col_types = "T", > skip = 1, > as_data_frame = FALSE > ) > #> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: > invalid value '2018-10-07 19:04:05.005' > read_csv_arrow( > "tmp.csv", > col_names = "x", > col_types = "T", > as_data_frame = FALSE, > skip = 1, > timestamp_parsers = "%Y-%m-%d %H:%M:%S" > ) > #> Error: Invalid: In CSV column #0: CSV conversion error to timestamp[s]: > invalid value '2018-10-07 19:04:05.005' > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)