[ 
https://issues.apache.org/jira/browse/ARROW-9186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17461663#comment-17461663
 ] 

Dewey Dunnington commented on ARROW-9186:
-----------------------------------------

Doesn't look like it's implemented as part of the [C++ 
ReadOptions|https://github.com/apache/arrow/blob/master/cpp/src/arrow/csv/options.h#L138-L171]
 but that Python uses the ReadOptions class to carry this information (with a 
Python-only {{.encoding}} attribute). We could/should do this, too.

It looks from the Python PR that we'd have to provide our own {{iconv}} and 
wrap a {{TransformInputStream}}. R provides an {{iconv}} at the C level, so we 
shouldn't need to call into R for this. I [wrote about this a while 
ago](https://fishandwhistle.net/post/2021/using-rs-cross-platform-iconv-wrapper-from-cpp11/)
 and I think there's another example in either readr or vroom where this bit of 
code is wrapped up in a helper class.

Where Python creates its wrapper around {{TransformInputStream}}: 
https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/io.cc#L340-L371

Where R creates the {{CsvReadOptions}}: 
https://github.com/apache/arrow/blob/master/r/R/csv.R#L402-L416

> [R] Allow specifying CSV file encoding
> --------------------------------------
>
>                 Key: ARROW-9186
>                 URL: https://issues.apache.org/jira/browse/ARROW-9186
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>            Reporter: Neal Richardson
>            Priority: Major
>             Fix For: 7.0.0
>
>
> ARROW-9106 did this for Python and we should have the same in R



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to