[ https://issues.apache.org/jira/browse/ARROW-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17148072#comment-17148072 ]
Michael Quinn commented on ARROW-9235: -------------------------------------- Yeah. It's fundamentally the same issue. They are both instances of R's "connection" class. https://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html Unfortunately, I don't really understand enough about arrow to implement something like this. > [R] Support for `connection` class when reading and writing files > ----------------------------------------------------------------- > > Key: ARROW-9235 > URL: https://issues.apache.org/jira/browse/ARROW-9235 > Project: Apache Arrow > Issue Type: New Feature > Components: R > Reporter: Michael Quinn > Priority: Major > > We have an internal filesystem that we interact with through objects that > inherit from the connection class. These files aren't necessarily local, > making it slightly more complicated to read and write parquet files, for > example. > For now, we're generating raw vectors and using that to create the file. For > example, to read files > {noformat} > ReadParquet <- function(filename, ...) {}} > file <-file(filename,"rb") > on.exit(close(file)) > raw <- readBin(file, "raw", FileInfo(filename)$size) > return(arrow::read_parquet(raw, ...)) > } > {noformat} > And to write, > {noformat} > WriteParquet <- function(df, filepath, ...) { > stream <- BufferOutputStream$create() > write_parquet(df, stream, ...) > raw <- stream$finish()$data() > file <- file(filepath, "wb") > on.exit(close(file) > writeBin(raw, file) > return(invisible()) > } > {noformat} > At the C++ level, we are interacting with ` R_new_custom_connection` defined > here: > [https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h] > I've been very impressed with how feature-rich arrow is. It would be nice to > see this API supported as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)