[ https://issues.apache.org/jira/browse/ARROW-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Neal Richardson resolved ARROW-9235. ------------------------------------ Resolution: Fixed Issue resolved by pull request 12323 [https://github.com/apache/arrow/pull/12323] > [R] Support for `connection` class when reading and writing files > ----------------------------------------------------------------- > > Key: ARROW-9235 > URL: https://issues.apache.org/jira/browse/ARROW-9235 > Project: Apache Arrow > Issue Type: New Feature > Components: R > Reporter: Michael Quinn > Assignee: Dewey Dunnington > Priority: Major > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > We have an internal filesystem that we interact with through objects that > inherit from the connection class. These files aren't necessarily local, > making it slightly more complicated to read and write parquet files, for > example. > For now, we're generating raw vectors and using that to create the file. For > example, to read files > {noformat} > ReadParquet <- function(filename, ...) {}} > file <-file(filename,"rb") > on.exit(close(file)) > raw <- readBin(file, "raw", FileInfo(filename)$size) > return(arrow::read_parquet(raw, ...)) > } > {noformat} > And to write, > {noformat} > WriteParquet <- function(df, filepath, ...) { > stream <- BufferOutputStream$create() > write_parquet(df, stream, ...) > raw <- stream$finish()$data() > file <- file(filepath, "wb") > on.exit(close(file) > writeBin(raw, file) > return(invisible()) > } > {noformat} > At the C++ level, we are interacting with ` R_new_custom_connection` defined > here: > [https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h] > I've been very impressed with how feature-rich arrow is. It would be nice to > see this API supported as well. -- This message was sent by Atlassian Jira (v8.20.7#820007)