[ https://issues.apache.org/jira/browse/ARROW-11120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270940#comment-17270940 ]
Laurent commented on ARROW-11120: --------------------------------- I looked briefly into it and the issue might be caused by a combination of what the API in the R arrow package and R's performance when creating many R6 objects objects. The R constructor for ChunkedArray expects a list of Array objects. In my example a ChunkedArray has ~2200 chunks. Getting R to build that many dummy Array objects (`arrow::Array$create(1)`) takes over half a second. If I multiply this by 18 (number of columns in my tables) have slightly over 10 seconds (almost half of the 24 seconds observed). It feels like a pair of functions `pyarrow.ChunkedArray._export_to_c() ` and `arrow:::ImportChunkedArray()` would be needed. > [Python][R] Prove out plumbing to pass data between Python and R using rpy2 > --------------------------------------------------------------------------- > > Key: ARROW-11120 > URL: https://issues.apache.org/jira/browse/ARROW-11120 > Project: Apache Arrow > Issue Type: Improvement > Components: Python, R > Reporter: Wes McKinney > Priority: Major > > Per discussion on the mailing list, we should see what is required (if > anything) to be able to pass data structures using the C interface between > Python and R from the perspective of the Python user using rpy2. rpy2 is sort > of the Python version of reticulate. Unit tests will then validate that it's > working -- This message was sent by Atlassian Jira (v8.3.4#803005)