[ 
https://issues.apache.org/jira/browse/ARROW-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2249:
--------------------------------
    Fix Version/s:     (was: 0.12.0)
                   0.13.0

> [Java/Python] in-process vector sharing from Java to Python
> -----------------------------------------------------------
>
>                 Key: ARROW-2249
>                 URL: https://issues.apache.org/jira/browse/ARROW-2249
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Java, Python
>            Reporter: Uwe L. Korn
>            Assignee: Uwe L. Korn
>            Priority: Major
>              Labels: beginner
>             Fix For: 0.13.0
>
>
> Currently we seem to use in all applications of Arrow the IPC capabilities to 
> move data between a Java process and a Python process. While this is 
> 0-serialization, it is not zero-copy. By taking the address and offset, we 
> can already create Python buffers from Java buffers: 
> https://github.com/apache/arrow/pull/1693. This is still a very low-level 
> interface and we should provide the user with:
> * A guide on how to load Apache Arrow java libraries in Python (either 
> through a fat-jar that was shipped with Arrow or how he should integrate it 
> into its Java packaging)
> * {{pyarrow.Array.from_jvm}}, {{pyarrow.RecordBatch.from_jvm}}, … functions 
> that take the respective Java objects and emit Python objects. These Python 
> objects should also ensure that the underlying memory regions are kept alive 
> as long as the Python objects exist.
> This issue can also be used as a tracker for the various sub-tasks that will 
> need to be done to complete this rather large milestone.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to