[jira] [Assigned] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

2022-07-12 Thread Todd Farmer (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Farmer reassigned ARROW-12976:
---

Assignee: (was: Micah Kornfield)

This issue was last updated over 90 days ago, which may be an indication it is 
no longer being actively worked. To better reflect the current state, the issue 
is being unassigned. Please feel free to re-take assignment of the issue if it 
is being actively worked, or if you plan to start that work soon.

> [Python] Arrow-to-Python conversion is slow
> ---
>
> Key: ARROW-12976
> URL: https://issues.apache.org/jira/browse/ARROW-12976
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Antoine Pitrou
>Priority: Major
>
> It seems that we are 20x slower than Numpy for converting the exact same data 
> to a Python list.
> With integers:
> {code:python}
> >>> arr = np.arange(0,1000, dtype=np.int64)
> >>> %timeit arr.tolist()
> 8.24 µs ± 3.46 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
> >>> parr = pa.array(arr)
> >>> %timeit parr.to_pylist()
> 218 µs ± 2.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> {code}
> With floats:
> {code:python}
> >>> arr = np.arange(0,1000, dtype=np.float64)
> >>> %timeit arr.tolist()
> 10.2 µs ± 25.5 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
> >>> parr = pa.array(arr)
> >>> %timeit parr.to_pylist()
> 199 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 1 loops each)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (ARROW-12976) [Python] Arrow-to-Python conversion is slow

2021-10-09 Thread Micah Kornfield (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-12976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Kornfield reassigned ARROW-12976:
---

Assignee: Micah Kornfield

> [Python] Arrow-to-Python conversion is slow
> ---
>
> Key: ARROW-12976
> URL: https://issues.apache.org/jira/browse/ARROW-12976
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Antoine Pitrou
>Assignee: Micah Kornfield
>Priority: Major
>
> It seems that we are 20x slower than Numpy for converting the exact same data 
> to a Python list.
> With integers:
> {code:python}
> >>> arr = np.arange(0,1000, dtype=np.int64)
> >>> %timeit arr.tolist()
> 8.24 µs ± 3.46 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
> >>> parr = pa.array(arr)
> >>> %timeit parr.to_pylist()
> 218 µs ± 2.39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
> {code}
> With floats:
> {code:python}
> >>> arr = np.arange(0,1000, dtype=np.float64)
> >>> %timeit arr.tolist()
> 10.2 µs ± 25.5 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
> >>> parr = pa.array(arr)
> >>> %timeit parr.to_pylist()
> 199 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 1 loops each)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)