[ 
https://issues.apache.org/jira/browse/ARROW-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358990#comment-16358990
 ] 

ASF GitHub Bot commented on ARROW-2121:
---------------------------------------

robertnishihara commented on issue #1581: ARROW-2121: [Python] Handle object 
arrays directly in pandas serializer.
URL: https://github.com/apache/arrow/pull/1581#issuecomment-364573786
 
 
   Some performance numbers. The numbers are somewhat variable if you run the 
benchmarks multiple times.
   
   ```python
   import pyarrow as pa
   import pandas as pd
   df = pd.DataFrame(data={str(i): [i, str(i)] for i in range(10 ** 6)})
   ```
   
   Before this PR
   
   ```python
   context = pa.pandas_serialization_context()
   
   %time s = pa.serialize(df, context=context).to_buffer()  # 570ms
   %time d = pa.deserialize(s, context=context)  # 485ms
   
   %timeit s = pa.serialize(df, context=context).to_buffer()  # 482ms
   %timeit d = pa.deserialize(s, context=context)  # 376ms
   ```
   
   After this PR
   
   ```python
   %time s = pa.serialize(df).to_buffer()  # 577ms
   %time d = pa.deserialize(s)  # 672ms
   
   %timeit s = pa.serialize(df).to_buffer()  # 467ms
   %timeit d = pa.deserialize(s)  # 349ms
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Consider special casing object arrays in pandas serializers.
> ------------------------------------------------------------
>
>                 Key: ARROW-2121
>                 URL: https://issues.apache.org/jira/browse/ARROW-2121
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Python
>            Reporter: Robert Nishihara
>            Priority: Major
>              Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to