You're welcome!
On Wed, May 16, 2018 at 6:13 PM Corey Nolet wrote:
> I must say, I’m super excited about using Arrow and Plasma.
>
> The code you just posted worked for me at home and I’m sure I’ll figure
> out what I was doing wrong tomorrow at work.
>
> Anyways, thanks so
I must say, I’m super excited about using Arrow and Plasma.
The code you just posted worked for me at home and I’m sure I’ll figure out
what I was doing wrong tomorrow at work.
Anyways, thanks so much for your help and fast replies!
Sent from my iPhone
> On May 16, 2018, at 7:42 PM, Robert
You should be able to do something like the following.
# Start the store.
plasma_store -s /tmp/store -m 10
Then in Python, do the following:
import pandas as pd
import pyarrow.plasma as plasma
import numpy as np
client = plasma.connect('/tmp/store', '', 0)
series =
Robert,
Thank you for the quick response. I've been playing around for a few hours
to get a feel for how this works.
If I understand correctly, it's better to have the Plasma client objects
instantiated within each separate process? Weird things seemed to happen
when I attempted to share a
Take a look at the Plasma object store
https://arrow.apache.org/docs/python/plasma.html.
Here's an example using it (along with multiprocessing to sort a pandas
dataframe)
https://github.com/apache/arrow/blob/master/python/examples/plasma/sorting/sort_df.py.
It's possible the example is a bit out
I've been reading through the PyArrow documentation and trying to
understand how to use the tool effectively for IPC (using zero-copy).
I'm on a system with 586 cores & 1TB of ram. I'm using Panda's Dataframes
to process several 10's of gigs of data in memory and the pickling that is
done by