Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-09-27 Thread Wes McKinney
To be clear, if someone wants to step up as the Plasma maintainer in Apache Arrow, that's completely fine -- that would be a good outcome. Many of us had already been concerned for a while about Plasma's maintenance status -- lots of stale PRs and low engagement on JIRA issues and mailing list

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-09-27 Thread Niklas B
We to rely heavily on Plasma (we use Ray as well, but also Plasma independent of Ray). I’ve started a thread on ray dev list to see if Rays plasma can be used standalone outside of ray as well. That would allow us who use Plasma to move to a standalone “ray plasma” when/if it’s removed from

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-09-25 Thread Wes McKinney
I'd suggest as a preliminary that we stop packaging Plasma for 1-2 releases to see who is affected by the component's removal. Usage may be more widespread than we realize, and we don't have much telemetry to know for certain. On Tue, Aug 18, 2020 at 1:26 PM Antoine Pitrou wrote: > > > Also, the

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-18 Thread Antoine Pitrou
Also, the fact that Ray has forked Plasma means their implementation becomes potentially incompatible with Arrow's. So even if we keep Plasma in our codebase, we can't guarantee interoperability with Ray. Regards Antoine. Le 18/08/2020 à 19:51, Wes McKinney a écrit : > I do not think there

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-18 Thread Wes McKinney
I do not think there is an urgency to remove Plasma from the Arrow codebase (as it currently does not cause much maintenance burden), but the reality is that Ray has already hard-forked and so new maintainers will need to come out of the woodwork to help support the project if it is to continue

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-18 Thread Matthias Vallentin
We are very interested in Plasma as a stand-alone project. The fork would hit us doubly hard, because it reduces both the appeal of an Arrow-specific use case as well as our planned Ray integration. We are developing effectively a database for network activity data that runs with Arrow as data

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-17 Thread Robert Nishihara
To answer Wes's question, the Plasma inside of Ray is not currently usable in a C++ library context, though it wouldn't be impossible to make that happen. I (or someone) could conduct a simple poll via Google Forms on the user mailing list to gauge demand if we are concerned about breaking a lot

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-17 Thread Antoine Pitrou
Le 15/08/2020 à 17:56, Wes McKinney a écrit : > > What isn't clear is whether the Plasma that's in Ray is usable in a > C++ library context (e.g. what we currently ship as libplasma-dev e.g. > on Ubuntu/Debian). That seems still useful, but if the project isn't > being actively maintained /

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-15 Thread Wes McKinney
On Fri, Aug 14, 2020 at 11:56 PM Micah Kornfield wrote: > > > > > Regarding Plasma, you're right we should have started this conversation > > earlier! The way it's being developed in Ray currently isn't useful as a > > standalone project. We realized that tighter integration with Ray's object > >

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-08-14 Thread Micah Kornfield
> > Regarding Plasma, you're right we should have started this conversation > earlier! The way it's being developed in Ray currently isn't useful as a > standalone project. We realized that tighter integration with Ray's object > lifetime tracking could be important, and removing IPCs and making

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-07-21 Thread Robert Nishihara
Hi all, Regarding Plasma, you're right we should have started this conversation earlier! The way it's being developed in Ray currently isn't useful as a standalone project. We realized that tighter integration with Ray's object lifetime tracking could be important, and removing IPCs and making it

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-07-12 Thread Wes McKinney
I'll add deprecation warnings to the pyarrow.serialize functions in question, it will be pretty simple. On Sun, Jul 12, 2020, 6:34 PM Neal Richardson wrote: > This seems like something to investigate after the 1.0 release. > > Neal > > On Sun, Jul 12, 2020 at 11:53 AM Antoine Pitrou > wrote: >

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-07-12 Thread Neal Richardson
This seems like something to investigate after the 1.0 release. Neal On Sun, Jul 12, 2020 at 11:53 AM Antoine Pitrou wrote: > > I'd certainly like to deprecate our custom Python serialization format, > and using pickle protocol 5 instead is a very good idea. > > We can probably keep it in 1.0

Re: [DISCUSS] Plasma appears to have been forked, consider deprecating pyarrow.serialization

2020-07-12 Thread Antoine Pitrou
I'd certainly like to deprecate our custom Python serialization format, and using pickle protocol 5 instead is a very good idea. We can probably keep it in 1.0 while raising a FutureWarning. Regards Antoine. Le 12/07/2020 à 19:22, Wes McKinney a écrit : > It appears that the Ray developers