Re: Apache Arrow at JupyterCon

Gang(Gary) Wang Thu, 31 Aug 2017 14:09:45 -0700

Hi Wes,

Thank you for the explanation. the usage of
https://issues.apache.org/jira/browse/ARROW-721 could be directly supported
by Mnemonic through DurableBuffer and DurableChunk, the DurableChunk makes
use of unsafe to expose a plain memory space for Arrow to use without
performance penalties. that's why most of the big data frameworks take the
advantage of unsafe, please refer to
https://mnemonic.apache.org/docs/domusecases.html for the use cases. we
could work on this ticket if you think that's exactly what you want.


Regarding the NVM tech., that is what Mnemonic created for. it could be
used to directly persist Java generic objects and collection on NVM with no
SerDe. so what kind of basic tools you mentioned? probably,  we can help
also identify the gaps for Mnemonic as well. Thanks!

Very truly yours,
Gary










On Thu, Aug 31, 2017 at 12:32 PM, Wes McKinney <wesmck...@gmail.com> wrote:

> hi Gary,
>
> The Java libraries are not yet capable of writing or zero-copy reads
> of Arrow datasets to/from shared memory or memory-mapped files:
> https://issues.apache.org/jira/browse/ARROW-721. We've developed quite
> a bit of technology on the C++ side for dealing with shared memory IPC
> but we need someone to help with that on the Java side.
>
> In the context of NVM technologies, it would be nice to be able to
> persist a dataset to NVM and continue to do analytics on it, while
> retaining a "handle" so that the dataset can be easily recovered in
> the event of process failure. We may arrive at new use cases once some
> of the basic tools exist.
>
> - Wes
>
> On Wed, Aug 30, 2017 at 6:19 PM, Gang(Gary) Wang <ga...@apache.org> wrote:
> > Thank you for sharing the videos. We are very interested in how to
> support
> > Arrow data format and collection very closely, could you please help to
> > point out which interfaces to allow Mnemonic act as a memory provider for
> > the user to store and access Arrow managed datasets ? Thanks!
> >
> > Very truly yours,
> > Gary.
> >
> >
> > On Wed, Aug 30, 2017 at 2:11 PM, Ivan Sadikov <ivan.sadi...@gmail.com>
> > wrote:
> >
> >> Great presentation! Thank you for sharing.
> >>
> >>
> >> On Thu, 31 Aug 2017 at 8:02 AM, Wes McKinney <wesmck...@gmail.com>
> wrote:
> >>
> >> > Absolutely. I will do that now
> >> >
> >> > On Wed, Aug 30, 2017 at 3:33 PM, Julian Hyde <jh...@apache.org>
> wrote:
> >> > > Thanks for sharing. Can we tweet those videos as well? I see that
> >> > https://twitter.com/apachearrow <https://twitter.com/apachearrow>
> only
> >> > tweeted your slides.
> >> > >
> >> > >> On Aug 26, 2017, at 1:11 PM, Wes McKinney <wesmck...@gmail.com>
> >> wrote:
> >> > >>
> >> > >> hi all,
> >> > >>
> >> > >> In case folks here are interested, I gave a keynote this week at
> >> > >> JupyterCon explaining my motivations for being involved in Apache
> >> > >> Arrow and how I see it fitting in with the data science ecosystem
> long
> >> > >> term:
> >> > >>
> >> > >> https://www.youtube.com/watch?v=wdmf1msbtVs
> >> > >>
> >> > >> I also gave an interview going a little deeper into some of the
> topics
> >> > >> from the talk:
> >> > >>
> >> > >> https://www.youtube.com/watch?v=Q7y9l-L8yiU
> >> > >>
> >> > >> I believe we have an exciting journey ahead of us, but it's
> certainly
> >> > >> going to take a lot of collaboration and community development.
> >> > >>
> >> > >> - Wes
> >> > >
> >> >
> >>
>

Re: Apache Arrow at JupyterCon

Reply via email to