fwiw I've seen some projects use hangouts/webex pretty effectively. Patrick
On Wed, Mar 30, 2016 at 11:15 PM, Wang, Yanping <[email protected]> wrote: > Yeah, I was so busy and in hurry to catch other sessions. We only talked > about 2 minutes :-) > After Jacques and Wes's Arrow presentation, someone in audiences asked if > Arrow is going to use RDMA, I answered: RDMA is going to be used in Mnemonic > project to support data transfer among nodes and clusters. > It makes perfect sense we position Mnemonic under Arrow to support its use of > persistent storage media. > > Thanks Patrick, Henry, Tayler G for the guideline. We can brainstorm ideas in > both dev lists, and post those ideas in jira so developers can see where our > projects are heading to. > Gary and I are located in Portland Oregon, we usually plan our SC visits 2 > weeks ahead. > > Thanks, > Yanping > > > -----Original Message----- > From: Jacques Nadeau [mailto:[email protected]] > Sent: Wednesday, March 30, 2016 7:34 PM > To: [email protected] > Cc: [email protected]; [email protected] > Subject: Re: A Proposal Apache Incubator Mnemonic as an alternative infra. > for Apache Arrow > > Yup. Will do. > > The discussion today was limited to "let's meet". > > > > On Wed, Mar 30, 2016 at 7:13 PM, P. Taylor Goetz <[email protected]> wrote: > >> +1 >> >> Discussions should be summarized and brought back to the mailing list(s). >> Recommendations are fine, but any decisions should be made on-list. >> >> -Taylor >> >> > On Mar 30, 2016, at 8:31 PM, Patrick Hunt <[email protected]> wrote: >> > >> > Remember that no decisions should be made at the meeting. It's fine to >> > have discussions, but those need to be brought back to the community >> > before decisions are made. Summarizing for the dev@ mailing list, also >> > jiras, etc... are good ways to socialize the issues. >> > >> > Patrick >> > >> >> On Wed, Mar 30, 2016 at 5:17 PM, Henry Saputra <[email protected]> >> wrote: >> >> The community for both podlings are bigger than the ones show up at >> Strata >> >> =) >> >> >> >> Would love to have the summary of the discussions in the dev@ list if >> >> indeed some discussions happening at Strata. >> >> >> >> - Henry >> >> >> >> On Wed, Mar 30, 2016 at 5:03 PM, Wang, Yanping <[email protected]> >> >> wrote: >> >> >> >>> Hi, All >> >>> >> >>> I met with Jacques today at Strata, we think it would be great that >> Arrow >> >>> and Mnemonic communities can have a F2F meeting together to talk about >> our >> >>> integration. >> >>> I have following two days, 4/11 Monday afternoon, or 4/15 Friday. >> >>> We can meet at intel SC campus. >> >>> >> >>> Would you let me know if you are able to join us and which day you'd >> >>> prefer? >> >>> >> >>> Thanks >> >>> Yanping >> >>> >> >>> >> >>> On Mar 29, 2016, at 4:38 PM, Gary <[email protected]<mailto: >> >>> [email protected]>> wrote: >> >>> >> >>> Yes, I agree with you and that's great if we could brainstorm here to >> >>> collect more ideas about enabling non-volatile memory usage for Apache >> >>> Arrow through Mnemonic. >> >>> >> >>> for the questions, my ideas are: >> >>> >> >>> >> >>> - Right now you are using unpooled persistent memory. Does that make >> sense >> >>> or does chunking make more sense? >> >>> >> >>> Gary: I think it could make some sense if developer knows that their >> >>> datasets are very big and they want Apache Arrow to keep most of them >> in >> >>> memory for intensive computing e.g. sort. >> >>> the developer certainly can spill their Mnemonic managed >> >>> datasets into disk but this way seems a bit inefficient in some >> scenarios >> >>> that might depend on concrete application logic . >> >>> >> >>> >> >>> - What do you think is the right way to transition back and forth >> between >> >>> persistent and ephemeral memory? What do you think will be the first >> >>> pattern to be adopted. For example, do you think we should try to use >> it as >> >>> a tiered storage for sort spilling (before hitting the disk), or >> should we >> >>> use it for caching? >> >>> Gary: my 2 cents, the netty library looks not yet provide a elegant >> switch >> >>> mechanism for Arrow to use, probably we can change the logic around >> >>> "initialCapacity > directArena.chunkSize" to control which buffer put >> on >> >>> off-heap or managed by Mnemonic, another approach is to let memory >> >>> clustering mechanism of Mnemonic managing hybrid memory-like spaces >> instead >> >>> of part logics of class PooledByteBufAllocatorL. >> >>> Regarding the sorting, I think it is a typical case of random access to >> >>> the data, we should avoid spilling as much as possible. >> >>> my 2 cents, the performance could be >> >>> all in off-heap if possible > mnemonic used as cache > all in mnemonic >> >>> using NVMe/disk > off-heap + spilling >> >>> the code simplicity would be >> >>> all in off-heap if possible > all in mnemonic using NVMe/disk > >> mnemonic >> >>> used as cache > off-heap + spilling >> >>> >> >>> the reason why the mode "mnemonic used as cache + spilling" probably >> >>> unnecessary is mnemonic could provide nearly equivalent capacity of >> disk. >> >>> >> >>> Thanks. >> >>> Gary. >> >>> >> >>> >> >>> -----Original Message----- >> >>> >> >>> From: Jacques Nadeau [mailto:[email protected]] >> >>> >> >>> Sent: Tuesday, March 29, 2016 8:05 AM >> >>> >> >>> To: <mailto:[email protected]> [email protected]<mailto: >> >>> [email protected]> >> >>> >> >>> Subject: Re: A Proposal Apache Incubator Mnemonic as an alternative >> infra. >> >>> for Apache Arrow >> >>> >> >>> >> >>> >> >>> This is super cool. A couple of questions: >> >>> >> >>> >> >>> >> >>> - Right now you are using unpooled persistent memory. Does that make >> sense >> >>> or does chunking make more sense? >> >>> >> >>> - What do you think is the right way to transition back and forth >> between >> >>> persistent and ephemeral memory? What do you think will be the first >> >>> pattern to be adopted. For example, do you think we should try to use >> it as >> >>> a tiered storage for sort spilling (before hitting the disk), or >> should we >> >>> use it for caching? >> >>> >> >>> >> >>> >> >>> I think it will be much easier to think about this in the context of a >> >>> primary or first use case. Do you have something in mind or should we >> >>> brainstorm here? >> >>> >> >>> >> >>> >> >>> On Wed, Mar 23, 2016 at 7:16 PM, Gary <[email protected]<mailto: >> >>> [email protected]>> wrote: >> >>> >> >>> >> >>> >> >>>> Hello, >> >>> >> >>> >> >>>> We have created a patch for Apache Arrow to leverage Apache >> >>> >> >>>> incubator Mnemonic as an alternative infra. for underlying memory >> >>> >> >>>> resources allocation, you can find it as below forked repo. >> >>> >> >>> >> >>>> <https://github.com/NonVolatileComputing/arrow> >> >>> https://github.com/NonVolatileComputing/arrow >> >>> >> >>> >> >>>> By this way, Apache Arrow could take some structural benefits from >> >>> >> >>>> Mnemonic project they are >> >>> >> >>> >> >>>> - Arrow is able to leverage larger capacity of high performance >> >>> >> >>>> hybrid storage devices. e.g. high-end SSD, NVMe >> >>> >> >>> >> >>>> - Mnemonic provide a potential opportunity for Arrow to >> >>> >> >>>> optimize/tuning its allocation algorithms as a native Arrow-oriented >> >>> >> >>>> allocation services >> >>> >> >>> >> >>>> - The non-volatile features of Mnemonic make it possible that >> >>> >> >>>> Arrow could make its columnar in-memory data shared between different >> >>> >> >>>> applications or across life-cycle of single application >> >>> >> >>> >> >>>> - Arrow could take advantages of coming Mnemonic features of >> >>> >> >>>> memory clustering/DOG (distributed object graph) and massive native >> >>> >> >>>> computing >> >>> >> >>> >> >>>> - Mnemonic helps to reduce the pressure of main memory utilization >> >>> >> >>>> and its related system wide overheads. >> >>> >> >>> >> >>>> Our this patch is designed to minimize the changes for user to use >> >>> >> >>>> Arrow, please check out the test cases provided by this patch for your >> >>> >> >>>> reference. >> >>> >> >>> >> >>>> Note that, we need to put allocator services to a specified >> >>> >> >>>> position (indicated by pom.xml) for Mnemonic backed Arrow related test >> >>> >> >>>> cases to run because those services are required for external >> >>> >> >>>> memory-like device management. >> >>> >> >>> >> >>>> Please give your comments and review feedback for better >> >>> >> >>>> collaboration of Apache Arrow and Mnemonic, Thanks. >> >>> >> >>> >> >>>> Best Regards. >> >>> >> >>>> Gary. >> >>> >> >>> >> >>> >> >>> >> >>> <smime.p7m> >> >>> <gpgol000.txt> >> >>> >>
