Josh, That makes sense.
Thanks for the help, let me know when you hear back from general@incubator and how we can best proceed. I'll keep working the things Chris mentioned on my end. R/ Eric >When we say 'contrib', we don't actually mean the 'contrib' directory in >the core of Accumulo. Look at the first four repos on [1] (bsp, pig, >instamo-archetype and the wikisearch). When we say "contrib repository", >a brand new repository here is where we would see raccumulo fitting in. > >Despite this code being strongly tied to Accumulo, Accumulo is not >strongly tied to it. That said, I'm still open to help you get raccumulo >included underneath the Accumulo "umbrella" which will give anyone >providing back a way to build up merit and recognition among the >community (so we can get to a point where all interested parties can >manage it themselves). > >I'll go ahead and contact general@incubator to see about any potential >licensing issues due to R's GPL-ness and see what other guidance they >might have about importing the code. > >- Josh On Sun, Oct 27, 2013 at 6:14 PM, Eric Whyne <[email protected]> wrote: > 1. I will take care fo the ASF ICLAs from contributors this week. I don't > think that's a problem based on conversations I've had. > > 2. CCLA I don't see any major hurdles to doing this, I'll set up some > discussions with other company leadership to move this forward. I'll be > advocating for this because I think it's a great idea. > > 3a. I don't see the contributed code being able to sustain a new > community. Although there is a demand for the R interface this creates, > there's not many other directions or advancements that could be made that > would be useful to undertake independent of the core accumulo roadmap. > > 3b. After some discussion on our end, it's inclusion in the contrib where > the pull request places it is optimal for it's current state. It will make > an R interface to accumulo part of the core distribution. This could > potentially increase adoption of accumulo across more of the data science > community, I know it would intrigue us from a corporate standpoint... > that's why it was written. > > 3c. As a future roadmap for the capability, I suspect that there's a > better way to do what this code accomplishes (sans proxy?) and that the > functionality will get rolled somewhere into the rest of the project in a > better way. That effort would probably A: use large chunks of this code (so > you'd want to include it to ensure licensing and avoid potential conflicts) > and B: make any contrib project obsolete, leading to it's demise. If it's > an independent contrib project or incubating as an apache project a lot of > time would have been wasted. If it's part of the core distribution it > equates to merely moving a directory from the contrib section of the code > base PMC can feel free to remove the parts that are obsolete and much > confusion would be avoided. > > 4. We're willing to move forward with it in whatever way that makes sense > to all involved. The core developer (Phil Grim) has an interest in > maintaining the capability since we have customers depending on it. I think > the best way would be to keep our github fork updated and then just sending > maintenance pull requests as modifications are made on our end. Of course, > we'd be looking for guidance on how to best license it and > establishing/signing whatever we have to to ensure compliance with ASF > requirements. > > R/ > Eric > > > > On Sun, Oct 27, 2013 at 2:37 PM, Chris Mattmann <[email protected]>wrote: > >> Hi Josh, and Eric, >> >> Thanks. As for the IP clearances and stuff, yes, makes total sense. Eric >> and >> anyone that contributed to this patch need and should have ASF ICLAs on >> file. >> It's a simple process, download the ICLA here: >> >> http://www.apache.org/licenses/icla.txt >> >> >> And submit it for each individual contributor to [email protected] >> >> Beyond that it may be nice in this case since it's got corporate stuff >> attached to it to see if Data Tactics is willing to sign an Apache >> Corporate >> Contributor License Agreement (CCLA): >> >> http://www.apache.org/licenses/cla-corporate.txt >> >> >> Not a requirement by any means, but a nice thing. So Eric and others at >> Data Tactics, that's something to consider. >> >> Once that's taken care, the Accumulo PMC can take on the stewardship of >> the code if someone on that PMC like Josh is willing to work through any >> issues >> that that person (or others on the PMC) have with bringing it on board. >> Also since Apache is a meritocracy, Eric and others, you will be credited >> for the work that you contribute and if you keep doing so over time, and >> creating work for the Accumulo PMC members, your work will likely be >> recognized. >> >> The Incubator PMC is there as a clearinghouse for new projects and new >> communities to begin their journey towards Apache individual project >> status (TLP) and the Apache way. Eric: do you see this project as an >> entirely >> new community that is complementary to Accumulo? If so, the Incubator >> would >> be a good route to go. If you see this as part of core Accumulo (or even >> contrib) >> and you can convince someone on the PMC like Josh to help shepherd this >> code >> into their PMC, that's also another route. >> >> Cheers, >> Chris >> >> >> -----Original Message----- >> From: Josh Elser <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Sunday, October 27, 2013 11:22 AM >> To: "[email protected]" <[email protected]> >> Subject: Re: accumulo pull request: raccumulo Packaged: 2013-05-09 >> 22:18:20 UTC; pgrim >> >> >Thanks again, Eric (and Phil). >> > >> >It's awesome to see this amount of work put in to integrate with R. But, >> >personally, I don't think direct inclusion in Accumulo is the proper >> >place for it. >> > >> >It definitely cannot be directly merged as such: we would need to make >> >sure we have ICLAs from all individuals and a CCLA from Data-Tactics (if >> >memory serves). Essentially, we need to make sure the proper paperwork >> >exists that the ownership is assigned to the ASF (instead of individuals >> >or Data-Tactics as the notices alternate between currently). Also, the >> >ASF has a general process for handling imports of code. [1] >> > >> >It looks like it's missing any documentation on how to use it too, e.g. >> >the user needs to start an instance of the thrift proxy themselves, but >> >that's a little nit-picky on my end :) >> > >> >Given the chatter on ACCUMULO-1804, it seems like it's desired for this >> >to be its own contrib repo as a part of the ASF. The next step here >> >would be for us to contact the ASF incubator to figure out the IP rules >> >and shake out any licensing concerns. >> > >> >Let me know for sure and I can kick off a message to the incubator if >> >this is how you (and Data-Tactics) want to proceed. [2] >> > >> >- Josh >> > >> >[1] https://www.apache.org/dev/pmc.html#import >> >[2] http://incubator.apache.org/ >> > >> > >> >On 10/25/13, 12:13 PM, ericwhyne wrote: >> >> GitHub user ericwhyne opened a pull request: >> >> >> >> https://github.com/apache/accumulo/pull/4 >> >> >> >> raccumulo Packaged: 2013-05-09 22:18:20 UTC; pgrim >> >> >> >> This pull request is in response to this issue: >> >> https://issues.apache.org/jira/browse/ACCUMULO-1804 >> >> >> >> What this code is: >> >> Need to be able to support users who utilize RStudio to conduct >> >>analysis of data residing in the Accumulo data space instead of moving >> >>data from one repository to a stand alone system to have the analytic >> >>run in memory. RStudio should be able to make calls directly to the data >> >>space and provide the output within the RStudio interface. >> >> >> >> >> >> You can merge this pull request into a Git repository by running: >> >> >> >> $ git pull https://github.com/DataTacticsCorp/accumulo master >> >> >> >> Alternatively you can review and apply these changes as the patch at: >> >> >> >> https://github.com/apache/accumulo/pull/4.patch >> >> >> >> ---- >> >> commit 116c045d05074b0e0ccf907e42235f94aa7c1703 >> >> Author: Eric Whyne <[email protected]> >> >> Date: 2013-10-25T16:08:38Z >> >> >> >> raccumulo Packaged: 2013-05-09 22:18:20 UTC; pgrim >> >> >> >> ---- >> >> >> >> >> >
