Hi, I fully agree. Accumulo looks cool, but at least I don't have any experience with it. Besides, the HBase dependency disaster is still fresh in my mind. 50+ dependencies, with several of them being incompatible with Hadoop and Crunch and no way to be sure it actually works.
BTW: Do we have someone on the team who could help us solve the HBase issues? Regards, Matthias On Wednesday, 2012-10-31, Josh Wills wrote: > +crunch-user, to see if we have any lurking accumulo users > > Hey Anthony, > > I don't think that we have much Accumulo experience yet among the > committers, so I'm hesitant to add a crunch-accumulo subproject w/o having > someone on the team who is dedicated to maintaining it. If you have stuff > you want to open source on github, we would be happy to link to it on the > Crunch homepage (something we should do for crunchR, come to think of it), > and we're all very happy to work together on bug fixes and new features to > support your use cases. Ideally, we would all work together for awhile and > get to like working with each other, and then you would join the committers > and own the submodule. > > I'll let other folks weigh in, but that's my two cents. > > J > > > On Wed, Oct 31, 2012 at 7:29 AM, Anthony Fox <[email protected]> wrote: > > > Hi all, > > > > I've started exploring Apache Crunch for use in developing some analytics > > on top of the Apache Accumulo column family store. So far, it looks very > > promising. I've implemented the source and sink and exposed tables through > > the scrunch repl. Being able to interactively define and submit map/reduce > > jobs from the repl will make developing new analytics much easier. There > > are some enhancements that I'll need to put together to support some of my > > analytical workflows. Much of this effort can be abstracted and applied to > > the HBase support as well. If this is of interest to anyone, I'd be happy > > to contribute back to the crunch project. Let me know. > > > > Thanks, > > Anthony > >
