Hi all,
I've started exploring Apache Crunch for use in developing some
analytics on top of the Apache Accumulo column family store. So far, it
looks very promising. I've implemented the source and sink and exposed
tables through the scrunch repl. Being able to interactively define and
submit map/reduce jobs from the repl will make developing new analytics
much easier. There are some enhancements that I'll need to put together
to support some of my analytical workflows. Much of this effort can be
abstracted and applied to the HBase support as well. If this is of
interest to anyone, I'd be happy to contribute back to the crunch
project. Let me know.
Thanks,
Anthony
- accumulo integration Anthony Fox
-