Hi,
The Dachis Group data analytics team are big users of Pig and just a
little Mahout so far, but that's changing soon so we'd like to
contribute some of our works and know-how back to the Mahout / Pig
communities.
In the near term, we're planning a Pig-Mahout hackday (code-named
Pigout) at our
On Wed, May 2, 2012 at 11:06 AM, Timothy Potter thelabd...@gmail.comwrote:
We're really keen on Ted's pig-vector project
(https://github.com/tdunning/pig-vector) as we're building a number of
classifiers on Mahout's SGD framework, with the bulk of our data being
in Cassandra processed almost
On Wed, May 2, 2012 at 11:13 AM, Ted Dunning ted.dunn...@gmail.com wrote:
On Wed, May 2, 2012 at 11:06 AM, Timothy Potter thelabd...@gmail.com
wrote:
We're really keen on Ted's pig-vector project
(https://github.com/tdunning/pig-vector) as we're building a number of
classifiers on
Thanks Ted! Removing the elephant-bird dependency / build problems
sounds like a good task we should include in our plans for the hackday
... what are your thoughts on adding pig-vector to Mahout as a contrib
module? Do you want to keep it separate or eventually make its way
into the project?
Hi Tim, Ted,
I wanted to chime in here regarding Elephant Bird utilities for
Pig-Mahout integration. I'm the author of EB's SequenceFileLoader,
SequenceFileStorage, and all the supporting WritableConverters,
including the VectorWritableConverter which facilitates conversion of
Mahout Vector data
Making a pig module for mahout is a fine idea. The twitter guys may have
something better, though, so we should explore that as well. Andy's
comments make that possibility very interesting.
On Wed, May 2, 2012 at 5:20 PM, Timothy Potter thelabd...@gmail.com wrote:
Thanks Ted! Removing the
Mahout is missing integration tools, this is true. Data have to be
converted to Mahout-accepted input. One way of doing it is outlined below:
1) collect unique terms from your data and make the dictionary of terms.
This can be done by any means, e.g. Hadoop streaming job in 2 steps -
collect
On Wed, May 2, 2012 at 8:07 PM, Ted Dunning ted.dunn...@gmail.com wrote:
Making a pig module for mahout is a fine idea. The twitter guys may have
something better, though, so we should explore that as well. Andy's
comments make that possibility very interesting.
What I'd want to suggest is
On Wed, May 2, 2012 at 9:05 PM, Jake Mannix jake.man...@gmail.com wrote:
On Wed, May 2, 2012 at 8:07 PM, Ted Dunning ted.dunn...@gmail.com wrote:
Making a pig module for mahout is a fine idea. The twitter guys may have
something better, though, so we should explore that as well. Andy's