Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-14 Thread Timothy Potter
Hi Ted, Re: In the readme, there is an example of using elephant-bird to store the Classifier in a SequenceFile, i.e. /* the trained model is passed to use as a bytearray so we just pass it on out. The classifier class just contains the list of target valeus and the

Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-14 Thread Ted Dunning
Tim, Sorry for the confusion and lack of help. Pig-vector is half-done and not even quite half-baked. Your help in updating the readme is very much appreciated. On Mon, May 14, 2012 at 10:17 AM, Timothy Potter thelabd...@gmail.comwrote: Hi Ted, Re: In the readme, there is an example of

Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-14 Thread Timothy Potter
My pleasure and hoping to do more with it ;-) Cheers, Tim On Mon, May 14, 2012 at 1:11 PM, Ted Dunning ted.dunn...@gmail.com wrote: Tim, Sorry for the confusion and lack of help. Pig-vector is half-done and not even quite half-baked. Your help in updating the readme is very much

Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-12 Thread Jake Mannix
Well actually elephant-bird has a generic com.twitter.elephantbird.pig.store.SequenceFileStorage which lets you use generic WritableConverters (com.twitter.elephantbird.pig.util.TextConverter, com.twitter.elephantbird.pig.util.IntWritableConverter, etc) to produce *Writable types as keys and

Question about storage in Pig-vector (Pig + Mahout)

2012-05-11 Thread Timothy Potter
I'm trying to run the simple 20-newsgroups example to train a Mahout classifier using Pig and am unsure about the elephant-bird stuff. First, after battling with getting a build of elephant-bird, the store to SequenceFile didn't work for me. Then I saw the PigModelStorage and just used that and

Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-11 Thread Jake Mannix
On Fri, May 11, 2012 at 11:38 AM, Timothy Potter thelabd...@gmail.comwrote: I'm trying to run the simple 20-newsgroups example to train a Mahout classifier using Pig and am unsure about the elephant-bird stuff. First, after battling with getting a build of elephant-bird, Why did you have to

Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-11 Thread Timothy Potter
Thanks for the help Jake. Makes sense about interfacing with other Mahout classes. What is confusing is that the PigModelStorage class also seems to produce a SequenceFile, i.e public OutputFormat getOutputFormat() throws IOException { return new SequenceFileOutputFormat(); } Maven

Re: Question about storage in Pig-vector (Pig + Mahout)

2012-05-11 Thread Ted Dunning
PigModelStorage stores SGD models. The elephant bird stuff stores data in the form of vectors. On Fri, May 11, 2012 at 11:38 AM, Timothy Potter thelabd...@gmail.comwrote: So my main question is what does the elephant-bird model storage stuff do that PigModelStorage doesn't?