Re: Needs clue to create a Proof of Concept recommender

2011-08-09 Thread Jeffrey
Hi Sean, Thanks for the help, is currently reading for more information (please let me know if I am not reading the right document). So in short, by using the API, I can produce a SequenceFile by feeding the sql result containing image and tag data

Re: Needs clue to create a Proof of Concept recommender

2011-08-09 Thread Sean Owen
You need some glue code here -- what you need to create in Java is a SequenceFile.Writer, and feed that to a VectorWritable, which knows how to write vectors in the right format. It's straightforward but needs some coding. There's no magic that ingests SQL and outputs this. Yes, but where the the

Re: distributed RandomSampler job?

2011-08-09 Thread Timothy Potter
Hi Ted, Can you clarify your point about "each mapper needs to retain as many samples are desired in the end"? Does this mean I'm restricted to sample sizes based on the max number of key/value pairs in a split? From what I've read in the Hadoop docs, the number of map tasks for a job is determine

Re: distributed RandomSampler job?

2011-08-09 Thread Ted Dunning
Well, yes, that does sound like what I said. But in that case, the mapper should just pass all of the data on to the reducer. You are limited to sample sizes that are the size of your data. And I certainly phrased it in a way that implied that your sample has to fit into memory. There are out-

reason for getting clustering result like 0 belongs to cluster 1.0: [ ]

2011-08-09 Thread eric skinner
I ran the NewsKMeansClustering.java(an example given in chapter 9 of Mahout-in-Action) against a set of sequence files. However, the generated result looks like this? 0 belongs to cluster 1.0: [] 0 belongs to cluster 1.0: [] 0 belongs to cluster 1.0: [] 0 belongs to cluster 1.0: [] 0 belongs to cl

Re: Errors in SSVD

2011-08-09 Thread Dmitriy Lyubimov
I will get back on this after my vacation. At this point I haven't tested the code with u0. Also one of the jobs must have one reducer (the parallelism of this job is heavily vested on use of combiners and high degree of aggregated output), this may have been broken by one of those attempts to unif

Re: Errors in SSVD

2011-08-09 Thread Dmitriy Lyubimov
Damn android autocorrection. I meant from one of my github ac ounts. Sometimes it(android spellcheck) makes me feel stupid. On Aug 9, 2011 5:22 PM, "Dmitriy Lyubimov" wrote: > I will get back on this after my vacation. At this point I haven't tested > the code with u0. Also one of the jobs must ha