Extremely Slow ALS Recommender in 0.8, but faster in 0.9.

2014-08-22 Thread jay vyas
*very* slowly, so slowly, that its essentially hanging? Like i said, this bug appears gone in 0.9. Any thoughts would be appreciated. thanks! -- jay vyas

Re: Confusion on runtime of mahout.

2014-05-27 Thread Jay Vyas
, the runtime of the program are basically the same. Shouldn't it be faster when the program runs on more machines? Any hint? Regards, Dong -- Jay Vyas http://jayunit100.blogspot.com

ParallelALSFactorizationJob: Long job names : not always picked up by JobHistoryServer?

2014-05-23 Thread Jay Vyas
a failure in the ALS job, because when it checks counters from previous job, an NPE is thrown. Im pretty lost on this, been looking into it on and off for some time - so anyone has a thought let me know. -- Jay Vyas http://jayunit100.blogspot.com

getCounters Exception in ALS jobs

2014-05-15 Thread Jay Vyas
) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.main(ParallelALSFactorizationJob.java:111) -- Jay Vyas http://jayunit100.blogspot.com

Re: simple idea for improving mahout docs over the next month?

2014-04-18 Thread Jay Vyas
(there still lots of work to do on the website...), how do others see this? --sebastian On 04/17/2014 05:06 PM, Jay Vyas wrote: Hi sebastian: theoretically, one could extract all the information from a mailing list search but i think a rolling FAQ would much more (1) be likely evolve

simple idea for improving mahout docs over the next month?

2014-04-16 Thread Jay Vyas
will volunteer to help translate the QA stream into real documentation / JIRAs etc. -- Jay Vyas http://jayunit100.blogspot.com

Re: (help!) Can someone scan this

2014-03-31 Thread Jay Vyas
dir env properties. - expoerted MAHOUT_HOME. In any case, I thin something about the way mahout nests jobs, or else, the way it logs, makes it tricky to debug when failures happen in local mode, but i was never able to put my finger on just what. On Sat, Mar 29, 2014 at 11:34 AM, Jay Vyas

Recommendation thresholds

2014-03-31 Thread Jay Vyas
Hi again mahout! What is the lowest that we can set a threshold in the item recommender? I'd like to set it low enough to gaurantee output to confirm that my recommender actually worked structurally, and then start tightening it up But with --threshold=.0001 i still get no results.

Re: (help!) Can someone scan this

2014-03-29 Thread Jay Vyas
On Sat, Mar 29, 2014 at 2:01 AM, Sebastian Schelter s...@apache.org wrote: Jay, which version of Mahout are you using? Have you tried to explicitly set the temp path? --sebastian On 03/29/2014 01:52 AM, Jay Vyas wrote: Hi again mahout: Im wrapping a distributed recommender like

The 3 distributed recommenders

2014-03-28 Thread Jay Vyas
Hi mahout: Looking through the source code there are 3 distributed recommenders... the als recommender the item recommender the pseudo recommender Any docs differentiating these? -- Jay Vyas http://jayunit100.blogspot.com

apache maven repo: Hadoop 2.2 compilations?

2014-03-28 Thread Jay Vyas
Does the apache maven release repo contain hadoop 2x compiled jars? Or shall we just compile those manually? -- Jay Vyas http://jayunit100.blogspot.com

Re: The 3 distributed recommenders

2014-03-28 Thread Jay Vyas
http://ssc.io/wp-content/uploads/2012/06/rec11-schelter.pdf On 03/28/2014 02:04 PM, Jay Vyas wrote: Hi mahout: Looking through the source code there are 3 distributed recommenders... the als recommender the item recommender the pseudo recommender Any docs differentiating

Re: The 3 distributed recommenders

2014-03-28 Thread Jay Vyas
... Im thinking i must be doing something horribly wrong in my recommender, but cant figure out exactly what? On Fri, Mar 28, 2014 at 2:29 PM, Jay Vyas jayunit...@gmail.com wrote: Thanks sebastian. I guess im looking more for some hints on how to use the mahout API for this. 1) Does

(help!) Can someone scan this

2014-03-28 Thread Jay Vyas
, but its not clear what is really done for me by mahout, and what i have to do on my own for the distributed recommender APIs. -- Jay Vyas http://jayunit100.blogspot.com

Re: Does Recommender System Overview Demo work?

2014-03-24 Thread Jay Vyas
. Bhargav Golla Committer, ASF Github http://www.github.com/bhargavgolla | LinkedINhttp://www.linkedin.com/in/bhargavgolla | Website http://www.bhargavgolla.com/ -- Jay Vyas http://jayunit100.blogspot.com

Re: Problem with K-Means clustering on Amazon EMR

2014-03-16 Thread Jay Vyas
I specifically have fixed mapreduce jobs by doing what the error message suggests. But maybe (hopefully) there is another workaround that is configuration driven. Just a hunch but, Maybe mahout needs to be refactored to create fs objects using the get(uri,conf) calls? As hadoop evolves to

Re: Problem with K-Means clustering on Amazon EMR

2014-03-16 Thread Jay Vyas
the 's3' protocol from Hadoop and got things working by using the 's3n' protocol instead. On Mar 16, 2014, at 8:41 AM, Jay Vyas jayunit...@gmail.com wrote: I specifically have fixed mapreduce jobs by doing what the error message suggests. But maybe (hopefully) there is another workaround

Re: Adapters for mahout inputs .... anyone working on this?

2014-02-22 Thread Jay Vyas
Yes it will be tricky. But that said, i think we should be able to simply change the existing parsers to be more flexible , to accomodate at least slightly more diverse inputs. For example, variable columns in CSV etc...

Re: Adapters for mahout inputs .... anyone working on this?

2014-02-21 Thread Jay Vyas
? On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas jayunit...@gmail.com wrote: Hi mahout. Was thinking about building adapters so it would be easier to run algorithms on a broader set of input data structures, unless there is already an initiative to do so: Any thoughts on this JIRA? https

Re: Use Naïve Bayes on a large CSV

2014-02-20 Thread Jay Vyas
This relates to a previous question I have: Does mahout have a concept of adapters which allow us to read data csv style data with filters to create exact format for its various inputs (i.e. Recommender three column format).? If not is it worth a jira? On Feb 20, 2014, at 7:50 AM, Kevin

Alternative input formats for Distributed REcommenders.

2014-02-16 Thread Jay Vyas
, iphone, .3 So I'd like to tell the recommender engine at runtime to read in fields 0, 2, and 3, skipping the garbage text in column 1. Any ideas on how to handle this without having to write a mapreduce job just to scrape 3 out of the 4 columns out of the file? -- Jay Vyas http://jayunit100