Re: Mahout on the cloud

2015-07-23 Thread Jay Vyas
An aside: You can also deploy mahout via asfbiftop in emr or openstack if you are interested in building your own distribution or patching but still want the convenience of automated deployment and rpm/deb packages. Boston university is going this route currently. > On Jul 23, 2015, at 10:45 AM

Extremely Slow ALS Recommender in 0.8, but faster in 0.9.

2014-08-22 Thread jay vyas
*very* slowly, so slowly, that its essentially hanging? Like i said, this bug appears gone in 0.9. Any thoughts would be appreciated. thanks! -- jay vyas

Re: Confusion on runtime of mahout.

2014-05-27 Thread Jay Vyas
; the runtime of the program are basically the same. Shouldn't it be faster > when the program runs on more machines? Any hint? > > Regards, Dong > > -- Jay Vyas http://jayunit100.blogspot.com

ParallelALSFactorizationJob: Long job names : not always picked up by JobHistoryServer?

2014-05-23 Thread Jay Vyas
nt properly process a file, you get a failure in the ALS job, because when it checks counters from previous job, an NPE is thrown. Im pretty lost on this, been looking into it on and off for some time - so anyone has a thought let me know. -- Jay Vyas http://jayunit100.blogspot.com

getCounters Exception in ALS jobs

2014-05-15 Thread Jay Vyas
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.main(ParallelALSFactorizationJob.java:111) -- Jay Vyas http://jayunit100.blogspot.com

Re: simple idea for improving mahout docs over the next month?

2014-04-18 Thread Jay Vyas
source for documentation > than the webpage would be so helpful (there still lots of work to do on the > website...), how do others see this? > > > > --sebastian > > > >> On 04/17/2014 05:06 PM, Jay Vyas wrote: > >> Hi sebastian: theoretically, one could extra

Re: simple idea for improving mahout docs over the next month?

2014-04-17 Thread Jay Vyas
> > > On 04/16/2014 09:31 PM, Jay Vyas wrote: > >> hi mahout... i finally thought of a really easy way of ad-hoc improvement >> of mahout docs, that can feed into the efforts to get formal docs >> improved. >> >> Any interest in creating a shared mahout F

simple idea for improving mahout docs over the next month?

2014-04-16 Thread Jay Vyas
volunteer to help translate the QA stream into "real" documentation / JIRAs etc. -- Jay Vyas http://jayunit100.blogspot.com

Recommendation thresholds

2014-03-31 Thread Jay Vyas
Hi again mahout! What is the lowest that we can set a threshold in the item recommender? I'd like to set it low enough to gaurantee output to confirm that my recommender actually worked structurally, and then start tightening it up But with --threshold=.0001 i still get no results.

Re: (help!) Can someone scan this

2014-03-31 Thread Jay Vyas
dir env properties. - expoerted MAHOUT_HOME. In any case, I thin something about the way mahout nests jobs, or else, the way it logs, makes it tricky to debug when failures happen in local mode, but i was never able to put my finger on just what. On Sat, Mar 29, 2014 at 11:34 AM, Jay Vyas wrote

Re: (help!) Can someone scan this

2014-03-29 Thread Jay Vyas
On Sat, Mar 29, 2014 at 2:01 AM, Sebastian Schelter wrote: > Jay, > > which version of Mahout are you using? Have you tried to explicitly set > the temp path? > > --sebastian > > > On 03/29/2014 01:52 AM, Jay Vyas wrote: > >> Hi again mahout: >> >>

(help!) Can someone scan this

2014-03-28 Thread Jay Vyas
ithms work , but its not clear what is really done for me by mahout, and what i have to do on my own for the distributed recommender APIs. -- Jay Vyas http://jayunit100.blogspot.com

Re: The 3 distributed recommenders

2014-03-28 Thread Jay Vyas
n , then it returns a new exception: /tmp/preparePreferenceMatrix/ratingMatrix not Found. So... Im thinking i must be doing something horribly wrong in my recommender, but cant figure out exactly what? On Fri, Mar 28, 2014 at 2:29 PM, Jay Vyas wrote: > Thanks sebastian. I guess i

Re: The 3 distributed recommenders

2014-03-28 Thread Jay Vyas
le Similarity-Based Neighborhood Methods with MapReduce, RecSys'12 > > http://ssc.io/wp-content/uploads/2012/06/rec11-schelter.pdf > > > > On 03/28/2014 02:04 PM, Jay Vyas wrote: > >> Hi mahout: >> >> Looking through the source code there are 3 distribu

apache maven repo: Hadoop 2.2 compilations?

2014-03-28 Thread Jay Vyas
Does the apache maven release repo contain hadoop 2x compiled jars? Or shall we just compile those manually? -- Jay Vyas http://jayunit100.blogspot.com

The 3 distributed recommenders

2014-03-28 Thread Jay Vyas
Hi mahout: Looking through the source code there are 3 distributed recommenders... the als recommender the item recommender the pseudo recommender Any docs differentiating these? -- Jay Vyas http://jayunit100.blogspot.com

Re: Does Recommender System Overview Demo work?

2014-03-24 Thread Jay Vyas
t;>> works. I don't find webapp directory in integration/ and hence even > >>> after I add jetty plugin in the pom.xml in integration/, it is > throwing an > >>> exception. > >>> > >>> Bhargav Golla > >>> Committer, ASF > >>> Github <http://www.github.com/bhargavgolla> | > >>> LinkedIN<http://www.linkedin.com/in/bhargavgolla> > >>>| Website <http://www.bhargavgolla.com/> > >>> > >>> > >> > > > -- Jay Vyas http://jayunit100.blogspot.com

Re: Problem with K-Means clustering on Amazon EMR

2014-03-16 Thread Jay Vyas
Another wild guess, I've had issues trying to use the 's3' protocol from >> Hadoop and got things working by using the 's3n' protocol instead. >> >>> On Mar 16, 2014, at 8:41 AM, Jay Vyas wrote: >>> >>> I specifically have fixed ma

Re: Problem with K-Means clustering on Amazon EMR

2014-03-16 Thread Jay Vyas
I specifically have fixed mapreduce jobs by doing what the error message suggests. But maybe (hopefully) there is another workaround that is configuration driven. Just a hunch but, Maybe mahout needs to be refactored to create fs objects using the get(uri,conf) calls? As hadoop evolves to supp

Re: Adapters for mahout inputs .... anyone working on this?

2014-02-22 Thread Jay Vyas
Yes it will be tricky. But that said, i think we should be able to simply change the existing parsers to be more flexible , to accomodate at least slightly more diverse inputs. For example, variable columns in CSV etc...

Re: Adapters for mahout inputs .... anyone working on this?

2014-02-21 Thread Jay Vyas
> > > > On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas wrote: > > > Hi mahout. Was thinking about building adapters so it would be easier to > > run algorithms on a broader set of input data structures, unless there is > > already an initiative to do

Adapters for mahout inputs .... anyone working on this?

2014-02-21 Thread Jay Vyas
Hi mahout. Was thinking about building adapters so it would be easier to run algorithms on a broader set of input data structures, unless there is already an initiative to do so: Any thoughts on this JIRA? https://issues.apache.org/jira/browse/MAHOUT-1421 -- Jay Vyas http://jayunit100

Re: Use Naïve Bayes on a large CSV

2014-02-20 Thread Jay Vyas
This relates to a previous question I have: Does mahout have a concept of adapters which allow us to read data csv style data with filters to create exact format for its various inputs (i.e. Recommender three column format).? If not is it worth a jira? > On Feb 20, 2014, at 7:50 AM, Kevin M

Re: Mahout on Spark?

2014-02-19 Thread Jay Vyas
+100 for this, different execution engines, like the direction pig and crunch take Sent from my iPhone > On Feb 19, 2014, at 5:19 AM, Gokhan Capan wrote: > > I imagine in Mahout offering an option to the users to select from > different execution engines (just like we currently do by giving

Alternative input formats for Distributed REcommenders.

2014-02-16 Thread Jay Vyas
x123x, iphone, .3 So I'd like to tell the recommender engine at runtime to read in fields 0, 2, and 3, skipping the garbage text in column 1. Any ideas on how to handle this without having to write a mapreduce job just to scrape 3 out of the 4 columns out of the file? -- Jay Vyas http://