date:20130116

RE: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Stuti Awasthi

Hey Sean, Thanks for response. MatrixMultiplicationJob help shows the usage like : usage: command [Generic Options] [Job-Specific Options] Here Generic Option can be provided by -D property=value. Hence I tried with commandline -D options but it seems like that it is not making any effect. It

RE: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Stuti Awasthi

Hi, I tried to call programmatically also but facing same issue : Only single MapTask is running and that too spilling the map output continuously. Hence im not able to generate the output for large matrix multiplication. Code Snippet : DistributedRowMatrix a = new DistributedRowMatrix(new

RE: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Sean Owen

Why do you need multiple mappers? Is one too slow? Many are not necessarily faster for small input On Jan 16, 2013 10:46 AM, Stuti Awasthi stutiawas...@hcl.com wrote: Hi, I tried to call programmatically also but facing same issue : Only single MapTask is running and that too spilling the map

RE: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Stuti Awasthi

The issue is that currently my matrix is of dimension (100x100k), Later it can be (1MX10M) or big. Even now if my job is running with the single mapper for (100x100k) and it is not able to complete the Job. As I mentioned map task just proceed to 0.99% and started spilling the map output.

Re: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Ashish

MatrixMultiplicationJob internally sets InputFormat as CompositeInputFormat JobConf conf = new JobConf(initialConf, MatrixMultiplicationJob.class); conf.setInputFormat(CompositeInputFormat.class); and AFAIK, CompositeInputFormat ignores the splits. See this

RE: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Stuti Awasthi

Thanks Ashish, So according to the link if one is using CompositeInputFormat then it will take entire file as Input to a mapper without considering InputSplits/blocksize. If I am understanding it correctly then it is asking to break [Original Input File]-[flie1,file2,] . So If my file is

Re: MatrixMultiplicationJob runs with 1 mapper only ?

2013-01-16 Thread Ashish

I am afraid I don't know the answer. Need to experiment a bit more. I have not used CompositeInputFormat so cannot comment. Probably, someone else on the ML(Mailing List) would be able to guide here. On Wed, Jan 16, 2013 at 6:01 PM, Stuti Awasthi stutiawas...@hcl.com wrote: Thanks Ashish,

Re: Test multiple similarities using the same data

2013-01-16 Thread Sean Owen

You can try resetting all the random seeds with RandomUtils.useTestSeed() On Jan 16, 2013 4:01 PM, Zia mel ziad.kame...@gmail.com wrote: Hi How to evaluate a recommender using different similarities ? Once we call evaluator.evaluate(recommenderBuilder,..) it will decide the training and test

Recommend to a group of users

2013-01-16 Thread Zia mel

Hi Can we use Mahout to recommend to a group of users that share similar interests? Maybe some clustering or so. Thanks

Re: Recommend to a group of users

2013-01-16 Thread Sean Owen

Not really directly, no. You can make N individual recommendations and combine them, and there are many ways to do that. You can blindly rank them on their absolute scores. You can interleave rankings so each gets every Nth slot in the recommendation. A popular metric is to rank by least-aversion

Re: Which ML Algorithms i can run without hadoop..

2013-01-16 Thread Ted Dunning

Logistic regression is a good place to start. The Mahout implementation stands alone without Hadoop. Look for OnlineLogisticRegression. On Mon, Jan 14, 2013 at 10:23 PM, VIGNESH S vigneshkln...@gmail.com wrote: Hi, I am looking for a light weight library for email classification.. can

Re: Which ML Algorithms i can run without hadoop..

2013-01-16 Thread VIGNESH S

Hi Ted, Thanks .. On Thu, Jan 17, 2013 at 12:41 AM, Ted Dunning ted.dunn...@gmail.com wrote: Logistic regression is a good place to start. The Mahout implementation stands alone without Hadoop. Look for OnlineLogisticRegression. On Mon, Jan 14, 2013 at 10:23 PM, VIGNESH S

RE: MatrixMultiplicationJob runs with 1 mapper only ?

RE: MatrixMultiplicationJob runs with 1 mapper only ?

RE: MatrixMultiplicationJob runs with 1 mapper only ?

RE: MatrixMultiplicationJob runs with 1 mapper only ?

Re: MatrixMultiplicationJob runs with 1 mapper only ?

RE: MatrixMultiplicationJob runs with 1 mapper only ?

Re: MatrixMultiplicationJob runs with 1 mapper only ?

Re: Test multiple similarities using the same data

Recommend to a group of users

Re: Recommend to a group of users

Re: Which ML Algorithms i can run without hadoop..

Re: Which ML Algorithms i can run without hadoop..

12 matches

Site Navigation

Mail list logo

Footer information