Hey Sean,
Thanks for response. MatrixMultiplicationJob help shows the usage like :
usage: command [Generic Options] [Job-Specific Options]
Here Generic Option can be provided by -D property=value. Hence I tried with
commandline -D options but it seems like that it is not making any effect. It
Hi,
I tried to call programmatically also but facing same issue : Only single
MapTask is running and that too spilling the map output continuously. Hence im
not able to generate the output for large matrix multiplication.
Code Snippet :
DistributedRowMatrix a = new DistributedRowMatrix(new
Why do you need multiple mappers? Is one too slow? Many are not necessarily
faster for small input
On Jan 16, 2013 10:46 AM, Stuti Awasthi stutiawas...@hcl.com wrote:
Hi,
I tried to call programmatically also but facing same issue : Only single
MapTask is running and that too spilling the map
The issue is that currently my matrix is of dimension (100x100k), Later it can
be (1MX10M) or big.
Even now if my job is running with the single mapper for (100x100k) and it is
not able to complete the Job. As I mentioned map task just proceed to 0.99% and
started spilling the map output.
MatrixMultiplicationJob internally sets InputFormat as CompositeInputFormat
JobConf conf = new JobConf(initialConf, MatrixMultiplicationJob.class);
conf.setInputFormat(CompositeInputFormat.class);
and AFAIK, CompositeInputFormat ignores the splits. See this
Thanks Ashish,
So according to the link if one is using CompositeInputFormat then it will take
entire file as Input to a mapper without considering InputSplits/blocksize.
If I am understanding it correctly then it is asking to break [Original Input
File]-[flie1,file2,] .
So If my file is
I am afraid I don't know the answer. Need to experiment a bit more. I have
not used CompositeInputFormat so cannot comment.
Probably, someone else on the ML(Mailing List) would be able to guide here.
On Wed, Jan 16, 2013 at 6:01 PM, Stuti Awasthi stutiawas...@hcl.com wrote:
Thanks Ashish,
You can try resetting all the random seeds with RandomUtils.useTestSeed()
On Jan 16, 2013 4:01 PM, Zia mel ziad.kame...@gmail.com wrote:
Hi
How to evaluate a recommender using different similarities ? Once we call
evaluator.evaluate(recommenderBuilder,..)
it will decide the training and test
Hi
Can we use Mahout to recommend to a group of users that share similar
interests? Maybe some clustering or so.
Thanks
Not really directly, no. You can make N individual recommendations and
combine them, and there are many ways to do that. You can blindly rank
them on their absolute scores. You can interleave rankings so each
gets every Nth slot in the recommendation. A popular metric is to rank
by least-aversion
Logistic regression is a good place to start.
The Mahout implementation stands alone without Hadoop. Look for
OnlineLogisticRegression.
On Mon, Jan 14, 2013 at 10:23 PM, VIGNESH S vigneshkln...@gmail.com wrote:
Hi,
I am looking for a light weight library for email classification..
can
Hi Ted,
Thanks ..
On Thu, Jan 17, 2013 at 12:41 AM, Ted Dunning ted.dunn...@gmail.com wrote:
Logistic regression is a good place to start.
The Mahout implementation stands alone without Hadoop. Look for
OnlineLogisticRegression.
On Mon, Jan 14, 2013 at 10:23 PM, VIGNESH S
12 matches
Mail list logo