This is great news and will automatically boost the performance of all
our ALS-based recommenders which are all using QRDecomposition internally.
On 28.01.2013 04:02, Ted Dunning wrote:
Did that.
You are right. The QRD in mahout is abysmally slow. I wrote a new version
on the airplane that
Is it worth simply using the Commons Math implementation?
On Mon, Jan 28, 2013 at 8:04 AM, Sebastian Schelter s...@apache.org wrote:
This is great news and will automatically boost the performance of all
our ALS-based recommenders which are all using QRDecomposition internally.
On 28.01.2013
Hi,
I would like to again consolidate all the steps which I performed.
Issue : MatrixMultiplication example is getting executed with only 1 map task.
Steps :
1. I created a file with size 104MB which is divided into 11 blocks with size
10MB each. The file contains 200x10 size of matrix.
These are settings to Hadoop, not Mahout. You may need to set them in
your cluster config. They are still only suggestions.
The question still remains why you think you need several mappers. Why?
On Mon, Jan 28, 2013 at 1:28 PM, Stuti Awasthi stutiawas...@hcl.com wrote:
Hi,
I would like to
I faced this problem too.
Split the seq file in which ur data is there into
Multiple files. Then run the matrix multiplication with the folder as input
. If the folder contains N sequence files, N mappers will be created.
On Monday, 28 January 2013, Sean Owen wrote:
These are settings to
A wrapper is needed then because Commons Math takes in and outputs in
different data structure.
On Mon, Jan 28, 2013 at 3:14 AM, Sean Owen sro...@gmail.com wrote:
Is it worth simply using the Commons Math implementation?
On Mon, Jan 28, 2013 at 8:04 AM, Sebastian Schelter s...@apache.org
Yeah... having to copy the matrix is a pain in the butt.
On Mon, Jan 28, 2013 at 8:13 AM, Ying Liao yliao...@gmail.com wrote:
A wrapper is needed then because Commons Math takes in and outputs in
different data structure.
On Mon, Jan 28, 2013 at 3:14 AM, Sean Owen sro...@gmail.com wrote:
Any thoughts of this ?
On Sat, Jan 26, 2013 at 10:55 AM, Zia mel ziad.kame...@gmail.com wrote:
OK , in the precison when we reduce the size of sample to .1 or 0.05 ,
would the results be related when we check with all the data ? For
example, if we have data1 and data2 and test them using 0.1
Impossible to say. More data means a more reliable estimate all else equal.
That's about it.
On Jan 28, 2013 5:17 PM, Zia mel ziad.kame...@gmail.com wrote:
Any thoughts of this ?
On Sat, Jan 26, 2013 at 10:55 AM, Zia mel ziad.kame...@gmail.com wrote:
OK , in the precison when we reduce the
What about running several tests on small data , can't that give an
indicator of how big data will perform ?
Thanks
On Mon, Jan 28, 2013 at 11:19 AM, Sean Owen sro...@gmail.com wrote:
Impossible to say. More data means a more reliable estimate all else equal.
That's about it.
On Jan 28, 2013
Yes several independent samples of all the data will, together, give
you a better estimate of the real metric value than any individual
one.
On Mon, Jan 28, 2013 at 5:41 PM, Zia mel ziad.kame...@gmail.com wrote:
What about running several tests on small data , can't that give an
indicator of
Sorry.. accidental sent out:
But as I was saying..
I was looking in the link :
https://cwiki.apache.org/confluence/display/MAHOUT/Top+Down+Clustering
but its not very clear how to perform heirarchical clustering?
Also, in the end.. I would also want to get the ids where each of the
cluster center
12 matches
Mail list logo