Re: Matrix multiplication in Hadoop

2011-11-19 Thread Lance Norskog
Look for uses of the DistributedRowMatrix in the Mahout code. The existing Mahout jobs are generally end-to-end algorithm implementations which do things like matrix multiplication in the middle. Also, the Mahout algorithms generally prefer to use sparse data for distributed work. What is a "large

Re: Matrix multiplication in Hadoop

2011-11-19 Thread Stephen Boesch
Hi, there are two solutions suggested that take advantage of either (a) a vector x matrix (your CF / Mahout example ) or (b) a small matrix x large matrix (an earlier suggestion of putting the small matrix into the Distributed Cache). Not clear yet on good approaches of (c) large matrix x la

Re: Matrix multiplication in Hadoop

2011-11-19 Thread bejoy . hadoop
Hey Mike In mahout one place where matrix multiplication is used is in Collaborative Filtering distributed implementation. The recommendations here are generated by the multiplication of a cooccurence matrix with a user vector. This user vector is treated as a single column matrix a

new LAB VM online

2011-11-19 Thread Alexander C.H. Lorenz
Hi, I created a new testing environment as VirtualBox - Image. Contains 4 Servers, CDH3u2, hBase, hive, Stargate, sqoop. I use them for testing, I dont know if anyone will use them too. The image has around 4 GB and will deploy 4 server with 40GB HDD. I wrote a site about in my blog. I think for n

Re: Matrix multiplication in Hadoop

2011-11-19 Thread Michel Segel
You really don't need to wait... If you're going to go down this path you can use a jni wrapper to do the c/c++ code for the gpu... You can do that now... If you want to go beyond the 1D you can do it but you have to get a bit creative... but it's doable... Sent from a remote device. Please e

Re: Matrix multiplication in Hadoop

2011-11-19 Thread Tommaso Teofili
I agree Hama (and BSP model) could be a good option, plus Hama also supports MR nextgen now [1]. I know MM has been implemented with Hama in the past so it may be worth asking on the mailing list. My 2 cents, Tommaso [1] : http://svn.apache.org/repos/asf/incubator/hama/trunk/yarn/ 2011/11/19 He

Re: Matrix multiplication in Hadoop

2011-11-19 Thread He Chen
Right, I agree with Edward Capriolo, Hadoop + GPGPU is a better choice. On Sat, Nov 19, 2011 at 10:53 AM, Edward Capriolo wrote: > Sounds like a job for next gen map reduce native libraries and gpu's. A > modern day Dr frankenstein for sure. > > On Saturday, November 19, 2011, Tim Broberg wrot

Re: Matrix multiplication in Hadoop

2011-11-19 Thread He Chen
Did you try Hama? There are may methods. 1) use Hadoop MPI which allows you use MPI MM code based on Hadoop; 2) Hama is designed for MM 3) Use pure Hadoop Java MapReduce; I did this before but may not be optimal algorithm. Put your first matrix in DistributedCache and take second matrix line a

Re: Matrix multiplication in Hadoop

2011-11-19 Thread Edward Capriolo
Sounds like a job for next gen map reduce native libraries and gpu's. A modern day Dr frankenstein for sure. On Saturday, November 19, 2011, Tim Broberg wrote: > Perhaps this is a good candidate for a native library, then? > > > From: Mike Davis [xmikeda..

RE: Matrix multiplication in Hadoop

2011-11-19 Thread Tim Broberg
Perhaps this is a good candidate for a native library, then? From: Mike Davis [xmikeda...@gmail.com] Sent: Friday, November 18, 2011 7:39 PM To: common-user@hadoop.apache.org Subject: Re: Matrix multiplication in Hadoop On Friday, November 18, 2011, Mike S

Re: Announcing Bangalore Hadoop Meetup Group

2011-11-19 Thread real great..
Very Good effort..thanks a lot.:) On Sat, Nov 19, 2011 at 3:51 PM, Rajesh Balamohan < rajesh.balamo...@gmail.com> wrote: > Excellent effort Sharad, > > Please do let us know the event timing. > > ~Rajesh.B > > On Thu, Nov 17, 2011 at 6:31 PM, Sharad Agarwal >wrote: > > > Hi Bangalore Area Hadoop

Re: Announcing Bangalore Hadoop Meetup Group

2011-11-19 Thread Rajesh Balamohan
Excellent effort Sharad, Please do let us know the event timing. ~Rajesh.B On Thu, Nov 17, 2011 at 6:31 PM, Sharad Agarwal wrote: > Hi Bangalore Area Hadoop Developers and Users, > > There is a lot of interest in Hadoop and Big Data space in Bangalore. Many > folks have been asking for Bangalor

Re: How many files can a hdfs client can access simultaneously

2011-11-19 Thread Harsh J
Oops, s/you're/you. I rephrased half-way. On 19-Nov-2011, at 2:02 PM, kartheek muthyala wrote: > Thanks harsh for the confirmation and for referring me to this dl. :) > > On Sat, Nov 19, 2011 at 12:20 PM, Harsh J wrote: > >> Kartheek, >> >> (Moving to hdfs-user@. Bcc'd common-user@. Please us

Re: How many files can a hdfs client can access simultaneously

2011-11-19 Thread kartheek muthyala
Thanks harsh for the confirmation and for referring me to this dl. :) On Sat, Nov 19, 2011 at 12:20 PM, Harsh J wrote: > Kartheek, > > (Moving to hdfs-user@. Bcc'd common-user@. Please use the right lists for > reaching to the proper audience :)) > > Writing multiple files from a single DFS clie