Writing a simple sort application for Hadoop

2010-02-28 Thread aa225
Hello, I am trying to write a simple sorting application for hadoop. This is what I have thought till now. Suppose I have 100 lines of data and 10 mappers, each of the 10 mappers will sort the data given to it. But I am unable to figure out is how to join these outputs to one big sorted

Hadoop: Divide and Conquer Algorithms

2010-02-28 Thread aa225
Hello Everybody, I have a small question. I want to know how would one implement divide and conquer algorithms in Hadoop. For example suppose I want to implement merge sort 100 lines in hadoop. There will be 10 mapper each sorting 10 lines. Now comes the tough part In the

Re: Re: Writing a simple sort application for Hadoop

2010-02-28 Thread aa225
Hi, Is there any way we can chain the reducers . As in initially the reducers work on some data. The output of these reducers is again sent to the same reducers again and so on. Similar to how the conquer step takes place in divide and conquer algorithms ? I hope you got what I am trying to

Some information on Hadoop Sort

2010-02-19 Thread aa225
Hello, I was wondering if some one could me some information on hadoop does the sorting. From what I have read there does not seem to be a map class and reduce class ? Where and how is the sorting parallelized ? Best Regards from Buffalo Abhishek Agrawal SUNY- Buffalo (716-435-7122)

Re: Re: Inverse of a matrix using Map - Reduce

2010-02-03 Thread aa225
Hi, Any idea how this method will scale for dense matrices ?The kind of matrices I am going to be working with are 500,000*500,000. Will this be a problem. Also have you used this patch ? Best Regards from Buffalo Abhishek Agrawal SUNY- Buffalo (716-435-7122) On Wed 02/03/10 1:41 AM ,

Inverse of a matrix using Map - Reduce

2010-02-02 Thread aa225
Hello People, My name is Abhishek Agrawal. For the last few days I have been trying to figure out how to calculate the inverse of a matrix using Map Reduce. Matrix inversion has 2 common approaches. Gaussian- Jordan and the cofactor of transpose method. But both of them dont seem to

Eclipse Plugin for Hadoop

2010-01-16 Thread aa225
Hi all, I was just looking around and I stumbled across the Eclipse plugin for Hadoop. Have any of you guys used this plug in ? Any thoughts on this ? Best Regards from Buffalo Abhishek Agrawal SUNY- Buffalo (716-435-7122)

Re: Re: Re: Re: Doubt in Hadoop

2009-11-29 Thread aa225
Hi, Actually, I just made the change suggested by Aaron and my code worked. But I still would like to know why does the setJarbyClass() method have to be called when the Main class and the Map and Reduce classes are in the same package ? Thank You Abhishek Agrawal SUNY- Buffalo

Object Serialization

2009-11-29 Thread aa225
Hello Everybody, I have a question about object serialization in Hadoop. I have an object A which I want to pass to every map function. Currently the code I am using for this is as under. The problem is if I run my program, the code crashes the first time with an error say that

Re: please help in setting hadoop

2009-11-26 Thread aa225
Hi, Just a thought, but you do not need to setup the temp directory in conf/hadoop-site.xml especially if you are running basic examples. Give that a shot, maybe it will work out. Otherwise see if you can find additional info in the LOGS Thank You Abhishek Agrawal SUNY- Buffalo

Re: RE: please help in setting hadoop

2009-11-26 Thread aa225
Hi, There should be a folder called as logs in $HADOOP_HOME. Also try going through http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29. This is a pretty good tutorial Abhishek Agrawal SUNY- Buffalo (716-435-7122) On Fri 11/27/09 1:18 AM , Krishna

Re: Re: Doubt in Hadoop

2009-11-26 Thread aa225
Hi, I am running the job from command line. The job runs fine in the local mode but something happens when I try to run the job in the distributed mode. Abhishek Agrawal SUNY- Buffalo (716-435-7122) On Fri 11/27/09 2:31 AM , Jeff Zhang zjf...@gmail.com sent: Do you run the map reduce job

Help in Hadoop

2009-11-22 Thread aa225
Hello Everybody, I have a doubt in a map reduce program and I would appreciate any help. I run the program using the command bin/hadoop jar HomeWork.jar prg1 input output. Ideally from within prg1, I want to sequentially launch 10 map- reduce tasks. I want to store the output of

Re: Re: Help in Hadoop

2009-11-22 Thread aa225
Hellow, If I write the output of the 10 tasks in 10 different files then how do I go about merging the output ? Is there some in built functionality or do I have to write some code for that ? Thank You Abhishek Agrawal SUNY- Buffalo (716-435-7122) On Sun 11/22/09 5:40 PM , Gang Luo

Re: Re: Re: Re: Help in Hadoop

2009-11-22 Thread aa225
I am still getting the same exception. This is the stack trace of it. java.io.IOException: Not a file: hdfs://zeus:18004/user/hadoop/output6/MatrixA-Row1 at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:195) at

Re: Re: Using hadoop for Matrix Multiplication in NFS?

2009-11-13 Thread aa225
Hi , I do not know if this will be helpful or not but I also wanted to use hadoop to do matrix multiplication. I came across a package called Hama which uses map reduce programs to multiply 2 matrices. To store the 2 matrices it used HBase. You could give that a shot. Thank You Abhishek