Hi Joey, I add new user and start-all.sh and it can worked right now, but when I tried to used the wordcount example, it gave me this :
11/12/26 11:52:51 INFO input.FileInputFormat: Total input paths to process : 1 11/12/26 11:53:01 INFO mapred.JobClient: Running job: job_201112261118_0002 11/12/26 11:53:03 INFO mapred.JobClient: map 0% reduce 0% 11/12/26 11:56:46 INFO mapred.JobClient: map 100% reduce 0% 11/12/26 12:11:10 INFO mapred.JobClient: Task Id : attempt_201112261118_0002_r_000000_0, Status : FAILED Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. 11/12/26 12:11:46 WARN mapred.JobClient: Error reading task outputConnection timed out 11/12/26 12:12:07 WARN mapred.JobClient: Error reading task outputConnection timed out Would you be so kindly to tell me how to fix this? Thanks. On Mon, Dec 26, 2011 at 10:31 AM, Martinus Martinus <martinus...@gmail.com>wrote: > Hi Joey, > > Can you give more explanation about that? You mean I should make a new > user and new group or just ssh? > > Thanks. > > > On Mon, Dec 26, 2011 at 3:57 AM, Joey Echeverria <j...@cloudera.com>wrote: > >> Don't start your daemons as root. They should be started as a system >> account. Typically hdfs for the HDFS services and mapred for the >> MapReduce ones. >> >> -Joey >> >> On Fri, Dec 23, 2011 at 4:04 AM, Martinus Martinus >> <martinus...@gmail.com> wrote: >> > Hi Ayon, >> > >> > I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem's >> to be >> > ok, but when I tried to used another version of hadoop, such as >> > hadoop-0.20.3, when I start-all.sh, it gaves me an error like this : >> > >> > uvm12dk: Unrecognized option: -jvm >> > uvm12dk: Could not create the Java virtual machine. >> > >> > Would you be so kindly to help me with this problem? >> > >> > Thanks. >> > >> > Martinus >> > >> > >> > On Wed, Dec 21, 2011 at 1:12 PM, Ayon Sinha <ayonsi...@yahoo.com> >> wrote: >> >> >> >> Couple of things: >> >> 1. Hadoop's strength is in data locality. So having most of your Hadoop >> >> heavy lifting on local filesystem (HDFS where hadoop computation is >> shipped >> >> to the nodes with the data). >> >> 2. Assuming you are pulling data into Hadoop from Mongo to crunch and >> put >> >> the resulting data back into Mongo as only the 1st and the last step >> in your >> >> entire workflow, you are basically looking for a MongoInputFormat and >> >> MongoOutputFormat (I made up the class names). you are probably >> looking for >> >> https://jira.mongodb.org/browse/HADOOP/component/10736 >> >> >> >> Your other options if using Pig or Hive is to write Loader UDF's, >> similar >> >> to PigStorage, HBaseStorage, etc. >> >> >> >> -Ayon >> >> See My Photos on Flickr >> >> Also check out my Blog for answers to commonly asked questions. >> >> >> >> ________________________________ >> >> From: Martinus Martinus <martinus...@gmail.com> >> >> To: hdfs-user@hadoop.apache.org >> >> Sent: Tuesday, December 20, 2011 7:31 PM >> >> Subject: hadoop cluster for querying data on mongodb >> >> >> >> Hi, >> >> >> >> I have hadoop cluster running and have my data inside mongodb >> database. I >> >> already write a java code to query data on mongodb using mongodb-java >> >> driver. And right now, I want to use hadoop cluster to run my java >> code to >> >> get and put the data from and to mongo database. Did anyone has done >> this >> >> before? Can you explain to me how to do that? >> >> >> >> Thanks. >> >> >> >> >> > >> >> >> >> -- >> Joseph Echeverria >> Cloudera, Inc. >> 443.305.9434 >> > >