Hi Ayon, I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem's to be ok, but when I tried to used another version of hadoop, such as hadoop-0.20.3, when I start-all.sh, it gaves me an error like this :
uvm12dk: Unrecognized option: -jvm uvm12dk: Could not create the Java virtual machine. Would you be so kindly to help me with this problem? Thanks. Martinus On Wed, Dec 21, 2011 at 1:12 PM, Ayon Sinha <ayonsi...@yahoo.com> wrote: > Couple of things: > 1. Hadoop's strength is in data locality. So having most of your Hadoop > heavy lifting on local filesystem (HDFS where hadoop computation is shipped > to the nodes with the data). > 2. Assuming you are pulling data into Hadoop from Mongo to crunch and put > the resulting data back into Mongo as only the 1st and the last step in > your entire workflow, you are basically looking for a MongoInputFormat and > MongoOutputFormat (I made up the class names). you are probably looking for > https://jira.mongodb.org/browse/HADOOP/component/10736 > > Your other options if using Pig or Hive is to write Loader UDF's, similar > to PigStorage, HBaseStorage, etc. > > -Ayon > See My Photos on Flickr <http://www.flickr.com/photos/ayonsinha/> > Also check out my Blog for answers to commonly asked > questions.<http://dailyadvisor.blogspot.com> > > ------------------------------ > *From:* Martinus Martinus <martinus...@gmail.com> > *To:* hdfs-user@hadoop.apache.org > *Sent:* Tuesday, December 20, 2011 7:31 PM > *Subject:* hadoop cluster for querying data on mongodb > > Hi, > > I have hadoop cluster running and have my data inside mongodb database. I > already write a java code to query data on mongodb using mongodb-java > driver. And right now, I want to use hadoop cluster to run my java code to > get and put the data from and to mongo database. Did anyone has done this > before? Can you explain to me how to do that? > > Thanks. > > >