Re: hadoop debugging tools

2013-08-27 Thread Shekhar Sharma
You can get the stats for a job using rumen. http://ksssblogs.blogspot.in/2013/06/getting-job-statistics-using-rumen.html Regards, Som Shekhar Sharma +91-8197243810 On Tue, Aug 27, 2013 at 10:54 AM, Gopi Krishna M mgo...@gmail.com wrote: Harsh: thanks for the quick response. we often see an

RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Smith, Joshua D.
Harsh- Yes, I intend to use HA. That's what I'm trying to configure right now. Unfortunately I cannot share my complete configuration files. They're on a disconnected network. Are there any configuration items that you'd like me to post my settings for? The deployment is CDH 4.3 on a brand

Re: reg hadoop streaming

2013-08-27 Thread Shahab Yunus
Yes, I think so. The TaskTracker that launched the mapper and reducer in the child JVM which further invoked the streaming process can (and does) communicate with the JobTracker. Regards, Shahab On Tue, Aug 27, 2013 at 8:34 AM, Manoj Babu manoj...@gmail.com wrote: Team, Does streaming

Tutorials that work with modern Hadoop (v1.x.y)?

2013-08-27 Thread Andrew Pennebaker
There are a number of Hadoop tutorials and textbooks available, but they always seem to target older versions of Hadoop. Does anyone know of good tutorials that work with modern Hadoop verions (v1.x.y)?

client side I/O buffering

2013-08-27 Thread Lukas Kairies
Hello, I know that the client caches write requests before they send to a datanode and that the client uses read-ahead, but where exactly is this implemented, Hadoop or HDFS? Or better question, will write cache and read-ahead also available when Hadoop uses another filesysten than HDFS?

MapReduce Tutorial tweak

2013-08-27 Thread Andrew Pennebaker
In https://hadoop.apache.org/docs/stable/mapred_tutorial.html#Source+Code, line 16 declares: private Text word = new Text(); ... But only lines 22 and 23 use this, and only to pass the value along to output: word.set(tokenizer.nextToken()); output.collect(word, one); Wouldn't this be better

RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Smith, Joshua D.
Harsh- Here are all of the other values that I have configured. hdfs-site.xml - dfs.webhdfs.enabled true dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-falover.enabled true

RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Azuryy Yu
dfs.ha.namenodes.mycluster nn.domain,snn.domain it should be: dfs.ha.namenodes.mycluster nn1,nn2 On Aug 27, 2013 11:22 PM, Smith, Joshua D. joshua.sm...@gd-ais.com wrote: Harsh- Here are all of the other values that I have configured. hdfs-site.xml - dfs.webhdfs.enabled

Simplifying MapReduce API

2013-08-27 Thread Andrew Pennebaker
There seems to be an abundance of boilerplate patterns in MapReduce: * Write a class extending Map (1), implementing Mapper (2), with a map method (3) * Write a class extending Reduce (4), implementing Reducer (5), with a reduce method (6) Could we achieve the same behavior with a single Job

RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Smith, Joshua D.
nn.domain is a place holder for the actual fully qualified hostname of my NameNode snn.domain is a place holder for the actual fully qualified hostname of my StandbyNameNode. Of course both the NameNode and the StandbyNameNode are running exactly the same software with the same configuration

Jar issue

2013-08-27 Thread jamal sasha
Hi, For one of my map reduce code I want to use a different version of slf4j jar (1.6.4) But I guess hadoop has a different version of jar in hadoop classpath lib/slf4j-log4j12-1.4.3.jar And when I am trying to run my code, I am gettign this error: Exception in thread main

Re: Simplifying MapReduce API

2013-08-27 Thread Shahab Yunus
For starters (experts might have more complex reasons), what if your respective map and reduce logic becomes complex enough to demand separate classes? Why tie the clients to implement both by moving these in one Job interface. In the current design you can always implement both (map and reduce)

Re: Jar issue

2013-08-27 Thread Shahab Yunus
One idea is, you can use the exclusion property of maven (provided you are using that to build your application) while including hadoop dependencies and exclude sl4j that is coming within hadoop and then include your own sl4j as a separate dependency. Something like this: dependency

Re: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Jitendra Yadav
Hi, Please follow the HA configuration steps available at below link. http://hadoop.apache.org/docs/r2.1.0-beta/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html *dfs.ha.namenodes.[nameservice ID] - unique identifiers for each NameNode in the nameservice * *Configure with a list

Re: MapReduce Tutorial tweak

2013-08-27 Thread Shahab Yunus
As far as I undersstand, StringTokenizer.nextToken returns Java String type object which does not implement the required Writable and Comparable interfaces needed to Hadoop Mapreduce serialization and transport. The Text class does that and is compatible and thus that is why that is being used to

RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Azuryy Yu
not yet. please correct it. On Aug 27, 2013 11:39 PM, Smith, Joshua D. joshua.sm...@gd-ais.com wrote: nn.domain is a place holder for the actual fully qualified hostname of my NameNode snn.domain is a place holder for the actual fully qualified hostname of my StandbyNameNode. **

Re: Jar issue

2013-08-27 Thread jamal sasha
I am right now using libjars option. How do i do what you suggested using that route? On Tue, Aug 27, 2013 at 8:51 AM, Shahab Yunus shahab.yu...@gmail.comwrote: One idea is, you can use the exclusion property of maven (provided you are using that to build your application) while including

RE: HDFS Startup Failure due to dfs.namenode.rpc-address and Shared Edits Directory

2013-08-27 Thread Smith, Joshua D.
That fixed it. I was assuming that nn1 and nn2 were hostnames and not IDs. Once I replaced the value with nn1,nn2, everything started to make sense. Thank you Azurry and Jitendra. Much appreciated! Josh From: Jitendra Yadav [mailto:jeetuyadav200...@gmail.com] Sent: Tuesday, August 27, 2013

libjars error

2013-08-27 Thread jamal sasha
I have bunch of jars whcih i want to pass. I am using libjars option to do so. But to do that I have to implement tool ?? So i change my code to following but still I am getting this warning? 13/08/27 11:32:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications

Creation of symlink from failed

2013-08-27 Thread Pradeep Singh
Hi All , I am trying to run a simple steaming command as mentioned below . bin/hadoop jar /windows/Hadoop/hadoop-1.2.1/contrib/streaming/hadoop-streaming-1.2.1.jar -input /usr/pradeep/input/'Good words' -output /usr/pradeep/output mapper /bin/cat reducer /bin/wc -w I am getting below as message

Namenode joining error in HA configuration

2013-08-27 Thread orahad bigdata
Hi All, I'm new in Hadoop administration, Can someone please help me? Hadoop-version :- 2.0.5 alpha and using QJM I'm getting below error messages while starting Hadoop hdfs using 'start-dfs.sh' 2013-01-23 03:25:43,208 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Image file of size 121

Re: Simplifying MapReduce API

2013-08-27 Thread Don Nelson
I agree with @Shahab - it's simple enough to declare both interfaces in one class if that's what you want to do. But given the distributed behavior of Hadoop, it's likely that your mappers will be running on different nodes than your reducers anyway - why ship around duplicate code? On Tue, Aug

RE: Jar issue

2013-08-27 Thread java8964 java8964
I am not sure the original suggestion will work for your case. My understanding is the you want to use some API, only exists in slf4j versiobn 1.6.4, but this library with different version already existed in your hadoop environment, which is quite possible. To change the maven build of the

Re: Namenode joining error in HA configuration

2013-08-27 Thread Harsh J
What OS are you starting this on? Are you able to run the command df -k /tmp/hadoop-hadoop/dfs/name/ as user hadoop? On Wed, Aug 28, 2013 at 12:53 AM, orahad bigdata oracle...@gmail.com wrote: Hi All, I'm new in Hadoop administration, Can someone please help me? Hadoop-version :- 2.0.5

RE: Apache Hadoop cluster monitoring

2013-08-27 Thread Leo Leung
Yes ganglia Please see http://wiki.apache.org/hadoop/GangliaMetrics From: Viswanathan J [mailto:jayamviswanat...@gmail.com] Sent: Tuesday, August 27, 2013 7:45 PM To: user@hadoop.apache.org Subject: Apache Hadoop cluster monitoring Hi, What are the best monitoring tools for hadoop other than

Re: Apache Hadoop cluster monitoring

2013-08-27 Thread Jagat Singh
Would add hannibal for hbase bird eye view On 28/08/2013 12:45 PM, Viswanathan J jayamviswanat...@gmail.com wrote: Hi, What are the best monitoring tools for hadoop other than JT and NN default UIs. Please share the doc for configuring in production cluster. Any tools that can be

Re: Writing multiple tables from reducer

2013-08-27 Thread Ravi Kiran
I have written a blog on this a while ago where I was writing to multiple tables from my mapper class. You can look into it at http://bigdatabuzz.wordpress.com/2012/04/24/how-to-write-to-multiple-hbase-tables-in-a-mapreduce-job/ Key things are, a) job.setOutputFormatClass

Re: MapReduce Tutorial tweak

2013-08-27 Thread Ravi Kiran
Also to add, the default serialization libraries supported are specified in core-default,xml as property nameio.serializations/name