Re: Hadoop cluster monitoring

2014-04-15 Thread Arun Murthy
Lots of folks use Apache Ambari (http://ambari.apache.org/) to deploy and monitor their Hadoop cluster. Ambari uses Ganglia/Nagios as underlying technology and has much better UI etc. hth, Arun On Mon, Apr 14, 2014 at 9:08 PM, Shashidhar Rao raoshashidhar...@gmail.comwrote: Hi, Can somebody

Re: Hadoop cluster monitoring

2014-04-15 Thread Shashidhar Rao
Thanks Arun Murthy On Tue, Apr 15, 2014 at 11:32 AM, Arun Murthy a...@hortonworks.com wrote: Lots of folks use Apache Ambari (http://ambari.apache.org/) to deploy and monitor their Hadoop cluster. Ambari uses Ganglia/Nagios as underlying technology and has much better UI etc. hth, Arun

Hadoop NoClassDefFoundError

2014-04-15 Thread laozh...@sina.cn
Hello EveryOne:    I am new to hadoop,and i am reading Hadoop in action.When i tried to run a demo from this book,I got a problem and could not find answer from the net. Can you help me on this ? below is the error info :  $ hadoop jar myjob.jar MyJob input outputException in thread main

Re: Offline image viewer - account for edits ?

2014-04-15 Thread Mingjiang Shi
I think you are right because the the offline image viewer only takes the fsimage file as input. On Tue, Apr 15, 2014 at 9:29 AM, Manoj Samel manojsamelt...@gmail.comwrote: Hi, Is it correct to say that the offline image viewer does not accounts for any edits that are not yet merged into

Re: Hadoop NoClassDefFoundError

2014-04-15 Thread Azuryy Yu
Please use: hadoop jar myjob.jar myjob.MyJob input output On Tue, Apr 15, 2014 at 3:06 PM, laozh...@sina.cn laozh...@sina.cn wrote: Hello EveryOne: I am new to hadoop,and i am reading Hadoop in action. When i tried to run a demo from this book,I got a problem and could not find answer

Re: Hadoop NoClassDefFoundError

2014-04-15 Thread Azuryy Yu
Please use: hadoop jar myjob.jar myjob.MyJob input output On Tue, Apr 15, 2014 at 3:06 PM, laozh...@sina.cn laozh...@sina.cn wrote: Hello EveryOne: I am new to hadoop,and i am reading Hadoop in action. When i tried to run a demo from this book,I got a problem and could not find answer

About Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode

2014-04-15 Thread Anacristing
Hi, I'm trying to setup Hadoop(version 2.2.0) on Windows(32-bit) with cygwin(version 1.7.5). I export JAVA_HOME=/cygdrive/c/Java/jdk1.7.0_51 in hadoop-env.sh and the classpath is /home/Administrator/hadoop-2.2.0/etc/hadoop:

Re: About Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode

2014-04-15 Thread Shengjun Xin
try to use bin/hadoop classpath to check whether the classpath is what you set On Tue, Apr 15, 2014 at 4:16 PM, Anacristing 99403...@qq.com wrote: Hi, I'm trying to setup Hadoop(version 2.2.0) on Windows(32-bit) with cygwin(version 1.7.5). I export

hadoop eclipse plugin compile path

2014-04-15 Thread Alex Lee
Trying to use the below command to generate hadoop eclipse plugin, but seem the directory =/usr/local/hadoop-2.2.0 not correct. I just used the ambari to installed the hadoop. $ant jar -Dversion=2.2.0 -Declipse.home=/usr/local/eclipse -Dhadoop.home=/usr/local/hadoop-2.2.0 error log BUILD

Re: Setting debug log level for individual daemons

2014-04-15 Thread Gordon Wang
Put the following line in the log4j setting file. log4j.logger.org.apache.hadoop.yarn.server.resourcemanager=DEBUG,console On Tue, Apr 15, 2014 at 8:33 AM, Ashwin Shankar ashwinshanka...@gmail.comwrote: Hi, How do we set log level to debug for lets say only Resource manager and not the

Re: Re: Hadoop NoClassDefFoundError

2014-04-15 Thread laozh...@sina.cn
Thank you for your advice . When i user your command , i get the below error info .$ hadoop jar myjob.jar myjob.MyJob input outputException in thread main java.lang.ClassNotFoundException: myjob.MyJob at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at

memoryjava.lang.OutOfMemoryError related with number of reducer?

2014-04-15 Thread leiwang...@gmail.com
I can fix this by changing heap size. But what confuse me is that when i change the reducer number from 24 to 84, there's no this error. Any insight on this? Thanks Lei Failed to merge in memoryjava.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2786)

unsubscribe

2014-04-15 Thread Levin Ding
Pls unsubscribe me. Thx. 在 2013-3-16,上午3:03,kishore raju hadoop1...@gmail.com 写道: HI, We are having an issue where multiple Task Trackers are running out of memory. I have collected HeapDump on those TaskTrackers to analyze further. They are currently running with 1GB Heap. we are

Re: memoryjava.lang.OutOfMemoryError related with number of reducer?

2014-04-15 Thread Thomas Bentsen
When you increase the number of reducers they each have less to work with provided the data is distributed evenly between them - in this case about one third of the original work. It is eessentially the same thing as increasing the heap size - it's just distributed between more reducers. /th

Re: HDFS file system size issue

2014-04-15 Thread Saumitra Shahapure
Hi Rahman, These are few lines from hadoop fsck / -blocks -files -locations /mnt/hadoop/hive/warehouse/user.db/table1/000255_0 44323326 bytes, 1 block(s): OK 0. blk_-7919979022650423857_446500 len=44323326 repl=3 [ip1:50010, ip2:50010, ip3:50010]

Re: Offline image viewer - account for edits ?

2014-04-15 Thread Akira AJISAKA
If you want to parse the edits, please use the Offline Edits Viewer. http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html Thanks, Akira (2014/04/15 16:41), Mingjiang Shi wrote: I think you are right because the the offline image viewer only takes the

Re: Re: memoryjava.lang.OutOfMemoryError related with number of reducer?

2014-04-15 Thread leiwang...@gmail.com
Thanks Thomas. Anohter question. I have no idea what is Failed to merge in memory. Does the 'merge' is the shuffle phase in reducer side? Why it is in memory? Except the two methods(increase reducer number and increase heap size), is there any other alternatives to fix this issue? Thanks

Re: Update interval of default counters

2014-04-15 Thread Akira AJISAKA
Moved to user@hadoop.apache.org. You can configure the interval by setting mapreduce.client.progressmonitor.pollinterval parameter. The default value is 1000 ms. For more details, please see

RE: Re: memoryjava.lang.OutOfMemoryError related with number of reducer?

2014-04-15 Thread German Florez-Larrahondo
Lei A good explanation of this can be found on the Hadoop The Definitive Guide by Tom White. Here is an excerpt that explains a bit the behavior at the reduce side and some possible tweaks to control it.

Re: RE: memoryjava.lang.OutOfMemoryError related with number of reducer?

2014-04-15 Thread leiwang...@gmail.com
Thanks, let me take a careful look at it. leiwang...@gmail.com From: German Florez-Larrahondo Date: 2014-04-15 23:27 To: user; 'th' Subject: RE: Re: memoryjava.lang.OutOfMemoryError related with number of reducer? Lei A good explanation of this can be found on the Hadoop The Definitive

Re: Offline image viewer - account for edits ?

2014-04-15 Thread Manoj Samel
So, is it correct to say that if one wants to get the latest state of the Name node, the information from imageviewer and from edits viewer has to be combined somehow ? Thanks, On Tue, Apr 15, 2014 at 7:26 AM, Akira AJISAKA ajisa...@oss.nttdata.co.jpwrote: If you want to parse the edits,

Re: Setting debug log level for individual daemons

2014-04-15 Thread Ashwin Shankar
Thanks Gordon and Stanley, but this would require us to bounce the process. Is there a way to change log levels without bouncing the process ? On Tue, Apr 15, 2014 at 3:23 AM, Gordon Wang gw...@gopivotal.com wrote: Put the following line in the log4j setting file.

Find the task and it's datanode which is taking the most time in a cluster

2014-04-15 Thread Shashidhar Rao
Hi, Can somebody please help me how to find the task and the datanode in a large cluster which has failed or which is taking the most time to execute considering thousands of mappers and reducers are running. Regards Shashi

Compiling from Source

2014-04-15 Thread Justin Mrkva
I’m using the guide at http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/SingleCluster.html to try to compile the native Hadoop libraries because I’m running a 64 bit OS and it keeps complaining that the native libraries can’t be found. After running the third command (mvn

Compiling from Source

2014-04-15 Thread Justin Mrkva
I’m using the guide at http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/SingleCluster.html to try to compile the native Hadoop libraries because I’m running a 64 bit OS and it keeps complaining that the native libraries can’t be found. After running the third command (mvn

Warning: $HADOOP_HOME is deprecated

2014-04-15 Thread Radhe Radhe
Hello All, I have configured Apache Hadoop 1.2.0 and set the $HADOOP_HOME env. variable: I keep getting :Warning: $HADOOP_HOME is deprecated Solution:(After googling)I replaced HADOOP_HOME with HADOOP_PREFIX and the warning disappeared. Does that mean HADOOP_HOME is replaced by HADOOP_PREFIX? If

Re: HDFS file system size issue

2014-04-15 Thread Abdelrahman Shettia
Hi Saumitra, It looks like the over replicated blocks root cause is not the issue that the cluster is experiencing. I can only think of miss configuring the dfs.data.dir parameter. Can you ensure that each one of the data directories is using only one partition(mount) and there is no other data

Re: Find the task and it's datanode which is taking the most time in a cluster

2014-04-15 Thread Abdelrahman Shettia
Hi Shashi, I am assuming that you are running hadoop 1.x. There is an option to see the failed tasks on the Job tracker UI. Please replace the jobtracker host with the actual host and click on the following link and look for the task failure.

Apache Hadoop 2.x installation *environment variables*

2014-04-15 Thread Radhe Radhe
Hello All, For Apache Hadoop 2.x (YARN) installation which *environment variables* are REALLY needed. By referring to various blogs I am getting a mix: HADOOP_COMMON_HOMEHADOOP_CONF_DIRHADOOP_HDFS_HOMEHADOOP_HOMEHADOOP_MAPRED_HOMEHADOOP_PREFIXYARN_HOME

RE: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Radhe Radhe
Thanks John for your comments, I believe MRv2 has support for both the old *mapred* APIs and new *mapreduce* APIs. I see this way:[1.] One may have binaries i.e. jar file of the M\R program that used old *mapred* APIsThis will work directly on MRv2(YARN). [2.] One may have the source code i.e.

Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Zhijie Shen
1. If you have the binaries that were compiled against MRv1 *mapred* libs, it should just work with MRv2. 2. If you have the source code that refers to MRv1 *mapred* libs, it should be compilable without code changes. Of course, you're free to change your code. 3. If you have the binaries that

RE: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Radhe Radhe
Thanks Zhijie for the explanation. Regarding #3 if I have ONLY the binaries i.e. jar file (compiled\build against old MRv1 mapred APIS) then how can I compile it since I don't have the source code i.e. Java files. All I can do with binaries i.e. jar file is execute it. -RR Date: Tue, 15 Apr

Re: Doubt regarding Binary Compatibility\Source Compatibility with old *mapred* APIs and new *mapreduce* APIs in Hadoop

2014-04-15 Thread Zhijie Shen
bq. Regarding #3 if I have ONLY the binaries i.e. jar file (compiled\build against old MRv1 mapred APIS) Which APIs are you talking about, *mapred* or *mapreduce*? In #3, I was saying about *mapreduce*. If this is the case, you may be in the trouble unfortunately, because MRv2 has evolved so much

Re: About Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode

2014-04-15 Thread Anacristing
It's the same -- Original -- From: Shengjun Xin;s...@gopivotal.com; Date: Tue, Apr 15, 2014 04:43 PM To: useruser@hadoop.apache.org; Subject: Re: About Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode try to use bin/hadoop

Re: Compiling from Source

2014-04-15 Thread Shengjun Xin
I think you can use the command 'mvn package -Pnative,dist -DskipTests' in source code root directory to build the binaries On Wed, Apr 16, 2014 at 2:31 AM, Justin Mrkva m...@justinmrkva.com wrote: I'm using the guide at

Re: Offline image viewer - account for edits ?

2014-04-15 Thread Akira AJISAKA
Yes, I think you are right. (2014/04/16 1:20), Manoj Samel wrote: So, is it correct to say that if one wants to get the latest state of the Name node, the information from imageviewer and from edits viewer has to be combined somehow ? Thanks, On Tue, Apr 15, 2014 at 7:26 AM, Akira AJISAKA

Re: Re: Hadoop NoClassDefFoundError

2014-04-15 Thread Stanley Shi
can do you an unzip -l myjob.jar to see if your jar file has the correct hierarchy? Regards, *Stanley Shi,* On Tue, Apr 15, 2014 at 6:53 PM, laozh...@sina.cn laozh...@sina.cn wrote: Thank you for your advice . When i user your command , i get the below error info . $ hadoop jar myjob.jar

Re: Setting debug log level for individual daemons

2014-04-15 Thread Stanley Shi
Is this what you are looking for? http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/CommandsManual.html#daemonlog Regards, *Stanley Shi,* On Wed, Apr 16, 2014 at 2:06 AM, Ashwin Shankar ashwinshanka...@gmail.comwrote: Thanks Gordon and Stanley, but this would require us

Re: Re: java.lang.OutOfMemoryError related with number of reducer?

2014-04-15 Thread leiwang...@gmail.com
Hi German Thomas, Seems i found the data that causes the error, but i still don't know the exactly reason. I just do a group with pig latin: domain_device_group = GROUP data_filter BY (custid, domain, level, device); domain_device = FOREACH domain_device_group { distinct_ip =