Re: Building Mahout Issue

2014-06-03 Thread Harsh J
This is not a Hadoop issue per-se. Its Mahout not having a build profile that supports hadoop-2. It appears this was only added for 1.0, per https://issues.apache.org/jira/browse/MAHOUT-1329. As a hack you could modify the 0.9's pom.xml to use hadoop-client instead of hadoop-core. You may have to

How can a task know if its running as a MR1 or MR2 job?

2014-06-03 Thread Michael Segel
Just a quick question... Suppose you have a M/R job running. How does the Mapper or Reducer task know or find out if its running as a M/R 1 or M/R 2 job? I would suspect the job context would hold that information... but on first glance I didn't see it. So what am I missing? Thx -Mike

Re: change yarn application priority

2014-06-03 Thread Michael Segel
WRT capacity scheduler, its not so much changing the priority of a job, but allowing for pre-emption. Note that I guess you could raise the one job's priority, and then the other job's priority so that when a task finishes the other job gets the next slot. However, you're still stuck waiting

100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Shayan Pooya
I have a three node HDFS cluster with a name-node. There is absolutely no IO going on this cluster or any jobs running on it and I just use it for testing the Disco HDFS integration. I noticed that two of the three data-nodes are using 100% CPU. They have been running for a long time (2 months)

Re: 100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Ted Yu
Can you pastebin data node log snippet and jstack of datanode process ? Thanks On Tue, Jun 3, 2014 at 9:34 AM, Shayan Pooya sha...@liveve.org wrote: I have a three node HDFS cluster with a name-node. There is absolutely no IO going on this cluster or any jobs running on it and I just use it

unsubscribe

2014-06-03 Thread Luiz Fernando Figueiredo
unsubscribe me pls. can't find the right way.

Re: 100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Ian Brooks
Hi Shayan, If you restart one of the datanodes, does that node go back to normal cpu usage? if so that looks like the same issue im seeing on my nodes, though mine will go to 200% over time on a 4 cpu host. I havent been able to track the cause down yet. Heavy use of HDFS will cause the node

Re: unsubscribe

2014-06-03 Thread Ted Yu
Please send email to user-unsubscr...@hadoop.apache.org See http://hadoop.apache.org/mailing_lists.html#User On Tue, Jun 3, 2014 at 9:40 AM, Luiz Fernando Figueiredo luiz.figueir...@auctorita.com.br wrote: unsubscribe me pls. can't find the right way.

Re: 100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Shayan Pooya
1. The jstack result is attached. 2. These are the two processes: 25050 shayan 20 0 1643M 87872 2512 S 100. 10.11434h /usr/lib/jvm/java-1.6.0//bin/java -Dproc_datanode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/shayan/hadoop/logs -Dhadoo 25094 shayan 20 0

Re: (Very) newbie questions

2014-06-03 Thread Ted Yu
Do you mind listing the unit tests that failed on your computer ? Cheers On Tue, Jun 3, 2014 at 3:31 PM, Christian Convey christian.con...@gmail.com wrote: I'm completely new to Hadoop, and I'm trying to build it for the first time. I cloned the Git repository and I'm building the tag

Re: (Very) newbie questions

2014-06-03 Thread Christian Convey
On Tue, Jun 3, 2014 at 8:19 PM, Christian Convey christian.con...@gmail.com wrote: On Tue, Jun 3, 2014 at 6:37 PM, Ted Yu yuzhih...@gmail.com wrote: Do you mind listing the unit tests that failed on your computer ? Cheers On Tue, Jun 3, 2014 at 3:31 PM, Christian Convey

Re: (Very) newbie questions

2014-06-03 Thread Ted Yu
TestSymlinkLocalFS is an abstract class. The actual test is in TestSymlinkLocalFSFileContext / TestSymlinkLocalFSFileSystem. Do they pass on your computer ? On Tue, Jun 3, 2014 at 5:21 PM, Christian Convey christian.con...@gmail.com wrote: On Tue, Jun 3, 2014 at 8:19 PM, Christian Convey

Re: (Very) newbie questions

2014-06-03 Thread Tsuyoshi OZAWA
Hi Ted and Christian, In fact, the problem is filed as HADOOP-10510. https://issues.apache.org/jira/browse/HADOOP-10510 This problem is also reproduced on my local, but not on the other environment. Maybe it's environment-dependent problem. However, I cannot understand the condition this problem

Re: should i just assign history server address on NN or i have to assign on each node?

2014-06-03 Thread Stanley Shi
should set it on RM node; Regards, *Stanley Shi,* On Wed, Jun 4, 2014 at 9:24 AM, ch huang justlo...@gmail.com wrote: hi,maillist: i installed my job history server on my one of NN(i use NN HA) ,i want to ask if i need set history server address on each node?