CPU Utilization for each job

2016-06-03 Thread Deepak Goel
Hey Namaskara~Nalama~Guten Tag~Bonjour How do I get to know how much CPU (CPU Time & other CPU statistics), Disk (Disk Time and other disk statistics), Memory utilization a particular job has taken on a node? I am not looking resource utilization per node level which Ambari gives. But more at a

Re: CPU utilization in map function

2015-04-07 Thread Harsh J
Asking the Giraph user lists may be a better way to get your answer: http://giraph.apache.org/mail-lists.html On Sun, Apr 5, 2015 at 1:57 PM, Ravikant Dindokar wrote: > I am newbie learning hadoop . I am running Apache Giraph on hadoop 2.2.0. > I want to find out how much CPU utilization a

CPU utilization in map function

2015-04-05 Thread Ravikant Dindokar
I am newbie learning hadoop . I am running Apache Giraph on hadoop 2.2.0. I want to find out how much CPU utilization as well as time spent in sending messages in each superstep for a Giraph application. I am not familiar with hadoop code. Can you suggest the functions I should look into to get

Re: CPU utilization

2014-09-12 Thread Adam Kawa
> Adam, how did you come to the conclusion that it is memory bounded? > I mean the number of containers running on your NodeManager, not the job itself.

Re: CPU utilization

2014-09-12 Thread Jakub Stransky
xml and > yarn-site.xml. But they don't change anything in your case (as you are > limited by the memory, not vcores). > > >> The job wasn't the smallest but wasn't PB of data. Was run on 1.5GB of >> data and run for 60min. I wasn't able to make any signif

Re: CPU utilization

2014-09-12 Thread Adam Kawa
B of > data and run for 60min. I wasn't able to make any significant improvment. > It is map only job. And wasn't able to achive more that 30% of total > machine cpu utilization. Howewer top command were displaying 100 %cpu for > process running on data node, that's why I was

Re: CPU utilization

2014-09-12 Thread Jakub Stransky
n 1.5GB of data and run for 60min. I wasn't able to make any significant improvment. It is map only job. And wasn't able to achive more that 30% of total machine cpu utilization. Howewer top command were displaying 100 %cpu for process running on data node, that's why I was thinkin

Re: CPU utilization

2014-09-12 Thread Adam Kawa
) jobs on the cluster concurrently? Then you might see higher CPU utilization than 30%. Cheers! Adam 2014-09-12 17:51 GMT+02:00 Jakub Stransky : > Hello experienced hadoop users, > > I have one beginners question regarding cpu utilization on datanodes when > running MR job. Cluster o

CPU utilization

2014-09-12 Thread Jakub Stransky
Hello experienced hadoop users, I have one beginners question regarding cpu utilization on datanodes when running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw using following parameters: # hadoop - yarn-site.xml yarn.nodemanager.resource.memory-mb : 2048 yarn.scheduler.minimum

RE: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
Yes, the client process used the most CPU shares. But could you please help explain why the CPU utilization kept increasing? We are sure that the traffic of provisioned data into HDFS was stable. Thanks BR/Shiyuan From: Gordon Wang [mailto:gw...@pivotal.io] Sent: 2014年9月1日 15:48 To: user

Re: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Gordon Wang
tat” to check the CPU utilization of our > application, I can confirm the CPU utilization of our application was > increasing and the CPU utilization of datanode, namenode, resourcemanager > and NodeManager processes kept stable. > > > > > > Below the “top” command’s o

RE: CPU utilization keeps increasing when using HDFS

2014-09-01 Thread Shiyuan Xiao
Because we are running the application with accessing local disk now, I can’t give the “top” command’s output when running with HDFS. But we used “top” and “pidstat” to check the CPU utilization of our application, I can confirm the CPU utilization of our application was increasing and the

Re: CPU utilization keeps increasing when using HDFS

2014-08-31 Thread Stanley Shi
ing data from HDFS(Pseudo-distributed mode in one node). And we > found the CPU system time and user time of the application keeps increasing > when it is running. If we changed the application to read data from local > disk without changing any other business logic, the CPU utilization kept &g

CPU utilization keeps increasing when using HDFS

2014-08-31 Thread Shiyuan Xiao
disk without changing any other business logic, the CPU utilization kept stable. So we have conclusion that the CPU utilization is related to HDFS. We want to know whether this issue is really related to HDFS and is there any solution to fix it? [cid:image001.png@01CFC5EE.E02C6D50] Thanks a lot

Re: 100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Shayan Pooya
1. The jstack result is attached. 2. These are the two processes: 25050 shayan 20 0 1643M 87872 2512 S 100. 10.11434h /usr/lib/jvm/java-1.6.0//bin/java -Dproc_datanode -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/shayan/hadoop/logs -Dhadoo 25094 shayan 20 0 1643

Re: 100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Ian Brooks
Hi Shayan, If you restart one of the datanodes, does that node go back to normal cpu usage? if so that looks like the same issue im seeing on my nodes, though mine will go to 200% over time on a 4 cpu host. I havent been able to track the cause down yet. Heavy use of HDFS will cause the node t

Re: 100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Ted Yu
Can you pastebin data node log snippet and jstack of datanode process ? Thanks On Tue, Jun 3, 2014 at 9:34 AM, Shayan Pooya wrote: > I have a three node HDFS cluster with a name-node. There is absolutely no > IO going on this cluster or any jobs running on it and I just use it for > testing t

100% CPU utilization on idle HDFS Data Nodes

2014-06-03 Thread Shayan Pooya
I have a three node HDFS cluster with a name-node. There is absolutely no IO going on this cluster or any jobs running on it and I just use it for testing the Disco HDFS integration. I noticed that two of the three data-nodes are using 100% CPU. They have been running for a long time (2 months) b