Hey
Namaskara~Nalama~Guten Tag~Bonjour
How do I get to know how much CPU (CPU Time & other CPU statistics), Disk
(Disk Time and other disk statistics), Memory utilization a particular job
has taken on a node?
I am not looking resource utilization per node level which Ambari gives.
But more at a
Asking the Giraph user lists may be a better way to get your answer:
http://giraph.apache.org/mail-lists.html
On Sun, Apr 5, 2015 at 1:57 PM, Ravikant Dindokar
wrote:
> I am newbie learning hadoop . I am running Apache Giraph on hadoop 2.2.0.
> I want to find out how much CPU utilization a
I am newbie learning hadoop . I am running Apache Giraph on hadoop 2.2.0.
I want to find out how much CPU utilization as well as time spent in
sending messages in each superstep for a Giraph application.
I am not familiar with hadoop code. Can you suggest the functions I should
look into to get
> Adam, how did you come to the conclusion that it is memory bounded?
>
I mean the number of containers running on your NodeManager, not the job
itself.
xml and
> yarn-site.xml. But they don't change anything in your case (as you are
> limited by the memory, not vcores).
>
>
>> The job wasn't the smallest but wasn't PB of data. Was run on 1.5GB of
>> data and run for 60min. I wasn't able to make any signif
B of
> data and run for 60min. I wasn't able to make any significant improvment.
> It is map only job. And wasn't able to achive more that 30% of total
> machine cpu utilization. Howewer top command were displaying 100 %cpu for
> process running on data node, that's why I was
n 1.5GB of data and run for 60min. I wasn't able to
make any significant improvment. It is map only job. And wasn't able to
achive more that 30% of total machine cpu utilization. Howewer top command
were displaying 100 %cpu for process running on data node, that's why I was
thinkin
) jobs on the cluster concurrently?
Then you might see higher CPU utilization than 30%.
Cheers!
Adam
2014-09-12 17:51 GMT+02:00 Jakub Stransky :
> Hello experienced hadoop users,
>
> I have one beginners question regarding cpu utilization on datanodes when
> running MR job. Cluster o
Hello experienced hadoop users,
I have one beginners question regarding cpu utilization on datanodes when
running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw
using following parameters:
# hadoop - yarn-site.xml
yarn.nodemanager.resource.memory-mb : 2048
yarn.scheduler.minimum
Yes, the client process used the most CPU shares.
But could you please help explain why the CPU utilization kept increasing? We
are sure that the traffic of provisioned data into HDFS was stable.
Thanks
BR/Shiyuan
From: Gordon Wang [mailto:gw...@pivotal.io]
Sent: 2014年9月1日 15:48
To: user
tat” to check the CPU utilization of our
> application, I can confirm the CPU utilization of our application was
> increasing and the CPU utilization of datanode, namenode, resourcemanager
> and NodeManager processes kept stable.
>
>
>
>
>
> Below the “top” command’s o
Because we are running the application with accessing local disk now, I can’t
give the “top” command’s output when running with HDFS.
But we used “top” and “pidstat” to check the CPU utilization of our
application, I can confirm the CPU utilization of our application was
increasing and the
ing data from HDFS(Pseudo-distributed mode in one node). And we
> found the CPU system time and user time of the application keeps increasing
> when it is running. If we changed the application to read data from local
> disk without changing any other business logic, the CPU utilization kept
&g
disk without changing any other business logic, the CPU utilization kept
stable. So we have conclusion that the CPU utilization is related to HDFS.
We want to know whether this issue is really related to HDFS and is there any
solution to fix it?
[cid:image001.png@01CFC5EE.E02C6D50]
Thanks a lot
1. The jstack result is attached.
2. These are the two processes:
25050 shayan 20 0 1643M 87872 2512 S 100. 10.11434h
/usr/lib/jvm/java-1.6.0//bin/java -Dproc_datanode -Xmx1000m
-Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/shayan/hadoop/logs
-Dhadoo
25094 shayan 20 0 1643
Hi Shayan,
If you restart one of the datanodes, does that node go back to normal cpu
usage? if so that looks like the same issue im seeing on my nodes, though mine
will go to 200% over time on a 4 cpu host.
I havent been able to track the cause down yet. Heavy use of HDFS will cause
the node t
Can you pastebin data node log snippet and jstack of datanode process ?
Thanks
On Tue, Jun 3, 2014 at 9:34 AM, Shayan Pooya wrote:
> I have a three node HDFS cluster with a name-node. There is absolutely no
> IO going on this cluster or any jobs running on it and I just use it for
> testing t
I have a three node HDFS cluster with a name-node. There is absolutely no
IO going on this cluster or any jobs running on it and I just use it for
testing the Disco HDFS integration.
I noticed that two of the three data-nodes are using 100% CPU. They have
been running for a long time (2 months) b
18 matches
Mail list logo