Hi group,

I met with a problem that sometimes when I running my MapReduce job, some YARN 
process will go to status D, which means uninterruptable status. In this case 
this YARN process cannot be killed, and only reboot server can recover. Since 
in this situation, some command such as ps, reboot, lsof will be stuck and I 
cannot investigate more. In my observation this occur on the server with 8T 
hard disk.

Here is the output from top command.
8382 yarn      20   0 3456408 277324  28040 D   0.0  0.2  11:16.21 java

Here is my environment.

·         Cloudera CDH 5.9.0

·         The disk on server is 8T volume

·         My MapReduce job is getting data from HBase, and store the result in 
HBase

I have used fsck command to check the disk, and no error found with the disk. I 
am not sure if there is some configuration to be tuned for large-volume disk 
for Hadoop / YARN. Is there any idea about this issue?
Thanks in advance.

Thanks,
Eric

Reply via email to