So I did the first command and did find some offenders:
5.9G    /var/log/ambari-infra-solr
5.9G    /var/log/Hadoop

While those are big numbers, they are sitting on a 1TB disk. This is the actual 
message I’m getting:
Capacity Used: [60.52%, 32.5 GB], Capacity Total: [53.7 GB], path=/usr/hdp

I figured out that HDFS isn’t actually taking up the whole disk which I didn’t 
know. I figured out how to expand that but before I do that, I want to know 
what is eating my space. I ran your command again with a modification:
sudo du -h --max-depth=1 /usr/hdp

That output is shown here:
395M    /usr/hdp/share
4.8G    /usr/hdp/2.5.0.0-1245
4.0K    /usr/hdp/current
5.2G    /usr/hdp

None of that adds up to 32.5 GB.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.massstreet.net<http://www.massstreet.net/>
www.linkedin.com/in/bobwakefieldmba<http://www.linkedin.com/in/bobwakefieldmba>
Twitter: @BobLovesData<http://twitter.com/BobLovesData>


From: Shane Kumpf [mailto:shane.kumpf.apa...@gmail.com]
Sent: Wednesday, July 12, 2017 7:17 AM
To: Adaryl Wakefield <adaryl.wakefi...@hotmail.com>
Cc: user@hadoop.apache.org
Subject: Re: Disk maintenance

Hello Bob,

It's difficult to say based on the information provided, but I would suspect 
namenode and datanode logs to be the culprit. What does "sudo du -h 
--max-depth=1 /var/log" return?

If it is not logs, is there a specific filesystem/directory that you see 
filling up/alerting? i.e. /, /var, /data, etc? If you are unsure, you can start 
at / to try to track down where the space is going via "sudo du -xm 
--max-depth=1 / | sort -rn" and then walk the filesystem hierarchy for the 
directory listed as using the most space (change / in the previous command to 
the directory reported as using all the space, continue that process until you 
locate the files using up all the space).

-Shane

On Tue, Jul 11, 2017 at 9:22 PM, Adaryl Wakefield 
<adaryl.wakefi...@hotmail.com<mailto:adaryl.wakefi...@hotmail.com>> wrote:
I'm running a test cluster that normally has no data in it. Despite that, I've 
been getting warnings of disk space usage. Something is growing on disk and I'm 
not sure what. Are there scrips that I should be running to clean out logs or 
something? What is really interesting is that this is only affecting the name 
node and one data node. The other data node isn’t having a space issue.

I'm running Hortonworks Data Platform 2.5 with HDFS 2.7.3 on CENTOS 7. I 
thought it might be a Linux issue but the problem is clearly confined to the 
parts of the disk taken up by HDFS.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685<tel:(913)%20938-6685>
www.massstreet.net<http://www.massstreet.net/>
www.linkedin.com/in/bobwakefieldmba<http://www.linkedin.com/in/bobwakefieldmba>
Twitter: @BobLovesData<http://twitter.com/BobLovesData>


Reply via email to