Stefan Will wrote:
Hi Raghu,
Each DN machine has 3 partitions, e.g.:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 20G 8.0G 11G 44% /
/dev/sda3 1.4T 756G 508G 60% /data
tmpfs 3.9G 0 3.9G 0% /dev/shm
All of the paths in hadoop-site.xml point to /data, which is the partition
that filled up to 100% (I deleted a bunch of files from HDFS since then). So
I guess the question is whether the DN looks at just the partition its data
directory is on, or all partitions when it determines disk usage.
Datanode checks df on /data alone. What is "dfs.df.interval" set to?
Also if you set multiple paths for "dfs.data.dir", "availble" for each
of these adds up, that would be wrong in your case since all of these
are under one partition.
Raghu.