I'm running 0.19.0 on a 10 node cluster (8 core, 16GB RAM, 4x1.5TB). The
current status of my FS is approximately 1 million files and directories,
950k blocks, and heap size of 7GB (16GB reserved). Average block replication
is 3.8. I'm concerned that the heap size is steadily climbing... a 7GB heap
is substantially higher per file that I have on a similar 0.18.2 cluster,
which has closer to a 1GB heap.
My typical usage model is 1) write a number of small files into HDFS (tens
or hundreds of thousands at a time), 2) archive those files, 3) delete the
originals. I've tried dropping the replication factor of the _index and
_masterindex files without much effect on overall heap size. While I had
trash enabled at one point, I've since disabled it and deleted the .Trash
folders.

On namenode startup, I get a massive number of the following lines in my log
file:
2009-01-31 21:41:23,283 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.processReport: block blk_-2389330910609345428_7332878 on
172.16.129.33:50010 size 798080 does not belong to any file.
2009-01-31 21:41:23,283 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
NameSystem.addToInvalidates: blk_-2389330910609345428 is added to invalidSet
of 172.16.129.33:50010

I suspect the original files may be left behind and causing the heap size
bloat. Is there any accounting mechanism to determine what is contributing
to my heap size?

Thanks,
Sean

Reply via email to