On Thu, Oct 29, 2015 at 1:08 AM, qihuang.zheng <qihuang.zh...@fraudmetrix.cn > wrote:
> *We have some nodes Load too large, but some are normal. * > tl;dr - Clear the snapshots on the nodes which are too large. Longer : Are you sure that the nodes which are too large differ in the actual *data* size, or do they just contain snapshots? Cassandra snapshots are hard links to SSTables, which means a number of odd things : 1) Snapshots grow in actual disk usage over time, as they only consume "extra" disk space when the SSTable they are a hard link to is removed from the data directory. 2) Unless you use du --apparent-size, the order in which du sees files determines which file is counted as using the disk, so you might see weird results from du in the data directory if you are also involving the snapshots. --apparent-size print apparent sizes, rather than disk usage; although the apparent size is usually smaller, it may be larger due to holes in (`sparse') files, internal fragmentation, indirect blocks, and the like =Rob