Thanks Aaron. No tmp files and not even a single exception in the system.log.
If the file was last modified on 20-Nov then there must be an entry for that in the log (either completed streaming or compacted). On Tue, Dec 17, 2013 at 7:23 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > -tmp- files will sit in the data dir, if there was an error creating them > during compaction or flushing to disk they will sit around until a restart. > > Check the logs for errors to see if compaction was failing on something. > > Cheers > > ----------------- > Aaron Morton > New Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 17/12/2013, at 12:28 pm, Narendra Sharma <narendra.sha...@gmail.com> > wrote: > > No snapshots. > > I restarted the node and now the Load in ring is in sync with the disk > usage. Not sure what caused it to go out of sync. However, the Live SStable > count doesn't match exactly with the number of data files on disk. > > I am going through the Cassandra code to understand what could be the > reason for the mismatch in the sstable count and also why there is no > reference of some of the data files in system.log. > > > > > On Mon, Dec 16, 2013 at 2:45 PM, Arindam Barua <aba...@247-inc.com> wrote: > >> >> >> Do you have any snapshots on the nodes where you are seeing this issue? >> >> Snapshots will link to sstables which will cause them not be deleted. >> >> >> >> -Arindam >> >> >> >> *From:* Narendra Sharma [mailto:narendra.sha...@gmail.com] >> *Sent:* Sunday, December 15, 2013 1:15 PM >> *To:* user@cassandra.apache.org >> *Subject:* Cassandra 1.1.6 - Disk usage and Load displayed in ring >> doesn't match >> >> >> >> We have 8 node cluster. Replication factor is 3. >> >> >> >> For some of the nodes the Disk usage (du -ksh .) in the data directory >> for CF doesn't match the Load reported in nodetool ring command. When we >> expanded the cluster from 4 node to 8 nodes (4 weeks back), everything was >> okay. Over period of last 2-3 weeks the disk usage has gone up. We >> increased the RF from 2 to 3 2 weeks ago. >> >> >> >> I am not sure if increasing the RF is causing this issue. >> >> >> >> For one of the nodes that I analyzed: >> >> 1. nodetool ring reported load as 575.38 GB >> >> >> >> 2. nodetool cfstats for the CF reported: >> >> SSTable count: 28 >> >> Space used (live): 572671381955 >> >> Space used (total): 572671381955 >> >> >> >> >> >> 3. 'ls -1 *Data* | wc -l' in the data folder for CF returned >> >> 46 >> >> >> >> 4. 'du -ksh .' in the data folder for CF returned >> >> 720G >> >> >> >> The above numbers indicate that there are some sstables that are obsolete >> and are still occupying space on disk. What could be wrong? Will restarting >> the node help? The cassandra process is running for last 45 days with no >> downtime. However, because the disk usage is high, we are not able to run >> full compaction. >> >> >> >> Also, I can't find reference to each of the sstables on disk in the >> system.log file. For eg I have one data file on disk as (ls -lth): >> >> 86G Nov 20 06:14 >> >> >> >> I have system.log file with first line: >> >> INFO [main] 2013-11-18 09:41:56,120 AbstractCassandraDaemon.java (line >> 101) Logging initialized >> >> >> >> The 86G file must be a result of some compaction. I see no reference of >> data file in system.log file between 11/18 to 11/25. What could be the >> reason for that? The only reference is dated 11/29 when the file was being >> streamed to another node (new node). >> >> >> >> How can I identify the obsolete files and remove them? I am thinking >> about following. Let me know if it make sense. >> >> 1. Restart the node and check the state. >> >> 2. Move the oldest data files to another location (to another mount point) >> >> 3. Restart the node again >> >> 4. Run repair on the node so that it can get the missing data from its >> peers. >> >> >> >> >> >> I compared the numbers of a healthy node for the same CF: >> >> 1. nodetool ring reported load as 662.95 GB >> >> >> >> 2. nodetool cfstats for the CF reported: >> >> SSTable count: 16 >> >> Space used (live): 670524321067 >> >> Space used (total): 670524321067 >> >> >> >> 3. 'ls -1 *Data* | wc -l' in the data folder for CF returned >> >> 16 >> >> >> >> 4. 'du -ksh .' in the data folder for CF returned >> >> 625G >> >> >> >> >> >> -Naren >> >> >> >> >> >> >> -- >> Narendra Sharma >> >> Software Engineer >> >> *http://www.aeris.com <http://www.aeris.com/>* >> >> *http://narendrasharma.blogspot.com/ >> <http://narendrasharma.blogspot.com/>* >> >> >> > > > > -- > Narendra Sharma > Software Engineer > *http://www.aeris.com <http://www.aeris.com/>* > *http://narendrasharma.blogspot.com/ <http://narendrasharma.blogspot.com/>* > > > -- Narendra Sharma Software Engineer *http://www.aeris.com <http://www.aeris.com>* *http://narendrasharma.blogspot.com/ <http://narendrasharma.blogspot.com/>*