Re: Understanding cassandra data directory contents

Vladimir Yudovin Mon, 10 Oct 2016 22:56:08 -0700

Snapshots are created inside of table folder (one with ID suffix):



$ nodetool snapshot music

Requested creating snapshot(s) for [music] with snapshot name [1476165047920]

Snapshot directory: 1476165047920



$pwd

cassandra/data/data/music/songs-6060ae608dd811e68e340f08799f1f06/snapshots/1476165047920




Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.





---- On Mon, 10 Oct 2016 17:00:03 -0400Nicolas Douillet 
&lt;nicolas.douil...@gmail.com&gt; wrote ----




Hi Json, 



I'm not familiar enough with Cassandra 3, but it might be snapshots. Snapshots 
are usually hardlinks to sstable directories.



Try this : 

    nodetool clearsnapshot



Does it change anything?



--

Nicolas




Le sam. 8 oct. 2016 à 21:26, Jason Kania &lt;jason.ka...@ymail.com&gt; a écrit :





Hi Vladamir,



Thanks for the response. I assume then that it is safe to remove the 
directories that are not current as per the system_schema.tables table. I have 
dozens of the same table and haven't dropped and added nearly that many times. 
Do any of the nodetool or other commands clean up these unused directories?



Thanks,



Jason Kania


From: Vladimir Yudovin &lt;vla...@winguzone.com&gt;
 To: user@cassandra.apache.org; Jason Kania &lt;jason.ka...@ymail.com&gt; 
 Sent: Saturday, October 8, 2016 2:05 PM
 Subject: Re: Understanding cassandra data directory contents







Each table has unique id (suffix). If you drop and then recreate table with the 
same name it gets new id.



Try

SELECT keyspace_name, table_name, id FROM system_schema.tables ;

to determinate actual ID.



You can limit request to specific keyspace or table.





Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra on Azure and SoftLayer.
Launch your cluster in minutes.





---- On Sat, 08 Oct 2016 13:42:19 -0400 Jason 
Kania&lt;jason.ka...@ymail.com&gt; wrote ---- 


Hello,



I am using Cassandra 3.0.9 and I have encountered an issue where the nodes in 
my 3 node cluster have vastly different amounts of data even though they should 
be roughly the same. When I looked through the data directory for my database 
on two of the nodes, I see a number of directories with the same prefix, eg:



periodicReading-76eb7510096811e68a7421c8b9466352,

periodicReading-453d55a0501d11e68623a9d2b6f96e86

...



Only one directory with a specific table name prefix has a current date and the 
rest are older.



In contrast, on the node with the least space used, each directory has a unique 
prefix (not shared).



I am wondering what the contents of a Cassandra database directory should look 
like. Are there supposed to be multiple entries for a given table or just one?



If just one, what would be a procedure to determine if the other directories 
with the same table are junk that can be removed.



Thanks,



Jason

Re: Understanding cassandra data directory contents

Reply via email to