[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17525440#comment-17525440 ]
Tibor Repasi edited comment on CASSANDRA-17568 at 4/21/22 6:37 AM: ------------------------------------------------------------------- Hi, thanks for the feedback. {quote} 1) return all data directories of a particular table(s). 2) return all data directories which are eligible to be deleted as the respective keyspace / table (or both) does not exist anymore in Cassandra. You implemented 1) but I miss 2). {quote} That's right. But, it is not trivial to identify directories of deleted tables and keyspaces, since Cassandra doesn't keep track of them. Deleting directories which aren't data paths of any existing table would assume that nothing else could have created them, which makes this approach particularly dangerous. The main goal might obviously be to clean up directories Cassandra created and not using anymore. I like CASSANDRA-16843 and I really love CASSANDRA-16451, which both improve control and handling of snapshots. I could imagine Cassandra to keep track of directories belonging to dropped tables and keyspaces and clean them up automatically under specific circumstances after some time. Maybe data directories could have bounded TTL? But, I think that's a complete different discussion track. This ticket and my patch are about making things visible to support operators. BTW, I am aware about the deadline for the version cut. was (Author: rtib): Hi, thanks for the feedback. {quote} 1) return all data directories of a particular table(s). 2) return all data directories which are eligible to be deleted as the respective keyspace / table (or both) does not exist anymore in Cassandra. You implemented 1) but I miss 2). {quote} That's right. But, it is not trivial to identify directories of deleted tables and keyspaces, since Cassandra doesn't keep track of them. Deleting directories which aren't data paths of any existing table would assume that nothing else could have created them, which makes this approach particularly dangerous. The main goal might obviously be to clean up directories Cassandra created and not using anymore. I like CASSANDRA-16843 and I really love CASSANDRA-16451, which both improve control and handling of snapshots. I could imagine Cassandra to keep track of directories belonging to dropped tables and keyspaces and clean them up automatically after some time. Maybe data directories could have bounded TTL? But, I think that's a complete different discussion track. This ticket and my patch are about making things visible to support operators. BTW, I am aware about the deadline for the version cut. > Tool to list data directories > ----------------------------- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool > Reporter: Tibor Repasi > Assignee: Tibor Repasi > Priority: Normal > Fix For: 4.x > > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > ---------------- > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org