[ https://issues.apache.org/jira/browse/CASSANDRA-16335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcus Eriksson updated CASSANDRA-16335: ---------------------------------------- Change Category: Operability Complexity: Normal Component/s: Local/Config Status: Open (was: Triage Needed) > Expose data dirs in ColumnFamilyStoreMBean > ------------------------------------------- > > Key: CASSANDRA-16335 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16335 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config > Reporter: Stefan Miklosovic > Assignee: Stefan Miklosovic > Priority: Low > > As of now, I am not currently aware of any way how to get the information > where a CF stores its data. While this might look like a detail, it is > important for backup and restore purposes. Lets consider this workflow: > 1) There is a keyspace "abc" with table "def", on disk, it will look like > /my/data/abc/def-123445/... > 2) I take a backup, all SSTables are put somewhere under path > /backups/abc/def-12345/.... > 3) I delete this table by CQL, data ends up in "dropped" > 4) I create this table again, but now it will generate other ID - like > /my/data/abc/def-6789/... > 5) I want to restore /backups/abc/def-123445/... but right now there are two > structures - > {code:java} > ├── data > │ ├── abc > │ │ ├── def-12345... > │ │ │ ├── backups > │ │ │ └── snapshots > │ │ │ └── dropped-1607699318139-ghi > │ │ │ ├── manifest.json > │ │ │ ├── na-1-big-CompressionInfo.db > │ │ │ ├── na-1-big-Data.db > │ │ │ ├── na-1-big-Digest.crc32 > │ │ │ ├── na-1-big-Filter.db > │ │ │ ├── na-1-big-Index.db > │ │ │ ├── na-1-big-Statistics.db > │ │ │ ├── na-1-big-Summary.db > │ │ │ ├── na-1-big-TOC.txt > │ │ │ └── schema.cql > │ │ └── def-6789... > │ │ ├── backups > │ │ ├── na-1-big-CompressionInfo.db > │ │ ├── na-1-big-Data.db > │ │ ├── na-1-big-Digest.crc32 > │ │ ├── na-1-big-Filter.db > │ │ ├── na-1-big-Index.db > │ │ ├── na-1-big-Statistics.db > │ │ ├── na-1-big-Summary.db > │ │ └── na-1-big-TOC.txt > {code} > The question now is, what directory I should restore this to? Sure, into the > "active" one, but I can not possibly know which one it is, because one of > them is not used anymore and I do not want to do something very smelly like > listing directories on disk and checking which one does not contain "dropped" > directory ... Yes, one might use importing of SSTables - that is introduced > in Cassandra 4, but for Cassandra 3, one can either copy it over or do > hardlinks and refresh. > The second scenario is like this: > There is just one "active" table, no structure with "dropped" dir exists, but > its id (that part after table name) differs. If I want to copy files over and > refresh, I need to resolve this discrepancy and copy SSTables into a > directory ending on id which differs from id from backup. > I was trying to get this information from CFSMB but that information is not > exposed. > Is there any way how to retrieve via JMX where a table actually stores its > data? > I have put this together: https://github.com/apache/cassandra/pull/850/files -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org