> On Jan 2, 2021, at 7:30 AM, Manu Chadha <manu.cha...@hotmail.com> wrote:
>
>
> Hi
>
> Can I just copy the keyspace folders into new cassandra installation s backup
> and restore strategy? I am trying to do that but it isn’t working.
>
> I am using `K8ssandra` to run my single node C* cluster. I am experimenting
> with data backup and restore. Though K8ssandra uses medusa for data backup
> and restore, I could use it so I thought to test by simply copying/pasting
> the data directory. But I don’t see my data after restore. There could be
> mistakes in my approach so I am not really sure where to look. For example
> K8ssandra uses Kubernetes’ persistent Volume Claims. Does that mean that the
> data is actually stored somewhere else and not in data directories of
> keyspaces?
> Is there a way to look into the files in data directories of keyspaces to
> check what data is there. Maybe the data isn’t backed up properly.
>
> The steps I did to copy the data are:
> GKE cluster-> default-pool -> found node running k8ssandra-dc1-default-sts-0
> container
> Go to VM instances -> SSH to the node which is running
> k8ssandra-dc1-default-sts-0 container
> Once SSHed, ran “docker exec -it
> k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0
> /bin/bash”
> I noticed that the container has Cassandra :
> /opt/cassandra
> ./opt/cassandra/bin/cassandra
> ./opt/cassandra/javadoc/org/apache/cassandra
> ./var/lib/cassandra
> ./var/log/cassandra
>
> cd opt/cassandra/data/data. There were directories for each keyspace. I
> assume that when taking backups we can take a copy of this data directory.
> Then once we need to restore, we can simply copy them back to new node’s data
> directory.
>
> Note that I couldn’t run nodetool inside the container (nodetool flush or
> nodetool refresh) due to JMX issue. I don’t know how important it is to run
> the command. There is no traffic running on the systems though.
>
> I copied data directory from OUTSIDE container (from the node) using “docker
> cp container name:src_path dest_path” (eg. docker cp
> k8s_cassandra_k8ssandra-dc1-default-sts-0_default_00b0d72a-c124-4b04-b25d-9e0f17edc582_0:/opt/cassandra/data/data
> backup/)
>
> Then to transfer the backup directory to cloudshell (the console on web
> browser), I used “gcloud compute scp --recurse
> gke-k8ssandra-cluster-default-pool-1b1cc22a-rd6t:~/backup/data
> ~/K8ssandra_data_backup”
> Then I copied from cloudshell to my laptop/workstation, using cloudshell
> editor. This downloaded a tar of the backup (using a download link).
>
> Then I downloaded a new .gz of C*3.11.6 on my laptop. After unzipping it, I
> noticed that it hasn’t got a data directory. I ran C* and noticed that only
> default keyspaces were present. I also noticed that data directory was now
> created. I then stopped C*.
>
> Then I copied contents of backup folder (only keyspace name folders, not all
> folders) in data/data directory of a new Cassandra system which wasn’t
> running. Then I restarted the c* system but I can’t see the data via cqlsh. I
> can’t see the keyspace as well which probably is because I should probably
> copy system and system-* folders. But is it safe to do so? I tried it but
> landed into several issues around cluster name, snitch, data center names etc.
The schemas are stored in system_schema so until / unless you copy that it’s
not gonna work.
Alternatively you can issue the DDL / CREATE statements on your laptop, it’ll
make new directories, you can copy the data files into those directories. This
is your safest and easiest option most of the time
>
> Would the approach of just copy/pasting folder work ?
>
> Thanks
> Manu
> Sent from Mail for Windows 10
>