Hey guys, Can someone explain pls why snaphot restore doesnt work woth control.sh
On Fri, Feb 4, 2022, 18:57 Surinder Mehra <redni...@gmail.com> wrote: > hey, > Did you get a chance to review my queries please. > > On Thu, Feb 3, 2022 at 4:40 PM Surinder Mehra <redni...@gmail.com> wrote: > >> Hi, >> So the way I am thinking to use it is if we lose the EBS volume and we >> need to restore the cluster state back. I would have a secondary EBS as my >> snapshot directory so I can restore from it. >> It means Application would need to be restarted after EBS data is copied >> back to the work directory. I see two options here >> 1. Manual as described in previous reply. manually copy data from >> snapshot directory to work/db and restart cluster >> 2. Use control script : I am not clear on how will this work because If >> I restart cluster, it is going to create directory structure again and then >> when we run restore command, it does not copy data >> >> Could you please suggest how it would work. directory structure is >> attached. >> Also, can you suggest a better way to copy snapshot directory data to S3 >> . I am thinking of using a kubernetes CSI driver to do it. Any objections >> to it >> >> >> >> On Thu, Feb 3, 2022 at 4:23 PM Maxim Muzafarov <mmu...@apache.org> wrote: >> >>> Hello, >>> >>> You don't need to stop the cluster or delete/move any snapshot files >>> in case you are using the restore procedure from the control.sh, so >>> the following should work: >>> - create snapshot >>> - stop the caches you are intended to restore >>> - run ./control.sh --snapshot restore snapshot_1 --start >>> >>> Can you provide the directory structure of the Ignite working >>> directory? (use `tree` command) >>> >>> On Wed, 2 Feb 2022 at 22:15, Surinder Mehra <redni...@gmail.com> wrote: >>> > >>> > Hi, >>> > Could you please point out if i missed something? >>> > >>> > On Wed, Feb 2, 2022, 13:39 Surinder Mehra <redni...@gmail.com> wrote: >>> >> >>> >> Hey thanks for your suggestions. >>> >> >>> >> I tried restoring using control.sh but it doesn't seem to work. Below >>> are steps >>> >> 1. Started 3 nodes and added data using a thick client >>> >> 2. created a snapshot using with ./control.sh --snapshot create >>> snapshot_1 >>> >> 3. I verified, the snapshot directory has data >>> >> 4. Stopped the cluster and cleared binary_data, marshaler and nodes >>> directory /db >>> >> 5. Started the cluster again, all 3 nodes >>> >> 6. Activate the cluster using ./control.sh --set-state ACTIVE >>> >> 7. Run restore command : ./control.sh --snapshot restore snapshot_1 >>> --start >>> >> 8. Command was successful but data is not copied to cluster nodes. >>> >> >>> >> Please note that when I restarted the cluster, it created >>> binary_data, marshaler and nodes directories by default. >>> >> >>> >> Did I miss anything ? >>> >> >>> >> >>> >> On Tue, Feb 1, 2022 at 8:21 PM Maxim Muzafarov <mmu...@apache.org> >>> wrote: >>> >>> >>> >>> Hello, >>> >>> >>> >>> Your case looks correct to me, however, I'd like to mention some >>> >>> important points that may help you: >>> >>> - the directories structure of the snapshot has the same structure as >>> >>> the Ignite native persistence, so you may backup the original cluster >>> >>> node directory (for binary_data, marshaller and db) and move all the >>> >>> files right from the snapshot. >>> >>> - do not forget to backup and clear the original wal directory in >>> case >>> >>> of restoration. >>> >>> - you may use control.sh --snapshot restore command to restore from a >>> >>> snapshot (this was added in 2.11) >>> >>> >>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-13805 >>> >>> >>> >>> On Tue, 1 Feb 2022 at 16:28, Surinder Mehra <redni...@gmail.com> >>> wrote: >>> >>> > >>> >>> > Hi, >>> >>> > After a few hiccups, I managed to restore the cluster state from >>> the snapshot. Please confirm if they look correct. If so documentation page >>> needs to be updated >>> >>> > >>> >>> > Create N nodes >>> >>> > Add some data to them >>> >>> > Create snapshot >>> >>> > Stop all nodes(cluster) >>> >>> > Delete binary_data, marsheller and sub directories of /work/db >>> >>> > Copy snapshots/snapshotname/db/binary_data to /work/db/, >>> >>> > Copy snapshots/snapshotname/db/marshaller to /work/db/ >>> >>> > Copy snapshots/snapshotname/db/{nodeid} dir to /work/db/ >>> >>> > Start cluster >>> >>> > Cluster should auto activate after all nodes join it >>> >>> > Cluster is ready >>> >>> > >>> >>> > >>> >>> > On Mon, Jan 31, 2022 at 7:14 PM Surinder Mehra <redni...@gmail.com> >>> wrote: >>> >>> >> >>> >>> >> Hi, >>> >>> >> We are using ignite 2.11.1 to experiment with ignite snapshots. >>> We tried steps mentioned on below page to restore ignite data from snapshot >>> >>> >> https://ignite.apache.org/docs/latest/snapshots/snapshots >>> >>> >> >>> >>> >> But we get the below error when we start a cluster after copying >>> data manually as mentioned on the page. >>> >>> >> >>> >>> >> Steps: >>> >>> >> 1.Created 3 nodes and added 3 records >>> >>> >> >>> >>> >> 2.Created snapshot. >>> >>> >> 3. Stopped the cluster and removed files from binary_data and >>> marshellar, not the directories. they are present but empty >>> >>> >> 4. removed nodeId directories and files under them from /work/db/ >>> >>> >> >>> >>> >> 5. Copied node id directories from snapshot directory to >>> /work/db/. I guess the below step meant to say $IGNITE_HOME/work/db/ right ? >>> >>> >> >>> >>> >> Copy the files belonging to a node with the {node_id} from the >>> snapshot into the $IGNITE_HOME/work/ directory. If the db/{node_id} >>> directory is not located under the Ignite work dir then you need to copy >>> data files there. >>> >>> >> >>> >>> >> Error : do we need to copy binary_data and marshaler files as >>> well or something else missing ? >>> >>> >> >>> >>> >> Caused by: class org.apache.ignite.IgniteCheckedException: Cannot >>> find metadata for object with compact footer (Ignite work directory might >>> have been cleared after restart. Make sure that IGNITE_HOME does not point >>> to a temp folder or any other folder that is destroyed/cleared on restarts) >>> [typeId=-88020438, IGNITE_HOME='null'] >>> >>> >> >>> >>> >> Please note that ignite HOEM/work/db directory has all nodes data >>> copied from snapshot, it is not cleared as indicated by error above >>> >>> >> >>> >>> >> >>> >>