Re: Restore data from Ignite snapshots

Maxim Muzafarov Wed, 09 Feb 2022 04:30:49 -0800

Hello,


As far as I can see from the directory structure you are running 3
nodes on the same persistence data storage.

So, for manual restore you should do the following:
- stop the cluster;
- backup the whole `work/db` directory (e.g. using mv command);
- copy everything from e.g. `snapshot_1/db` to the `work/db`, note the
`work/db` directory must be fully empty at this point;
- be sure that the db/wal and db/{node}/cp directories are empty at
this point for all the nodes you're restoring;
- start the cluster and check the restored snapshot e.g. running
idle_verify command from control.sh
- remove the backup from step2 if you don't need it;

If you are using the control.sh for restore from a snapshot you don't
need to stop the cluster and copy any files manually, you also don't
need to deactivate the cluster. You only need to stop caches you're
intent to restore and run a command from the CLI. Check the
documentation page [1].

You can configure the snapshot directory path [2] to take snapshots
right on your secondary ESB volume.


[1] 
https://ignite.apache.org/docs/2.11.1/snapshots/snapshots#restoring-cache-group-from-the-snapshot
[2] 
https://ignite.apache.org/docs/2.11.1/persistence/snapshot-directory#configuring-snapshot-directory

On Mon, 7 Feb 2022 at 12:35, Surinder Mehra <[email protected]> wrote:
>
> Hey guys,
> Can someone explain pls why snaphot restore doesnt work woth control.sh
>
> On Fri, Feb 4, 2022, 18:57 Surinder Mehra <[email protected]> wrote:
>>
>> hey,
>> Did you get a chance to review my queries please.
>>
>> On Thu, Feb 3, 2022 at 4:40 PM Surinder Mehra <[email protected]> wrote:
>>>
>>> Hi,
>>> So the way I am thinking to use it is if we lose the EBS volume and we need 
>>> to restore the cluster state back. I would have a secondary EBS as my 
>>> snapshot directory so I can restore from it.
>>> It means Application would need to be restarted after EBS data is copied 
>>> back to the work directory. I see two options here
>>> 1. Manual as described in previous reply. manually copy data from snapshot 
>>> directory to work/db and restart cluster
>>> 2. Use control script : I am not clear on how will this work because  If I 
>>> restart cluster, it is going to create directory structure again and then 
>>> when we run restore command, it does not copy data
>>>
>>> Could you please suggest how it would work. directory structure is attached.
>>> Also, can you suggest a better way to copy snapshot directory data to S3 . 
>>> I am thinking of using a kubernetes CSI driver to do it. Any objections to 
>>> it
>>>
>>>
>>>
>>> On Thu, Feb 3, 2022 at 4:23 PM Maxim Muzafarov <[email protected]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> You don't need to stop the cluster or delete/move any snapshot files
>>>> in case you are using the restore procedure from the control.sh, so
>>>> the following should work:
>>>> - create snapshot
>>>> - stop the caches you are intended to restore
>>>> - run ./control.sh --snapshot restore snapshot_1 --start
>>>>
>>>> Can you provide the directory structure of the Ignite working
>>>> directory? (use `tree` command)
>>>>
>>>> On Wed, 2 Feb 2022 at 22:15, Surinder Mehra <[email protected]> wrote:
>>>> >
>>>> > Hi,
>>>> > Could you please point out if i missed something?
>>>> >
>>>> > On Wed, Feb 2, 2022, 13:39 Surinder Mehra <[email protected]> wrote:
>>>> >>
>>>> >> Hey thanks for your suggestions.
>>>> >>
>>>> >> I tried restoring using control.sh but it doesn't seem to work. Below 
>>>> >> are steps
>>>> >> 1. Started 3 nodes and added data using a thick client
>>>> >> 2. created a snapshot using with  ./control.sh --snapshot create 
>>>> >> snapshot_1
>>>> >> 3. I verified, the snapshot directory has data
>>>> >> 4. Stopped the cluster and cleared binary_data, marshaler and nodes 
>>>> >> directory /db
>>>> >> 5. Started the cluster again, all 3 nodes
>>>> >> 6. Activate the cluster using ./control.sh --set-state ACTIVE
>>>> >> 7. Run restore command : ./control.sh --snapshot restore snapshot_1 
>>>> >> --start
>>>> >> 8. Command was successful but data is not copied to cluster nodes.
>>>> >>
>>>> >> Please note that when I restarted the cluster, it created binary_data, 
>>>> >> marshaler and nodes directories by default.
>>>> >>
>>>> >> Did I miss anything ?
>>>> >>
>>>> >>
>>>> >> On Tue, Feb 1, 2022 at 8:21 PM Maxim Muzafarov <[email protected]> 
>>>> >> wrote:
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>> Your case looks correct to me, however, I'd like to mention some
>>>> >>> important points that may help you:
>>>> >>> - the directories structure of the snapshot has the same structure as
>>>> >>> the Ignite native persistence, so you may backup the original cluster
>>>> >>> node directory (for binary_data, marshaller and db) and move all the
>>>> >>> files right from the snapshot.
>>>> >>> - do not forget to backup and clear the original wal directory in case
>>>> >>> of restoration.
>>>> >>> - you may use control.sh --snapshot restore command to restore from a
>>>> >>> snapshot (this was added in 2.11)
>>>> >>>
>>>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-13805
>>>> >>>
>>>> >>> On Tue, 1 Feb 2022 at 16:28, Surinder Mehra <[email protected]> wrote:
>>>> >>> >
>>>> >>> > Hi,
>>>> >>> > After a few hiccups, I managed to restore the cluster state from the 
>>>> >>> > snapshot. Please confirm if they look correct. If so documentation 
>>>> >>> > page needs to be updated
>>>> >>> >
>>>> >>> > Create N nodes
>>>> >>> > Add some data to them
>>>> >>> > Create snapshot
>>>> >>> > Stop all nodes(cluster)
>>>> >>> > Delete binary_data, marsheller and sub directories of /work/db
>>>> >>> > Copy snapshots/snapshotname/db/binary_data to /work/db/,
>>>> >>> > Copy snapshots/snapshotname/db/marshaller to /work/db/
>>>> >>> > Copy  snapshots/snapshotname/db/{nodeid} dir to /work/db/
>>>> >>> > Start cluster
>>>> >>> > Cluster should auto activate after all nodes join it
>>>> >>> > Cluster is ready
>>>> >>> >
>>>> >>> >
>>>> >>> > On Mon, Jan 31, 2022 at 7:14 PM Surinder Mehra <[email protected]> 
>>>> >>> > wrote:
>>>> >>> >>
>>>> >>> >> Hi,
>>>> >>> >> We are using ignite 2.11.1 to experiment with ignite snapshots. We 
>>>> >>> >> tried steps mentioned on below page to restore ignite data from 
>>>> >>> >> snapshot
>>>> >>> >> https://ignite.apache.org/docs/latest/snapshots/snapshots
>>>> >>> >>
>>>> >>> >> But we get the below error when we start a cluster after copying 
>>>> >>> >> data manually as mentioned on the page.
>>>> >>> >>
>>>> >>> >> Steps:
>>>> >>> >> 1.Created 3 nodes and added 3 records
>>>> >>> >>
>>>> >>> >> 2.Created snapshot.
>>>> >>> >> 3. Stopped the cluster and removed files from binary_data and 
>>>> >>> >> marshellar, not the directories. they are present but empty
>>>> >>> >> 4. removed nodeId directories and files under them from /work/db/
>>>> >>> >>
>>>> >>> >> 5. Copied node id directories from snapshot directory to /work/db/. 
>>>> >>> >> I guess the below step meant to say $IGNITE_HOME/work/db/ right ?
>>>> >>> >>
>>>> >>> >> Copy the files belonging to a node with the {node_id} from the 
>>>> >>> >> snapshot into the $IGNITE_HOME/work/ directory. If the db/{node_id} 
>>>> >>> >> directory is not located under the Ignite work dir then you need to 
>>>> >>> >> copy data files there.
>>>> >>> >>
>>>> >>> >> Error : do we need to copy binary_data and marshaler files as well 
>>>> >>> >> or something else missing ?
>>>> >>> >>
>>>> >>> >> Caused by: class org.apache.ignite.IgniteCheckedException: Cannot 
>>>> >>> >> find metadata for object with compact footer (Ignite work directory 
>>>> >>> >> might have been cleared after restart. Make sure that IGNITE_HOME 
>>>> >>> >> does not point to a temp folder or any other folder that is 
>>>> >>> >> destroyed/cleared on restarts) [typeId=-88020438, 
>>>> >>> >> IGNITE_HOME='null']
>>>> >>> >>
>>>> >>> >> Please note that ignite HOEM/work/db directory has all nodes data 
>>>> >>> >> copied from snapshot, it is not cleared as indicated by error above
>>>> >>> >>
>>>> >>> >>

Re: Restore data from Ignite snapshots

Reply via email to