Re: [DISCUSSION] Hot cache backup
👍 https://twitter.com/ApacheIgnite/status/1256249943846576129 - Denis On Fri, May 1, 2020 at 2:05 AM Maxim Muzafarov wrote: > Folks, > > > I've merged the changes. > Thanks everyone for the help. > > Here are a few tasks that I'm going to complete too. > > https://issues.apache.org/jira/browse/IGNITE-12968 > https://issues.apache.org/jira/browse/IGNITE-12967 > https://issues.apache.org/jira/browse/IGNITE-12961 > > On Wed, 29 Apr 2020 at 21:21, Denis Magda wrote: > > > > Maxim, > > > > Ok, let's follow your plan. Ping me once the docs are ready. It's > > definitely not a blocker for merging the feature into the master. We can > > always adjust the implementation in the master before a public release. > > > > Btw, could you please fill in the "readiness estimated data" column in > the > > roadmap draft? I've added this snapshots to the table earlier: > > https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+Roadmap > > > > - > > Denis > > > > > > On Wed, Apr 29, 2020 at 9:17 AM Maxim Muzafarov > wrote: > > > > > Denis, > > > > > > No, I don't. I'm planning to work on documentation pages right after > > > we'll finish with the source code changes. I will be very grateful if > > > you will help with the review of the documentation. > > > > > > Currently, the approach is very straightforward and simple and I doubt > > > we can change anything from the user's standpoint: > > > 1. The single method for creating snapshots of the whole persisted > > > cluster caches - createSnapshot(name); > > > 2. Users can change the location of the base snapshot directory to any > > > he likes (absolute path or relative path can be used, available from > > > IgniteConfiguration); > > > 3. The created snapshot will have the same directory structure as the > > > Ignite instances have; > > > 4. Users will be able to start Ignite instances right from snapshot > > > directory and all will work fine for them (with respect to consistent > > > nodeId). > > > > > > On Wed, 29 Apr 2020 at 19:06, Denis Magda wrote: > > > > > > > > Hi Maxim, > > > > > > > > Do you have a draft of docs in any form explaining how the feature is > > > > supposed to be used (snapshots creation, restore procedure, > > > > setting/changing snapshots location, etc. - essential operations for > such > > > > capabilities)? I can help with the review from the user standpoint > and > > > > might advise usability improvements. > > > > > > > > - > > > > Denis > > > > > > > > > > > > On Wed, Apr 29, 2020 at 8:57 AM Maxim Muzafarov > > > wrote: > > > > > > > > > Folks, > > > > > > > > > > > > > > > I'm going to merge this issue [1] on the 1-st day of May. > > > > > If you still have any questions or PR improvement suggestions, > please > > > > > let me know. > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > > > On Mon, 27 Apr 2020 at 18:27, Maxim Muzafarov > > > wrote: > > > > > > > > > > > > Alexey, > > > > > > > > > > > > > > > > > > From my point of view, the feature is fully self-sufficient and > ready > > > > > > for a release (with a small caveat): > > > > > > - administrators will be able to create snapshots without writing > > > > > java-code; > > > > > > - developers will be able to create snapshots through java API; > > > > > > > > > > > > The documentation pages for creating and restoring procedures > with > > > > > > examples will be completed by me prior to release this feature > for > > > our > > > > > > end-users. > > > > > > > > > > > > All other features mentioned in this list [1] adds convenience > for > > > > > > users but not mandatory. I'll try to finish these tasks from the > list > > > > > > [1] prior to release: > > > > > > - support snapshot creation from a client node > > > > > > - add starting snapshot via control.sh > > > > > > > > > > > > Are there any details I've missed? > > > > > > > > > > > > > > > > > > [1] > > > https://github.com/apache/ignite/pull/7607#issuecomment-618964647 > > > > > > > > > > > > On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk > > > > > > wrote: > > > > > > > > > > > > > > Maxim, > > > > > > > > > > > > > > I saw the list of the tickets you want to work on in the PR, it > > > looks > > > > > nice. > > > > > > > I was wondering, what part of that list are you planning to > > > implement > > > > > > > before the feature is released to end users? For example, I > agree > > > with > > > > > > > Slava that we should implement a command-line utility part for > > > > > snapshots > > > > > > > before the release, however I think it's better to do it in a > > > separate > > > > > > > ticket. > > > > > > > > > > > > > > I know we do not have a strict policy regarding big features > > > > > development in > > > > > > > the community, so perhaps it's a good time to discuss this? If > we > > > are > > > > > ok > > > > > > > with merging separate tickets to master, how we ensure a > complete > > > > > feature > > > > > > > is released to public? If not, shoul
Re: [DISCUSSION] Hot cache backup
Folks, I've merged the changes. Thanks everyone for the help. Here are a few tasks that I'm going to complete too. https://issues.apache.org/jira/browse/IGNITE-12968 https://issues.apache.org/jira/browse/IGNITE-12967 https://issues.apache.org/jira/browse/IGNITE-12961 On Wed, 29 Apr 2020 at 21:21, Denis Magda wrote: > > Maxim, > > Ok, let's follow your plan. Ping me once the docs are ready. It's > definitely not a blocker for merging the feature into the master. We can > always adjust the implementation in the master before a public release. > > Btw, could you please fill in the "readiness estimated data" column in the > roadmap draft? I've added this snapshots to the table earlier: > https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+Roadmap > > - > Denis > > > On Wed, Apr 29, 2020 at 9:17 AM Maxim Muzafarov wrote: > > > Denis, > > > > No, I don't. I'm planning to work on documentation pages right after > > we'll finish with the source code changes. I will be very grateful if > > you will help with the review of the documentation. > > > > Currently, the approach is very straightforward and simple and I doubt > > we can change anything from the user's standpoint: > > 1. The single method for creating snapshots of the whole persisted > > cluster caches - createSnapshot(name); > > 2. Users can change the location of the base snapshot directory to any > > he likes (absolute path or relative path can be used, available from > > IgniteConfiguration); > > 3. The created snapshot will have the same directory structure as the > > Ignite instances have; > > 4. Users will be able to start Ignite instances right from snapshot > > directory and all will work fine for them (with respect to consistent > > nodeId). > > > > On Wed, 29 Apr 2020 at 19:06, Denis Magda wrote: > > > > > > Hi Maxim, > > > > > > Do you have a draft of docs in any form explaining how the feature is > > > supposed to be used (snapshots creation, restore procedure, > > > setting/changing snapshots location, etc. - essential operations for such > > > capabilities)? I can help with the review from the user standpoint and > > > might advise usability improvements. > > > > > > - > > > Denis > > > > > > > > > On Wed, Apr 29, 2020 at 8:57 AM Maxim Muzafarov > > wrote: > > > > > > > Folks, > > > > > > > > > > > > I'm going to merge this issue [1] on the 1-st day of May. > > > > If you still have any questions or PR improvement suggestions, please > > > > let me know. > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > On Mon, 27 Apr 2020 at 18:27, Maxim Muzafarov > > wrote: > > > > > > > > > > Alexey, > > > > > > > > > > > > > > > From my point of view, the feature is fully self-sufficient and ready > > > > > for a release (with a small caveat): > > > > > - administrators will be able to create snapshots without writing > > > > java-code; > > > > > - developers will be able to create snapshots through java API; > > > > > > > > > > The documentation pages for creating and restoring procedures with > > > > > examples will be completed by me prior to release this feature for > > our > > > > > end-users. > > > > > > > > > > All other features mentioned in this list [1] adds convenience for > > > > > users but not mandatory. I'll try to finish these tasks from the list > > > > > [1] prior to release: > > > > > - support snapshot creation from a client node > > > > > - add starting snapshot via control.sh > > > > > > > > > > Are there any details I've missed? > > > > > > > > > > > > > > > [1] > > https://github.com/apache/ignite/pull/7607#issuecomment-618964647 > > > > > > > > > > On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk > > > > > wrote: > > > > > > > > > > > > Maxim, > > > > > > > > > > > > I saw the list of the tickets you want to work on in the PR, it > > looks > > > > nice. > > > > > > I was wondering, what part of that list are you planning to > > implement > > > > > > before the feature is released to end users? For example, I agree > > with > > > > > > Slava that we should implement a command-line utility part for > > > > snapshots > > > > > > before the release, however I think it's better to do it in a > > separate > > > > > > ticket. > > > > > > > > > > > > I know we do not have a strict policy regarding big features > > > > development in > > > > > > the community, so perhaps it's a good time to discuss this? If we > > are > > > > ok > > > > > > with merging separate tickets to master, how we ensure a complete > > > > feature > > > > > > is released to public? If not, should we create a feature branch > > and > > > > wait > > > > > > for all related tickets to be merged there? Will be glad to discuss > > > > this in > > > > > > a separate thread if needed. > > > > > > > > > > > > пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > > > > > > Are there any cases left which we need to discuss? > > > > > > > > > > > > > > D
Re: [DISCUSSION] Hot cache backup
Maxim, Ok, let's follow your plan. Ping me once the docs are ready. It's definitely not a blocker for merging the feature into the master. We can always adjust the implementation in the master before a public release. Btw, could you please fill in the "readiness estimated data" column in the roadmap draft? I've added this snapshots to the table earlier: https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+Roadmap - Denis On Wed, Apr 29, 2020 at 9:17 AM Maxim Muzafarov wrote: > Denis, > > No, I don't. I'm planning to work on documentation pages right after > we'll finish with the source code changes. I will be very grateful if > you will help with the review of the documentation. > > Currently, the approach is very straightforward and simple and I doubt > we can change anything from the user's standpoint: > 1. The single method for creating snapshots of the whole persisted > cluster caches - createSnapshot(name); > 2. Users can change the location of the base snapshot directory to any > he likes (absolute path or relative path can be used, available from > IgniteConfiguration); > 3. The created snapshot will have the same directory structure as the > Ignite instances have; > 4. Users will be able to start Ignite instances right from snapshot > directory and all will work fine for them (with respect to consistent > nodeId). > > On Wed, 29 Apr 2020 at 19:06, Denis Magda wrote: > > > > Hi Maxim, > > > > Do you have a draft of docs in any form explaining how the feature is > > supposed to be used (snapshots creation, restore procedure, > > setting/changing snapshots location, etc. - essential operations for such > > capabilities)? I can help with the review from the user standpoint and > > might advise usability improvements. > > > > - > > Denis > > > > > > On Wed, Apr 29, 2020 at 8:57 AM Maxim Muzafarov > wrote: > > > > > Folks, > > > > > > > > > I'm going to merge this issue [1] on the 1-st day of May. > > > If you still have any questions or PR improvement suggestions, please > > > let me know. > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > On Mon, 27 Apr 2020 at 18:27, Maxim Muzafarov > wrote: > > > > > > > > Alexey, > > > > > > > > > > > > From my point of view, the feature is fully self-sufficient and ready > > > > for a release (with a small caveat): > > > > - administrators will be able to create snapshots without writing > > > java-code; > > > > - developers will be able to create snapshots through java API; > > > > > > > > The documentation pages for creating and restoring procedures with > > > > examples will be completed by me prior to release this feature for > our > > > > end-users. > > > > > > > > All other features mentioned in this list [1] adds convenience for > > > > users but not mandatory. I'll try to finish these tasks from the list > > > > [1] prior to release: > > > > - support snapshot creation from a client node > > > > - add starting snapshot via control.sh > > > > > > > > Are there any details I've missed? > > > > > > > > > > > > [1] > https://github.com/apache/ignite/pull/7607#issuecomment-618964647 > > > > > > > > On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk > > > > wrote: > > > > > > > > > > Maxim, > > > > > > > > > > I saw the list of the tickets you want to work on in the PR, it > looks > > > nice. > > > > > I was wondering, what part of that list are you planning to > implement > > > > > before the feature is released to end users? For example, I agree > with > > > > > Slava that we should implement a command-line utility part for > > > snapshots > > > > > before the release, however I think it's better to do it in a > separate > > > > > ticket. > > > > > > > > > > I know we do not have a strict policy regarding big features > > > development in > > > > > the community, so perhaps it's a good time to discuss this? If we > are > > > ok > > > > > with merging separate tickets to master, how we ensure a complete > > > feature > > > > > is released to public? If not, should we create a feature branch > and > > > wait > > > > > for all related tickets to be merged there? Will be glad to discuss > > > this in > > > > > a separate thread if needed. > > > > > > > > > > пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > > > Are there any cases left which we need to discuss? > > > > > > > > > > > > Do you have any questions? > > > > > > I'm ready to provide all the details you need for the review. > > > > > > > > > > > > Who else what to take a look at my changes [1] [2]? > > > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > [2] https://github.com/apache/ignite/pull/7607 > > > > > > > > > > > > On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov > > > > wrote: > > > > > > > > > > > > > > Alexey, > > > > > > > > > > > > > > > > > > > > > I've addressed all your comments, please, take a look at the PR > > > [1]. > > > > > > > Ad
Re: [DISCUSSION] Hot cache backup
Denis, No, I don't. I'm planning to work on documentation pages right after we'll finish with the source code changes. I will be very grateful if you will help with the review of the documentation. Currently, the approach is very straightforward and simple and I doubt we can change anything from the user's standpoint: 1. The single method for creating snapshots of the whole persisted cluster caches - createSnapshot(name); 2. Users can change the location of the base snapshot directory to any he likes (absolute path or relative path can be used, available from IgniteConfiguration); 3. The created snapshot will have the same directory structure as the Ignite instances have; 4. Users will be able to start Ignite instances right from snapshot directory and all will work fine for them (with respect to consistent nodeId). On Wed, 29 Apr 2020 at 19:06, Denis Magda wrote: > > Hi Maxim, > > Do you have a draft of docs in any form explaining how the feature is > supposed to be used (snapshots creation, restore procedure, > setting/changing snapshots location, etc. - essential operations for such > capabilities)? I can help with the review from the user standpoint and > might advise usability improvements. > > - > Denis > > > On Wed, Apr 29, 2020 at 8:57 AM Maxim Muzafarov wrote: > > > Folks, > > > > > > I'm going to merge this issue [1] on the 1-st day of May. > > If you still have any questions or PR improvement suggestions, please > > let me know. > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > On Mon, 27 Apr 2020 at 18:27, Maxim Muzafarov wrote: > > > > > > Alexey, > > > > > > > > > From my point of view, the feature is fully self-sufficient and ready > > > for a release (with a small caveat): > > > - administrators will be able to create snapshots without writing > > java-code; > > > - developers will be able to create snapshots through java API; > > > > > > The documentation pages for creating and restoring procedures with > > > examples will be completed by me prior to release this feature for our > > > end-users. > > > > > > All other features mentioned in this list [1] adds convenience for > > > users but not mandatory. I'll try to finish these tasks from the list > > > [1] prior to release: > > > - support snapshot creation from a client node > > > - add starting snapshot via control.sh > > > > > > Are there any details I've missed? > > > > > > > > > [1] https://github.com/apache/ignite/pull/7607#issuecomment-618964647 > > > > > > On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk > > > wrote: > > > > > > > > Maxim, > > > > > > > > I saw the list of the tickets you want to work on in the PR, it looks > > nice. > > > > I was wondering, what part of that list are you planning to implement > > > > before the feature is released to end users? For example, I agree with > > > > Slava that we should implement a command-line utility part for > > snapshots > > > > before the release, however I think it's better to do it in a separate > > > > ticket. > > > > > > > > I know we do not have a strict policy regarding big features > > development in > > > > the community, so perhaps it's a good time to discuss this? If we are > > ok > > > > with merging separate tickets to master, how we ensure a complete > > feature > > > > is released to public? If not, should we create a feature branch and > > wait > > > > for all related tickets to be merged there? Will be glad to discuss > > this in > > > > a separate thread if needed. > > > > > > > > пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > > > > > > > > > Folks, > > > > > > > > > > > > > > > Are there any cases left which we need to discuss? > > > > > > > > > > Do you have any questions? > > > > > I'm ready to provide all the details you need for the review. > > > > > > > > > > Who else what to take a look at my changes [1] [2]? > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > [2] https://github.com/apache/ignite/pull/7607 > > > > > > > > > > On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov > > wrote: > > > > > > > > > > > > Alexey, > > > > > > > > > > > > > > > > > > I've addressed all your comments, please, take a look at the PR > > [1]. > > > > > > Additional tests were added. > > > > > > Additional comments with further steps were added. > > > > > > > > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/7607 > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > > > > > On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk > > > > > > wrote: > > > > > > > > > > > > > > Maxim, > > > > > > > > > > > > > > I've left my comments in the PR. > > > > > > > > > > > > > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov > >: > > > > > > > > > > > > > > > Alex P, > > > > > > > > Thank you for the great sophisticated review. > > > > > > > > > > > > > > > > > > > > > > > > Alexey G, > > > > > > > > Will you take a look at my changes[1]? > > > > > > > > The fresh TC.Bot visa attac
Re: [DISCUSSION] Hot cache backup
Hi Maxim, Do you have a draft of docs in any form explaining how the feature is supposed to be used (snapshots creation, restore procedure, setting/changing snapshots location, etc. - essential operations for such capabilities)? I can help with the review from the user standpoint and might advise usability improvements. - Denis On Wed, Apr 29, 2020 at 8:57 AM Maxim Muzafarov wrote: > Folks, > > > I'm going to merge this issue [1] on the 1-st day of May. > If you still have any questions or PR improvement suggestions, please > let me know. > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > On Mon, 27 Apr 2020 at 18:27, Maxim Muzafarov wrote: > > > > Alexey, > > > > > > From my point of view, the feature is fully self-sufficient and ready > > for a release (with a small caveat): > > - administrators will be able to create snapshots without writing > java-code; > > - developers will be able to create snapshots through java API; > > > > The documentation pages for creating and restoring procedures with > > examples will be completed by me prior to release this feature for our > > end-users. > > > > All other features mentioned in this list [1] adds convenience for > > users but not mandatory. I'll try to finish these tasks from the list > > [1] prior to release: > > - support snapshot creation from a client node > > - add starting snapshot via control.sh > > > > Are there any details I've missed? > > > > > > [1] https://github.com/apache/ignite/pull/7607#issuecomment-618964647 > > > > On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk > > wrote: > > > > > > Maxim, > > > > > > I saw the list of the tickets you want to work on in the PR, it looks > nice. > > > I was wondering, what part of that list are you planning to implement > > > before the feature is released to end users? For example, I agree with > > > Slava that we should implement a command-line utility part for > snapshots > > > before the release, however I think it's better to do it in a separate > > > ticket. > > > > > > I know we do not have a strict policy regarding big features > development in > > > the community, so perhaps it's a good time to discuss this? If we are > ok > > > with merging separate tickets to master, how we ensure a complete > feature > > > is released to public? If not, should we create a feature branch and > wait > > > for all related tickets to be merged there? Will be glad to discuss > this in > > > a separate thread if needed. > > > > > > пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > > > > > > > Folks, > > > > > > > > > > > > Are there any cases left which we need to discuss? > > > > > > > > Do you have any questions? > > > > I'm ready to provide all the details you need for the review. > > > > > > > > Who else what to take a look at my changes [1] [2]? > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > [2] https://github.com/apache/ignite/pull/7607 > > > > > > > > On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov > wrote: > > > > > > > > > > Alexey, > > > > > > > > > > > > > > > I've addressed all your comments, please, take a look at the PR > [1]. > > > > > Additional tests were added. > > > > > Additional comments with further steps were added. > > > > > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/7607 > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > > > On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk > > > > > wrote: > > > > > > > > > > > > Maxim, > > > > > > > > > > > > I've left my comments in the PR. > > > > > > > > > > > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov >: > > > > > > > > > > > > > Alex P, > > > > > > > Thank you for the great sophisticated review. > > > > > > > > > > > > > > > > > > > > > Alexey G, > > > > > > > Will you take a look at my changes[1]? > > > > > > > The fresh TC.Bot visa attached. > > > > > > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > > > > > > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov < > plehanov.a...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > Maxim, I've reviewed your PR and it looks good to me. Good > job! > > > > > > > > > > > > > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > > > > > > > alexey.goncha...@gmail.com>: > > > > > > > > > > > > > > > > > Maxim, > > > > > > > > > > > > > > > > > > Thanks for raising this PR. I will do a review during next > week. > > > > > > > > > > > > > > > > > > --AG > > > > > > > > > > > > > > > > > > > > >
Re: [DISCUSSION] Hot cache backup
Folks, I'm going to merge this issue [1] on the 1-st day of May. If you still have any questions or PR improvement suggestions, please let me know. [1] https://issues.apache.org/jira/browse/IGNITE-11073 On Mon, 27 Apr 2020 at 18:27, Maxim Muzafarov wrote: > > Alexey, > > > From my point of view, the feature is fully self-sufficient and ready > for a release (with a small caveat): > - administrators will be able to create snapshots without writing java-code; > - developers will be able to create snapshots through java API; > > The documentation pages for creating and restoring procedures with > examples will be completed by me prior to release this feature for our > end-users. > > All other features mentioned in this list [1] adds convenience for > users but not mandatory. I'll try to finish these tasks from the list > [1] prior to release: > - support snapshot creation from a client node > - add starting snapshot via control.sh > > Are there any details I've missed? > > > [1] https://github.com/apache/ignite/pull/7607#issuecomment-618964647 > > On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk > wrote: > > > > Maxim, > > > > I saw the list of the tickets you want to work on in the PR, it looks nice. > > I was wondering, what part of that list are you planning to implement > > before the feature is released to end users? For example, I agree with > > Slava that we should implement a command-line utility part for snapshots > > before the release, however I think it's better to do it in a separate > > ticket. > > > > I know we do not have a strict policy regarding big features development in > > the community, so perhaps it's a good time to discuss this? If we are ok > > with merging separate tickets to master, how we ensure a complete feature > > is released to public? If not, should we create a feature branch and wait > > for all related tickets to be merged there? Will be glad to discuss this in > > a separate thread if needed. > > > > пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > > > > > Folks, > > > > > > > > > Are there any cases left which we need to discuss? > > > > > > Do you have any questions? > > > I'm ready to provide all the details you need for the review. > > > > > > Who else what to take a look at my changes [1] [2]? > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > [2] https://github.com/apache/ignite/pull/7607 > > > > > > On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov wrote: > > > > > > > > Alexey, > > > > > > > > > > > > I've addressed all your comments, please, take a look at the PR [1]. > > > > Additional tests were added. > > > > Additional comments with further steps were added. > > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/7607 > > > > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk > > > > wrote: > > > > > > > > > > Maxim, > > > > > > > > > > I've left my comments in the PR. > > > > > > > > > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov : > > > > > > > > > > > Alex P, > > > > > > Thank you for the great sophisticated review. > > > > > > > > > > > > > > > > > > Alexey G, > > > > > > Will you take a look at my changes[1]? > > > > > > The fresh TC.Bot visa attached. > > > > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > > > > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov > > > > > > > > > wrote: > > > > > > > > > > > > > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > > > > > > > > > > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > > > > > > alexey.goncha...@gmail.com>: > > > > > > > > > > > > > > > Maxim, > > > > > > > > > > > > > > > > Thanks for raising this PR. I will do a review during next week. > > > > > > > > > > > > > > > > --AG > > > > > > > > > > > > > > > > >
Re: [DISCUSSION] Hot cache backup
Alexey, >From my point of view, the feature is fully self-sufficient and ready for a release (with a small caveat): - administrators will be able to create snapshots without writing java-code; - developers will be able to create snapshots through java API; The documentation pages for creating and restoring procedures with examples will be completed by me prior to release this feature for our end-users. All other features mentioned in this list [1] adds convenience for users but not mandatory. I'll try to finish these tasks from the list [1] prior to release: - support snapshot creation from a client node - add starting snapshot via control.sh Are there any details I've missed? [1] https://github.com/apache/ignite/pull/7607#issuecomment-618964647 On Mon, 27 Apr 2020 at 18:12, Alexey Goncharuk wrote: > > Maxim, > > I saw the list of the tickets you want to work on in the PR, it looks nice. > I was wondering, what part of that list are you planning to implement > before the feature is released to end users? For example, I agree with > Slava that we should implement a command-line utility part for snapshots > before the release, however I think it's better to do it in a separate > ticket. > > I know we do not have a strict policy regarding big features development in > the community, so perhaps it's a good time to discuss this? If we are ok > with merging separate tickets to master, how we ensure a complete feature > is released to public? If not, should we create a feature branch and wait > for all related tickets to be merged there? Will be glad to discuss this in > a separate thread if needed. > > пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > > > Folks, > > > > > > Are there any cases left which we need to discuss? > > > > Do you have any questions? > > I'm ready to provide all the details you need for the review. > > > > Who else what to take a look at my changes [1] [2]? > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > [2] https://github.com/apache/ignite/pull/7607 > > > > On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov wrote: > > > > > > Alexey, > > > > > > > > > I've addressed all your comments, please, take a look at the PR [1]. > > > Additional tests were added. > > > Additional comments with further steps were added. > > > > > > > > > [1] https://github.com/apache/ignite/pull/7607 > > > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk > > > wrote: > > > > > > > > Maxim, > > > > > > > > I've left my comments in the PR. > > > > > > > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov : > > > > > > > > > Alex P, > > > > > Thank you for the great sophisticated review. > > > > > > > > > > > > > > > Alexey G, > > > > > Will you take a look at my changes[1]? > > > > > The fresh TC.Bot visa attached. > > > > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov > > > > > > > wrote: > > > > > > > > > > > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > > > > > > > > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > > > > > alexey.goncha...@gmail.com>: > > > > > > > > > > > > > Maxim, > > > > > > > > > > > > > > Thanks for raising this PR. I will do a review during next week. > > > > > > > > > > > > > > --AG > > > > > > > > > > > > > >
Re: [DISCUSSION] Hot cache backup
Maxim, I saw the list of the tickets you want to work on in the PR, it looks nice. I was wondering, what part of that list are you planning to implement before the feature is released to end users? For example, I agree with Slava that we should implement a command-line utility part for snapshots before the release, however I think it's better to do it in a separate ticket. I know we do not have a strict policy regarding big features development in the community, so perhaps it's a good time to discuss this? If we are ok with merging separate tickets to master, how we ensure a complete feature is released to public? If not, should we create a feature branch and wait for all related tickets to be merged there? Will be glad to discuss this in a separate thread if needed. пн, 27 апр. 2020 г. в 14:38, Maxim Muzafarov : > Folks, > > > Are there any cases left which we need to discuss? > > Do you have any questions? > I'm ready to provide all the details you need for the review. > > Who else what to take a look at my changes [1] [2]? > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > [2] https://github.com/apache/ignite/pull/7607 > > On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov wrote: > > > > Alexey, > > > > > > I've addressed all your comments, please, take a look at the PR [1]. > > Additional tests were added. > > Additional comments with further steps were added. > > > > > > [1] https://github.com/apache/ignite/pull/7607 > > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk > > wrote: > > > > > > Maxim, > > > > > > I've left my comments in the PR. > > > > > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov : > > > > > > > Alex P, > > > > Thank you for the great sophisticated review. > > > > > > > > > > > > Alexey G, > > > > Will you take a look at my changes[1]? > > > > The fresh TC.Bot visa attached. > > > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov > > > > > wrote: > > > > > > > > > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > > > > > > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > > > > alexey.goncha...@gmail.com>: > > > > > > > > > > > Maxim, > > > > > > > > > > > > Thanks for raising this PR. I will do a review during next week. > > > > > > > > > > > > --AG > > > > > > > > > > >
Re: [DISCUSSION] Hot cache backup
Folks, Are there any cases left which we need to discuss? Do you have any questions? I'm ready to provide all the details you need for the review. Who else what to take a look at my changes [1] [2]? [1] https://issues.apache.org/jira/browse/IGNITE-11073 [2] https://github.com/apache/ignite/pull/7607 On Fri, 24 Apr 2020 at 15:01, Maxim Muzafarov wrote: > > Alexey, > > > I've addressed all your comments, please, take a look at the PR [1]. > Additional tests were added. > Additional comments with further steps were added. > > > [1] https://github.com/apache/ignite/pull/7607 > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > > On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk > wrote: > > > > Maxim, > > > > I've left my comments in the PR. > > > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov : > > > > > Alex P, > > > Thank you for the great sophisticated review. > > > > > > > > > Alexey G, > > > Will you take a look at my changes[1]? > > > The fresh TC.Bot visa attached. > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov > > > wrote: > > > > > > > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > > > > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > > > alexey.goncha...@gmail.com>: > > > > > > > > > Maxim, > > > > > > > > > > Thanks for raising this PR. I will do a review during next week. > > > > > > > > > > --AG > > > > > > > >
Re: [DISCUSSION] Hot cache backup
Alexey, I've addressed all your comments, please, take a look at the PR [1]. Additional tests were added. Additional comments with further steps were added. [1] https://github.com/apache/ignite/pull/7607 [2] https://issues.apache.org/jira/browse/IGNITE-11073 On Tue, 21 Apr 2020 at 09:53, Alexey Goncharuk wrote: > > Maxim, > > I've left my comments in the PR. > > пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov : > > > Alex P, > > Thank you for the great sophisticated review. > > > > > > Alexey G, > > Will you take a look at my changes[1]? > > The fresh TC.Bot visa attached. > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov > > wrote: > > > > > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > > alexey.goncha...@gmail.com>: > > > > > > > Maxim, > > > > > > > > Thanks for raising this PR. I will do a review during next week. > > > > > > > > --AG > > > > > >
Re: [DISCUSSION] Hot cache backup
Maxim, I've left my comments in the PR. пн, 20 апр. 2020 г. в 12:52, Maxim Muzafarov : > Alex P, > Thank you for the great sophisticated review. > > > Alexey G, > Will you take a look at my changes[1]? > The fresh TC.Bot visa attached. > > > [1] https://issues.apache.org/jira/browse/IGNITE-11073 > > On Mon, 20 Apr 2020 at 11:54, Alex Plehanov > wrote: > > > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk < > alexey.goncha...@gmail.com>: > > > > > Maxim, > > > > > > Thanks for raising this PR. I will do a review during next week. > > > > > > --AG > > > >
Re: [DISCUSSION] Hot cache backup
Alex P, Thank you for the great sophisticated review. Alexey G, Will you take a look at my changes[1]? The fresh TC.Bot visa attached. [1] https://issues.apache.org/jira/browse/IGNITE-11073 On Mon, 20 Apr 2020 at 11:54, Alex Plehanov wrote: > > Maxim, I've reviewed your PR and it looks good to me. Good job! > > пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk : > > > Maxim, > > > > Thanks for raising this PR. I will do a review during next week. > > > > --AG > >
Re: [DISCUSSION] Hot cache backup
Maxim, I've reviewed your PR and it looks good to me. Good job! пт, 10 апр. 2020 г. в 19:43, Alexey Goncharuk : > Maxim, > > Thanks for raising this PR. I will do a review during next week. > > --AG >
Re: [DISCUSSION] Hot cache backup
Maxim, Thanks for raising this PR. I will do a review during next week. --AG
Re: [DISCUSSION] Hot cache backup
Andrey, > What about primary/backup node data consistency. Primary and backup partitions must be fully consistent in a snapshot, additional recovery procedures not required. So, when we restore a snapshot on the same topology everything will work right out of the box - no WAL needed. This is achieved by triggering PME [1]. Doing this we will get a point in time when all started transactions are finished (on backups too) and new ones are blocked on a new topology version. That's the point in time when snapshot operation starts. And also this is a weak point of the current solution since the process blocks all cluster transactions for a while. See [2]. > I cant quite picture how persistence rebalancing works The WAL-rebalance will not happen. The full rebalance will be used in case of restoring a snapshot on different topology. For now, only restoring on the same cluster topology (same baseline) will work fine, other cases must be explicitly tested but in a theory, it will work too. > You analyze alternative snapshot solutions based on WAL? Do you mean taking snapshots from the cluster without blocking transactions (without PME)? It's not a trivial task from my point of view. Currently, I have no design for it which can cover all corner cases. [1] https://cwiki.apache.org/confluence/display/IGNITE/%28Partition+Map%29+Exchange+-+under+the+hood [2] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/preloader/GridDhtPartitionsExchangeFuture.java#L1524 On Thu, 9 Apr 2020 at 00:52, Andrey Dolmatov wrote: > > I would like to understand your solution deeper. Hope, that my questions > are interesting not only for me: > >- What about primary/backup node data consistency. I found, that [1] >Cassandra uses eventually consistent backups, so some backup data could >miss from snapshot. If I apply snapshot, would Ignite detect and rebalance >data to backup nodes? >- I cant quite picture how persistence rebalancing works, but according >to [2] it uses WAL logs. Snapshot doesn't contain WAL data, correct? Did >You analyze alternative snapshot solutions based on WAL? > > [1] > https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsAboutSnapshots.html > [2] > https://cwiki.apache.org/confluence/display/IGNITE/Persistent+Store+Architecture#PersistentStoreArchitecture-Rebalancing > > ср, 8 апр. 2020 г. в 18:22, Maxim Muzafarov : > > > Andrey, > > > > > > Thanks for your questions, I've also clarified some details on the > > IEP-43 [1] page according to them. > > > > > Does snapshot contain only primary data or backup partitions or both? > > > > A snapshot contains a full copy of persistence data on each local > > node. This means all primary, backup partitions and the SQL index file > > available on the local node are copied to snapshot. > > > > > Could I create snapshot from m-node cluster and apply it to n-node > > cluster (n<>m)? > > > > Currently, the restore procedure is fully manual, but it is possible > > to restore on different topology in general. There are a few options > > here: > > - m == n, the easiest and fastest way > > - m < n, cluster will start and the rebalance will happen (see > > testClusterSnapshotWithRebalancing in PR). If some SQL indexes exist > > it may take a quite a long time to complete. > > - m > n, the hardest case. For instance, if backups > 1 you can start > > a cluster and remove node one by one from baseline. I think this case > > should be covered by additional recovery scripts which will be > > developed further. > > > > > - Should data node has extra space on persistent store to create > > snapshot? Or, from another point of view, woild size of temporary file be > > equal to size of all data on cluster node? > > > > If a cluster has no load you will need only a free space to store > > snapshot which is almost equal to the node `db` directory size. > > > > If a cluster is under the load it needs some extra space to store > > intermediate snapshot results. The amount of such space depends on how > > fast cache partition files are copied to snapshot directory (if disks > > are slow). The maximum size of the temporary file per each partition > > is equal to the size of the appropriate partition file. So, the worst > > case you need x3 extra disk size. But according to my measurements > > assume SSD is used and size of each partition is 300MB it will require > > no more than 1-3% to a cluster under high load. > > > > - What resulted snapshot is, single file or collection of files (one > > for every data node)? > > > > Check the example of the snapshot directory structure on the IEP-43 > > page [1], this is how a completed snapshot will look like. > > > > [1] > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Restoresnapshot(manually) > > > > On Wed, 8 Apr 2020 at 17:18, Andrey Dolmatov wrote: > > > > > >
Re: [DISCUSSION] Hot cache backup
I would like to understand your solution deeper. Hope, that my questions are interesting not only for me: - What about primary/backup node data consistency. I found, that [1] Cassandra uses eventually consistent backups, so some backup data could miss from snapshot. If I apply snapshot, would Ignite detect and rebalance data to backup nodes? - I cant quite picture how persistence rebalancing works, but according to [2] it uses WAL logs. Snapshot doesn't contain WAL data, correct? Did You analyze alternative snapshot solutions based on WAL? [1] https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsAboutSnapshots.html [2] https://cwiki.apache.org/confluence/display/IGNITE/Persistent+Store+Architecture#PersistentStoreArchitecture-Rebalancing ср, 8 апр. 2020 г. в 18:22, Maxim Muzafarov : > Andrey, > > > Thanks for your questions, I've also clarified some details on the > IEP-43 [1] page according to them. > > > Does snapshot contain only primary data or backup partitions or both? > > A snapshot contains a full copy of persistence data on each local > node. This means all primary, backup partitions and the SQL index file > available on the local node are copied to snapshot. > > > Could I create snapshot from m-node cluster and apply it to n-node > cluster (n<>m)? > > Currently, the restore procedure is fully manual, but it is possible > to restore on different topology in general. There are a few options > here: > - m == n, the easiest and fastest way > - m < n, cluster will start and the rebalance will happen (see > testClusterSnapshotWithRebalancing in PR). If some SQL indexes exist > it may take a quite a long time to complete. > - m > n, the hardest case. For instance, if backups > 1 you can start > a cluster and remove node one by one from baseline. I think this case > should be covered by additional recovery scripts which will be > developed further. > > > - Should data node has extra space on persistent store to create > snapshot? Or, from another point of view, woild size of temporary file be > equal to size of all data on cluster node? > > If a cluster has no load you will need only a free space to store > snapshot which is almost equal to the node `db` directory size. > > If a cluster is under the load it needs some extra space to store > intermediate snapshot results. The amount of such space depends on how > fast cache partition files are copied to snapshot directory (if disks > are slow). The maximum size of the temporary file per each partition > is equal to the size of the appropriate partition file. So, the worst > case you need x3 extra disk size. But according to my measurements > assume SSD is used and size of each partition is 300MB it will require > no more than 1-3% to a cluster under high load. > > - What resulted snapshot is, single file or collection of files (one > for every data node)? > > Check the example of the snapshot directory structure on the IEP-43 > page [1], this is how a completed snapshot will look like. > > [1] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Restoresnapshot(manually) > > On Wed, 8 Apr 2020 at 17:18, Andrey Dolmatov wrote: > > > > Hi, Maxim! > > It is very useful feature, great job! > > > > But could you explain me some aspects? > > > >- Does snapshot contain only primary data or backup partitions or > both? > >- Could I create snapshot from m-node cluster and apply it to n-node > >cluster (n<>m)? > >- Should data node has extra space on persistent store to create > >snapshot? Or, from another point of view, woild size of temporary > file be > >equal to size of all data on cluster node? > >- What resulted snapshot is, single file or collection of files (one > for > >every data node)? > > > > I apologize for my questions, but i really interested in such feature. > > > > > > вт, 7 апр. 2020 г. в 22:10, Maxim Muzafarov : > > > > > Igniters, > > > > > > > > > I'd like to back to the discussion of a snapshot operation for Apache > > > Ignite for persistence cache groups and I propose my changes below. I > > > have prepared everything so that the discussion is as meaningful and > > > specific as much as possible: > > > > > > - IEP-43: Cluster snapshot [1] > > > - The Jira task IGNITE-11073 [2] > > > - PR with described changes, Patch Available [4] > > > > > > Changes are ready for review. > > > > > > > > > Here are a few implementation details and my thoughts: > > > > > > 1. Snapshot restore assumed to be manual at the first step. The > > > process will be described on our documentation pages, but it is > > > possible to start node right from the snapshot directory since the > > > directory structure is preserved (check > > > `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some > > > options here about how the restore process must look like: > > > - fully manual snapshot restore (will be documented) > > > - ansible
Re: [DISCUSSION] Hot cache backup
Andrey, Thanks for your questions, I've also clarified some details on the IEP-43 [1] page according to them. > Does snapshot contain only primary data or backup partitions or both? A snapshot contains a full copy of persistence data on each local node. This means all primary, backup partitions and the SQL index file available on the local node are copied to snapshot. > Could I create snapshot from m-node cluster and apply it to n-node cluster > (n<>m)? Currently, the restore procedure is fully manual, but it is possible to restore on different topology in general. There are a few options here: - m == n, the easiest and fastest way - m < n, cluster will start and the rebalance will happen (see testClusterSnapshotWithRebalancing in PR). If some SQL indexes exist it may take a quite a long time to complete. - m > n, the hardest case. For instance, if backups > 1 you can start a cluster and remove node one by one from baseline. I think this case should be covered by additional recovery scripts which will be developed further. > - Should data node has extra space on persistent store to create snapshot? > Or, from another point of view, woild size of temporary file be equal to size > of all data on cluster node? If a cluster has no load you will need only a free space to store snapshot which is almost equal to the node `db` directory size. If a cluster is under the load it needs some extra space to store intermediate snapshot results. The amount of such space depends on how fast cache partition files are copied to snapshot directory (if disks are slow). The maximum size of the temporary file per each partition is equal to the size of the appropriate partition file. So, the worst case you need x3 extra disk size. But according to my measurements assume SSD is used and size of each partition is 300MB it will require no more than 1-3% to a cluster under high load. - What resulted snapshot is, single file or collection of files (one for every data node)? Check the example of the snapshot directory structure on the IEP-43 page [1], this is how a completed snapshot will look like. [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots#IEP-43:Clustersnapshots-Restoresnapshot(manually) On Wed, 8 Apr 2020 at 17:18, Andrey Dolmatov wrote: > > Hi, Maxim! > It is very useful feature, great job! > > But could you explain me some aspects? > >- Does snapshot contain only primary data or backup partitions or both? >- Could I create snapshot from m-node cluster and apply it to n-node >cluster (n<>m)? >- Should data node has extra space on persistent store to create >snapshot? Or, from another point of view, woild size of temporary file be >equal to size of all data on cluster node? >- What resulted snapshot is, single file or collection of files (one for >every data node)? > > I apologize for my questions, but i really interested in such feature. > > > вт, 7 апр. 2020 г. в 22:10, Maxim Muzafarov : > > > Igniters, > > > > > > I'd like to back to the discussion of a snapshot operation for Apache > > Ignite for persistence cache groups and I propose my changes below. I > > have prepared everything so that the discussion is as meaningful and > > specific as much as possible: > > > > - IEP-43: Cluster snapshot [1] > > - The Jira task IGNITE-11073 [2] > > - PR with described changes, Patch Available [4] > > > > Changes are ready for review. > > > > > > Here are a few implementation details and my thoughts: > > > > 1. Snapshot restore assumed to be manual at the first step. The > > process will be described on our documentation pages, but it is > > possible to start node right from the snapshot directory since the > > directory structure is preserved (check > > `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some > > options here about how the restore process must look like: > > - fully manual snapshot restore (will be documented) > > - ansible or shell scripts for restore > > - Java API for restore (I doubt we should go this way). > > > > 3. The snapshot `create` procedure creates a snapshot of all > > persistent caches available on the cluster (see limitations [1]). > > > > 2. The snapshot `create` procedure is available through Java API and > > JMX (control.sh may be implemented further). > > > > Java API: > > IgniteFuture fut = ignite.snapshot() > > .createSnapshot(name); > > > > JMX: > > SnapshotMXBean mxBean = getMBean(ignite.name()); > > mxBean.createSnapshot(name); > > > > 3. The Distribute Process [3] is used to perform a cluster-wide > > snapshot procedure, so we've avoided a lot of boilerplate code here. > > > > 4. The design document [1] contains also an internal API for creating > > a consistent local snapshot of requested cache groups and transfer it > > to another node using the FileTransmission protocol [6]. This is one > > of the parts of IEP-28 [5] for cluster rebalancing via partition files > > and an important part for u
Re: [DISCUSSION] Hot cache backup
Hi, Maxim! It is very useful feature, great job! But could you explain me some aspects? - Does snapshot contain only primary data or backup partitions or both? - Could I create snapshot from m-node cluster and apply it to n-node cluster (n<>m)? - Should data node has extra space on persistent store to create snapshot? Or, from another point of view, woild size of temporary file be equal to size of all data on cluster node? - What resulted snapshot is, single file or collection of files (one for every data node)? I apologize for my questions, but i really interested in such feature. вт, 7 апр. 2020 г. в 22:10, Maxim Muzafarov : > Igniters, > > > I'd like to back to the discussion of a snapshot operation for Apache > Ignite for persistence cache groups and I propose my changes below. I > have prepared everything so that the discussion is as meaningful and > specific as much as possible: > > - IEP-43: Cluster snapshot [1] > - The Jira task IGNITE-11073 [2] > - PR with described changes, Patch Available [4] > > Changes are ready for review. > > > Here are a few implementation details and my thoughts: > > 1. Snapshot restore assumed to be manual at the first step. The > process will be described on our documentation pages, but it is > possible to start node right from the snapshot directory since the > directory structure is preserved (check > `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some > options here about how the restore process must look like: > - fully manual snapshot restore (will be documented) > - ansible or shell scripts for restore > - Java API for restore (I doubt we should go this way). > > 3. The snapshot `create` procedure creates a snapshot of all > persistent caches available on the cluster (see limitations [1]). > > 2. The snapshot `create` procedure is available through Java API and > JMX (control.sh may be implemented further). > > Java API: > IgniteFuture fut = ignite.snapshot() > .createSnapshot(name); > > JMX: > SnapshotMXBean mxBean = getMBean(ignite.name()); > mxBean.createSnapshot(name); > > 3. The Distribute Process [3] is used to perform a cluster-wide > snapshot procedure, so we've avoided a lot of boilerplate code here. > > 4. The design document [1] contains also an internal API for creating > a consistent local snapshot of requested cache groups and transfer it > to another node using the FileTransmission protocol [6]. This is one > of the parts of IEP-28 [5] for cluster rebalancing via partition files > and an important part for understanding the whole design. > > Java API: > public IgniteInternalFuture createRemoteSnapshot( > UUID rmtNodeId, > Map> parts, > BiConsumer partConsumer); > > > Please, share your thoughts and take a loot at my changes [4]. > > > [1] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > [3] > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java#L49 > [4] https://github.com/apache/ignite/pull/7607 > [5] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Filetransferbetweennodes > [6] > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/managers/communication/TransmissionHandler.java#L42 > > > On Thu, 28 Feb 2019 at 14:43, Dmitriy Pavlov wrote: > > > > Hi Maxim, > > > > I agree with Denis and I have just one concern here. > > > > Apache Ignite has quite a long story (started even before Apache), and > now > > it has a way too huge number of features. Some of these features > > - are developed and well known by community members, > > - some of them were contributed a long time ago and nobody develops it, > > - and, actually, in some rare cases, nobody in the community knows how it > > works and how to change it. > > > > Such features may attract users, but a bug in it may ruin impression > about > > the product. Even worse, nobody can help to solve it, and only user > himself > > or herself may be encouraged to contribute a fix. > > > > And my concern here, such a big feature should have a number of > interested > > contributors, who can support it in case if others lost interest. I will > be > > happy if 3-5 members will come and say, yes, I will do a review/I will > help > > with further changes. > > > > Just to be clear, I'm not against it, and I'll never cast -1 for it, but > it > > would be more comfortable to develop this feature with understanding that > > this work will not be useless. > > > > Sincerely, > > Dmitriy Pavlov > > > > ср, 27 февр. 2019 г. в 23:36, Denis Magda : > > > > > Maxim, > > > > > > GridGain has this exact feature available for Ignite native persistence > > > deployments. It's not as easy as it might have been seen from the > > > enablement perspective. Took us many y
Re: [DISCUSSION] Hot cache backup
That's cool. I'm waiting for this thing. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Re: [DISCUSSION] Hot cache backup
Hello, Maxim. Great to see such an important feature in Ignite! Please, let me know if you need any help with review. > 7 апр. 2020 г., в 22:10, Maxim Muzafarov написал(а): > > Igniters, > > > I'd like to back to the discussion of a snapshot operation for Apache > Ignite for persistence cache groups and I propose my changes below. I > have prepared everything so that the discussion is as meaningful and > specific as much as possible: > > - IEP-43: Cluster snapshot [1] > - The Jira task IGNITE-11073 [2] > - PR with described changes, Patch Available [4] > > Changes are ready for review. > > > Here are a few implementation details and my thoughts: > > 1. Snapshot restore assumed to be manual at the first step. The > process will be described on our documentation pages, but it is > possible to start node right from the snapshot directory since the > directory structure is preserved (check > `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some > options here about how the restore process must look like: > - fully manual snapshot restore (will be documented) > - ansible or shell scripts for restore > - Java API for restore (I doubt we should go this way). > > 3. The snapshot `create` procedure creates a snapshot of all > persistent caches available on the cluster (see limitations [1]). > > 2. The snapshot `create` procedure is available through Java API and > JMX (control.sh may be implemented further). > > Java API: > IgniteFuture fut = ignite.snapshot() > .createSnapshot(name); > > JMX: > SnapshotMXBean mxBean = getMBean(ignite.name()); > mxBean.createSnapshot(name); > > 3. The Distribute Process [3] is used to perform a cluster-wide > snapshot procedure, so we've avoided a lot of boilerplate code here. > > 4. The design document [1] contains also an internal API for creating > a consistent local snapshot of requested cache groups and transfer it > to another node using the FileTransmission protocol [6]. This is one > of the parts of IEP-28 [5] for cluster rebalancing via partition files > and an important part for understanding the whole design. > > Java API: > public IgniteInternalFuture createRemoteSnapshot( >UUID rmtNodeId, >Map> parts, >BiConsumer partConsumer); > > > Please, share your thoughts and take a loot at my changes [4]. > > > [1] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots > [2] https://issues.apache.org/jira/browse/IGNITE-11073 > [3] > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java#L49 > [4] https://github.com/apache/ignite/pull/7607 > [5] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Filetransferbetweennodes > [6] > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/managers/communication/TransmissionHandler.java#L42 > > > On Thu, 28 Feb 2019 at 14:43, Dmitriy Pavlov wrote: >> >> Hi Maxim, >> >> I agree with Denis and I have just one concern here. >> >> Apache Ignite has quite a long story (started even before Apache), and now >> it has a way too huge number of features. Some of these features >> - are developed and well known by community members, >> - some of them were contributed a long time ago and nobody develops it, >> - and, actually, in some rare cases, nobody in the community knows how it >> works and how to change it. >> >> Such features may attract users, but a bug in it may ruin impression about >> the product. Even worse, nobody can help to solve it, and only user himself >> or herself may be encouraged to contribute a fix. >> >> And my concern here, such a big feature should have a number of interested >> contributors, who can support it in case if others lost interest. I will be >> happy if 3-5 members will come and say, yes, I will do a review/I will help >> with further changes. >> >> Just to be clear, I'm not against it, and I'll never cast -1 for it, but it >> would be more comfortable to develop this feature with understanding that >> this work will not be useless. >> >> Sincerely, >> Dmitriy Pavlov >> >> ср, 27 февр. 2019 г. в 23:36, Denis Magda : >> >>> Maxim, >>> >>> GridGain has this exact feature available for Ignite native persistence >>> deployments. It's not as easy as it might have been seen from the >>> enablement perspective. Took us many years to make it production ready, >>> involving many engineers. If the rest of the community wants to create >>> something similar and available in open source then please take this >>> estimate into consideration. >>> >>> - >>> Denis >>> >>> >>> On Wed, Feb 27, 2019 at 8:53 AM Maxim Muzafarov >>> wrote: >>> Igniters, Some of the stores with which the Apache Ignite is often compared has a feature called Snapshots [1] [2]. This feature provides an eventually consistent view
Re: [DISCUSSION] Hot cache backup
Igniters, I'd like to back to the discussion of a snapshot operation for Apache Ignite for persistence cache groups and I propose my changes below. I have prepared everything so that the discussion is as meaningful and specific as much as possible: - IEP-43: Cluster snapshot [1] - The Jira task IGNITE-11073 [2] - PR with described changes, Patch Available [4] Changes are ready for review. Here are a few implementation details and my thoughts: 1. Snapshot restore assumed to be manual at the first step. The process will be described on our documentation pages, but it is possible to start node right from the snapshot directory since the directory structure is preserved (check `testConsistentClusterSnapshotUnderLoad` in the PR). We also have some options here about how the restore process must look like: - fully manual snapshot restore (will be documented) - ansible or shell scripts for restore - Java API for restore (I doubt we should go this way). 3. The snapshot `create` procedure creates a snapshot of all persistent caches available on the cluster (see limitations [1]). 2. The snapshot `create` procedure is available through Java API and JMX (control.sh may be implemented further). Java API: IgniteFuture fut = ignite.snapshot() .createSnapshot(name); JMX: SnapshotMXBean mxBean = getMBean(ignite.name()); mxBean.createSnapshot(name); 3. The Distribute Process [3] is used to perform a cluster-wide snapshot procedure, so we've avoided a lot of boilerplate code here. 4. The design document [1] contains also an internal API for creating a consistent local snapshot of requested cache groups and transfer it to another node using the FileTransmission protocol [6]. This is one of the parts of IEP-28 [5] for cluster rebalancing via partition files and an important part for understanding the whole design. Java API: public IgniteInternalFuture createRemoteSnapshot( UUID rmtNodeId, Map> parts, BiConsumer partConsumer); Please, share your thoughts and take a loot at my changes [4]. [1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-43%3A+Cluster+snapshots [2] https://issues.apache.org/jira/browse/IGNITE-11073 [3] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/util/distributed/DistributedProcess.java#L49 [4] https://github.com/apache/ignite/pull/7607 [5] https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Filetransferbetweennodes [6] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/managers/communication/TransmissionHandler.java#L42 On Thu, 28 Feb 2019 at 14:43, Dmitriy Pavlov wrote: > > Hi Maxim, > > I agree with Denis and I have just one concern here. > > Apache Ignite has quite a long story (started even before Apache), and now > it has a way too huge number of features. Some of these features > - are developed and well known by community members, > - some of them were contributed a long time ago and nobody develops it, > - and, actually, in some rare cases, nobody in the community knows how it > works and how to change it. > > Such features may attract users, but a bug in it may ruin impression about > the product. Even worse, nobody can help to solve it, and only user himself > or herself may be encouraged to contribute a fix. > > And my concern here, such a big feature should have a number of interested > contributors, who can support it in case if others lost interest. I will be > happy if 3-5 members will come and say, yes, I will do a review/I will help > with further changes. > > Just to be clear, I'm not against it, and I'll never cast -1 for it, but it > would be more comfortable to develop this feature with understanding that > this work will not be useless. > > Sincerely, > Dmitriy Pavlov > > ср, 27 февр. 2019 г. в 23:36, Denis Magda : > > > Maxim, > > > > GridGain has this exact feature available for Ignite native persistence > > deployments. It's not as easy as it might have been seen from the > > enablement perspective. Took us many years to make it production ready, > > involving many engineers. If the rest of the community wants to create > > something similar and available in open source then please take this > > estimate into consideration. > > > > - > > Denis > > > > > > On Wed, Feb 27, 2019 at 8:53 AM Maxim Muzafarov > > wrote: > > > > > Igniters, > > > > > > Some of the stores with which the Apache Ignite is often compared has > > > a feature called Snapshots [1] [2]. This feature provides an > > > eventually consistent view on stored data for different purposes (e.g. > > > moving data between environments, saving a backup of data for the > > > further restore procedure and so on). The Apache Ignite has all > > > opportunities and machinery to provide cache and\or data region > > > snapshots out of the box but still don't have them. > > > > > > This issue derives from IEP-28 [5] o
Re: [DISCUSSION] Hot cache backup
Hi Maxim, I agree with Denis and I have just one concern here. Apache Ignite has quite a long story (started even before Apache), and now it has a way too huge number of features. Some of these features - are developed and well known by community members, - some of them were contributed a long time ago and nobody develops it, - and, actually, in some rare cases, nobody in the community knows how it works and how to change it. Such features may attract users, but a bug in it may ruin impression about the product. Even worse, nobody can help to solve it, and only user himself or herself may be encouraged to contribute a fix. And my concern here, such a big feature should have a number of interested contributors, who can support it in case if others lost interest. I will be happy if 3-5 members will come and say, yes, I will do a review/I will help with further changes. Just to be clear, I'm not against it, and I'll never cast -1 for it, but it would be more comfortable to develop this feature with understanding that this work will not be useless. Sincerely, Dmitriy Pavlov ср, 27 февр. 2019 г. в 23:36, Denis Magda : > Maxim, > > GridGain has this exact feature available for Ignite native persistence > deployments. It's not as easy as it might have been seen from the > enablement perspective. Took us many years to make it production ready, > involving many engineers. If the rest of the community wants to create > something similar and available in open source then please take this > estimate into consideration. > > - > Denis > > > On Wed, Feb 27, 2019 at 8:53 AM Maxim Muzafarov > wrote: > > > Igniters, > > > > Some of the stores with which the Apache Ignite is often compared has > > a feature called Snapshots [1] [2]. This feature provides an > > eventually consistent view on stored data for different purposes (e.g. > > moving data between environments, saving a backup of data for the > > further restore procedure and so on). The Apache Ignite has all > > opportunities and machinery to provide cache and\or data region > > snapshots out of the box but still don't have them. > > > > This issue derives from IEP-28 [5] on which I'm currently working on > > (partially described in the section [6]). I would like to solve this > > issue too and make Apache Ignite more attractive to use on a > > production environment. I've haven't investigated in-memory type > > caches yet, but for caches with enabled persistence, we can do it > > without any performance impact on cache operations (some additional IO > > operations are needed to copy cache data to backup store, copy on > > write technique is used here). We just need to use our DiscoverySpi, > > PME and Checkpointer process the right way. > > > > For the first step, we can store all backup data on each of cache > > affinity node locally. For instance, the `backup\snapshotId\cache0` > > folder will be created and all `cache0` partitions will be stored > > there for each local node for the snapshot process with id > > `snapshotId`. In future, we can teach nodes to upload snapshotted > > partitions to the one remote node or cloud. > > > > -- > > > > High-level process overview > > > > A new snapshot process is managed via DiscoverySpi and > > CommunicationSpi messages. > > > > 1. The initiator sends a request to the cluster (DiscoveryMessage). > > 2. When the node receives a message it initiates PME. > > 3. The node begins checkpoint process (holding write lock a short time) > > 4. The node starts to track any write attempts to the snapshotting > > partition and places the copy of original pages to the temp file. > > 5. The node performs merge the partition file with the corresponding > delta. > > 6. When the node finishes the backup process it sends ack message with > > saved partitions to the initiator (or the error response). > > 7. When all ack messages received the backup is finished. > > > > The only problem here is that when the request message arrives at the > > particular node during running checkpoint PME will be locked until it > > ends. This is not good. But hopefully, it will be fixed here [4]. > > > > -- > > > > Probable API > > > > From the cache perspective: > > > > IgniteFuture snapshotFut = > > ignite.cache("default") > > .shapshotter() > > .create("myShapshotId"); > > > > IgniteSnapshot cacheSnapshot = snapshotFut.get(); > > > > IgniteCache copiedCache = > > ignite.createCache("CopyCache") > > .withConfiguration(defaultCache.getConfiguration()) > > .loadFromSnapshot(cacheSnapshot.id()); > > > > From the command line perspective: > > > > control.sh --snapshot take cache0,cache1,cache2 > > > > -- > > > > WDYT? > > Will it be a useful feature for the Apache Ignite? > > > > > > [1] > > > https://geode.apache.org/docs/guide/10/managing/cache_snapshots/chapter_overview.html > > [2] > > > https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupTakesSnapshot.html > > [3] > > > http://apache-igni
Re: [DISCUSSION] Hot cache backup
Maxim, GridGain has this exact feature available for Ignite native persistence deployments. It's not as easy as it might have been seen from the enablement perspective. Took us many years to make it production ready, involving many engineers. If the rest of the community wants to create something similar and available in open source then please take this estimate into consideration. - Denis On Wed, Feb 27, 2019 at 8:53 AM Maxim Muzafarov wrote: > Igniters, > > Some of the stores with which the Apache Ignite is often compared has > a feature called Snapshots [1] [2]. This feature provides an > eventually consistent view on stored data for different purposes (e.g. > moving data between environments, saving a backup of data for the > further restore procedure and so on). The Apache Ignite has all > opportunities and machinery to provide cache and\or data region > snapshots out of the box but still don't have them. > > This issue derives from IEP-28 [5] on which I'm currently working on > (partially described in the section [6]). I would like to solve this > issue too and make Apache Ignite more attractive to use on a > production environment. I've haven't investigated in-memory type > caches yet, but for caches with enabled persistence, we can do it > without any performance impact on cache operations (some additional IO > operations are needed to copy cache data to backup store, copy on > write technique is used here). We just need to use our DiscoverySpi, > PME and Checkpointer process the right way. > > For the first step, we can store all backup data on each of cache > affinity node locally. For instance, the `backup\snapshotId\cache0` > folder will be created and all `cache0` partitions will be stored > there for each local node for the snapshot process with id > `snapshotId`. In future, we can teach nodes to upload snapshotted > partitions to the one remote node or cloud. > > -- > > High-level process overview > > A new snapshot process is managed via DiscoverySpi and > CommunicationSpi messages. > > 1. The initiator sends a request to the cluster (DiscoveryMessage). > 2. When the node receives a message it initiates PME. > 3. The node begins checkpoint process (holding write lock a short time) > 4. The node starts to track any write attempts to the snapshotting > partition and places the copy of original pages to the temp file. > 5. The node performs merge the partition file with the corresponding delta. > 6. When the node finishes the backup process it sends ack message with > saved partitions to the initiator (or the error response). > 7. When all ack messages received the backup is finished. > > The only problem here is that when the request message arrives at the > particular node during running checkpoint PME will be locked until it > ends. This is not good. But hopefully, it will be fixed here [4]. > > -- > > Probable API > > From the cache perspective: > > IgniteFuture snapshotFut = > ignite.cache("default") > .shapshotter() > .create("myShapshotId"); > > IgniteSnapshot cacheSnapshot = snapshotFut.get(); > > IgniteCache copiedCache = > ignite.createCache("CopyCache") > .withConfiguration(defaultCache.getConfiguration()) > .loadFromSnapshot(cacheSnapshot.id()); > > From the command line perspective: > > control.sh --snapshot take cache0,cache1,cache2 > > -- > > WDYT? > Will it be a useful feature for the Apache Ignite? > > > [1] > https://geode.apache.org/docs/guide/10/managing/cache_snapshots/chapter_overview.html > [2] > https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupTakesSnapshot.html > [3] > http://apache-ignite-developers.2346864.n4.nabble.com/Data-Snapshots-in-Ignite-td4183.html > [4] https://issues.apache.org/jira/browse/IGNITE-10508 > [5] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing > [6] > https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Checkpointer >
[DISCUSSION] Hot cache backup
Igniters, Some of the stores with which the Apache Ignite is often compared has a feature called Snapshots [1] [2]. This feature provides an eventually consistent view on stored data for different purposes (e.g. moving data between environments, saving a backup of data for the further restore procedure and so on). The Apache Ignite has all opportunities and machinery to provide cache and\or data region snapshots out of the box but still don't have them. This issue derives from IEP-28 [5] on which I'm currently working on (partially described in the section [6]). I would like to solve this issue too and make Apache Ignite more attractive to use on a production environment. I've haven't investigated in-memory type caches yet, but for caches with enabled persistence, we can do it without any performance impact on cache operations (some additional IO operations are needed to copy cache data to backup store, copy on write technique is used here). We just need to use our DiscoverySpi, PME and Checkpointer process the right way. For the first step, we can store all backup data on each of cache affinity node locally. For instance, the `backup\snapshotId\cache0` folder will be created and all `cache0` partitions will be stored there for each local node for the snapshot process with id `snapshotId`. In future, we can teach nodes to upload snapshotted partitions to the one remote node or cloud. -- High-level process overview A new snapshot process is managed via DiscoverySpi and CommunicationSpi messages. 1. The initiator sends a request to the cluster (DiscoveryMessage). 2. When the node receives a message it initiates PME. 3. The node begins checkpoint process (holding write lock a short time) 4. The node starts to track any write attempts to the snapshotting partition and places the copy of original pages to the temp file. 5. The node performs merge the partition file with the corresponding delta. 6. When the node finishes the backup process it sends ack message with saved partitions to the initiator (or the error response). 7. When all ack messages received the backup is finished. The only problem here is that when the request message arrives at the particular node during running checkpoint PME will be locked until it ends. This is not good. But hopefully, it will be fixed here [4]. -- Probable API >From the cache perspective: IgniteFuture snapshotFut = ignite.cache("default") .shapshotter() .create("myShapshotId"); IgniteSnapshot cacheSnapshot = snapshotFut.get(); IgniteCache copiedCache = ignite.createCache("CopyCache") .withConfiguration(defaultCache.getConfiguration()) .loadFromSnapshot(cacheSnapshot.id()); >From the command line perspective: control.sh --snapshot take cache0,cache1,cache2 -- WDYT? Will it be a useful feature for the Apache Ignite? [1] https://geode.apache.org/docs/guide/10/managing/cache_snapshots/chapter_overview.html [2] https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupTakesSnapshot.html [3] http://apache-ignite-developers.2346864.n4.nabble.com/Data-Snapshots-in-Ignite-td4183.html [4] https://issues.apache.org/jira/browse/IGNITE-10508 [5] https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing [6] https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-Checkpointer