Ivan, If you come up with any ideas that may make this feature better, don't hesitate to share them!
Thank you! On Tue, Sep 22, 2020 at 11:27 AM Ivan Pavlukhin <vololo...@gmail.com> wrote: > Sergey, > > Thank you for your answer. While I am not happy with the proposed > approach but things never were easy. Unfortunately I cannot suggest > 100% better approaches so far. So, I should trust your vision. > > 2020-09-22 10:29 GMT+03:00, Sergey Chugunov <sergey.chugu...@gmail.com>: > > Ivan, > > > > Checkpointer in Maintenance Mode is started and allows normal operations > as > > it may be needed for defragmentation and possibly other cases. > > > > Discovery is started with a special implementation of SPI that doesn't > make > > attempts to seek and/or connect to the rest of the cluster. From that > > perspective node in MM is totally isolated. > > > > Communication is started as usual but I believe it doesn't matter as > > discovery no other nodes are observed in topology and connection attempt > > should not happen. But it may make sense to implement isolated version of > > communication SPI as well to have 100% guarantee that no communication > with > > other nodes will happen. > > > > It is important to note that GridRestProcessor is started normally as we > > need it to connect to the node via control utility. > > > > On Mon, Sep 21, 2020 at 7:04 PM Ivan Pavlukhin <vololo...@gmail.com> > wrote: > > > >> Sergey, > >> > >> > From the code complexity perspective I'm trying to design the feature > >> in such a way that all maintenance code is as encapsulated as possible > >> and > >> avoids massive interventions into main workflows of components. > >> > >> Could please briefly tell what means do you use to achieve > >> encapsulation? Are Discovery, Communication, Checkpointer and other > >> components started in a maintenance mode in current design? > >> > >> 2020-09-21 15:19 GMT+03:00, Nikolay Izhikov <nizhi...@apache.org>: > >> > Hello, Sergey. > >> > > >> >> At the moment I'm aware about two use cases for this feature: > >> >> corrupted > >> >> PDS cleanup and defragmentation. > >> > > >> > AFAIKU There is third use-case for this mode. > >> > > >> > Change encryption master key in case node was down during cluster > >> > master > >> key > >> > change. > >> > In this case, node can’t join to the cluster, because it’s master key > >> > differs from the cluster. > >> > To recover node Ignite should locally change master key before join. > >> > > >> > Please, take a look into source code [1] > >> > > >> > [1] > >> > > >> > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/managers/encryption/GridEncryptionManager.java#L710 > >> > > >> >> 21 сент. 2020 г., в 14:37, Sergey Chugunov < > sergey.chugu...@gmail.com> > >> >> написал(а): > >> >> > >> >> Ivan, > >> >> > >> >> Sorry for some confusion, MM indeed is not a normal mode. What I was > >> >> trying > >> >> to say is that when in MM node still starts and allows the user to > >> >> perform > >> >> actions with it like sending commands via control utility/JMX APIs or > >> >> reading metrics. > >> >> > >> >> This is the key point: although the node is not in the cluster but it > >> >> is > >> >> still alive can be monitored and supports management to do > >> >> maintenance. > >> >> > >> >> From the code complexity perspective I'm trying to design the > feature > >> in > >> >> such a way that all maintenance code is as encapsulated as possible > >> >> and > >> >> avoids massive interventions into main workflows of components. > >> >> At the moment I'm aware about two use cases for this feature: > >> >> corrupted > >> >> PDS > >> >> cleanup and defragmentation. As far as I know it won't bring too much > >> >> complexity in both cases. > >> >> > >> >> I cannot say for other components but I believe it will be possible > to > >> >> integrate MM feature into their workflow as well with reasonable > >> >> amount > >> >> of > >> >> refactoring. > >> >> > >> >> Does it make sense to you? > >> >> > >> >> On Sun, Sep 6, 2020 at 8:08 AM Ivan Pavlukhin <vololo...@gmail.com> > >> >> wrote: > >> >> > >> >>> Sergey, > >> >>> > >> >>> Thank you for your answer! > >> >>> > >> >>> Might be I am looking at the subject from a different angle. > >> >>> > >> >>>> I think of a node in MM as an almost normal one > >> >>> I cannot think of such a mode as a normal one, because it apparently > >> >>> does not perform usual cluster node functions. It is not a part of a > >> >>> cluster, caches data is not available, Discovery and Communication > >> >>> are > >> >>> not needed. > >> >>> > >> >>> I fear that with "node started in a special mode" approach we will > >> >>> get > >> >>> an additional flag in the code making the code more complex and > >> >>> fragile. Should not I worry about it? > >> >>> > >> >>> 2020-09-02 10:45 GMT+03:00, Sergey Chugunov > >> >>> <sergey.chugu...@gmail.com > >> >: > >> >>>> Vladislav, Ivan, > >> >>>> > >> >>>> Thank you for your questions and suggestions. Let me answer them. > >> >>>> > >> >>>> Vladislav, > >> >>>> > >> >>>> If I understood you correctly, you're talking about a node > >> >>>> performing > >> >>> some > >> >>>> automatic actions to fix the problem and then join the cluster as > >> >>>> usual. > >> >>>> > >> >>>> However the original ticket [1] where we faced the need for > >> Maintenance > >> >>>> Mode is about exactly the opposite: avoid doing automatic actions > >> >>>> and > >> >>> give > >> >>>> a user the ability to decide what to do. > >> >>>> > >> >>>> Also the idea of Maintenance Mode is that the node is able to > accept > >> >>>> commands, expose metrics and so on, thus we need all components to > >> >>>> be > >> >>>> initialized (some of them may be partially initialized due to their > >> own > >> >>>> maintenance). > >> >>>> To achieve that we need to go through a full cycle of node > >> >>>> initialization > >> >>>> including discovery initialization. When discovery is initialized > >> >>>> (in > >> >>>> special isolated mode) I don't think it is easy to switch back to > >> >>>> normal > >> >>>> operations without a restart. > >> >>>> > >> >>>> Ivan, > >> >>>> > >> >>>> I think of a node in MM as an almost normal one (maybe with some > >> >>> components > >> >>>> skipped some steps of their initialization). Commands are accepted, > >> >>>> appropriate metrics are exposed e.g. through JMX API and so on. > >> >>>> > >> >>>> So as I see it we'll have special commands for control.{sh|bat} CLI > >> >>>> allowing user to see reasons why node switched to maintenance mode > >> >>>> and/or > >> >>>> trigger actions to fix the problem (I'm still thinking about proper > >> >>> design > >> >>>> of these actions though). > >> >>>> > >> >>>> Of course the user should also be able to fix the problem manually > >> e.g. > >> >>> by > >> >>>> manually deleting corrupted PDS files when node is down. Ideally > >> >>>> Maintenance Mode should be smart enough to figure that out and > >> >>>> switch > >> >>>> to > >> >>>> normal operations without a restart but I'm not sure if it is > >> >>>> possible > >> >>>> without invasive changes of our components' lifecycle. > >> >>>> So I believe this model (node truly started in Maintenance Mode and > >> new > >> >>>> commands in control.{sh|bat}) is a good fit for our current APIs > and > >> >>>> ways > >> >>>> to interact with the node. > >> >>>> > >> >>>> Does it sound reasonable to you? > >> >>>> > >> >>>> Thank you! > >> >>>> > >> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-13366 > >> >>>> > >> >>>> On Tue, Sep 1, 2020 at 2:07 PM Ivan Pavlukhin <vololo...@gmail.com > > > >> >>> wrote: > >> >>>> > >> >>>>> Sergey, > >> >>>>> > >> >>>>> Actually, I missed the point that the discussed mode affects a > >> >>>>> single > >> >>>>> node but not a whole cluster. Perhaps I mixed terms "mode" and > >> >>>>> "state". > >> >>>>> > >> >>>>> My next thoughts about maintenance routines are about special > >> >>>>> utilities. As far as I remember MySQL provides a bunch of scripts > >> >>>>> for > >> >>>>> various maintenance purposes. What user interface for maintenance > >> >>>>> tasks execution is assumed? And what do we mean by "starting" a > >> >>>>> node > >> >>>>> in a maintenance mode? Can we do some routines without "starting" > >> >>>>> (e.g. try to recover PDS or cleanup)? > >> >>>>> > >> >>>>> 2020-08-31 23:41 GMT+03:00, Vladislav Pyatkov < > vldpyat...@gmail.com > >> >: > >> >>>>>> Hi Sergey. > >> >>>>>> > >> >>>>>> As I understand any switching from/to MM possible only through > >> manual > >> >>>>>> restart a node. > >> >>>>>> But in your example that look like a technical actions, that only > >> >>>>> possible > >> >>>>>> in the case. > >> >>>>>> Do you plan to provide a possibility for client where he can make > >> >>>>>> a > >> >>>>>> decision without a manual intervention? > >> >>>>>> > >> >>>>>> For example: Start node and manually agree with an option and > >> >>>>>> after > >> >>>>>> automatically resolve conflict and back to topology as a stable > >> node. > >> >>>>>> > >> >>>>>> On Mon, Aug 31, 2020 at 5:41 PM Sergey Chugunov < > >> >>>>> sergey.chugu...@gmail.com> > >> >>>>>> wrote: > >> >>>>>> > >> >>>>>>> Hello Ivan, > >> >>>>>>> > >> >>>>>>> Thank you for raising the good question, I didn't think of > >> >>> Maintenance > >> >>>>>>> Mode > >> >>>>>>> from that perspective. > >> >>>>>>> > >> >>>>>>> In short, Maintenance Mode isn't related to Cluster States > >> >>>>>>> concept. > >> >>>>>>> According to javadoc documentation of ClusterState enum [1] it > is > >> >>>>>>> solely > >> >>>>>>> about cache operations and to some extent doesn't affect other > >> >>>>> components > >> >>>>>>> of Ignite node. > >> >>>>>>> From APIs perspective putting the methods to manage Cluster > State > >> to > >> >>>>>>> IgniteCluster interface doesn't look ideal to me but it is as it > >> is. > >> >>>>>>> > >> >>>>>>> On the other hand Maintenance Mode as I see it will be managed > >> >>> through > >> >>>>>>> different APIs than a ClusterState and this difference > definitely > >> >>> will > >> >>>>> be > >> >>>>>>> reflected in the documentation of the feature. > >> >>>>>>> > >> >>>>>>> Ignite node is a complex piece of many components interacting > >> >>>>>>> with > >> >>>>>>> each > >> >>>>>>> other, they may have different lifecycles and states; states of > >> >>>>> different > >> >>>>>>> components cannot be reduced to the lowest common denominator. > >> >>>>>>> > >> >>>>>>> However if you have an idea of how to call the feature better to > >> let > >> >>>>>>> the > >> >>>>>>> user easier distinguish it from other similar features please > >> >>>>>>> share > >> >>> it > >> >>>>>>> with > >> >>>>>>> us. Personally I'm very welcome to any suggestions that make > >> >>>>>>> design > >> >>>>>>> more > >> >>>>>>> intuitive and easy-to-use. > >> >>>>>>> > >> >>>>>>> Thanks! > >> >>>>>>> > >> >>>>>>> [1] > >> >>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> > https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/cluster/ClusterState.java > >> >>>>>>> > >> >>>>>>> On Mon, Aug 31, 2020 at 12:32 PM Ivan Pavlukhin < > >> vololo...@gmail.com > >> >>>> > >> >>>>>>> wrote: > >> >>>>>>> > >> >>>>>>>> Hi Sergey, > >> >>>>>>>> > >> >>>>>>>> Thank you for bringing attention to that important subject! > >> >>>>>>>> > >> >>>>>>>> My note here is about one more cluster mode. As far as I know > >> >>>>>>>> currently we already have 3 modes (inactive, read-only, > >> read-write) > >> >>>>>>>> and the subject is about one more. From the first glance it > >> >>>>>>>> could > >> >>> be > >> >>>>>>>> hard for a user to understand and use all modes properly. Do we > >> >>>>>>>> really > >> >>>>>>>> need all spectrum? Could we simplify things somehow? > >> >>>>>>>> > >> >>>>>>>> 2020-08-27 15:59 GMT+03:00, Sergey Chugunov > >> >>>>>>>> <sergey.chugu...@gmail.com>: > >> >>>>>>>>> Hello Nikolay, > >> >>>>>>>>> > >> >>>>>>>>> Created one, available by link [1] > >> >>>>>>>>> > >> >>>>>>>>> Initially there was an intention to develop it under IEP-47 > [2] > >> >>>>>>>>> and > >> >>>>>>> there > >> >>>>>>>>> is even a separate section for Maintenance Mode there. > >> >>>>>>>>> But it looks like this feature is useful in more cases and > >> >>>>>>>>> deserves > >> >>>>>>>>> its > >> >>>>>>>> own > >> >>>>>>>>> IEP. > >> >>>>>>>>> > >> >>>>>>>>> [1] > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-53%3A+Maintenance+Mode > >> >>>>>>>>> [2] > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-47:+Native+persistence+defragmentation > >> >>>>>>>>> > >> >>>>>>>>> On Thu, Aug 27, 2020 at 11:01 AM Nikolay Izhikov > >> >>>>>>>>> <nizhi...@apache.org> > >> >>>>>>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>>> Hello, Sergey! > >> >>>>>>>>>> > >> >>>>>>>>>> Thanks for the proposal. > >> >>>>>>>>>> Let’s have IEP for this feature. > >> >>>>>>>>>> > >> >>>>>>>>>>> 27 авг. 2020 г., в 10:25, Sergey Chugunov < > >> >>>>>>> sergey.chugu...@gmail.com> > >> >>>>>>>>>> написал(а): > >> >>>>>>>>>>> > >> >>>>>>>>>>> Hello Igniters, > >> >>>>>>>>>>> > >> >>>>>>>>>>> I want to start a discussion about new supporting feature > >> >>>>>>>>>>> that > >> >>>>>>>>>>> could > >> >>>>>>>> be > >> >>>>>>>>>>> very useful in many scenarios where persistent storage is > >> >>>>>>>>>>> involved: > >> >>>>>>>>>>> Maintenance Mode. > >> >>>>>>>>>>> > >> >>>>>>>>>>> *Summary* > >> >>>>>>>>>>> Maintenance Mode (MM for short) is a special state of Ignite > >> >>>>>>>>>>> node > >> >>>>>>> when > >> >>>>>>>>>> node > >> >>>>>>>>>>> doesn't serve user requests nor joins the cluster but waits > >> >>> for > >> >>>>>>>>>>> user > >> >>>>>>>>>>> commands or performs automatic actions for maintenance > >> >>>>>>>>>>> purposes. > >> >>>>>>>>>>> > >> >>>>>>>>>>> *Motivation* > >> >>>>>>>>>>> There are situations when node cannot participate in regular > >> >>>>>>>> operations > >> >>>>>>>>>> but > >> >>>>>>>>>>> at the same time should not be shut down. > >> >>>>>>>>>>> > >> >>>>>>>>>>> One example is a ticket [1] where I developed the first > draft > >> >>>>>>>>>>> of > >> >>>>>>>>>>> Maintenance Mode. > >> >>>>>>>>>>> Here we get into a situation when node has potentially > >> >>>>>>>>>>> corrupted > >> >>>>>>>>>>> PDS > >> >>>>>>>>>>> thus > >> >>>>>>>>>>> cannot proceed with restore routine and join the cluster as > >> >>>>> usual. > >> >>>>>>>>>>> At the same time node should not fail nor be stopped for > >> >>> manual > >> >>>>>>>>>>> cleanup. > >> >>>>>>>>>>> Manual cleanup is not always an option (e.g. restricted > >> >>>>>>>>>>> access > >> >>>>>>>>>>> to > >> >>>>>>> file > >> >>>>>>>>>>> system); in managed environments failed node will be > >> >>>>>>>>>>> restarted > >> >>>>>>>>>>> automatically so user won't have time for performing > >> >>>>>>>>>>> necessary > >> >>>>>>>>>> operations. > >> >>>>>>>>>>> Thus node needs to function in a special mode allowing user > >> >>>>>>>>>>> to > >> >>>>>>> connect > >> >>>>>>>>>>> to > >> >>>>>>>>>>> it and perform necessary actions. > >> >>>>>>>>>>> > >> >>>>>>>>>>> Another example is described in IEP-47 [2] where > >> >>>>>>>>>>> defragmentation > >> >>>>>>>>>>> is > >> >>>>>>>>>>> being > >> >>>>>>>>>>> developed. Node defragmenting its PDS should not join the > >> >>>>>>>>>>> cluster > >> >>>>>>>> until > >> >>>>>>>>>> the > >> >>>>>>>>>>> process is finished so it needs to enter Maintenance Mode as > >> >>>>> well. > >> >>>>>>>>>>> > >> >>>>>>>>>>> *Suggested design* > >> >>>>>>>>>>> I suggest MM to work as follows: > >> >>>>>>>>>>> 1. Node enters MM if special markers are found on disk. > These > >> >>>>>>> markers > >> >>>>>>>>>>> called Maintenance Records could be created automatically > >> >>> (e.g. > >> >>>>>>>>>>> when > >> >>>>>>>>>>> storage component detects corrupted storage) or by user > >> >>> request > >> >>>>>>> (when > >> >>>>>>>>>> user > >> >>>>>>>>>>> requests defragmentation of some caches). So entering MM > >> >>>>>>>>>>> requires > >> >>>>>>> node > >> >>>>>>>>>>> restart. > >> >>>>>>>>>>> 2. Started in MM node doesn't join the cluster but finishes > >> >>>>>>>>>>> startup > >> >>>>>>>>>> routine > >> >>>>>>>>>>> so it is able to receive commands and provide metrics to the > >> >>>>> user. > >> >>>>>>>>>>> 3. When all necessary maintenance operations are finished, > >> >>>>>>> Maintenance > >> >>>>>>>>>>> Records for these operations are deleted from disk and node > >> >>>>>>> restarted > >> >>>>>>>>>> again > >> >>>>>>>>>>> to enter normal service. > >> >>>>>>>>>>> > >> >>>>>>>>>>> *Example* > >> >>>>>>>>>>> To put it into a context let's consider an example of how I > >> >>> see > >> >>>>>>>>>>> the > >> >>>>>>> MM > >> >>>>>>>>>>> workflow in case of PDS corruption. > >> >>>>>>>>>>> > >> >>>>>>>>>>> 1. Node has failed in the middle of checkpoint when WAL is > >> >>>>>>> disabled > >> >>>>>>>>>>> for > >> >>>>>>>>>>> a particular cache -> data files of the cache are > >> >>> potentially > >> >>>>>>>>>> corrupted. > >> >>>>>>>>>>> 2. On next startup node detects this situation, creates > >> >>>>>>> Maintenance > >> >>>>>>>>>>> Record on disk and shuts down. > >> >>>>>>>>>>> 3. On next startup node sees Maintenance Record, enters > >> >>>>>>> Maintenance > >> >>>>>>>>>> Mode > >> >>>>>>>>>>> and waits for user to do specific actions: clean > potentially > >> >>>>>>>>>>> corrupted > >> >>>>>>>>>> PDS. > >> >>>>>>>>>>> 4. When user has done necessary actions he/she removes > >> >>>>>>>>>>> Maintenance > >> >>>>>>>>>>> Record using Maintenance Mode API exposed via > >> >>>>>>>>>>> control.{sh|bat} > >> >>>>>>>> script > >> >>>>>>>>>> or > >> >>>>>>>>>>> JMX. > >> >>>>>>>>>>> 5. On next startup node goes to normal operations as > >> >>>>> maintenance > >> >>>>>>>>>>> reason > >> >>>>>>>>>>> is fixed. > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> I prepared a PR [3] for ticket [1] with draft > implementation. > >> >>>>>>>>>>> It > >> >>>>>>>>>>> is > >> >>>>>>>> not > >> >>>>>>>>>>> ready to be merged to master branch but is already fully > >> >>>>>>>>>>> functional > >> >>>>>>>> and > >> >>>>>>>>>> can > >> >>>>>>>>>>> be reviewed. > >> >>>>>>>>>>> > >> >>>>>>>>>>> Hope you'll share your feedback on the feature and/or any > >> >>>>> thoughts > >> >>>>>>> on > >> >>>>>>>>>>> implementation. > >> >>>>>>>>>>> > >> >>>>>>>>>>> Thank you! > >> >>>>>>>>>>> > >> >>>>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-13366 > >> >>>>>>>>>>> [2] > >> >>>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>> > >> >>>>>>> > >> >>>>> > >> >>> > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-47:+Native+persistence+defragmentation > >> >>>>>>>>>>> [3] https://github.com/apache/ignite/pull/8189 > >> >>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> -- > >> >>>>>>>> > >> >>>>>>>> Best regards, > >> >>>>>>>> Ivan Pavlukhin > >> >>>>>>>> > >> >>>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> -- > >> >>>>>> Vladislav Pyatkov > >> >>>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> > >> >>>>> Best regards, > >> >>>>> Ivan Pavlukhin > >> >>>>> > >> >>>> > >> >>> > >> >>> > >> >>> -- > >> >>> > >> >>> Best regards, > >> >>> Ivan Pavlukhin > >> >>> > >> > > >> > > >> > >> > >> -- > >> > >> Best regards, > >> Ivan Pavlukhin > >> > > > > > -- > > Best regards, > Ivan Pavlukhin >