Re: Ignite rebalancing when a server is rebooted w/ persistance enabled.

2021-01-26 Thread maxi628
The cluster was almost idle.
It didn't receive lots of updates while that node was down.

Is there any way to confirm which of those two options you mentioned was
executed?
Is there any way to configure a threshold to choose one of those two
options?




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite rebalancing when a server is rebooted w/ persistance enabled.

2021-01-26 Thread Ilya Kasnacheev
Hello!

While the node was down, the partitions that it previously owned had their
data updated.

At this point we only have two options:
- Throw out existing partitions and rebalance them. AFAIK it involves WAL
so it will take some time. I have heard that if you wipe node's persistence
then it won't use WAL during rebalancing, which should help a lot. However,
I'm not confident here.
- Use historical rebalance, where the node will try to use other nodes'
WALs to get its partitions up to speed. Should be pretty fast, at least if
the rate of change in the cluster is low. However, as far as I know it will
only be used under specific circumstances, maybe you didn't get lucky here.

Regards,
-- 
Ilya Kasnacheev


чт, 14 янв. 2021 г. в 20:02, maxi628 :

> Sorry, I'm attaching the log here  ignite_eviction.log
> <
> http://apache-ignite-users.70518.x6.nabble.com/file/t3058/ignite_eviction.log>
>
>
> I've read https://issues.apache.org/jira/browse/IGNITE-11974 and the thing
> is, this isn't an infinite loop.
> The remainingPartsToEvict=$something starts going down until it reaches 0,
> and that's when we consider the node completely up.
>
> My question is, it is expected for a node to try rebalance if it only went
> down for 2 minutes being part of a baseline topology with persistence
> enabled?
> All caches are Partitioned with 2 backups, and only 1 node is being
> restarted at a time.
> So shouldn't the other nodes with backups of the primary partitions of this
> node cover up for that node until it boots up again?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


RE: Ignite rebalancing when a server is rebooted w/ persistance enabled.

2021-01-14 Thread maxi628
Sorry, I'm attaching the log here  ignite_eviction.log
 
 

I've read https://issues.apache.org/jira/browse/IGNITE-11974 and the thing
is, this isn't an infinite loop. 
The remainingPartsToEvict=$something starts going down until it reaches 0,
and that's when we consider the node completely up.

My question is, it is expected for a node to try rebalance if it only went
down for 2 minutes being part of a baseline topology with persistence
enabled? 
All caches are Partitioned with 2 backups, and only 1 node is being
restarted at a time. 
So shouldn't the other nodes with backups of the primary partitions of this
node cover up for that node until it boots up again?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


RE: Ignite rebalancing when a server is rebooted w/ persistance enabled.

2021-01-14 Thread Alexandr Shapkin
Hi, Looks like the error message is truncated. Could you please re-send it or attach the full log file? PartitionsEvictManager is part of rebalancing routine and it clears local data before demanding it from other nodes. Also I see the following JIRA https://issues.apache.org/jira/browse/IGNITE-11974  From: maxi628Sent: Thursday, January 14, 2021 12:16 AMTo: user@ignite.apache.orgSubject: Ignite rebalancing when a server is rebooted w/ persistance enabled. Hello everyone. I have several ignite clusters with version 2.7.6 and persistence enabled.I have a 3 caches on every cluster, with 10M~ records each. Sometimes when I reboot a node, it takes a lot of time to boot, it can behours. With rebooting I mean stopping the container that's running ignite andstarting it again, without ever changing the baseline topology, it can take2 minutes to restart the container.The node joins the topology just fine but takes a long time to start servingtraffic. Checking the logs I've found that there are several lines like this oneshere:   So for some reason after booting it starts a process calledPartitionsEvictManager, which can take a lot of time.What is the intended functionality behind PartitionsEvictManager?It is something that we should expect? This is a problem because a rolling restart of all nodes in a cluster cantake up to a day. Thanks.--Sent from: http://apache-ignite-users.70518.x6.nabble.com/ 


Ignite rebalancing when a server is rebooted w/ persistance enabled.

2021-01-13 Thread maxi628
 Hello everyone.

I have several ignite clusters with version 2.7.6 and persistence enabled.
I have a 3 caches on every cluster, with 10M~ records each.

Sometimes when I reboot a node, it takes a lot of time to boot, it can be
hours.

With rebooting I mean stopping the container that's running ignite and
starting it again, without ever changing the baseline topology, it can take
2 minutes to restart the container.
The node joins the topology just fine but takes a long time to start serving
traffic.

Checking the logs I've found that there are several lines like this ones
here:



So for some reason after booting it starts a process called
PartitionsEvictManager, which can take a lot of time.
What is the intended functionality behind PartitionsEvictManager?
It is something that we should expect?

This is a problem because a rolling restart of all nodes in a cluster can
take up to a day.

Thanks.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/