[jira] [Updated] (IGNITE-18963) Altering filters must recalculate data nodes

Mirza Aliev (Jira) Wed, 08 Mar 2023 23:13:04 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-18963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mirza Aliev updated IGNITE-18963:
---------------------------------
    Description: 
{*}Motivation{*}:

Altering filters form the SQL must lead to data nodes recalculation.

*Definition of done:*
 
* Altering filters form the SQL leads to data nodes recalculation. 
* Any scale up timers must be canceled and corresponding nodes must be included 
in data nodes if filter were altered and data nodes recalculation was 
triggered.  


*Implementation details:*

Proper algorithm must be implemented under the ticket, currently is unclear how 
we guarantee atomic cancel of existing scale up timers and data nodes 
recalculation. 

The list of open questions that we have to address to implement this feature 
properly: 

1) How should we handle situation, when alter filter event and scale up event 
happen at the same time?
   1.1) We should be able to handle the situation, when we had concurrent scale 
up and alter filter event, and if we decide to cancel scale up, it is possible, 
that new scale up with the greater revision has already rescheduled this timer 
concurrently, so if we cancel that scale up, we could loose the latest scale up 
event. 

Scenario would look like this: 


Topology = [A,B,C], filter is (A, B), data nodes = [A,B]
# Node D is added, timers for scale up are set to 5 minutes
# Node E is added, timers for scale up are rescheduled
# Change filter to filter(A, B, D, E) before scale up timer is up, revision of 
event is 10
# Node F is added concurrently, with revision 11, which is also fits the 
filter, timer is rescheduled
#* Here we come to the situation, where change filter event must cancel scale 
up, but it cannot cancel it, because node F was added and scale up for that 
must be triggered later.
#* Event about node F adding is not visible for change filter (in topology 
augmentation map), because it has higher revision
# Nodes [A, B, D, E] will be as data nodes after filter change event
# Scale Up is triggered and [A, B, D, E, F] is a new data nodes. 



2) Currently, we have topologyAugmentationMap and data nodes are stored as map 
node -> counter. When we apply filter, we don't want to remove corresponding 
node from the map in metastore, if node do not pass filter. The reason is 
because filter could be changed in the future, and node could pass filter after 
that, meaning that we must be able to understand, do we need to include this 
node according to the scale up/scale down events.



Taking into account scenario 1, I propose to not cancel scale up timers after 
filter was changed and treat that event as immediate scale up. In terms of 
code, filter change will immediately call 
{{DistributionZoneManager#saveDataNodesToMetaStorageOnScaleUp(zoneId, rev)}} 
where {{rev}} is the revision of the filter change. With that solution, we 
automatically defenced against any concurrent scale up, because we trigger 
{{saveDataNodesToMetaStorageOnScaleUp}} with the corresponding revision of the 
event.

Lets consider some examples: 

1)
Topology = [A,B,C], filter is (A, B), data nodes = [A,B]
# Node D is added, timers for scale up are set to 5 minutes
# Node E is added, timers for scale up are rescheduled, revision of the event 
is 7
# Change filter to filter(A, B, D, E) before scale up timer is up, revision of 
event is 10
# {{saveDataNodesToMetaStorageOnScaleUp(rev = 10)}} is triggered, nodes [A, B, 
D, E] will be as data nodes after filter change event, set 
{{zoneScaleUpChangeTriggerKey}} as 10
# Scale up event with the revision 7 is running, it will not pass the condition 
for {{zoneScaleUpChangeTriggerKey}} and just will be skipped

2)
# Node D is added, timers for scale up are set to 5 minutes
# Node E is added, timers for scale up are rescheduled, revision of the event 
is 7
# Change filter to filter(A, B, D, E) before scale up timer is up, revision of 
event is 10
# Node F is added concurrently, with revision 11, which is also fits the 
filter, timer is rescheduled
# Nodes [A, B, D, E] will be as data nodes after filter change event, set 
{{zoneScaleUpChangeTriggerKey}} as 10
# Scale up event with the revision 7 is running, it will not pass the condition 
for {{zoneScaleUpChangeTriggerKey}} and just will be skipped


  was:
{*}Motivation{*}:

Altering filters form the SQL must lead to data nodes recalculation.

*Definition of done:*
 
* Altering filters form the SQL leads to data nodes recalculation. 
* Any scale up timers must be canceled and corresponding nodes must be included 
in data nodes if filter were altered and data nodes recalculation was 
triggered.  


*Implementation details:*

Proper algorithm must be implemented under the ticket, currently is unclear how 
we guarantee atomic cancel of existing scale up timers and data nodes 
recalculation. 

The list of open questions that we have to address to implement this feature 
properly: 

1) How should we handle situation, when alter filter event and scale up event 
happen at the same time?
   1.1) We should be able to handle the situation, when we had concurrent scale 
up and alter filter event, and if we decide to cancel scale up, it is possible, 
that new scale up with the greater revision has already rescheduled this timer 
concurrently, so if we cancel that scale up, we could loose the latest scale up 
event. 

Scenario would look like this: 


Topology = [A,B,C], filter is (A, B), data nodes = [A,B]
# Node D is added, timers for scale up are set to 5 minutes
# Node E is added, timers for scale up are rescheduled
# Change filter to filter(A, B, D, E) before scale up timer is up, revision of 
event is 10
# Node F is added concurrently, with revision 11, which is also fits the 
filter, timer is rescheduled
#* Here we come to the situation, where change filter event must cancel scale 
up, but it cannot cancel it, because node F was added and scale up for that 
must be triggered later.
#* Event about node F adding is not visible for change filter (in topology 
augmentation map), because it has higher revision
# Nodes [A, B, D, E] will be as data nodes after filter change event
# Scale Up is triggered and [A, B, D, E, F] is a new data nodes. 



2) Currently, we have topologyAugmentationMap and data nodes are stored as map 
node -> counter. When we apply filter, we don't want to remove corresponding 
node from the map in metastore, if node do not pass filter. The reason is 
because filter could be changed in the future, and node could pass filter after 
that, meaning that we must be able to understand, do we need to include this 
node according to the scale up/scale down events.



Taking into account scenario 1, I propose to not cancel scale up timers after 
filter was changed and treat that event as immediate scale up. In terms of 
code, filter change will immediately call 
{{DistributionZoneManager#saveDataNodesToMetaStorageOnScaleUp(zoneId, rev)}} 
where {{rev}} is the revision of the filter change. With that solution, we 
automatically defenced against any concurrent scale up, because we trigger 
{{saveDataNodesToMetaStorageOnScaleUp}} with the corresponding revision of the 
event.

Lets consider some examples: 

1)
Topology = [A,B,C], filter is (A, B), data nodes = [A,B]
# Node D is added, timers for scale up are set to 5 minutes
# Node E is added, timers for scale up are rescheduled, revision of the event 
is 7
# Change filter to filter(A, B, D, E) before scale up timer is up, revision of 
event is 10
# Nodes [A, B, D, E] will be as data nodes after filter change event, set 
{{zoneScaleUpChangeTriggerKey}} as 10
# Scale up event with the revision 7 is running, it will not pass the condition 
for {{zoneScaleUpChangeTriggerKey}} and just will be skipped



> Altering filters must recalculate data nodes
> --------------------------------------------
>
>                 Key: IGNITE-18963
>                 URL: https://issues.apache.org/jira/browse/IGNITE-18963
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Mirza Aliev
>            Priority: Major
>              Labels: ignite-3
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {*}Motivation{*}:
> Altering filters form the SQL must lead to data nodes recalculation.
> *Definition of done:*
>  
> * Altering filters form the SQL leads to data nodes recalculation. 
> * Any scale up timers must be canceled and corresponding nodes must be 
> included in data nodes if filter were altered and data nodes recalculation 
> was triggered.  
> *Implementation details:*
> Proper algorithm must be implemented under the ticket, currently is unclear 
> how we guarantee atomic cancel of existing scale up timers and data nodes 
> recalculation. 
> The list of open questions that we have to address to implement this feature 
> properly: 
> 1) How should we handle situation, when alter filter event and scale up event 
> happen at the same time?
>    1.1) We should be able to handle the situation, when we had concurrent 
> scale up and alter filter event, and if we decide to cancel scale up, it is 
> possible, that new scale up with the greater revision has already rescheduled 
> this timer concurrently, so if we cancel that scale up, we could loose the 
> latest scale up event. 
> Scenario would look like this: 
> Topology = [A,B,C], filter is (A, B), data nodes = [A,B]
> # Node D is added, timers for scale up are set to 5 minutes
> # Node E is added, timers for scale up are rescheduled
> # Change filter to filter(A, B, D, E) before scale up timer is up, revision 
> of event is 10
> # Node F is added concurrently, with revision 11, which is also fits the 
> filter, timer is rescheduled
> #* Here we come to the situation, where change filter event must cancel scale 
> up, but it cannot cancel it, because node F was added and scale up for that 
> must be triggered later.
> #* Event about node F adding is not visible for change filter (in topology 
> augmentation map), because it has higher revision
> # Nodes [A, B, D, E] will be as data nodes after filter change event
> # Scale Up is triggered and [A, B, D, E, F] is a new data nodes. 
> 2) Currently, we have topologyAugmentationMap and data nodes are stored as 
> map node -> counter. When we apply filter, we don't want to remove 
> corresponding node from the map in metastore, if node do not pass filter. The 
> reason is because filter could be changed in the future, and node could pass 
> filter after that, meaning that we must be able to understand, do we need to 
> include this node according to the scale up/scale down events.
> Taking into account scenario 1, I propose to not cancel scale up timers after 
> filter was changed and treat that event as immediate scale up. In terms of 
> code, filter change will immediately call 
> {{DistributionZoneManager#saveDataNodesToMetaStorageOnScaleUp(zoneId, rev)}} 
> where {{rev}} is the revision of the filter change. With that solution, we 
> automatically defenced against any concurrent scale up, because we trigger 
> {{saveDataNodesToMetaStorageOnScaleUp}} with the corresponding revision of 
> the event.
> Lets consider some examples: 
> 1)
> Topology = [A,B,C], filter is (A, B), data nodes = [A,B]
> # Node D is added, timers for scale up are set to 5 minutes
> # Node E is added, timers for scale up are rescheduled, revision of the event 
> is 7
> # Change filter to filter(A, B, D, E) before scale up timer is up, revision 
> of event is 10
> # {{saveDataNodesToMetaStorageOnScaleUp(rev = 10)}} is triggered, nodes [A, 
> B, D, E] will be as data nodes after filter change event, set 
> {{zoneScaleUpChangeTriggerKey}} as 10
> # Scale up event with the revision 7 is running, it will not pass the 
> condition for {{zoneScaleUpChangeTriggerKey}} and just will be skipped
> 2)
> # Node D is added, timers for scale up are set to 5 minutes
> # Node E is added, timers for scale up are rescheduled, revision of the event 
> is 7
> # Change filter to filter(A, B, D, E) before scale up timer is up, revision 
> of event is 10
> # Node F is added concurrently, with revision 11, which is also fits the 
> filter, timer is rescheduled
> # Nodes [A, B, D, E] will be as data nodes after filter change event, set 
> {{zoneScaleUpChangeTriggerKey}} as 10
> # Scale up event with the revision 7 is running, it will not pass the 
> condition for {{zoneScaleUpChangeTriggerKey}} and just will be skipped



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-18963) Altering filters must recalculate data nodes

Reply via email to