[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-06-10 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Fix Version/s: 2.9

> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Assignee: Vladislav Pyatkov
>Priority: Major
> Fix For: 2.9
>
>   Original Estimate: 80h
>  Time Spent: 20m
>  Remaining Estimate: 79h 40m
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changes its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> We should also extract WAL pointer reservation and filtering small partitions 
> from reserveHistoryForExchange(), but this shouldn't be a problem.
> Another point for optimization: searchPartitionCounter() and 
> searchCheckpointEntry() are executed for each (grpId, partId). That means 
> we'll perform O(number of partitions) linear lookups in history. This should 
> be optimized as well: we can perform one lookup for all (grpId, partId) 
> pairs. This is especially critical for reserveHistoryForPreloading() method 
> complexity: it's executed from exchange thread.
> Memory overhead of storing described map on heap is insignificant. Its size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.
> P.S. Possibly, instead of storing map, we can keep earliestCheckpoint right 
> in GridDhtLocalPartition. It may simplify implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point for optimization: searchPartitionCounter() and 
searchCheckpointEntry() are executed for each (grpId, partId). That means we'll 
perform O(number of partitions) linear lookups in history. This should be 
optimized as well: we can perform one lookup for all (grpId, partId) pairs. 
This is especially critical for reserveHistoryForPreloading() method 
complexity: it's executed from exchange thread.

Memory overhead of storing described map on heap is insignificant. Its size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

P.S. Possibly, instead of storing map, we can keep earlistCheckpoint right in 
GridDhtLocalPartition. It may simplify implementation.


  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap is insignificant. Its size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changes its state to 

[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point for optimization: searchPartitionCounter() and 
searchCheckpointEntry() are executed for each (grpId, partId). That means we'll 
perform O(number of partitions) linear lookups in history. This should be 
optimized as well: we can perform one lookup for all (grpId, partId) pairs. 
This is especially critical for reserveHistoryForPreloading() method 
complexity: it's executed from exchange thread.

Memory overhead of storing described map on heap is insignificant. Its size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

P.S. Possibly, instead of storing map, we can keep earliestCheckpoint right in 
GridDhtLocalPartition. It may simplify implementation.


  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point for optimization: searchPartitionCounter() and 
searchCheckpointEntry() are executed for each (grpId, partId). That means we'll 
perform O(number of partitions) linear lookups in history. This should be 
optimized as well: we can perform one lookup for all (grpId, partId) pairs. 
This is especially critical for reserveHistoryForPreloading() method 
complexity: it's executed from exchange thread.

Memory overhead of storing described map on heap is insignificant. Its size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

P.S. Possibly, instead of storing map, we can keep earlistCheckpoint right in 
GridDhtLocalPartition. It may simplify implementation.



> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on firs

[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changes its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> We should also extract WAL pointer reservation and filtering small partitions 
> 

[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap in significant. Its size isn't 
greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changes its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> We should also extract WAL pointer reservation and filtering small partitions 
> f

[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap is insignificant. Its size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changes its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap in significant. Its size isn't 
greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changes its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> We should also extract WAL pointer reservation and filtering small partitions 
> 

[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should also extract WAL pointer reservation and filtering small partitions 
from reserveHistoryForExchange(), but this shouldn't be a problem.
Another point: possibly, instead of storing map, we can keep earlistCheckpoint 
right in GridDhtLocalPartition. It may simplify implementation.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should extract WAL pointer reservation

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> We should also extract WAL pointer reservation and filtering small partitions 
> from reserveHistoryForExchange(), but this shouldn't be a problem.
> Another point: possibly, instead of storing map, we can keep 
> earlistCheckpoint right in GridDhtLocalPartition. It may simplify 
> implementation.
> Memory overhead of storing d

[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.
We should extract WAL pointer reservation

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> We should extract WAL pointer reservation
> Memory overhead of storing described map on heap in significant. It's size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should be removed from the 
map. If checkpoint is marked as inapplicable for a certain group, the whole 
group should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should be removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> Memory overhead of storing described map on heap in significant. It's size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate it's result only on first PME 
(ideally, even before first PME, on recovery stage), keep resulting map (grpId, 
partId -> earlisetCheckpoint) on heap and update it if necessary. From the 
first glance, map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate it's result only on first PME 
(ideally, even before first PME, on recovery stage), keep resulting map {grpId, 
partId -> earlisetCheckpoint} on heap and update it if necessary. From the 
first glance, map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate it's result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> Memory overhead of storing described map on heap in significant. It's size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Remaining Estimate: 80h  (was: 240h)
 Original Estimate: 80h  (was: 240h)

> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate it's result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> {grpId, partId -> earlisetCheckpoint} on heap and update it if necessary. 
> From the first glance, map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> Memory overhead of storing described map on heap in significant. It's size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
the map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, the map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> Memory overhead of storing described map on heap in significant. It's size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IGNITE-13052) Calculate result of reserveHistoryForExchange in advance

2020-05-21 Thread Ivan Rakov (Jira)


 [ 
https://issues.apache.org/jira/browse/IGNITE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Rakov updated IGNITE-13052:

Description: 
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate its result only on first PME (ideally, 
even before first PME, on recovery stage), keep resulting map (grpId, partId -> 
earlisetCheckpoint) on heap and update it if necessary. From the first glance, 
map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.

  was:
Method reserveHistoryForExchange() is called on every partition map exchange. 
It's an expensive call: it requires iteration over the whole checkpoint history 
with possible retrieve of GroupState from WAL (it's stored on heap with 
SoftReference). On some deployments this operation can take several minutes.

The idea of optimization is to calculate it's result only on first PME 
(ideally, even before first PME, on recovery stage), keep resulting map (grpId, 
partId -> earlisetCheckpoint) on heap and update it if necessary. From the 
first glance, map should be updated:
1) On checkpoint. If a new partition appears on local node, it should be 
registered in the map with current checkpoint. If a partition is evicted from 
local node, or changed its state to non-OWNING, it should removed from the map. 
If checkpoint is marked as inapplicable for a certain group, the whole group 
should be removed from the map.
2) On checkpoint history cleanup. For every (grpId, partId), previous earliest 
checkpoint should be changed with setIfGreater to new earliest checkpoint.

Memory overhead of storing described map on heap in significant. It's size 
isn't greater than size of map returned from reserveHistoryForExchange().

Described fix should be much simpler than IGNITE-12429.


> Calculate result of reserveHistoryForExchange in advance
> 
>
> Key: IGNITE-13052
> URL: https://issues.apache.org/jira/browse/IGNITE-13052
> Project: Ignite
>  Issue Type: Improvement
>Reporter: Ivan Rakov
>Priority: Major
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Method reserveHistoryForExchange() is called on every partition map exchange. 
> It's an expensive call: it requires iteration over the whole checkpoint 
> history with possible retrieve of GroupState from WAL (it's stored on heap 
> with SoftReference). On some deployments this operation can take several 
> minutes.
> The idea of optimization is to calculate its result only on first PME 
> (ideally, even before first PME, on recovery stage), keep resulting map 
> (grpId, partId -> earlisetCheckpoint) on heap and update it if necessary. 
> From the first glance, map should be updated:
> 1) On checkpoint. If a new partition appears on local node, it should be 
> registered in the map with current checkpoint. If a partition is evicted from 
> local node, or changed its state to non-OWNING, it should removed from the 
> map. If checkpoint is marked as inapplicable for a certain group, the whole 
> group should be removed from the map.
> 2) On checkpoint history cleanup. For every (grpId, partId), previous 
> earliest checkpoint should be changed with setIfGreater to new earliest 
> checkpoint.
> Memory overhead of storing described map on heap in significant. It's size 
> isn't greater than size of map returned from reserveHistoryForExchange().
> Described fix should be much simpler than IGNITE-12429.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)