[ 
https://issues.apache.org/jira/browse/IGNITE-21633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18066824#comment-18066824
 ] 

Konstantin Orlov commented on IGNITE-21633:
-------------------------------------------

Worth to mention that 
{{RemotelyTriggeredResourceRegistry#remoteHostsToResource}} is source of 
contention because usually cardinality of remote hosts is small.

To be more specific: default partition count calculator uses this 
{{dataNodesCount * max(cores, 8) * scaleFactor / replicas}}. It gives us 
minimum 24 partitions per node in default configuration (but highly likely this 
number will be higher due to many-core CPUs are common nowadays). When user 
issue a non-colocated query with ordering by secondary index, in order to 
leverage sort order provided by index, sql engine performs NumberOfPartition 
(which is 24 in our example) simultaneous calls to storage. Number of such 
calls quickly add up when multiple thread/clients issues similar queries 
concurrently.

> Get rid of RemotelyTriggeredResourceRegistry#remoteHostsToResources
> -------------------------------------------------------------------
>
>                 Key: IGNITE-21633
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21633
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Denis Chudov
>            Priority: Major
>              Labels: ignite-3
>
> *Motivation*
> RemotelyTriggeredResourceRegistry has an API that allows closing the 
> resources using following parameters:
>  * #close(UUID contextId)
>  * #close(FullyQualifiedResourceId resourceId)
>  * #close(String remoteHostId) - is used when the remote host is no longer in 
> topology and we can close all resources that it has triggered, because they 
> are no longer needed
> In IGNITE-21293 there was added a map 
> _RemotelyTriggeredResourceRegistry#remoteHostsToResource_ which, in fact, is 
> grouping the resources by remote hosts which is needed to implement the last 
> method without iterating over all resources. The main map ({_}#resources{_}, 
> ordered map of FullyQualifiedResourceId to RemotelyTriggeredResource object) 
> which is used to store the resources is not able to provide some resources by 
> remote host id, because FullyQualifiedResourceId does not contain the remote 
> host id.
> The context id is included into the FullyQualifiedResourceId , but the 
> transaction id (which is contextId in case of cursor resource) does not 
> contain node identifier, only an integer hash code of the coordinator node 
> name.
> *Definition of done*
> The _RemotelyTriggeredResourceRegistry#remoteHostsToResource_ is removed.
> *Implementation notes*
> We can change the transaction id generation to replace the node name hash 
> with the order in which the node joined the cluster, then we will be able to 
> evaluate the transaction coordinator having only the transaction id. This 
> will also require to postulate that context id generation for every type of 
> resources should follow this rule.
> After that we will be able to get a submap of resources created by some node 
> from _#resources_ map (FullyQualifiedResourceId to RemotelyTriggeredResource 
> object).
> As one of the possible implementation to get all resources triggered by the 
> nodes that are no longer in topology, we can iterate over the currently 
> online nodes (their order in which they joined) and get a submap of resources 
> belonging to the space between each two of them. As the number of nodes is 
> significantly less that the number of resources, this operation should be 
> more effective that iterating over the whole map.
> For example:
>  * there were 3 nodes: A (join order 0), B (join order 1), C (join order 2);
>  * node B left the topology;
>  * there are 1000 resources, 200 of them are created by A, 500 by B and 300 
> by C;
>  * iterating over existing node pairs will get following intervals: 
> (MIN_ORDER; 0) - submap is empty, (0; 2) - submap includes 500 resources 
> created by B, (2; MAX_ORDER) - submap is empty.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to