[ 
https://issues.apache.org/jira/browse/IGNITE-27871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Valuyskiy updated IGNITE-27871:
------------------------------------
    Description: 
When a Compute task is executed from a thin client and the task class is 
available in the node classpath (e.g. placed in libs directory),
*GridDeploymentManager#getLocalDeployment* creates *GridDeploymentMetadata* 
without *classLoader* and {*}classLoaderId{*}.

*GridDeploymentLocalStore#getDeployment(meta)* first attempts to find an 
existing deployment via {*}deployment(meta){*}. However, *deployment(meta)* 
matches cached deployments only if:
{code:java}
dep.classLoaderId() == meta.classLoaderId() || dep.classLoader() == 
meta.classLoader(){code}
Since both *meta.classLoader* and *meta.classLoaderId* are null, the cached 
local deployment can never be matched.

As a result, *GridDeploymentLocalStore#deploy(...)* is invoked on every task 
execution. This method is synchronized and performs additional lookup and 
bookkeeping logic, which introduces unnecessary contention and latency under 
high load.

The issue is reproducible with:
 * peerClassLoadingEnabled = true
 * task class present in node classpath (libs)
 * thin client executing the same task repeatedly by name

However, when {*}peerClassLoadingEnabled=false{*}, *GridDeploymentManager* 
initializes *locDep* and reuses it directly, bypassing 
{*}GridDeploymentLocalStore{*}, which avoids this problem.

  was:
When a Compute task is executed from a thin client and the task class is 
available in the node classpath (e.g. placed in libs directory),
*GridDeploymentManager#getLocalDeployment* creates *GridDeploymentMetadata* 
without *classLoader* and {*}classLoaderId{*}.

*GridDeploymentLocalStore#getDeployment(meta)* first attempts to find an 
existing deployment via {*}deployment(meta){*}. However, *deployment(meta)* 
matches cached deployments only if:
{code:java}
dep.classLoaderId() == meta.classLoaderId() || dep.classLoader() == 
meta.classLoader(){code}
Since both *meta.classLoader* and *meta.classLoaderId* are null, the cached 
local deployment can never be matched.

As a result, *GridDeploymentLocalStore#deploy(...)* is invoked on every task 
execution. This method is synchronized and performs additional lookup and 
bookkeeping logic, which introduces unnecessary contention and latency under 
high load.

The issue is reproducible with:
 * peerClassLoadingEnabled = true
 * task class present in node classpath (libs)
 * thin client executing the same task repeatedly by name

Hoewver, when {*}peerClassLoadingEnabled=false{*}, *GridDeploymentManager* 
initializes *locDep* and reuses it directly, bypassing 
{*}GridDeploymentLocalStore{*}, which avoids this problem.


> Local deployment cache miss when peerClassLoadingEnabled=true leads to 
> repeated synchronized deploy() calls
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-27871
>                 URL: https://issues.apache.org/jira/browse/IGNITE-27871
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Oleg Valuyskiy
>            Assignee: Oleg Valuyskiy
>            Priority: Major
>              Labels: ise
>
> When a Compute task is executed from a thin client and the task class is 
> available in the node classpath (e.g. placed in libs directory),
> *GridDeploymentManager#getLocalDeployment* creates *GridDeploymentMetadata* 
> without *classLoader* and {*}classLoaderId{*}.
> *GridDeploymentLocalStore#getDeployment(meta)* first attempts to find an 
> existing deployment via {*}deployment(meta){*}. However, *deployment(meta)* 
> matches cached deployments only if:
> {code:java}
> dep.classLoaderId() == meta.classLoaderId() || dep.classLoader() == 
> meta.classLoader(){code}
> Since both *meta.classLoader* and *meta.classLoaderId* are null, the cached 
> local deployment can never be matched.
> As a result, *GridDeploymentLocalStore#deploy(...)* is invoked on every task 
> execution. This method is synchronized and performs additional lookup and 
> bookkeeping logic, which introduces unnecessary contention and latency under 
> high load.
> The issue is reproducible with:
>  * peerClassLoadingEnabled = true
>  * task class present in node classpath (libs)
>  * thin client executing the same task repeatedly by name
> However, when {*}peerClassLoadingEnabled=false{*}, *GridDeploymentManager* 
> initializes *locDep* and reuses it directly, bypassing 
> {*}GridDeploymentLocalStore{*}, which avoids this problem.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to