[ 
https://issues.apache.org/jira/browse/IGNITE-17385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Timonin updated IGNITE-17385:
------------------------------------
    Ignite Flags: Release Notes Required  (was: Docs Required,Release Notes 
Required)

> Frequent commits of single cache transactions can lead 
> GridCacheAdapter#asyncOpsSem permits overflow
> ----------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-17385
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17385
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.13
>            Reporter: Ilya Shishkov
>            Assignee: Maksim Timonin
>            Priority: Major
>              Labels: ise
>         Attachments: SemaphorePermitsExceeded.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When you commit a transaction, which was _explicitly started only over a 
> single cache_, then {{GridCacheAdapter#asyncOpRelease}} is called without 
> {{GridCacheAdapter#asyncOpAcquire}}. This situation can lead to continuous 
> grow of permits count in {{GridCacheAdapter#asyncOpsSem}} and to overflow 
> with a further failure  of node started the transaction:
> {code}
> Critical system error detected. Will be handled accordingly to configured 
> handler 
> [hnd=o.a.i.i.processors.cache.transactions.TxAsyncOpsSemaphorePermitsExeededTest$$Lambda$42/1924582348@7379bebb,
>  failureCtx=FailureContext [type=CRITICAL_ERROR, err=java.lang.Error: Maximum 
> permit count exceeded]]
> {code}
> As you can see in [1], in case of the single cache context, transaction will 
> be commited by calling of {{GridCacheAdapter#commitTxAsync}}, which invokes 
> {{GridCacheAdapter#asyncOpRelease}} later. But, when multiple caches affected 
> by transaction, {{GridNearTxLocal#commitNearTxLocalAsync}} is called to 
> commit transaction, and no invokes of {{GridCacheAdapter#asyncOpRelease}} 
> occur.
> So, the greater the load (RPS / TPS) with a such single cache transactions, 
> the faster the failure of a node will happen.
> Reproducer of the problem:  [^SemaphorePermitsExceeded.patch]. It prints 
> additional messages, when semaphore is released, or acquired.
> Links:
> # 
> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/GridCacheSharedContext.java#L1122



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to