>> This invariant is violated in many places.
At least in half of the messages:

org.apache.ignite.internal.processors.cache.distributed.GridDistributedLockRequest#partition
org.apache.ignite.internal.processors.cache.distributed.GridDistributedUnlockRequest#partition
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxFinishResponse#partition
org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxPrepareResponse#partition

+1 to use nearXidVersion.hash() as a stripe index, but, listed methods made
me doubt.

On Fri, May 19, 2023 at 7:32 PM Ivan Daschinsky <ivanda...@gmail.com> wrote:

> >> Tx processing is supposed to be thread bound by hashing the version to a
> partition
> This invariant is violated in many places. The most notorious example is tx
> recovery.
>
> Another example: I just added an assertion that checks tId of a creator
> thread with tId of an accessor thread.
> TxMultiCacheAsyncOpsTest fails immediately on processing of a tx prepare
> request. Looks like a big issue, IMO
>
>
> пт, 19 мая 2023 г. в 19:11, Alexei Scherbakov <
> alexey.scherbak...@gmail.com
> >:
>
> > Tx processing is supposed to be thread bound by hashing the version to a
> > partition, see methods like [1]
> > If for some cases this invariant is broken, this should be fixed.
> >
> > [1]
> >
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareRequest#partition
> >
> > пт, 19 мая 2023 г. в 15:57, Anton Vinogradov <a...@apache.org>:
> >
> > > Igniters,
> > >
> > > My team was faced with node failure [1] because of non-threadsafe
> > > collections usage.
> > >
> > > IgniteTxStateImpl's fields
> > > - activeCacheIds
> > > - txMap
> > > are not thread safe, but are widely used from different threads without
> > the
> > > proper sync.
> > >
> > > The main question is ... why?
> > >
> > > According to the research, we have no guarantee that tx will be
> processed
> > > at the single thread.
> > > It may be processed at the several! threads at the striped pool and at
> > the
> > > tx recovery thread as well.
> > >
> > > Thread at the striped pool will be selected by the message's
> partition()
> > > method, which can be calculated like this:
> > > - return keys != null && !keys.isEmpty() ? keys.get(0).partition() :
> -1;
> > > - return U.safeAbs(version().hashCode());
> > > - ...,
> > > so, no guarantee it is processed at the same thread (proven by tests).
> > >
> > > Seems, we MAY lose the data.
> > > For example, ignoring some or all keys from txMap at commit.
> > >
> > > If anyone knows why this is not a problem (I mean sync lack, not data
> > loss)
> > > or how to fix this properly, please give me a hint, or correct my
> > > conclusions if necessary.
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-19445
> > >
> >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Reply via email to