Re: Atomic cache inconsistent state

Andrey Mashenkov Tue, 05 Jun 2018 09:40:04 -0700

Dmitry,

There are other cases that can result in inconsistent state of Atomic cache
with 2 or more backups.


1. For PRIMARY_SYNC.  Primary sends requests to all backups and respond to
near.... and then one of backup update fails.
Will primary retry update operation? I doubt.

2. For all sync modes.  Primary sends request to 1-st backup and fails to
send to 2-nd backup... and then near node sudden death happens.
No one will retry as near has gone.

On Tue, Jun 5, 2018 at 7:16 PM, Dmitriy Govorukhin <
dmitriy.govoruk...@gmail.com> wrote:

> Denis,
>
> Seem that you right, it is a problem.
> I guess in this case primary node should send CachePartialUpdateException
> to near node.
>
> On Tue, Jun 5, 2018 at 6:13 PM, Denis Garus <garus....@gmail.com> wrote:
>
> > Fix formatting
> >
> > Hello Igniters!
> >
> > I have found some confusing behavior of atomic partitioned cache with
> > `PRIMARY_SYNC` write synchronization mode.
> > Node with a primary partition sends a message to remote nodes with backup
> > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`.
> > If during of sending occurs an error then it, in fact, will be ignored,
> see
> > [1]:
> > ```
> > try {
> >                 ....
> >
> >                 cctx.io().send(req.nodeId(), req, cctx.ioPolicy());
> >
> >                 ....
> > }
> > catch (ClusterTopologyCheckedException ignored) {
> >                 ....
> >
> >                 registerResponse(req.nodeId());
> > }
> > catch (IgniteCheckedException ignored) {
> >                 ....
> >
> >                 registerResponse(req.nodeId());
> > }
> >
> > ```
> > This behavior results in the primary partition and backup partitions have
> > the different value for given key.
> >
> > There is the reproducer [2].
> >
> > Should we consider this behavior as valid?
> >
> > [1].
> > https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d890
> > 8e882396d7/modules/core/src/main/java/org/apache/ignite/
> > internal/processors/cache/distributed/dht/atomic/
> > GridDhtAtomicAbstractUpdateFuture.java#L473
> > [2].
> > https://github.com/apache/ignite/pull/4126/files#diff-
> > 5e5bfb73bd917d85f56a05552b1d014aR26
> >
> > 2018-06-05 17:35 GMT+03:00 Denis Garus <garus....@gmail.com>:
> >
> > > Hello Igniters!
> > >
> > >
> > >
> > > I have found some confusing behavior of atomic partitioned cache with
> > > `PRIMARY_SYNC` write synchronization mode.
> > >
> > > Node with a primary partition sends a message to remote nodes with
> backup
> > > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`.
> > >
> > > If during of sending occurs an error then it, in fact, will be ignored,
> > > see [1]:
> > >
> > > ```
> > >
> > > try {
> > >
> > >                 ....
> > >
> > >
> > >
> > >                 cctx.io().send(req.nodeId(), req, cctx.ioPolicy());
> > >
> > >
> > >
> > >                 ....
> > >
> > > }
> > >
> > > catch (ClusterTopologyCheckedException ignored) {
> > >
> > >                 ....
> > >
> > >
> > >
> > >                 registerResponse(req.nodeId());
> > >
> > > }
> > >
> > > catch (IgniteCheckedException ignored) {
> > >
> > >                 ....
> > >
> > >
> > >
> > >                 registerResponse(req.nodeId());
> > >
> > > }
> > >
> > > ```
> > >
> > > This behavior results in the primary partition and backup partitions
> have
> > > the different value for given key.
> > >
> > >
> > >
> > > There is the reproducer [2].
> > >
> > >
> > >
> > > Should we consider this behavior as valid?
> > >
> > >
> > >
> > > [1]. https://github.com/dgarus/ignite/blob/
> > d473b507f04e2ec843c1da1066d890
> > > 8e882396d7/modules/core/src/main/java/org/apache/ignite/
> > > internal/processors/cache/distributed/dht/atomic/
> > > GridDhtAtomicAbstractUpdateFuture.java#L473
> > >
> > > [2]. https://github.com/apache/ignite/pull/4126/files#diff-
> > > 5e5bfb73bd917d85f56a05552b1d014aR26
> > >
> >
>



-- 
Best regards,
Andrey V. Mashenkov

Re: Atomic cache inconsistent state

Reply via email to