Re: IGNITE-1948 ClusterTopologyCheckedException can return null for retryReadyFuture()

Александр Меньшиков Tue, 07 Feb 2017 06:18:55 -0800

Alexey, I ran into some difficulties.

Look at the method:
GridNearTxFinishFuture.CheckBackupMiniFuture#onDhtFinishResponse(GridDhtTxFinishResponse
res)


            *Throwable err* = res.checkCommittedError();

            if (*err* != null) {
                if (*err* instanceof IgniteCheckedException) {
                    *ClusterTopologyCheckedException cause* =
                        ((IgniteCheckedException)*err*).
*getCause(ClusterTopologyCheckedException.class)*;

                    if (*cause* != null)
                        *cause.retryReadyFuture(*
cctx.nextAffinityReadyFuture(tx.topologyVersion()));
*                                   //^_____Here update the readyFut in
some first "cause". *
                }

                onDone(*err*);
            }


So I can't rewrite "cause" field in exception without reflection. It means
if I try to use two exception: one with "readyFut" and second without
"readyFut", I see no way to do it in this place.

Perhaps I misunderstood your original idea. Or maybe this code some kind of
a crutch and should be removed. What do you think?

2017-01-30 16:54 GMT+03:00 Alexey Goncharuk <[email protected]>:

> In the example above there is no point of setting the retry future in the
> exception because it will be serialized and sent to a remote node.
>
> I see the only possible way to ensure this: make this be verified at
> compile time. This looks like a big change, but the draft may look like so:
> 1) Introduce new exception type, like CacheTopology*Checked*Exception
> which
> *must* take the minimum topology version to wait for
> 2) In all cases when a cache operation fails because of topology change,
> provide the appropriate exception
> 3) When CacheTopologyException is being constructed, create a corresponding
> local future based on wait version and throw the exception.
>
> Even though this still does not give us 100% guarantee that we did not miss
> anything, it should cover a lot more cases than now.
>
> 2017-01-30 14:20 GMT+03:00 Александр Меньшиков <[email protected]>:
>
> > Alexey,
> >
> > For GridCacheIoManager#sendNoRetry it's not necessary because exception
> > will be ignored (or will cast to String). Perhaps my message was unclear.
> > I try to say that three methods can throw "
> ClusterTopologyCheckedException"
> > without any problem.
> >
> > But you are right, in some methods I can't set "retryFuture". We must
> > ensure that exception doesn't escape away. The best option is to ignore
> it
> > (or cast to String).
> >
> > But any way, look here:
> >
> > IgniteTxHandler#sendReply(UUID nodeId, GridDhtTxFinishRequest req,
> boolean
> > committed, GridCacheVersion nearTxId)
> >
> > Inside you can found next lines:
> > __________________________
> > ClusterTopologyCheckedException *cause* =
> >     new ClusterTopologyCheckedException("Primary node left grid.");
> >
> > res.checkCommittedError(new IgniteTxRollbackCheckedException("Failed to
> > commit transaction " +
> >     "(transaction has been rolled back on backup node): " +
> req.version(),
> > *cause*));
> > __________________________
> >
> > How do you unsure that *"cause"* can't come to GridCacheUtils#
> retryTopologySafe(final
> > Callable<S> c) ?
> >
> >
> > Perhaps I don't know some rules which you use to maintain it now?
> >
> >
> > 2017-01-30 12:24 GMT+03:00 Alexey Goncharuk <[email protected]
> >:
> >
> >> Alexander,
> >>
> >> Do you have a use-case in mind which results in IgniteTopologyException
> >> thrown from public API with retryFuture unset?
> >>
> >> retryFuture makes sense only for certain cache operations and may be set
> >> only in certain places in the code where we already know what to wait
> for.
> >> I do not see how you can initialize this future, for example, in
> >>  GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage msg,
> >> byte
> >> plc)
> >>
> >> --AG
> >>
> >> 2017-01-27 15:45 GMT+03:00 Александр Меньшиков <[email protected]>:
> >>
> >> > I found a lot of using "ClusterTopologyCheckedException" exception.
> In
> >> > most
> >> > use-case these exceptions were just ignored. It's easier to see if
> >> > issue-4612 will be applied (I'm waiting for code review by my team
> >> leader)
> >> > where I renamed all unused exceptions in the "ignored".
> >> >
> >> > It means that in some case "retryReadyFuture" can left unset. But
> there
> >> are
> >> > code where exception is being getting from "get()" method in some
> >> "future"
> >> > class and don't ignored (exception is sending to method
> >> > "GridFutureAdapter#onDone()" for example).
> >> >
> >> > So I decided not to do a deep global analysis of code and make some
> >> simple
> >> > rules.
> >> > 1. If some method "X" throws ClusterTopologyCheckedException and
> >> ignores
> >> > it
> >> > (or converts to String, there are some variants to lose exception like
> >> > object) in all methods that call the method "X", then we can don't set
> >> > "retryReadyFuture" for optimization. But this should be clarified in
> >> > javadoc.
> >> > 2. But if exception escapes at least once like object: we must set
> >> > "retryReadyFuture" in that method.
> >> >
> >> > A few times when call tree was simple, I went deeper.
> >> >
> >> > So I found only three methods which can throw
> >> > "ClusterTopologyCheckedException" without setting "retryReadyFuture":
> >> >
> >> > 1. IgfsFragmentizerManager#sendWithRetries(UUID nodeId,
> >> > IgfsCommunicationMessage msg)
> >> > 2. GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage
> >> msg,
> >> > byte plc)
> >> > 3. GridCacheIoManager#sendOrderedMessage(ClusterNode node, Object
> >> topic,
> >> > GridCacheMessage msg, byte plc, long timeout)
> >> >
> >> > In all other methods "ClusterTopologyCheckedException" escapes away
> too
> >> > far.
> >> >
> >> > What do you think about this approach to make compromise between
> >> > optimization and correctness?
> >> >
> >>
> >
> >
>

Re: IGNITE-1948 ClusterTopologyCheckedException can return null for retryReadyFuture()

Reply via email to