Alexey, I ran into some difficulties. Look at the method: GridNearTxFinishFuture.CheckBackupMiniFuture#onDhtFinishResponse(GridDhtTxFinishResponse res)
*Throwable err* = res.checkCommittedError(); if (*err* != null) { if (*err* instanceof IgniteCheckedException) { *ClusterTopologyCheckedException cause* = ((IgniteCheckedException)*err*). *getCause(ClusterTopologyCheckedException.class)*; if (*cause* != null) *cause.retryReadyFuture(* cctx.nextAffinityReadyFuture(tx.topologyVersion())); * //^_____Here update the readyFut in some first "cause". * } onDone(*err*); } So I can't rewrite "cause" field in exception without reflection. It means if I try to use two exception: one with "readyFut" and second without "readyFut", I see no way to do it in this place. Perhaps I misunderstood your original idea. Or maybe this code some kind of a crutch and should be removed. What do you think? 2017-01-30 16:54 GMT+03:00 Alexey Goncharuk <alexey.goncha...@gmail.com>: > In the example above there is no point of setting the retry future in the > exception because it will be serialized and sent to a remote node. > > I see the only possible way to ensure this: make this be verified at > compile time. This looks like a big change, but the draft may look like so: > 1) Introduce new exception type, like CacheTopology*Checked*Exception > which > *must* take the minimum topology version to wait for > 2) In all cases when a cache operation fails because of topology change, > provide the appropriate exception > 3) When CacheTopologyException is being constructed, create a corresponding > local future based on wait version and throw the exception. > > Even though this still does not give us 100% guarantee that we did not miss > anything, it should cover a lot more cases than now. > > 2017-01-30 14:20 GMT+03:00 Александр Меньшиков <sharple...@gmail.com>: > > > Alexey, > > > > For GridCacheIoManager#sendNoRetry it's not necessary because exception > > will be ignored (or will cast to String). Perhaps my message was unclear. > > I try to say that three methods can throw " > ClusterTopologyCheckedException" > > without any problem. > > > > But you are right, in some methods I can't set "retryFuture". We must > > ensure that exception doesn't escape away. The best option is to ignore > it > > (or cast to String). > > > > But any way, look here: > > > > IgniteTxHandler#sendReply(UUID nodeId, GridDhtTxFinishRequest req, > boolean > > committed, GridCacheVersion nearTxId) > > > > Inside you can found next lines: > > __________________________ > > ClusterTopologyCheckedException *cause* = > > new ClusterTopologyCheckedException("Primary node left grid."); > > > > res.checkCommittedError(new IgniteTxRollbackCheckedException("Failed to > > commit transaction " + > > "(transaction has been rolled back on backup node): " + > req.version(), > > *cause*)); > > __________________________ > > > > How do you unsure that *"cause"* can't come to GridCacheUtils# > retryTopologySafe(final > > Callable<S> c) ? > > > > > > Perhaps I don't know some rules which you use to maintain it now? > > > > > > 2017-01-30 12:24 GMT+03:00 Alexey Goncharuk <alexey.goncha...@gmail.com > >: > > > >> Alexander, > >> > >> Do you have a use-case in mind which results in IgniteTopologyException > >> thrown from public API with retryFuture unset? > >> > >> retryFuture makes sense only for certain cache operations and may be set > >> only in certain places in the code where we already know what to wait > for. > >> I do not see how you can initialize this future, for example, in > >> GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage msg, > >> byte > >> plc) > >> > >> --AG > >> > >> 2017-01-27 15:45 GMT+03:00 Александр Меньшиков <sharple...@gmail.com>: > >> > >> > I found a lot of using "ClusterTopologyCheckedException" exception. > In > >> > most > >> > use-case these exceptions were just ignored. It's easier to see if > >> > issue-4612 will be applied (I'm waiting for code review by my team > >> leader) > >> > where I renamed all unused exceptions in the "ignored". > >> > > >> > It means that in some case "retryReadyFuture" can left unset. But > there > >> are > >> > code where exception is being getting from "get()" method in some > >> "future" > >> > class and don't ignored (exception is sending to method > >> > "GridFutureAdapter#onDone()" for example). > >> > > >> > So I decided not to do a deep global analysis of code and make some > >> simple > >> > rules. > >> > 1. If some method "X" throws ClusterTopologyCheckedException and > >> ignores > >> > it > >> > (or converts to String, there are some variants to lose exception like > >> > object) in all methods that call the method "X", then we can don't set > >> > "retryReadyFuture" for optimization. But this should be clarified in > >> > javadoc. > >> > 2. But if exception escapes at least once like object: we must set > >> > "retryReadyFuture" in that method. > >> > > >> > A few times when call tree was simple, I went deeper. > >> > > >> > So I found only three methods which can throw > >> > "ClusterTopologyCheckedException" without setting "retryReadyFuture": > >> > > >> > 1. IgfsFragmentizerManager#sendWithRetries(UUID nodeId, > >> > IgfsCommunicationMessage msg) > >> > 2. GridCacheIoManager#sendNoRetry(ClusterNode node, GridCacheMessage > >> msg, > >> > byte plc) > >> > 3. GridCacheIoManager#sendOrderedMessage(ClusterNode node, Object > >> topic, > >> > GridCacheMessage msg, byte plc, long timeout) > >> > > >> > In all other methods "ClusterTopologyCheckedException" escapes away > too > >> > far. > >> > > >> > What do you think about this approach to make compromise between > >> > optimization and correctness? > >> > > >> > > > > >