Igniters, since there are no more comments and/or review feedback, I'm going to merge the changes by the end of the week.
On Mon, Mar 22, 2021 at 10:37 PM Pavel Tupitsyn <ptupit...@apache.org> wrote: > Ready for review: > https://github.com/apache/ignite/pull/8870 > > On Sun, Mar 21, 2021 at 8:09 PM Pavel Tupitsyn <ptupit...@apache.org> > wrote: > >> Simple benchmark added - see JmhCacheAsyncListenBenchmark in the PR. >> There is a 6-8% drop (1 client, 2 servers, 1 machine, int key/val). >> I expect this difference to become barely observable on real-world >> workloads. >> >> On Thu, Mar 18, 2021 at 12:35 PM Pavel Tupitsyn <ptupit...@apache.org> >> wrote: >> >>> Denis, >>> >>> For a reproducer, please see CacheAsyncContinuationExecutorTest.java in >>> the linked PoC [1] >>> >>> [1] >>> https://github.com/apache/ignite/pull/8870/files#diff-c788c12013622093df07d8f628a6e8c5fb0c15007f0787f2d459dbb3e377fc5aR54 >>> >>> On Thu, Mar 18, 2021 at 1:58 AM Raymond Wilson < >>> raymond_wil...@trimble.com> wrote: >>> >>>> We implemented the ContinueWith() suggestion from Pavel, and it works >>>> well >>>> so far in testing, though we do not have data to support if there is or >>>> is >>>> not a performance penalty in our use case.. >>>> >>>> To lend another vote to the 'Don't do continuations on the striped >>>> thread >>>> pool' line of thinking: Deadlocking is an issue as is starvation. In >>>> some >>>> ways starvation is more insidious because by the time things stop >>>> working >>>> the cause and effect distance may be large. >>>> >>>> I appreciate the documentation does make statements about not performing >>>> cache operations in a continuation due to deadlock possibilities, but >>>> that >>>> statement does not reveal why this is an issue. It is less a case of a >>>> async cache operation being followed by some other cache operation (an >>>> immediate issue), and more a general case of the continuation of >>>> application logic using a striped pool thread in a way that might mean >>>> that >>>> thread is never given back - it's now just a piece of the application >>>> infrastructure until some other application activity schedules away from >>>> that thread (eg: by ContinueWith or some other async operation in the >>>> application code that releases the thread). To be fair, beyond >>>> structures >>>> like ContinueWith(), it is not obvious how that continuation thread >>>> should >>>> be handed back. This will be the same behaviour for dedicated >>>> continuation >>>> pools, but with far less risk in the absence of ContinueWith() >>>> constructs. >>>> >>>> In the .Net world this is becoming more of an issue as fewer .Net use >>>> cases >>>> outside of UI bother with synchronization contexts by default. >>>> >>>> Raymond. >>>> >>>> >>>> On Thu, Mar 18, 2021 at 9:56 AM Valentin Kulichenko < >>>> valentin.kuliche...@gmail.com> wrote: >>>> >>>> > Hi Denis, >>>> > >>>> > I think Pavel's main point is that behavior is unpredictable. For >>>> example, >>>> > AFAIK, putAsync can be executed in the main thread instead of the >>>> striped >>>> > pool thread if the operation is local. The listener can also be >>>> executed in >>>> > the main thread - this happens if the future is completed prior to >>>> listener >>>> > invocation (this is actually quite possible in the unit test >>>> environment >>>> > causing the test to pass). Finally, I'm pretty sure there are many >>>> cases >>>> > when a deadlock does not occur right away, but instead it will reveal >>>> > itself under high load due to thread starvation. The latter type of >>>> issues >>>> > is the most dangerous because they are often reproduced only in >>>> production. >>>> > Finally, there are performance considerations as well - cache >>>> operations >>>> > and listeners share the same fixed-sized pool which can have negative >>>> > effects. >>>> > >>>> > I'm OK with the change. Although, it might be better to introduce a >>>> new >>>> > fixed-sized pool instead of ForkJoinPool for listeners, simply >>>> because this >>>> > is the approach taken throughout the project. But this is up to a >>>> debate. >>>> > >>>> > -Val >>>> > >>>> > On Wed, Mar 17, 2021 at 11:31 AM Denis Garus <garus....@gmail.com> >>>> wrote: >>>> > >>>> > > Pavel, >>>> > > I tried this: >>>> > > >>>> > > @Test >>>> > > public void test() throws Exception { >>>> > > IgniteCache<Integer, String> cache = >>>> > > startGrid().getOrCreateCache("test_cache"); >>>> > > >>>> > > cache.putAsync(1, "one").listen(f -> cache.replace(1, "two")); >>>> > > >>>> > > assertEquals("two", cache.get(1)); >>>> > > } >>>> > > >>>> > > and this test is green. >>>> > > I believe that an user can make listener that leads to deadlock, but >>>> > > the example in the IEP does not reflect this. >>>> > > >>>> > > >>>> > > >>>> > > ср, 17 мар. 2021 г. в 17:36, Вячеслав Коптилин < >>>> slava.kopti...@gmail.com >>>> > >: >>>> > > >>>> > > > Hi Pavel, >>>> > > > >>>> > > > > Not a good excuse really. We have a usability problem, you have >>>> to >>>> > > admit >>>> > > > it. >>>> > > > Fair enough. I agree that this is a usability issue, but I have >>>> doubts >>>> > > that >>>> > > > the proposed approach to overcome it is the best one. >>>> > > > >>>> > > > > Documentation won't help - no one is going to read the Javadoc >>>> for a >>>> > > > trivial method like putAsync >>>> > > > That is sad... However, I don't think that this is a strong >>>> argument >>>> > > here. >>>> > > > >>>> > > > This is just my opinion. Let's see what other community members >>>> have to >>>> > > > say. >>>> > > > >>>> > > > Thanks, >>>> > > > S. >>>> > > > >>>> > > > >>>> > > > ср, 17 мар. 2021 г. в 17:01, Pavel Tupitsyn <ptupit...@apache.org >>>> >: >>>> > > > >>>> > > > > > the user should use the right API >>>> > > > > >>>> > > > > Not a good excuse really. We have a usability problem, you have >>>> to >>>> > > admit >>>> > > > > it. >>>> > > > > "The brakes did not work on your car - too bad, you should have >>>> known >>>> > > > that >>>> > > > > on Sundays only your left foot is allowed on the pedal" >>>> > > > > >>>> > > > > This particular use case is too intricate. >>>> > > > > Even when you know about that, it is difficult to decide what >>>> can run >>>> > > on >>>> > > > > the striped pool, >>>> > > > > and what can't. It is too easy to forget. >>>> > > > > And most people don't know, even among Ignite developers. >>>> > > > > >>>> > > > > Documentation won't help - no one is going to read the Javadoc >>>> for a >>>> > > > > trivial method like putAsync. >>>> > > > > >>>> > > > > >>>> > > > > So I propose to have a safe default. >>>> > > > > Then document the performance tuning opportunity on [1]. >>>> > > > > >>>> > > > > Think about how many users abandon a product because it >>>> mysteriously >>>> > > > > crashes and hangs. >>>> > > > > >>>> > > > > [1] >>>> > > > > >>>> > > > > >>>> > > > >>>> > > >>>> > >>>> https://ignite.apache.org/docs/latest/perf-and-troubleshooting/general-perf-tips >>>> > > > > >>>> > > > > >>>> > > > > On Wed, Mar 17, 2021 at 4:21 PM Вячеслав Коптилин < >>>> > > > > slava.kopti...@gmail.com> >>>> > > > > wrote: >>>> > > > > >>>> > > > > > Hi Pavel, >>>> > > > > > >>>> > > > > > Well, I think that the user should use the right API instead >>>> of >>>> > > > > introducing >>>> > > > > > uncontested overhead for everyone. >>>> > > > > > For instance, the code that is provided by IEP can changed as >>>> > > follows: >>>> > > > > > >>>> > > > > > IgniteFuture fut = cache.putAsync(1, 1); >>>> > > > > > fut.listenAync(f -> { >>>> > > > > > // Executes on Striped pool and deadlocks. >>>> > > > > > cache.replace(1, 2); >>>> > > > > > }, ForkJoinPool.commonPool()); >>>> > > > > > >>>> > > > > > Of course, it does not mean that this fact should not be >>>> properly >>>> > > > > > documented. >>>> > > > > > Perhaps, I am missing something. >>>> > > > > > >>>> > > > > > Thanks, >>>> > > > > > S. >>>> > > > > > >>>> > > > > > ср, 17 мар. 2021 г. в 16:01, Pavel Tupitsyn < >>>> ptupit...@apache.org >>>> > >: >>>> > > > > > >>>> > > > > > > Slava, >>>> > > > > > > >>>> > > > > > > Your suggestion is to keep things as is and discard the IEP, >>>> > right? >>>> > > > > > > >>>> > > > > > > > this can lead to significant overhead >>>> > > > > > > Yes, there is some overhead, but the cost of accidentally >>>> > starving >>>> > > > the >>>> > > > > > > striped pool is worse, >>>> > > > > > > not to mention the deadlocks. >>>> > > > > > > >>>> > > > > > > I believe that we should favor correctness over performance >>>> in >>>> > any >>>> > > > > case. >>>> > > > > > > >>>> > > > > > > >>>> > > > > > > On Wed, Mar 17, 2021 at 3:34 PM Вячеслав Коптилин < >>>> > > > > > > slava.kopti...@gmail.com> >>>> > > > > > > wrote: >>>> > > > > > > >>>> > > > > > > > Well, the specified method already exists :) >>>> > > > > > > > >>>> > > > > > > > /** >>>> > > > > > > > * Registers listener closure to be asynchronously >>>> notified >>>> > > > > > whenever >>>> > > > > > > > future completes. >>>> > > > > > > > * Closure will be processed in specified executor. >>>> > > > > > > > * >>>> > > > > > > > * @param lsnr Listener closure to register. Cannot be >>>> > {@code >>>> > > > > > null}. >>>> > > > > > > > * @param exec Executor to run listener. Cannot be >>>> {@code >>>> > > > null}. >>>> > > > > > > > */ >>>> > > > > > > > public void listenAsync(IgniteInClosure<? super >>>> > > > IgniteFuture<V>> >>>> > > > > > > lsnr, >>>> > > > > > > > Executor exec); >>>> > > > > > > > >>>> > > > > > > > Thanks, >>>> > > > > > > > S. >>>> > > > > > > > >>>> > > > > > > > ср, 17 мар. 2021 г. в 15:25, Вячеслав Коптилин < >>>> > > > > > slava.kopti...@gmail.com >>>> > > > > > > >: >>>> > > > > > > > >>>> > > > > > > > > Hello Pavel, >>>> > > > > > > > > >>>> > > > > > > > > I took a look at your IEP and pool request. I have the >>>> > > following >>>> > > > > > > > concerns. >>>> > > > > > > > > First of all, this change breaks the contract of >>>> > > > > > > > IgniteFuture#listen(lsnr) >>>> > > > > > > > > >>>> > > > > > > > > /** >>>> > > > > > > > > * Registers listener closure to be asynchronously >>>> > notified >>>> > > > > > > whenever >>>> > > > > > > > > future completes. >>>> > > > > > > > > * Closure will be processed in thread that >>>> completes >>>> > this >>>> > > > > future >>>> > > > > > > or >>>> > > > > > > > > (if future already >>>> > > > > > > > > * completed) immediately in current thread. >>>> > > > > > > > > * >>>> > > > > > > > > * @param lsnr Listener closure to register. Cannot >>>> be >>>> > > {@code >>>> > > > > > > null}. >>>> > > > > > > > > */ >>>> > > > > > > > > public void listen(IgniteInClosure<? super >>>> > IgniteFuture<V>> >>>> > > > > > lsnr); >>>> > > > > > > > > >>>> > > > > > > > > In your pull request, the listener is always called >>>> from >>>> > a >>>> > > > > > > specified >>>> > > > > > > > > thread pool (which is fork-join by default) >>>> > > > > > > > > even though the future is already completed at the >>>> moment >>>> > > the >>>> > > > > > > listen >>>> > > > > > > > > method is called. >>>> > > > > > > > > In my opinion, this can lead to significant >>>> overhead - >>>> > > > > submission >>>> > > > > > > > > requires acquiring a lock and notifying a pool thread. >>>> > > > > > > > > >>>> > > > > > > > > It seems to me, that we should not change the >>>> current >>>> > > > behavior. >>>> > > > > > > > > However, thread pool executor can be added as an >>>> optional >>>> > > > parameter >>>> > > > > > of >>>> > > > > > > > > listen() method as follows: >>>> > > > > > > > > >>>> > > > > > > > > public void listen(IgniteInClosure<? super >>>> > > > IgniteFuture<V>> >>>> > > > > > > lsnr, >>>> > > > > > > > > Executor exec); >>>> > > > > > > > > >>>> > > > > > > > > Thanks, >>>> > > > > > > > > S. >>>> > > > > > > > > >>>> > > > > > > > > пн, 15 мар. 2021 г. в 19:24, Pavel Tupitsyn < >>>> > > > ptupit...@apache.org >>>> > > > > >: >>>> > > > > > > > > >>>> > > > > > > > >> Igniters, >>>> > > > > > > > >> >>>> > > > > > > > >> Please review the IEP [1] and let me know your >>>> thoughts. >>>> > > > > > > > >> >>>> > > > > > > > >> [1] >>>> > > > > > > > >> >>>> > > > > > > > >> >>>> > > > > > > > >>>> > > > > > > >>>> > > > > > >>>> > > > > >>>> > > > >>>> > > >>>> > >>>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-70%3A+Async+Continuation+Executor >>>> > > > > > > > >> >>>> > > > > > > > > >>>> > > > > > > > >>>> > > > > > > >>>> > > > > > >>>> > > > > >>>> > > > >>>> > > >>>> > >>>> >>>> >>>> -- >>>> <http://www.trimble.com/> >>>> Raymond Wilson >>>> Solution Architect, Civil Construction Software Systems (CCSS) >>>> 11 Birmingham Drive | Christchurch, New Zealand >>>> raymond_wil...@trimble.com >>>> >>>> < >>>> https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch >>>> > >>>> >>>