Denis,

At the first glance looks like the main problem here is about
rollbacking of cache creation due to exception. Incorrect rollback
leads us to critial problem with ExchangeWorker as expected. It's
normal behavior because since May 2018 we have `FailureHandler`
and we can handle it (`NoOpFailureHandler` used in tests by default).

AFAIK, each dymanic cache creadtion should lead us to minor topology
change. So, expected result here from my point:

 - PME happens, topVer [1, 1], cache NOT created
 - PME happens, topVer [1, 2], cache created


I think we should provide correct fix with handling cache creation.
Please, look at my PR [1] it's very crude and I'm sure it not cover
whole problem. But seems, these changes will fix your reproducer.

Hope, it helps you.

[1] https://github.com/apache/ignite/pull/4487/files


On Fri, 3 Aug 2018 at 14:54 Denis Garus <garus....@gmail.com> wrote:

> Hello, Igniters!
>
>
> If an error occurred during of creation dynamic cache, the
> CacheAffinitySharedManager#processCacheStartRequests method will try to
> rollback cache start routine.
> ```
> try {
>         if (startCache) {
>
> cctx.cache().prepareCacheStart(req.startCacheConfiguration(),
>                         cacheDesc,
>                         nearCfg,
>                         evts.topologyVersion(),
>                         req.disabledAfterStart());
>                 //some code
>         }
> }
> catch (IgniteCheckedException e) {
>         U.error(log, "Failed to initialize cache. Will try to rollback
> cache start
> routine. " +
>                 "[cacheName=" + req.cacheName() + ']', e);
>
>         cctx.cache().closeCaches(Collections.singleton(req.cacheName()),
> false);
>
>         cctx.cache().completeCacheStartFuture(req, false, e);
> }
> ```
> Assume, that GridDhtPartitionsExchangeFuture will finish without any error
> because of the exception is just logged.
> Is this way right? What should return the Ignite#createCache method in this
> case?
>
> I can't check what could return in that case because it just doesn't work
> this way now.
> In the further, we're getting the critical error that stops ExchangeWorker
> in a test environment
> or stops node in a production environment.
>
> Reproducer:
> ```
> package org.apache.ignite.internal.processors.cache;
>
> import org.apache.ignite.IgniteCheckedException;
> import org.apache.ignite.configuration.CacheConfiguration;
> import org.apache.ignite.internal.IgniteEx;
> import org.apache.ignite.internal.util.typedef.internal.U;
> import org.apache.ignite.testframework.GridTestUtils;
> import
> org.apache.ignite.testframework.junits.common.GridCommonAbstractTest;
>
> public class CreateCacheFreezeTest extends GridCommonAbstractTest {
>     public void test() throws Exception {
>         IgniteEx ignite = startGrid(0);
>
>         U.registerMBean(ignite.context().config().getMBeanServer(),
> ignite.name(), "FIRST_CACHE",
>
> "org.apache.ignite.internal.processors.cache.CacheLocalMetricsMXBeanImpl",
>             new DummyMBeanImpl(), DummyMBean.class);
>
>         GridTestUtils.assertThrowsWithCause(() -> {
>             ignite.createCache(new CacheConfiguration<>("FIRST_CACHE"));
>
>             return 0;
>         }, IgniteCheckedException.class);
>         //The creation of SECOND_CACHE will hang because of ExchangeWorker
> is stopped
>         assertNotNull(ignite.createCache(new
> CacheConfiguration<>("SECOND_CACHE")));
>     }
>
>     public interface DummyMBean {
>         void noop();
>     }
>     static class DummyMBeanImpl implements DummyMBean {
>         @Override public void noop() {
>         }
>     }
> }
> ```
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
-- 
--
Maxim Muzafarov

Reply via email to