Mikhail Petrov created IGNITE-15966:
---------------------------------------

             Summary: [Security] Node can hang with authentication enabled 
after user drop operation
                 Key: IGNITE-15966
                 URL: https://issues.apache.org/jira/browse/IGNITE-15966
             Project: Ignite
          Issue Type: Bug
         Environment: 

            Reporter: Mikhail Petrov


Reproducer: 

{code:java}
/** */
public class UserDropTest extends GridCommonAbstractTest {
    /** {@inheritDoc} */
    @Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
        IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);

        cfg.setAuthenticationEnabled(true);

        cfg.setDataStorageConfiguration(new DataStorageConfiguration()
            .setDefaultDataRegionConfiguration(new DataRegionConfiguration()
                .setPersistenceEnabled(true)));

        return cfg;
    }

    /** */
    @Test
    public void test() throws Exception {
        startGrid(0);
        startGrid(1);

        grid(0).cluster().state(ClusterState.ACTIVE);

        grid(0).createCache(DEFAULT_CACHE_NAME);

        try (AutoCloseable ignored = 
withSecurityContextOnAllNodes(authenticate(grid(0), "ignite", "ignite"))) {
            grid(0).context().security().createUser("cli", "pwd".toCharArray());
        }

        IgniteClient client = Ignition.startClient(new 
ClientConfiguration().setAddresses("127.0.0.1:10800").setUserName("cli").setUserPassword("pwd"));

        ClientCache<Object, Object> cache = client.cache(DEFAULT_CACHE_NAME);

        try (AutoCloseable ignored = 
withSecurityContextOnAllNodes(authenticate(grid(0), "ignite", "ignite"))) {
            grid(0).context().security().dropUser("cli");
        }

        Map<Integer, Integer> entries = new HashMap<>();

        for (int i = 0; i < 10000; i++)
            entries.put(i, i);

        cache.putAll(entries);
    }

    /** {@inheritDoc} */
    @Override protected void beforeTest() throws Exception {
        super.beforeTest();

        cleanPersistenceDir();
    }
}

{code}

Exception:

{code:java}
[2021-11-22 
11:04:32,390][ERROR][sys-stripe-3-#92%ignite.UserDropTest1%][IgniteTestResources]
 Critical system error detected. Will be handled accordingly to configured 
handler [hnd=NoOpFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Failed to 
find security context for subject with given ID : 
0898b227-30d5-3afc-9394-d8e4889ece4a]]
java.lang.IllegalStateException: Failed to find security context for subject 
with given ID : 0898b227-30d5-3afc-9394-d8e4889ece4a
        at 
org.apache.ignite.internal.processors.security.IgniteSecurityProcessor.withContext(IgniteSecurityProcessor.java:167)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1906)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1528)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager.access$5300(GridIoManager.java:242)
        at 
org.apache.ignite.internal.managers.communication.GridIoManager$9.execute(GridIoManager.java:1421)
        at 
org.apache.ignite.internal.managers.communication.TraceRunnable.run(TraceRunnable.java:55)
        at 
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:569)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
        at java.lang.Thread.run(Thread.java:748)
{code}

The main problem is:

Implementation of authentication plugin ties security user with the subject ID 
that is propagated through cluster nodes.

If some node receives operation initiated by the deleted user, it fails  to 
obtain security context via subject id since it was deleted and hangs with 
mentioned above exception.

Here we are faced with a security implementation problem - we have no mechanism 
to determine that a security subject is no longer needed and can be safely 
removed and at the same time we  throw unrecoverable exception in case security 
subject is not found that kills system worker and hangs node.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to