[ https://issues.apache.org/jira/browse/IGNITE-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nikolay updated IGNITE-21737: ----------------------------- Description: Server node ({_}<property name="authenticationEnabled" value="true"/>{_}) fails with such an error {code:java} JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Failed to find security context for subject with given ID]{code} when an arbitrary client node leaves a topology and a new one joins it. We host thick client nodes in k8, so this error occurs when we simply restart one pod. One assumption - a pretty big ring of around 40 nodes (5 servers, 35 clients) prevents somehow to find a security context of disconnected node. Each disconnected node uses a graceful shutdown timeout: {code:java} public void onLifecycleEvent(LifecycleEventType evt) { if (evt == LifecycleEventType.BEFORE_NODE_STOP) { Thread.sleep(40_000); } } {code} was: Server node (<property name="authenticationEnabled" value="true"/>) fails with such an error JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Failed to find security context for subject with given ID] when an arbitrary client node leaves a topology and a new one joins it. We host thick client nodes in k8, so this error occurs when we simply restart one pod. One assumption - a pretty big ring of around 40 nodes prevents somehow to find a security context of disconnected node. > Server node failure in case of nodes join and leave a cluster with a security > is enabled > ---------------------------------------------------------------------------------------- > > Key: IGNITE-21737 > URL: https://issues.apache.org/jira/browse/IGNITE-21737 > Project: Ignite > Issue Type: Bug > Components: security > Affects Versions: 2.16 > Reporter: Nikolay > Priority: Major > > Server node ({_}<property name="authenticationEnabled" value="true"/>{_}) > fails with such an error > {code:java} > JVM will be halted immediately due to the failure: [failureCtx=FailureContext > [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Failed > to find security context for subject with given ID]{code} > when an arbitrary client node leaves a topology and a new one joins it. We > host thick client nodes in k8, so this error occurs when we simply restart > one pod. > One assumption - a pretty big ring of around 40 nodes (5 servers, 35 clients) > prevents somehow to find a security context of disconnected node. > Each disconnected node uses a graceful shutdown timeout: > {code:java} > public void onLifecycleEvent(LifecycleEventType evt) { > if (evt == LifecycleEventType.BEFORE_NODE_STOP) { > Thread.sleep(40_000); > } > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)