[ https://issues.apache.org/jira/browse/FLINK-14110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Konstantin Knauf reopened FLINK-14110: -------------------------------------- Re-opening in accordance with https://issues.apache.org/jira/browse/FLINK-23206. > Deleting state.backend.rocksdb.localdir causes silent failure > ------------------------------------------------------------- > > Key: FLINK-14110 > URL: https://issues.apache.org/jira/browse/FLINK-14110 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Affects Versions: 1.8.1, 1.9.0 > Environment: Flink {{1.8.1}} and {{1.9.0}}. > JVM 8 > Reporter: Aaron Levin > Priority: Minor > Labels: auto-closed > > Suppose {{state.backend.rocksdb.localdir}} is configured as: > {noformat} > state.backend.rocksdb.localdir: /flink/tmp > {noformat} > If I then run \{{rm -rf /flink/tmp/job_*}} on a host while a Flink > application is running, I will observe the following: > * throughput of my operators running on that host will drop to zero > * the application will not fail or restart > * the task manager will not fail or restart > * in most cases there is nothing in the logs to indicate a failure (I've run > this several times and only once seen an exception - I believe I was lucky > and deleted those directories during a checkpoint or something) > The desired behaviour here would be to throw an exception and crash, instead > of silently dropping throughput to zero. Restarting the Task Manager will > resolve the issues. > I only tried this on Flink {{1.8.1}} and {{1.9.0}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)