Narayanan Arunachalam created FLINK-9291:
--------------------------------------------

             Summary: Checkpoint failure (CIRCULAR 
REFERENCE:java.lang.NegativeArraySizeException)
                 Key: FLINK-9291
                 URL: https://issues.apache.org/jira/browse/FLINK-9291
             Project: Flink
          Issue Type: Bug
            Reporter: Narayanan Arunachalam


Using rocksdb for state and after running for few hours, checkpointing fails 
with the following error. The job recovers fine after this.
AsynchronousException\{java.lang.Exception: Could not materialize checkpoint 
215 for operator makeSalpTrace -> countTraces -> makeZipkinTrace -> (Map -> 
Sink: bs, Sink: es) (14/80).}
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:948)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.Exception: Could not materialize checkpoint 215 for 
operator makeSalpTrace -> countTraces -> makeZipkinTrace -> (Map -> Sink: bs, 
Sink: es) (14/80).
        ... 6 more
Caused by: java.util.concurrent.ExecutionException: 
java.lang.NegativeArraySizeException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:894)
        ... 5 more
        Suppressed: java.lang.Exception: Could not properly cancel managed 
keyed state future.
                at 
org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:91)
                at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.cleanup(StreamTask.java:976)
                at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:939)
                ... 5 more
        Caused by: java.util.concurrent.ExecutionException: 
java.lang.NegativeArraySizeException
                at java.util.concurrent.FutureTask.report(FutureTask.java:122)
                at java.util.concurrent.FutureTask.get(FutureTask.java:192)
                at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:43)
                at 
org.apache.flink.runtime.state.StateUtil.discardStateFuture(StateUtil.java:66)
                at 
org.apache.flink.streaming.api.operators.OperatorSnapshotResult.cancel(OperatorSnapshotResult.java:89)
                ... 7 more
        Caused by: java.lang.NegativeArraySizeException
                at org.rocksdb.RocksIterator.value0(Native Method)
                at org.rocksdb.RocksIterator.value(RocksIterator.java:50)
                at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBMergeIterator.value(RocksDBKeyedStateBackend.java:1898)
                at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullSnapshotOperation.writeKVStateData(RocksDBKeyedStateBackend.java:704)
                at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullSnapshotOperation.writeDBSnapshot(RocksDBKeyedStateBackend.java:556)
                at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$3.performOperation(RocksDBKeyedStateBackend.java:466)
                at 
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$3.performOperation(RocksDBKeyedStateBackend.java:424)
                at 
org.apache.flink.runtime.io.async.AbstractAsyncCallableWithResources.call(AbstractAsyncCallableWithResources.java:75)
                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:40)
                at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:894)
                ... 5 more
        [CIRCULAR REFERENCE:java.lang.NegativeArraySizeException]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to