[jira] [Commented] (FLINK-4150) Problem with Blobstore in Yarn HA setting on recovery after cluster shutdown

ASF GitHub Bot (JIRA) Fri, 15 Jul 2016 07:15:36 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379447#comment-15379447
 ]


ASF GitHub Bot commented on FLINK-4150:
---------------------------------------

Github user tillrohrmann commented on the issue:

    https://github.com/apache/flink/pull/2256
  
    Just a quick question. Do we want to remove also failed jobs from the 
BlobStore and ZK? Or only finished or cancelled jobs?


> Problem with Blobstore in Yarn HA setting on recovery after cluster shutdown
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-4150
>                 URL: https://issues.apache.org/jira/browse/FLINK-4150
>             Project: Flink
>          Issue Type: Bug
>          Components: Job-Submission
>            Reporter: Stefan Richter
>            Assignee: Ufuk Celebi
>            Priority: Blocker
>             Fix For: 1.1.0
>
>
> Submitting a job in Yarn with HA can lead to the following exception:
> {code}
> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot load 
> user class: org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
> ClassLoader info: URL ClassLoader:
>     file: 
> '/tmp/blobStore-ccec0f4a-3e07-455f-945b-4fcd08f5bac1/cache/blob_7fafffe9595cd06aff213b81b5da7b1682e1d6b0'
>  (invalid JAR: zip file is empty)
> Class not resolvable through given classloader.
>       at 
> org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:207)
>       at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:222)
>       at org.apache.flink.runtime.taskmanager.Task.run(Task.java:588)
>       at java.lang.Thread.run(Thread.java:745)
> {code}
> Some job information, including the Blob ids, are stored in Zookeeper. The 
> actual Blobs are stored in a dedicated BlobStore, if the recovery mode is set 
> to Zookeeper. This BlobStore is typically located in a FS like HDFS. When the 
> cluster is shut down, the path for the BlobStore is deleted. When the cluster 
> is then restarted, recovering jobs cannot restore because it's Blob ids 
> stored in Zookeeper now point to deleted files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-4150) Problem with Blobstore in Yarn HA setting on recovery after cluster shutdown

Reply via email to