Re: Standalone HA cluster: Fatal error occurred in the cluster entrypoint.
Hi, Miki Thank you for reply! I have deleted zookeeper data and was able to restart cluster. Olga Sent from my iPhone On Nov 16, 2018, at 4:38 AM, miki haiat mailto:miko5...@gmail.com>> wrote: I "solved" this issue by cleaning the zookeeper information and start the cluster again all the the checkpoint and job graph data will be erased and basacly you will start a new cluster... It's happened to me allot on a 1.5.x On a 1.6 things are running perfect . I'm not sure way this error is back again on 1.6.1 ? On Fri, 16 Nov 2018, 0:42 Olga Luganska mailto:trebl...@hotmail.com> wrote: Hello, I am running flink 1.6.1 standalone HA cluster. Today I am unable to start cluster because of "Fatal error in cluster entrypoint" (I used to see this error when running flink 1.5 version, after upgrade to 1.6.1 (which had a fix for this bug) everything worked well for a while) Question: what exactly needs to be done to clean "state handle store"? 2018-11-15 15:09:53,181 DEBUG org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor - Fencing token not set: Ignoring message LocalFencedMessage(null, org.apache.flink.runtime.rpc.messages.RunAsync@21fd224c) because the fencing token is null. 2018-11-15 15:09:53,182 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error occurred in the cluster entrypoint. java.lang.RuntimeException: org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph from state handle under /e13034f83a80072204facb2cec9ea6a3. This indicates that the retrieved state handle is broken. Try cleaning the state handle store. at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$1(FunctionUtils.java:61) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph from state handle under /e13034f83a80072204facb2cec9ea6a3. This indicates that the retrieved state handle is broken. Try cleaning the state handle store. at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:692) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:677) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobs(Dispatcher.java:658) at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$26(Dispatcher.java:817) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$1(FunctionUtils.java:59) ... 9 more Caused by: java.io.FileNotFoundException: /checkpoint_repo/ha/submittedJobGraphdd865937d674 (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.(FileInputStream.java:138) at org.apache.flink.core.fs.local.LocalDataInputStream.(LocalDataInputStream.java:50) at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:142) at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68) at org.apache.flink.runtime.state.RetrievableStreamStateHandle.openInputStream(RetrievableStreamStateHandle.java:64) at org.apache.flink.runtime.state.RetrievableStreamStateHandle.retrieveState(RetrievableStreamStateHandle.java:57) at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:202) ... 14 more 2018-11-15 15:09:53,185 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache thank you, Olga
Standalone HA cluster: Fatal error occurred in the cluster entrypoint.
Hello, I am running flink 1.6.1 standalone HA cluster. Today I am unable to start cluster because of "Fatal error in cluster entrypoint" (I used to see this error when running flink 1.5 version, after upgrade to 1.6.1 (which had a fix for this bug) everything worked well for a while) Question: what exactly needs to be done to clean "state handle store"? 2018-11-15 15:09:53,181 DEBUG org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor - Fencing token not set: Ignoring message LocalFencedMessage(null, org.apache.flink.runtime.rpc.messages.RunAsync@21fd224c) because the fencing token is null. 2018-11-15 15:09:53,182 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error occurred in the cluster entrypoint. java.lang.RuntimeException: org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph from state handle under /e13034f83a80072204facb2cec9ea6a3. This indicates that the retrieved state handle is broken. Try cleaning the state handle store. at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$1(FunctionUtils.java:61) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph from state handle under /e13034f83a80072204facb2cec9ea6a3. This indicates that the retrieved state handle is broken. Try cleaning the state handle store. at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:692) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:677) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobs(Dispatcher.java:658) at org.apache.flink.runtime.dispatcher.Dispatcher.lambda$null$26(Dispatcher.java:817) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$1(FunctionUtils.java:59) ... 9 more Caused by: java.io.FileNotFoundException: /checkpoint_repo/ha/submittedJobGraphdd865937d674 (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.(FileInputStream.java:138) at org.apache.flink.core.fs.local.LocalDataInputStream.(LocalDataInputStream.java:50) at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:142) at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68) at org.apache.flink.runtime.state.RetrievableStreamStateHandle.openInputStream(RetrievableStreamStateHandle.java:64) at org.apache.flink.runtime.state.RetrievableStreamStateHandle.retrieveState(RetrievableStreamStateHandle.java:57) at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:202) ... 14 more 2018-11-15 15:09:53,185 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache thank you, Olga
Enabling Flink’s checkpointing
Hello, By reading Flink documentation I see that to enable checkpointing we need to: 1. Enable checkpointing at the execution environment. 2. Make sure that your source/sink implements either CheckpointedFunction or ListCheckpointed interfaces? Is #2 a must, and how checkpointing mechanism is affected if your source does not implement mentioned above interfaces? (I see example of using RuntimeContext accessing keyed state) Please explain. Thank you very much, Olga Sent from my iPhone
Re: FlinkKafkaProducer and Confluent Schema Registry
Dawid, Is there a projected date to deliver ConfluentRegistryAvroSerializationSchema ? thank you, Olga From: Dawid Wysakowicz Sent: Monday, October 22, 2018 10:40 AM To: trebl...@hotmail.com Cc: user Subject: Re: FlinkKafkaProducer and Confluent Schema Registry Hi Olga, There is an open PR[1] that has some in-progress work on corresponding AvroSerializationSchema, you can have a look at it. The bigger issue there is that SerializationSchema does not have access to event's key so using topic pattern might be problematic. Best, Dawid [1] https://github.com/apache/flink/pull/6259 On Mon, 22 Oct 2018 at 16:51, Kostas Kloudas mailto:k.klou...@data-artisans.com>> wrote: Hi Olga, Sorry for the late reply. I think that Gordon (cc’ed) could be able to answer your question. Cheers, Kostas On Oct 13, 2018, at 3:10 PM, Olga Luganska mailto:trebl...@hotmail.com>> wrote: Any suggestions? Thank you Sent from my iPhone On Oct 9, 2018, at 9:28 PM, Olga Luganska mailto:trebl...@hotmail.com>> wrote: Hello, I would like to use Confluent Schema Registry in my streaming job. I was able to make it work with the help of generic Kafka producer and FlinkKafkaConsumer which is using ConfluentRegistryAvroDeserializationSchema. FlinkKafkaConsumer011 consumer = new FlinkKafkaConsumer011<>(MY_TOPIC, ConfluentRegistryAvroDeserializationSchema.forGeneric(schema, SCHEMA_URI), kafkaProperties); My question: is it possible to implement producer logic in the FlinkKafkaProducer to serialize message and store schema id in the Confluent Schema registry? I don't think this is going to work with the current interface because creation and caching of the schema id in the Confluent Schema Registry is done with the help of io.confluent.kafka.serializers.KafkaAvroSerializer.class and all FlinkKafkaProducer constructors have either SerializationSchema or KeyedSerializationSchema (part of Flink's own serialization stack) as one of the parameters. If my assumption is wrong, could you please provide details of implementation? Thank you very much, Olga
Re: FlinkKafkaProducer and Confluent Schema Registry
Any suggestions? Thank you Sent from my iPhone On Oct 9, 2018, at 9:28 PM, Olga Luganska mailto:trebl...@hotmail.com>> wrote: Hello, I would like to use Confluent Schema Registry in my streaming job. I was able to make it work with the help of generic Kafka producer and FlinkKafkaConsumer which is using ConfluentRegistryAvroDeserializationSchema. FlinkKafkaConsumer011 consumer = new FlinkKafkaConsumer011<>(MY_TOPIC, ConfluentRegistryAvroDeserializationSchema.forGeneric(schema, SCHEMA_URI), kafkaProperties); My question: is it possible to implement producer logic in the FlinkKafkaProducer to serialize message and store schema id in the Confluent Schema registry? I don't think this is going to work with the current interface because creation and caching of the schema id in the Confluent Schema Registry is done with the help of io.confluent.kafka.serializers.KafkaAvroSerializer.class and all FlinkKafkaProducer constructors have either SerializationSchema or KeyedSerializationSchema (part of Flink's own serialization stack) as one of the parameters. If my assumption is wrong, could you please provide details of implementation? Thank you very much, Olga
FlinkKafkaProducer and Confluent Schema Registry
Hello, I would like to use Confluent Schema Registry in my streaming job. I was able to make it work with the help of generic Kafka producer and FlinkKafkaConsumer which is using ConfluentRegistryAvroDeserializationSchema. FlinkKafkaConsumer011 consumer = new FlinkKafkaConsumer011<>(MY_TOPIC, ConfluentRegistryAvroDeserializationSchema.forGeneric(schema, SCHEMA_URI), kafkaProperties); My question: is it possible to implement producer logic in the FlinkKafkaProducer to serialize message and store schema id in the Confluent Schema registry? I don't think this is going to work with the current interface because creation and caching of the schema id in the Confluent Schema Registry is done with the help of io.confluent.kafka.serializers.KafkaAvroSerializer.class and all FlinkKafkaProducer constructors have either SerializationSchema or KeyedSerializationSchema (part of Flink's own serialization stack) as one of the parameters. If my assumption is wrong, could you please provide details of implementation? Thank you very much, Olga
Using several Kerberos keytabs in standalone cluster
Hello, According to the documentation, the security setup is shared by all the jobs on the same cluster, and if users need to use a different keytab, it is easily achievable in Yarn cluster setup by starting a new cluster with a different flink-conf.yaml Is it possible to setup a standalone cluster in such a way that each user can run his respective jobs with his own keytab? Thank you very much, Olga
Flink support for multiple data centers
Hello, Does Flink support multiple data center implementation and failover procedures in case one of the data centers goes down? Another question I have is about data encryption. If application state which needs to be checkpointed contains data elements which are considered to be a personally identifiable information, or maybe some credit card information, do you provide any encryption mechanisms to make sure that this data will be secured? Thank you very much, Olga