Re: Challenges using Flink REST API

Chesnay Schepler Wed, 13 Mar 2019 02:42:16 -0700

You should get the full stacktrace if you upgrade to 1.7.2 .


On 13.03.2019 09:55, Wouter Zorgdrager wrote:

Hey all!
I'm looking for some advice on the following; I'm working on anabstraction on top of Apache Flink to 'pipeline' Flink applicationsusing Kafka. For deployment this means that all these Flink jobs areembedded into one jar and each job is started using an programargument (e.g. "--stage 'FirstFlinkJob'". To ease deploying a set ofinterconnected Flink jobs onto a cluster I wrote a Python script whichbasically communicates with the REST client of the JobManager. So youcan do things like "pipeline start --jar 'JarWithThePipeline.jar'" andthis would deploy every Flink application separately.
However, this script was written a while ago against Flink version"1.4.2". This week I tried to upgrade it to Flink latest version but Inoticed a change in the REST responses. In order to get the "pipelinestart" command working,we need to know all the Flink jobs that are inthe jar (we call these Flink jobs 'stages') because we need to knowthe stage names as argument for the jar. For the 1.4.2 version we useda dirty trick; we ran the jar with '--list --asException' as programarguments which basically runs the jar file and immediately throws anexception with the stage names. These are then parsed and used tostart every stage separately. The error message that Flink threwlooked something like this:
java.util.concurrent.CompletionException:org.apache.flink.util.FlinkException: Could not run the jar.atorg.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:90)atjava.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
atjava.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)atjava.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.FlinkException: Could not run the jar.
... 9 more
Caused by: org.apache.flink.client.program.ProgramInvocationException:The main method caused an error.atorg.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:542)atorg.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:417)atorg.apache.flink.client.program.OptimizerPlanEnvironment.getOptimizedPlan(OptimizerPlanEnvironment.java:83)atorg.apache.flink.client.program.ClusterClient.getOptimizedPlan(ClusterClient.java:334)atorg.apache.flink.runtime.webmonitor.handlers.JarActionHandler.getJobGraphAndClassLoader(JarActionHandler.java:87)atorg.apache.flink.runtime.webmonitor.handlers.JarRunHandler.lambda$handleJsonRequest$0(JarRunHandler.java:69)
... 8 more
Caused by: org.codefeedr.pipeline.PipelineListException:["org.codefeedr.plugin.twitter.stages.TwitterStatusInput","mongo_tweets","elasticsearch_tweets"]
at org.codefeedr.pipeline.Pipeline.showList(Pipeline.scala:114)
at org.codefeedr.pipeline.Pipeline.start(Pipeline.scala:100)
at nl.wouterr.Main$.main(Main.scala:23)
at nl.wouterr.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
atorg.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:525)
However, for 1.7.0 this trick doesn't work anymore because instead ofreturning the full stack trace, it only returns the following:org.apache.flink.client.program.ProgramInvocationException: Theprogram caused an error:
In the console of the JobManager it does give the full stack tracethough. So first of all I'm wondering if there might be a way toenable more detailed stacktraces for Flink 1.7 in the REST responses.If not, do you have any suggestions on how to tackle this problem. Iknow, in the end this isn't really a Flink problem however you mightknow a workaround in the Flink REST client to achieve the same.
Some solutions I already considered:
- Running the jar with the "--list --asException" locally through thePython script; however Flink and Scala are not provided in the jar.Technically I could add them both to the classpath, but this wouldrequire users to have the Flink jar locally (and also Scala somewhere,but I assume most have).- Let users provide a list of stage names for all their(interconnected) Flink jobs. This is not really an option, because the(main) idea behind this framework is to reduce the boilerplate andcumbersome of setting up complex stream processing architectures.
Any help is appreciated. Thanks in advance!

Kind regards,
Wouter Zorgdrager

Re: Challenges using Flink REST API

Reply via email to