Re: Recommended approach to debug this

2019-09-24 Thread Biao Liu
Hi Zili, Great to hear that! Hope to see the new client soon! Thanks, Biao /'bɪ.aʊ/ On Tue, 24 Sep 2019 at 19:23, Zili Chen wrote: > Actually there is an ongoing client API refactoring on this stuff[1] and > one of the main purpose is > eliminating hijacking env.execute... > > Best, >

Re: Recommended approach to debug this

2019-09-24 Thread Zili Chen
Actually there is an ongoing client API refactoring on this stuff[1] and one of the main purpose is eliminating hijacking env.execute... Best, tison. [1] https://lists.apache.org/x/thread.html/ce99cba4a10b9dc40eb729d39910f315ae41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E Biao Liu

Re: Recommended approach to debug this

2019-09-24 Thread Biao Liu
So I believe (I did't test it) the solution for this case is keeping the original exception thrown from `env.execute()` and throwing this exception out of main method. It's a bit tricky, maybe we could have a better design of this scenario. Thanks, Biao /'bɪ.aʊ/ On Tue, 24 Sep 2019 at 18:55,

Re: Recommended approach to debug this

2019-09-24 Thread Biao Liu
The key point of this case is in `PackagedProgram#callMainMethod`. The `ProgramAbortException` is expected when executing the main method here. This `ProgramAbortException` thrown is wrapped with `InvocationTargetException` by Java reflection layer [1]. There is a piece of codes handling

Re: Recommended approach to debug this

2019-09-24 Thread Debasish Ghosh
Well, I think I got the solution though I am not yet sure of the problem .. The original code looked like this .. Try { // from a parent class called Runner which runs a streamlet // run returns an abstraction which completes a Promise depending on whether // the Job was successful or not

Re: Recommended approach to debug this

2019-09-23 Thread Biao Liu
Hi Zili, Thanks for pointing that out. I didn't realize that it's a REST API based case. Debasish's case has been discussed not only in this thread... It's really hard to analyze the case without the full picture. I think the reason of why `ProgramAbortException` is not caught is that he did

Re: Recommended approach to debug this

2019-09-23 Thread Zili Chen
Hi Biao, The log below already infers that the job was submitted via REST API and I don't think it matters. at org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$ JarHandlerContext.toJobGraph(JarHandlerUtils.java:126) at

Re: Recommended approach to debug this

2019-09-23 Thread Biao Liu
> We submit the code through Kubernetes Flink Operator which uses the REST API to submit the job to the Job Manager So you are submitting job through REST API, not Flink client? Could you explain more about this? Thanks, Biao /'bɪ.aʊ/ On Tue, 24 Sep 2019 at 03:44, Debasish Ghosh wrote: > Hi

Re: Recommended approach to debug this

2019-09-23 Thread Debasish Ghosh
Hi Dian - We submit one job through the operator. We just use the following to complete a promise when the job completes .. Try { createLogic.executeStreamingQueries(ctx.env) }.fold( th ⇒ completionPromise.tryFailure(th), _ ⇒ completionPromise.trySuccess(Dun)

Re: Recommended approach to debug this

2019-09-23 Thread Dian Fu
Hi Debasish, In which case will the exception occur? Does it occur when you submit one job at a time or when multiple jobs are submitted at the same time? I'm asking this because I noticed that you used Future to execute the job unblocking. I guess ThreadLocal doesn't work well in this case.

Re: Recommended approach to debug this

2019-09-23 Thread Debasish Ghosh
Hi tison - Please find my response below in >>. regards. On Mon, Sep 23, 2019 at 6:20 PM Zili Chen wrote: > Hi Debasish, > > The OptimizerPlanEnvironment.ProgramAbortException should be caught at > OptimizerPlanEnvironment#getOptimizedPlan > in its catch (Throwable t) branch. > >> true but

Re: Recommended approach to debug this

2019-09-23 Thread Zili Chen
Hi Debasish, The OptimizerPlanEnvironment.ProgramAbortException should be caught at OptimizerPlanEnvironment#getOptimizedPlan in its catch (Throwable t) branch. It should always throw a ProgramInvocationException instead of OptimizerPlanEnvironment.ProgramAbortException if any exception thrown

Re: Recommended approach to debug this

2019-09-23 Thread Debasish Ghosh
This is the complete stack trace which we get from execution on Kubernetes using the Flink Kubernetes operator .. The boxed error comes from the fact that we complete a Promise with Success when it returns a JobExecutionResult and with Failure when we get an exception. And here we r getting an

Re: Recommended approach to debug this

2019-09-23 Thread Dian Fu
Regarding to the code you pasted, personally I think nothing is wrong. The problem is how it's executed. As you can see from the implementation of of StreamExecutionEnvironment.getExecutionEnvironment, it may created different StreamExecutionEnvironment implementations under different

Re: Recommended approach to debug this

2019-09-23 Thread Debasish Ghosh
Can it be the case that the threadLocal stuff in https://github.com/apache/flink/blob/release-1.9/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java#L1609 does not behave deterministically when we submit job through a Kubernetes Flink

Re: Recommended approach to debug this

2019-09-23 Thread Debasish Ghosh
ah .. Ok .. I get the Throwable part. I am using import org.apache.flink.streaming.api.scala._ val env = StreamExecutionEnvironment.getExecutionEnvironment How can this lead to a wrong StreamExecutionEnvironment ? Any suggestion ? regards. On Mon, Sep 23, 2019 at 3:53 PM Dian Fu wrote: > Hi

Re: Recommended approach to debug this

2019-09-23 Thread Dian Fu
Hi Debasish, As I said before, the exception is caught in [1]. It catches the Throwable and so it could also catch "OptimizerPlanEnvironment.ProgramAbortException". Regarding to the cause of this exception, I have the same feeling with Tison and I also think that the wrong

Re: Recommended approach to debug this

2019-09-23 Thread Debasish Ghosh
Hi Tison - This is the code that builds the computation graph. readStream reads from Kafka and writeStream writes to Kafka. override def buildExecutionGraph = { val rides: DataStream[TaxiRide] = readStream(inTaxiRide) .filter { ride ⇒ ride.getIsStart().booleanValue }

Re: Recommended approach to debug this

2019-09-22 Thread Vijay Bhaskar
One more suggestion is to run the same job in regular 2 node cluster and see whether you are getting the same exception. So that you can narrow down the issue easily. Regards Bhaskar On Mon, Sep 23, 2019 at 7:50 AM Zili Chen wrote: > Hi Debasish, > > As mentioned by Dian, it is an internal

Re: Recommended approach to debug this

2019-09-22 Thread Zili Chen
Hi Debasish, As mentioned by Dian, it is an internal exception that should be always caught by Flink internally. I would suggest you share the job(abstractly). Generally it is because you use StreamPlanEnvironment/OptimizerPlanEnvironment directly. Best, tison. Austin Cawley-Edwards

Re: Recommended approach to debug this

2019-09-22 Thread Austin Cawley-Edwards
Have you reached out to the FlinkK8sOperator team on Slack? They’re usually pretty active on there. Here’s the link: https://join.slack.com/t/flinkk8soperator/shared_invite/enQtNzIxMjc5NDYxODkxLTEwMThmN2I0M2QwYjM3ZDljYTFhMGRiNDUzM2FjZGYzNTRjYWNmYTE1NzNlNWM2YWM5NzNiNGFhMTkxZjA4OGU Best, Austin

Re: Recommended approach to debug this

2019-09-22 Thread Debasish Ghosh
The problem is I am submitting Flink jobs to Kubernetes cluster using a Flink Operator. Hence it's difficult to debug in the traditional sense of the term. And all I get is the exception that I reported .. Caused by: org.apache.flink.client.program.OptimizerPlanEnvironment$ProgramAbortException

Re: Recommended approach to debug this

2019-09-20 Thread Debasish Ghosh
Thanks for the pointer .. I will try debugging. I am getting this exception running my application on Kubernetes using the Flink operator from Lyft. regards. On Sat, 21 Sep 2019 at 6:44 AM, Dian Fu wrote: > This exception is used internally to get the plan of a job before > submitting it for

Re: Recommended approach to debug this

2019-09-20 Thread Dian Fu
This exception is used internally to get the plan of a job before submitting it for execution. It's thrown with special purpose and will be caught internally in [1] and will not be thrown to end users usually. You could check the following places to find out the cause to this problem: 1. Check

Recommended approach to debug this

2019-09-20 Thread Debasish Ghosh
Hi - When you get an exception stack trace like this .. Caused by: org.apache.flink.client.program.OptimizerPlanEnvironment$ProgramAbortException at org.apache.flink.streaming.api.environment.StreamPlanEnvironment.execute(StreamPlanEnvironment.java:66) at