Hi Dongwon,

This should work but it could also interfere with Flink itself exiting in
case of a fatal error.

Regards,
Roman


On Fri, Dec 6, 2019 at 2:54 AM Dongwon Kim <eastcirc...@gmail.com> wrote:

> FYI, we've launched a session cluster where multiple jobs are managed by a
> job manager. If that happens, all the other jobs also fail because the job
> manager is shut down and all the task managers get into chaos (failing to
> connect to the job manager).
>
> I just searched a way to prevent System.exit() calls from terminating JVMs
> and found [1]. Can it be a possible solution to the problem?
>
> [1]
> https://stackoverflow.com/questions/5549720/how-to-prevent-calls-to-system-exit-from-terminating-the-jvm
>
> Best,
> - Dongwon
>
> On Fri, Dec 6, 2019 at 10:39 AM Dongwon Kim <eastcirc...@gmail.com> wrote:
>
>> Hi Robert and Roman,
>>
>> Thank you for taking a look at this.
>>
>> what is your main() method / client doing when it's receiving wrong
>>> program parameters? Does it call System.exit(), or something like that?
>>>
>>
>> I just found that our HTTP client is programmed to call System.exit(1). I
>> should guide not to call System.exit() in Flink applications.
>>
>> p.s. Just out of curiosity, is there no way for the web app to intercept
>> System.exit() and prevent the job manager from being shutting down?
>>
>> Best,
>>
>> - Dongwon
>>
>> On Fri, Dec 6, 2019 at 3:59 AM Robert Metzger <rmetz...@apache.org>
>> wrote:
>>
>>> Hi Dongwon,
>>>
>>> what is your main() method / client doing when it's receiving wrong
>>> program parameters? Does it call System.exit(), or something like that?
>>>
>>> By the way, the http address from the error message is
>>> publicly available. Not sure if this is internal data or not.
>>>
>>> On Thu, Dec 5, 2019 at 6:32 PM Khachatryan Roman <
>>> khachatryan.ro...@gmail.com> wrote:
>>>
>>>> Hi Dongwon,
>>>>
>>>> I wasn't able to reproduce your problem with Flink JobManager 1.9.1
>>>> with various kinds of errors in the job.
>>>> I suggest you try it on a fresh Flink installation without any other
>>>> jobs submitted.
>>>>
>>>> Regards,
>>>> Roman
>>>>
>>>>
>>>> On Thu, Dec 5, 2019 at 3:48 PM Dongwon Kim <eastcirc...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Roman,
>>>>>
>>>>> We're using the latest version 1.9.1 and those two lines are all I've
>>>>> seen after executing the job on the web ui.
>>>>>
>>>>> Best,
>>>>>
>>>>> Dongwon
>>>>>
>>>>> On Thu, Dec 5, 2019 at 11:36 PM r_khachatryan <
>>>>> khachatryan.ro...@gmail.com> wrote:
>>>>>
>>>>>> Hi Dongwon,
>>>>>>
>>>>>> Could you please provide Flink version you are running and the job
>>>>>> manager
>>>>>> logs?
>>>>>>
>>>>>> Regards,
>>>>>> Roman
>>>>>>
>>>>>>
>>>>>> eastcirclek wrote
>>>>>> > Hi,
>>>>>> >
>>>>>> > I tried to run a program by uploading a jar on Flink UI. When I
>>>>>> > intentionally enter a wrong parameter to my program, JobManager
>>>>>> dies.
>>>>>> > Below
>>>>>> > is all log messages I can get from JobManager; JobManager dies as
>>>>>> soon as
>>>>>> > spitting the second line:
>>>>>> >
>>>>>> > 2019-12-05 04:47:58,623 WARN
>>>>>> >>  org.apache.flink.runtime.webmonitor.handlers.JarRunHandler    -
>>>>>> >> Configuring the job submission via query parameters is deprecated.
>>>>>> Please
>>>>>> >> migrate to submitting a JSON request instead.
>>>>>> >>
>>>>>> >>
>>>>>> >> *2019-12-05 04:47:59,133 ERROR com.skt.apm.http.HTTPClient
>>>>>> >>                   - Cannot
>>>>>> >> connect:
>>>>>> http://52.141.38.11:8380/api/spec/poc_asset_model_01/model/imbalance/models
>>>>>> >> &lt;
>>>>>> http://52.141.38.11:8380/api/spec/poc_asset_model_01/model/imbalance/models&gt
>>>>>> ;:
>>>>>> >> com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot
>>>>>> >> deserialize instance of `java.util.ArrayList` out of START_OBJECT
>>>>>> token
>>>>>> >> at
>>>>>> >> [Source:
>>>>>> >>
>>>>>> (String)“{”code”:“GB0001”,“resource”:“msg.comm.unknown.error”,“details”:“NullPointerException:
>>>>>> >> “}”; line: 1, column: 1]2019-12-05 04:47:59,166 INFO
>>>>>> >>  org.apache.flink.runtime.blob.BlobServer                      -
>>>>>> Stopped
>>>>>> >> BLOB server at 0.0.0.0:6124 &lt;http://0.0.0.0:6124&gt;*
>>>>>> >
>>>>>> >
>>>>>> > The second line is obviously from my program and it shouldn't cause
>>>>>> > JobManager to be shut down. Is it intended behavior?
>>>>>> >
>>>>>> > Best,
>>>>>> >
>>>>>> > Dongwon
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent from:
>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>>>>>
>>>>>

Reply via email to