Re: Old Flink jobs restarting on Job Manager failover

2018-04-15 Thread Gary Yao
Hi Steve,

What is the Flink version you are using?

Jobs are recovered from metadata stored in ZooKeeper. The behavior you
describe
indicates that the submitted job graph is not deleted from ZooKeeper. By
default, the jobs that should be running/recovered are stored in znode:

  /flink/default/jobgraphs

Can you check if the job id is still present in ZK after the cancelation? If
that is the case then there should be relevant warnings or errors in the
jobmanager log that should help debugging why the deletion failed.

Best,
Gary

On Thu, Apr 12, 2018 at 6:01 PM,  wrote:

> Hi all,
>
>
>
> We have a Flink environment using zookeeper to manage the cluster. The
> high availability option is set up with the high-availability.storageDir
> parameter set to a shared directory on NAS; this is available to all nodes.
>
>
>
> When zookeeper fails over to the standby JobManager during a cluster
> change, we see old jobs that have long been cancelled being restarted
> automatically by Flink. It seems like the standby JobManager is
> reconnecting with old configuration and old job details.
>
>
>
> I can’t see anything in the log that gives any indication why this old job
> is restarting. I have noticed that the blob.storage.directory is set to a
> local directory.
>
>
>
> Are there any other settings in Flink that might cause a Job Manager to
> restart against an old local state rather than the latest shared state?
>
>
>
> Thanks,
>
>
>
> Steve
>
>
>
>
>
>
>
>
> *Stephen Hesketh Reporting Shared Services, NatWest Markets*
>
> 250 Bishopsgate, London EC2M 4AA
> 
>
> Office:
> 
> +44
> 
> (0)20 7678 1482 (internal 381482) | Mobile: +44 (0)7968 039848
>
>
> **
> NatWest Markets is a marketing name of The Royal Bank of Scotland plc.
> This communication and any attachments are confidential and intended
> solely for the addressee. If you are not the intended recipient please
> advise us immediately and delete it. Unless specifically stated in the
> message or otherwise indicated, you may not duplicate, redistribute or
> forward this message and any attachments are not intended for distribution
> to, or use by any person or entity in any jurisdiction or country where
> such distribution or use would be contrary to local law or regulation. The
> Royal Bank Of Scotland plc or any affiliated entity ("RBS") accepts no
> responsibility for any changes made to this message after it was sent.
> Unless otherwise specifically indicated, the contents of this
> communication and its attachments are for information purposes only and
> should not be regarded as an offer or solicitation to buy or sell a product
> or service, confirmation of any transaction, a valuation, indicative price
> or an official statement. This communication has been prepared by the RBS
> trading desk, which may have a position or interest in the products or
> services mentioned that is inconsistent with any views expressed in this
> message. In evaluating the information contained in this message, you
> should know that it could have been previously provided to other clients
> and/or internal RBS personnel, who could have already acted on it.
> RBS cannot provide absolute assurances that all electronic communications
> (sent or received) are secure, error free, not corrupted, incomplete or
> virus free and/or that they will not be lost, mis-delivered, destroyed,
> delayed or intercepted/decrypted by others. Therefore RBS disclaims all
> liability with regards to electronic communications (and the contents
> therein) if they are corrupted, lost destroyed, delayed, incomplete,
> mis-delivered, intercepted, decrypted or otherwise misappropriated by
> others.
> Any electronic communication that is conducted within or through RBS
> systems will be subject to being archived, monitored and produced to
> regulators and in litigation in accordance with RBS's policy and local
> laws, rules and regulations. Unless expressly prohibited by local law,
> electronic communications may be archived in countries other than the
> country in which you are located, and may be treated in accordance with the
> laws and regulations of the country of each individual included in the
> entire chain.
> Copyright 2014 The Royal Bank of Scotland plc. All rights reserved. See
> http://www.natwestmarkets.com/legal/s-t-discl.html for further risk
> disclosure.
> **
>


RE: Old Flink jobs restarting on Job Manager failover

2018-04-12 Thread Stephen.Hesketh
Hi all,

We have a Flink environment using zookeeper to manage the cluster. The high 
availability option is set up with the high-availability.storageDir parameter 
set to a shared directory on NAS; this is available to all nodes.

When zookeeper fails over to the standby JobManager during a cluster change, we 
see old jobs that have long been cancelled being restarted automatically by 
Flink. It seems like the standby JobManager is reconnecting with old 
configuration and old job details.

I can't see anything in the log that gives any indication why this old job is 
restarting. I have noticed that the blob.storage.directory is set to a local 
directory.

Are there any other settings in Flink that might cause a Job Manager to restart 
against an old local state rather than the latest shared state?

Thanks,

Steve



Stephen Hesketh
Reporting Shared Services, NatWest Markets
250 Bishopsgate, London EC2M 4AA
Office: +44 (0)20 7678 1482 (internal 381482) | Mobile: +44 (0)7968 039848

**

NatWest Markets is a marketing name of The Royal Bank of Scotland plc. 

This communication and any attachments are confidential and intended solely for 
the addressee. If you are not the intended recipient please advise us 
immediately and delete it. Unless specifically stated in the message or 
otherwise indicated, you may not duplicate, redistribute or forward this 
message and any attachments are not intended for distribution to, or use by any 
person or entity in any jurisdiction or country where such distribution or use 
would be contrary to local law or regulation. The Royal Bank Of Scotland plc or 
any affiliated entity ("RBS") accepts no responsibility for any changes made to 
this message after it was sent.

Unless otherwise specifically indicated, the contents of this communication and 
its attachments are for information purposes only and should not be regarded as 
an offer or solicitation to buy or sell a product or service, confirmation of 
any transaction, a valuation, indicative price or an official statement. This 
communication has been prepared by the RBS trading desk, which may have a 
position or interest in the products or services mentioned that is inconsistent 
with any views expressed in this message. In evaluating the information 
contained in this message, you should know that it could have been previously 
provided to other clients and/or internal RBS personnel, who could have already 
acted on it.

RBS cannot provide absolute assurances that all electronic communications (sent 
or received) are secure, error free, not corrupted, incomplete or virus free 
and/or that they will not be lost, mis-delivered, destroyed, delayed or 
intercepted/decrypted by others. Therefore RBS disclaims all liability with 
regards to electronic communications (and the contents therein) if they are 
corrupted, lost destroyed, delayed, incomplete, mis-delivered, intercepted, 
decrypted or otherwise misappropriated by others.

Any electronic communication that is conducted within or through RBS systems 
will be subject to being archived, monitored and produced to regulators and in 
litigation in accordance with RBS's policy and local laws, rules and 
regulations. Unless expressly prohibited by local law, electronic 
communications may be archived in countries other than the country in which you 
are located, and may be treated in accordance with the laws and regulations of 
the country of each individual included in the entire chain.

Copyright 2014 The Royal Bank of Scotland plc. All rights reserved. See 
http://www.natwestmarkets.com/legal/s-t-discl.html for further risk disclosure.

**