Hi Till,

I will have to test it with flink 1.7.1 and get back to you. Thanks!

Best,
Ethan


> On Feb 15, 2019, at 4:01 AM, Till Rohrmann <trohrm...@apache.org> wrote:
> 
> Hi Ethan,
> 
> can you observe a similar behaviour with Flink 1.7.1? Flink 1.4.2 is no 
> longer supported by the community.
> 
> Cheers,
> Till
> 
> On Thu, Feb 14, 2019 at 5:06 PM Ethan Li <ethanopensou...@gmail.com 
> <mailto:ethanopensou...@gmail.com>> wrote:
> The related job manager log is 
> https://gist.github.com/Ethanlm/86a10e786ad9025ddaa27c113c536da8 
> <https://gist.github.com/Ethanlm/86a10e786ad9025ddaa27c113c536da8>
> 
>> On Feb 14, 2019, at 9:40 AM, Ethan Li <ethanopensou...@gmail.com 
>> <mailto:ethanopensou...@gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I have a standalone flink-1.4.2 cluster with one JobManager, one 
>> TaskManager, and zookeeper.  I first started JM and TM and waited for them 
>> to be stable. Then I restarted JM. It’s when the TM got confused.
>> 
>> TM got notified that Leader node has changed and it tried to register to the 
>> new Leader (the new rpc port is 34561). Then it got the acknowledge says 
>> it’s already registered. And it then kept trying to associate with the old 
>> JM roc port (35213) and fail.
>> 
>> 2019-02-14 14:56:54,059 INFO  
>> org.apache.flink.runtime.taskmanager.TaskManager              - Trying to 
>> register at JobManager 
>> akka.ssl.tcp://fl...@openstorm10blue-n1.blue.ygrid.yahoo.com:34561/user/jobmanager
>>  <> (attempt 1, timeout: 500 milliseconds)
>> 2019-02-14 14:56:54,157 DEBUG 
>> org.apache.flink.shaded.akka.org.jboss.netty.handler.ssl.SslHandler  - [id: 
>> 0x77ac93ae, /10.215.68.243:46796 <http://10.215.68.243:46796/> => 
>> openstorm10blue-n1.blue.ygrid.yahoo.com/10.215.68.98:34561 
>> <http://openstorm10blue-n1.blue.ygrid.yahoo.com/10.215.68.98:34561>] 
>> HANDSHAKEN: TLS_RSA_WITH_AES_128_CBC_SHA
>> 2019-02-14 14:56:54,276 INFO  
>> org.apache.flink.runtime.taskmanager.TaskManager              - Successful 
>> registration at JobManager 
>> (akka.ssl.tcp://fl...@openstorm10blue-n1.blue.ygrid.yahoo.com:34561/user/jobmanager
>>  <>), starting network stack and library cache.
>> 2019-02-14 14:56:54,276 INFO  
>> org.apache.flink.runtime.taskmanager.TaskManager              - Determined 
>> BLOB server address to be 
>> openstorm10blue-n1.blue.ygrid.yahoo.com/10.215.68.98:50100 
>> <http://openstorm10blue-n1.blue.ygrid.yahoo.com/10.215.68.98:50100>. 
>> Starting BLOB cache.
>> 2019-02-14 14:56:54,278 INFO  
>> org.apache.flink.runtime.blob.PermanentBlobCache              - Created BLOB 
>> cache storage directory 
>> /home/y/var/flink/blobstorage/blobStore-927b523f-f3ff-4ccc-83a0-362e09a3b858
>> 2019-02-14 14:56:54,279 INFO  
>> org.apache.flink.runtime.blob.TransientBlobCache              - Created BLOB 
>> cache storage directory 
>> /home/y/var/flink/blobstorage/blobStore-8492465e-0e94-4792-a346-66e6da299f7a
>> 2019-02-14 14:56:54,572 DEBUG 
>> org.apache.flink.runtime.taskmanager.TaskManager              - TaskManager 
>> was triggered to register at JobManager, but is already registered
>> 2019-02-14 14:56:56,359 WARN  akka.remote.transport.netty.NettyTransport     
>>                - Remote connection to [null] failed with 
>> java.net.ConnectException: Connection refused: 
>> openstorm10blue-n1.blue.ygrid.yahoo.com/10.215.68.98:35213 
>> <http://openstorm10blue-n1.blue.ygrid.yahoo.com/10.215.68.98:35213>
>> 2019-02-14 14:56:56,360 DEBUG 
>> org.apache.flink.runtime.taskmanager.TaskManager              - The 
>> association error event's root cause is not of type 
>> InvalidAssociationException.
>> 
>> 
>> 
>> Full Task manage log:  
>> https://gist.github.com/Ethanlm/e6f1b29d27d26813f5f8f40cd2c12643 
>> <https://gist.github.com/Ethanlm/e6f1b29d27d26813f5f8f40cd2c12643>
>> 
>> 
>> Is this expected or is this a bug? 
>> 
>> Thank you!
>> 
>> Ethan
> 

Reply via email to