Re: Cannot start native K8s

2020-05-28 Thread Yang Wang
A quick update on this issue.

The root cause of this issue is compatibility of kubernetes-client and java
8u252[1]. And we have
bumped he fabric8 kubernetes-client version from 4.5.2 to 4.9.2 in master
and release-1.11 branch.
Now users could deploy Flink on K8s natively with java 8u252.

If you really could not use the latest Flink version, you could set the
environment "HTTP2_DISABLE=true"
in Flink client, jobmanager, taskmanager side.

[1]. https://github.com/fabric8io/kubernetes-client/issues/2212

Best,
Yang

Yang Wang  于2020年5月11日周一 上午11:51写道:

> Glad to hear that you could deploy the Flink cluster on K8s natively.
> Thanks for
> trying the in-preview feature and give your feedback.
>
>
> Moreover, i want to give a very simple conclusion here. Currently, because
> of the
> compatibility issue of fabric8 kubernetes-client, the native K8s
> integration have the
> following known limitation.
> * For jdk 8u252, the native k8s integration could only work on kubernetes
> v1.16 and
> lower versions.
> * For other jdk versions(e.g. 8u242, jdk11), i am not aware of the same
> issues. The native
> K8s integration works well.
>
>
> Best,
> Yang
>
> Dongwon Kim  于2020年5月9日周六 上午11:46写道:
>
>> Hi Yang,
>>
>> Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that
>> the current user account can access to K8s.
>> Now that I copied it, I found that kubernetes-session.sh is working fine.
>> Thanks very much!
>>
>> Best,
>> Dongwon
>>
>> [flink@DAC-E04-W06 ~]$ kubernetes-session.sh
>> 2020-05-09 12:43:49,961 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.address, DAC-E04-W06
>> 2020-05-09 12:43:49,962 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.port, 6123
>> 2020-05-09 12:43:49,962 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.heap.size, 1024m
>> 2020-05-09 12:43:49,962 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: taskmanager.memory.process.size, 24g
>> 2020-05-09 12:43:49,963 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: taskmanager.numberOfTaskSlots, 24
>> 2020-05-09 12:43:49,963 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: parallelism.default, 1
>> 2020-05-09 12:43:49,963 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability, zookeeper
>> 2020-05-09 12:43:49,963 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability.zookeeper.path.root, /flink
>> 2020-05-09 12:43:49,964 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
>> 2020-05-09 12:43:49,964 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
>> 2020-05-09 12:43:49,965 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.execution.failover-strategy, region
>> 2020-05-09 12:43:49,965 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: rest.port, 8082
>> 2020-05-09 12:43:51,122 INFO
>>  org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The
>> derived from fraction jvm overhead memory (2.400gb (2576980416 bytes)) is
>> greater than its max value 1024.000mb (1073741824 bytes), max value will be
>> used instead
>> 2020-05-09 12:43:51,123 INFO
>>  org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The
>> derived from fraction network memory (2.291gb (2459539902 bytes)) is
>> greater than its max value 1024.000mb (1073741824 bytes), max value will be
>> used instead
>> 2020-05-09 12:43:51,131 INFO
>>  org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
>> deployment requires a fixed port. Configuration blob.server.port will be
>> set to 6124
>> 2020-05-09 12:43:51,131 INFO
>>  org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
>> deployment requires a fixed port. Configuration taskmanager.rpc.port will
>> be set to 6122
>> 2020-05-09 12:43:51,134 INFO
>>  org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
>> deployment requires a fixed port. Configuration
>> high-availability.jobmanager.port will be set to 6123
>> 2020-05-09 12:43:52,167 INFO
>>  org.apache.flink.kubernetes.KubernetesClusterDescriptor   - Create
>> flink session cluster flink-cluster-4a82d41b-af15-4205-8a44-62351e270242
>> successfully, JobManager Web Interface: 

Re: Cannot start native K8s

2020-05-10 Thread Yang Wang
Glad to hear that you could deploy the Flink cluster on K8s natively.
Thanks for
trying the in-preview feature and give your feedback.


Moreover, i want to give a very simple conclusion here. Currently, because
of the
compatibility issue of fabric8 kubernetes-client, the native K8s
integration have the
following known limitation.
* For jdk 8u252, the native k8s integration could only work on kubernetes
v1.16 and
lower versions.
* For other jdk versions(e.g. 8u242, jdk11), i am not aware of the same
issues. The native
K8s integration works well.


Best,
Yang

Dongwon Kim  于2020年5月9日周六 上午11:46写道:

> Hi Yang,
>
> Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that
> the current user account can access to K8s.
> Now that I copied it, I found that kubernetes-session.sh is working fine.
> Thanks very much!
>
> Best,
> Dongwon
>
> [flink@DAC-E04-W06 ~]$ kubernetes-session.sh
> 2020-05-09 12:43:49,961 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.address, DAC-E04-W06
> 2020-05-09 12:43:49,962 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.port, 6123
> 2020-05-09 12:43:49,962 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.heap.size, 1024m
> 2020-05-09 12:43:49,962 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.memory.process.size, 24g
> 2020-05-09 12:43:49,963 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.numberOfTaskSlots, 24
> 2020-05-09 12:43:49,963 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: parallelism.default, 1
> 2020-05-09 12:43:49,963 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability, zookeeper
> 2020-05-09 12:43:49,963 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.zookeeper.path.root, /flink
> 2020-05-09 12:43:49,964 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
> 2020-05-09 12:43:49,964 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
> 2020-05-09 12:43:49,965 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.execution.failover-strategy, region
> 2020-05-09 12:43:49,965 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: rest.port, 8082
> 2020-05-09 12:43:51,122 INFO
>  org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The
> derived from fraction jvm overhead memory (2.400gb (2576980416 bytes)) is
> greater than its max value 1024.000mb (1073741824 bytes), max value will be
> used instead
> 2020-05-09 12:43:51,123 INFO
>  org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The
> derived from fraction network memory (2.291gb (2459539902 bytes)) is
> greater than its max value 1024.000mb (1073741824 bytes), max value will be
> used instead
> 2020-05-09 12:43:51,131 INFO
>  org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
> deployment requires a fixed port. Configuration blob.server.port will be
> set to 6124
> 2020-05-09 12:43:51,131 INFO
>  org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
> deployment requires a fixed port. Configuration taskmanager.rpc.port will
> be set to 6122
> 2020-05-09 12:43:51,134 INFO
>  org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
> deployment requires a fixed port. Configuration
> high-availability.jobmanager.port will be set to 6123
> 2020-05-09 12:43:52,167 INFO
>  org.apache.flink.kubernetes.KubernetesClusterDescriptor   - Create
> flink session cluster flink-cluster-4a82d41b-af15-4205-8a44-62351e270242
> successfully, JobManager Web Interface: http://cluster-endpoint:31513
>
>
> On Sat, May 9, 2020 at 12:29 PM Yang Wang  wrote:
>
>> Hi Dongwon Kim,
>>
>> Thanks a lot for your information. I will dig into this issue.
>>
>> I think the "UnknownHostException" is caused by incorrectly setting the
>> Kubernetes
>> ApiServer address. Maybe you are using "kubernetes.default.svc". However,
>> it
>> could not be accessed outside of the Kubernetes cluster. You need to
>> configure
>> a correct ip/hostname for ApiServer address, which could be accessed in
>> your
>> local environment. You could use `kubectl auth can-i create pods` to
>> verify
>> whether the kube config is correct.
>>
>> BTW, currently we only find the flink on native 

Re: Cannot start native K8s

2020-05-08 Thread Dongwon Kim
Hi Yang,

Oops, I forget to copy /etc/kube/admin.conf to $HOME/.kube/config so that
the current user account can access to K8s.
Now that I copied it, I found that kubernetes-session.sh is working fine.
Thanks very much!

Best,
Dongwon

[flink@DAC-E04-W06 ~]$ kubernetes-session.sh
2020-05-09 12:43:49,961 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.rpc.address, DAC-E04-W06
2020-05-09 12:43:49,962 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.rpc.port, 6123
2020-05-09 12:43:49,962 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.heap.size, 1024m
2020-05-09 12:43:49,962 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: taskmanager.memory.process.size, 24g
2020-05-09 12:43:49,963 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: taskmanager.numberOfTaskSlots, 24
2020-05-09 12:43:49,963 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: parallelism.default, 1
2020-05-09 12:43:49,963 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability, zookeeper
2020-05-09 12:43:49,963 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability.zookeeper.path.root, /flink
2020-05-09 12:43:49,964 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
2020-05-09 12:43:49,964 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
2020-05-09 12:43:49,965 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-05-09 12:43:49,965 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: rest.port, 8082
2020-05-09 12:43:51,122 INFO
 org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The
derived from fraction jvm overhead memory (2.400gb (2576980416 bytes)) is
greater than its max value 1024.000mb (1073741824 bytes), max value will be
used instead
2020-05-09 12:43:51,123 INFO
 org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils  - The
derived from fraction network memory (2.291gb (2459539902 bytes)) is
greater than its max value 1024.000mb (1073741824 bytes), max value will be
used instead
2020-05-09 12:43:51,131 INFO
 org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
deployment requires a fixed port. Configuration blob.server.port will be
set to 6124
2020-05-09 12:43:51,131 INFO
 org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
deployment requires a fixed port. Configuration taskmanager.rpc.port will
be set to 6122
2020-05-09 12:43:51,134 INFO
 org.apache.flink.kubernetes.utils.KubernetesUtils - Kubernetes
deployment requires a fixed port. Configuration
high-availability.jobmanager.port will be set to 6123
2020-05-09 12:43:52,167 INFO
 org.apache.flink.kubernetes.KubernetesClusterDescriptor   - Create
flink session cluster flink-cluster-4a82d41b-af15-4205-8a44-62351e270242
successfully, JobManager Web Interface: http://cluster-endpoint:31513


On Sat, May 9, 2020 at 12:29 PM Yang Wang  wrote:

> Hi Dongwon Kim,
>
> Thanks a lot for your information. I will dig into this issue.
>
> I think the "UnknownHostException" is caused by incorrectly setting the
> Kubernetes
> ApiServer address. Maybe you are using "kubernetes.default.svc". However,
> it
> could not be accessed outside of the Kubernetes cluster. You need to
> configure
> a correct ip/hostname for ApiServer address, which could be accessed in
> your
> local environment. You could use `kubectl auth can-i create pods` to verify
> whether the kube config is correct.
>
> BTW, currently we only find the flink on native K8s could not work on
> 8u252. For
> 8u242 and lower version, it works well.
>
>
> Best,
> Yang
>
> Dongwon Kim  于2020年5月9日周六 上午10:43写道:
>
>> Hello Yang,
>>
>> I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not
>> a Minikube).
>> Previously, as you pointed out, openjdk version "1.8.0_252" was installed.
>> I bump up java version to openjdk 11.0.7 but got something different:
>>
>> [flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh
>> 2020-05-09 11:39:36,737 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.address, DAC-E04-W06
>> 2020-05-09 11:39:36,739 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: 

Re: Cannot start native K8s

2020-05-08 Thread Yang Wang
Hi Dongwon Kim,

Thanks a lot for your information. I will dig into this issue.

I think the "UnknownHostException" is caused by incorrectly setting the
Kubernetes
ApiServer address. Maybe you are using "kubernetes.default.svc". However, it
could not be accessed outside of the Kubernetes cluster. You need to
configure
a correct ip/hostname for ApiServer address, which could be accessed in your
local environment. You could use `kubectl auth can-i create pods` to verify
whether the kube config is correct.

BTW, currently we only find the flink on native K8s could not work on
8u252. For
8u242 and lower version, it works well.


Best,
Yang

Dongwon Kim  于2020年5月9日周六 上午10:43写道:

> Hello Yang,
>
> I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not
> a Minikube).
> Previously, as you pointed out, openjdk version "1.8.0_252" was installed.
> I bump up java version to openjdk 11.0.7 but got something different:
>
> [flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh
> 2020-05-09 11:39:36,737 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.address, DAC-E04-W06
> 2020-05-09 11:39:36,739 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.port, 6123
> 2020-05-09 11:39:36,739 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.heap.size, 1024m
> 2020-05-09 11:39:36,739 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.memory.process.size, 24g
> 2020-05-09 11:39:36,739 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.numberOfTaskSlots, 24
> 2020-05-09 11:39:36,739 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: parallelism.default, 1
> 2020-05-09 11:39:36,740 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability, zookeeper
> 2020-05-09 11:39:36,740 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.zookeeper.path.root, /flink
> 2020-05-09 11:39:36,740 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
> 2020-05-09 11:39:36,740 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
> 2020-05-09 11:39:36,741 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.execution.failover-strategy, region
> 2020-05-09 11:39:36,741 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: rest.port, 8082
> 2020-05-09 11:39:36,817 WARN  io.fabric8.kubernetes.client.Config
>   - Error reading service account token from:
> [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
> 2020-05-09 11:39:36,823 WARN  io.fabric8.kubernetes.client.Config
>   - Error reading service account token from:
> [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
> 2020-05-09 11:39:37,080 WARN  io.fabric8.kubernetes.client.Config
>   - Error reading service account token from:
> [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
> 2020-05-09 11:39:37,082 WARN  io.fabric8.kubernetes.client.Config
>   - Error reading service account token from:
> [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
> 2020-05-09 11:39:37,334 ERROR
> org.apache.flink.kubernetes.cli.KubernetesSessionCli  - Error while
> running the Flink session.
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>  for kind: [Service]  with name:
> [flink-cluster-6adb7c62-8940-4828-990c-a87379102d61]  in namespace:
> [default]  failed.
> at
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
> at
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164)
> at
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334)
> at
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246)
> at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104)
> at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
> at
> 

Re: Cannot start native K8s

2020-05-08 Thread Dongwon Kim
Hello Yang,

I'm using K8s v1.18.2 installed by Kubeadm over a cluster of 5 nodes (not a
Minikube).
Previously, as you pointed out, openjdk version "1.8.0_252" was installed.
I bump up java version to openjdk 11.0.7 but got something different:

[flink@DAC-E04-W06 bin]$ ./kubernetes-session.sh
2020-05-09 11:39:36,737 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.rpc.address, DAC-E04-W06
2020-05-09 11:39:36,739 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.rpc.port, 6123
2020-05-09 11:39:36,739 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.heap.size, 1024m
2020-05-09 11:39:36,739 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: taskmanager.memory.process.size, 24g
2020-05-09 11:39:36,739 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: taskmanager.numberOfTaskSlots, 24
2020-05-09 11:39:36,739 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: parallelism.default, 1
2020-05-09 11:39:36,740 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability, zookeeper
2020-05-09 11:39:36,740 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability.zookeeper.path.root, /flink
2020-05-09 11:39:36,740 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
2020-05-09 11:39:36,740 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
2020-05-09 11:39:36,741 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-05-09 11:39:36,741 INFO
 org.apache.flink.configuration.GlobalConfiguration- Loading
configuration property: rest.port, 8082
2020-05-09 11:39:36,817 WARN  io.fabric8.kubernetes.client.Config
- Error reading service account token from:
[/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
2020-05-09 11:39:36,823 WARN  io.fabric8.kubernetes.client.Config
- Error reading service account token from:
[/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
2020-05-09 11:39:37,080 WARN  io.fabric8.kubernetes.client.Config
- Error reading service account token from:
[/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
2020-05-09 11:39:37,082 WARN  io.fabric8.kubernetes.client.Config
- Error reading service account token from:
[/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
2020-05-09 11:39:37,334 ERROR
org.apache.flink.kubernetes.cli.KubernetesSessionCli  - Error while
running the Flink session.
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
 for kind: [Service]  with name:
[flink-cluster-6adb7c62-8940-4828-990c-a87379102d61]  in namespace:
[default]  failed.
at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231)
at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164)
at
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334)
at
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246)
at
org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104)
at
org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
at
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at
org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)
Caused by: java.net.UnknownHostException: kubernetes.default.svc: Name or
service not known
at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at
java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(InetAddress.java:929)
at
java.base/java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1515)
at
java.base/java.net.InetAddress$NameServiceAddresses.get(InetAddress.java:848)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1505)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1364)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1298)
at 

Re: Cannot start native K8s

2020-05-08 Thread Yang Wang
Hi Dongwon Kim,

Are you running Flink on a minikube or a real Kubernetes cluster? I just
could not
reproduce it in a real Kubernetes cluster with java 8u252. For minikube, i
get the
exception with you.


Best,
Yang

Yang Wang  于2020年5月6日周三 上午9:29写道:

> Hi Dongwon Kim,
>
> I think it is a known issue. The native kubernetes integration could not
> work with jdk 8u252
> due to okhttp issue[1]. Currently, you could upgrade your jdk to a new
> version to work around.
>
>
> [1]. https://issues.apache.org/jira/browse/FLINK-17416
>
> Dongwon Kim  于2020年5月6日周三 上午7:15写道:
>
>> Hi,
>>
>> I'm using Flink-1.10 and tested everything [1] successfully.
>> While trying [2], I got the following message.
>> Can anyone help please?
>>
>> [root@DAC-E04-W06 bin]# ./kubernetes-session.sh
>>> 2020-05-06 08:10:49,411 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: jobmanager.rpc.address, DAC-E04-W06
>>> 2020-05-06 08:10:49,412 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: jobmanager.rpc.port, 6123
>>> 2020-05-06 08:10:49,412 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: jobmanager.heap.size, 1024m
>>> 2020-05-06 08:10:49,412 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: taskmanager.memory.process.size, 24g
>>> 2020-05-06 08:10:49,413 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: taskmanager.numberOfTaskSlots, 24
>>> 2020-05-06 08:10:49,413 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: parallelism.default, 1
>>> 2020-05-06 08:10:49,413 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: high-availability, zookeeper
>>> 2020-05-06 08:10:49,413 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: high-availability.zookeeper.path.root, /flink
>>> 2020-05-06 08:10:49,414 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
>>> 2020-05-06 08:10:49,414 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
>>> 2020-05-06 08:10:49,414 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: jobmanager.execution.failover-strategy, region
>>> 2020-05-06 08:10:49,415 INFO
>>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>>> configuration property: rest.port, 8082
>>> 2020-05-06 08:10:50,386 ERROR
>>> org.apache.flink.kubernetes.cli.KubernetesSessionCli  - Error while
>>> running the Flink session.
>>> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>>>  for kind: [Service]  with name:
>>> [flink-cluster-5c12bd50-a540-4614-96d0-549785a8bc62]  in namespace:
>>> [default]  failed.
>>> at
>>> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
>>> at
>>> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
>>> at
>>> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231)
>>> at
>>> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164)
>>> at
>>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334)
>>> at
>>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246)
>>> at
>>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104)
>>> at
>>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
>>> at
>>> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>>> at
>>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)
>>> Caused by: java.net.SocketException: Broken pipe (Write failed)
>>> at java.net.SocketOutputStream.socketWrite0(Native Method)
>>> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
>>> at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
>>> at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
>>> at sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
>>> at
>>> sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894)
>>> at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865)
>>> at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
>>> at 

Re: Cannot start native K8s

2020-05-05 Thread Yang Wang
Hi Dongwon Kim,

I think it is a known issue. The native kubernetes integration could not
work with jdk 8u252
due to okhttp issue[1]. Currently, you could upgrade your jdk to a new
version to work around.


[1]. https://issues.apache.org/jira/browse/FLINK-17416

Dongwon Kim  于2020年5月6日周三 上午7:15写道:

> Hi,
>
> I'm using Flink-1.10 and tested everything [1] successfully.
> While trying [2], I got the following message.
> Can anyone help please?
>
> [root@DAC-E04-W06 bin]# ./kubernetes-session.sh
>> 2020-05-06 08:10:49,411 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.address, DAC-E04-W06
>> 2020-05-06 08:10:49,412 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.rpc.port, 6123
>> 2020-05-06 08:10:49,412 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.heap.size, 1024m
>> 2020-05-06 08:10:49,412 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: taskmanager.memory.process.size, 24g
>> 2020-05-06 08:10:49,413 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: taskmanager.numberOfTaskSlots, 24
>> 2020-05-06 08:10:49,413 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: parallelism.default, 1
>> 2020-05-06 08:10:49,413 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability, zookeeper
>> 2020-05-06 08:10:49,413 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability.zookeeper.path.root, /flink
>> 2020-05-06 08:10:49,414 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
>> 2020-05-06 08:10:49,414 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
>> 2020-05-06 08:10:49,414 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: jobmanager.execution.failover-strategy, region
>> 2020-05-06 08:10:49,415 INFO
>>  org.apache.flink.configuration.GlobalConfiguration- Loading
>> configuration property: rest.port, 8082
>> 2020-05-06 08:10:50,386 ERROR
>> org.apache.flink.kubernetes.cli.KubernetesSessionCli  - Error while
>> running the Flink session.
>> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>>  for kind: [Service]  with name:
>> [flink-cluster-5c12bd50-a540-4614-96d0-549785a8bc62]  in namespace:
>> [default]  failed.
>> at
>> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
>> at
>> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
>> at
>> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231)
>> at
>> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164)
>> at
>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334)
>> at
>> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246)
>> at
>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104)
>> at
>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
>> at
>> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>> at
>> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)
>> Caused by: java.net.SocketException: Broken pipe (Write failed)
>> at java.net.SocketOutputStream.socketWrite0(Native Method)
>> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
>> at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
>> at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
>> at sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
>> at
>> sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894)
>> at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865)
>> at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
>> at org.apache.flink.kubernetes.shadded.okio.Okio$1.write(Okio.java:79)
>> at
>> org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$1.write(AsyncTimeout.java:180)
>> at
>> org.apache.flink.kubernetes.shadded.okio.RealBufferedSink.flush(RealBufferedSink.java:224)
>> at
>> org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Writer.settings(Http2Writer.java:203)
>> at
>> 

Cannot start native K8s

2020-05-05 Thread Dongwon Kim
Hi,

I'm using Flink-1.10 and tested everything [1] successfully.
While trying [2], I got the following message.
Can anyone help please?

[root@DAC-E04-W06 bin]# ./kubernetes-session.sh
> 2020-05-06 08:10:49,411 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.address, DAC-E04-W06
> 2020-05-06 08:10:49,412 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.rpc.port, 6123
> 2020-05-06 08:10:49,412 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.heap.size, 1024m
> 2020-05-06 08:10:49,412 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.memory.process.size, 24g
> 2020-05-06 08:10:49,413 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: taskmanager.numberOfTaskSlots, 24
> 2020-05-06 08:10:49,413 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: parallelism.default, 1
> 2020-05-06 08:10:49,413 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability, zookeeper
> 2020-05-06 08:10:49,413 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.zookeeper.path.root, /flink
> 2020-05-06 08:10:49,414 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.storageDir, hdfs:///user/flink/ha/
> 2020-05-06 08:10:49,414 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: high-availability.zookeeper.quorum, DAC-E04-W06:2181
> 2020-05-06 08:10:49,414 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: jobmanager.execution.failover-strategy, region
> 2020-05-06 08:10:49,415 INFO
>  org.apache.flink.configuration.GlobalConfiguration- Loading
> configuration property: rest.port, 8082
> 2020-05-06 08:10:50,386 ERROR
> org.apache.flink.kubernetes.cli.KubernetesSessionCli  - Error while
> running the Flink session.
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]
>  for kind: [Service]  with name:
> [flink-cluster-5c12bd50-a540-4614-96d0-549785a8bc62]  in namespace:
> [default]  failed.
> at
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
> at
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:231)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:164)
> at
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getService(Fabric8FlinkKubeClient.java:334)
> at
> org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.getInternalService(Fabric8FlinkKubeClient.java:246)
> at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.run(KubernetesSessionCli.java:104)
> at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.lambda$main$0(KubernetesSessionCli.java:185)
> at
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
> at
> org.apache.flink.kubernetes.cli.KubernetesSessionCli.main(KubernetesSessionCli.java:185)
> Caused by: java.net.SocketException: Broken pipe (Write failed)
> at java.net.SocketOutputStream.socketWrite0(Native Method)
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
> at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:431)
> at sun.security.ssl.OutputRecord.write(OutputRecord.java:417)
> at
> sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:894)
> at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:865)
> at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
> at org.apache.flink.kubernetes.shadded.okio.Okio$1.write(Okio.java:79)
> at
> org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$1.write(AsyncTimeout.java:180)
> at
> org.apache.flink.kubernetes.shadded.okio.RealBufferedSink.flush(RealBufferedSink.java:224)
> at
> org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Writer.settings(Http2Writer.java:203)
> at
> org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Connection.start(Http2Connection.java:515)
> at
> org.apache.flink.kubernetes.shadded.okhttp3.internal.http2.Http2Connection.start(Http2Connection.java:505)
> at
> org.apache.flink.kubernetes.shadded.okhttp3.internal.connection.RealConnection.startHttp2(RealConnection.java:298)
> at
>