Re: The issue of "Failed to shutdown socket with fd xx: Transport endpoint is not connected" on Mesos master

Nan Xiao Mon, 28 Dec 2015 19:32:49 -0800

Hi Klaus,

Firstly, thanks very much for your answer!


The km processes are all live:
root     129474 128024  2 22:26 pts/0    00:00:00 km apiserver
--address=15.242.100.60 --etcd-servers=http://15.242.100.60:4001
--service-cluster-ip-range=10.10.10.0/24 --port=8888
--cloud-provider=mesos --cloud-config=mesos-cloud.conf --secure-port=0
--v=1
root     129509 128024  2 22:26 pts/0    00:00:00 km
controller-manager --master=15.242.100.60:8888 --cloud-provider=mesos
--cloud-config=./mesos-cloud.conf --v=1
root     129538 128024  0 22:26 pts/0    00:00:00 km scheduler
--address=15.242.100.60 --mesos-master=15.242.100.56:5050
--etcd-servers=http://15.242.100.60:4001 --mesos-user=root
--api-servers=15.242.100.60:8888 --cluster-dns=10.10.10.10
--cluster-domain=cluster.local --v=2

All the logs are also seem OK, except the logs from scheduler.log:
......
I1228 22:26:37.883092  129538 messenger.go:381] Receiving message
mesos.internal.InternalMasterChangeDetected from
scheduler(1)@15.242.100.60:33077
I1228 22:26:37.883225  129538 scheduler.go:374] New master
master@15.242.100.56:5050 detected
I1228 22:26:37.883268  129538 scheduler.go:435] No credentials were
provided. Attempting to register scheduler without authentication.
I1228 22:26:37.883356  129538 scheduler.go:928] Registering with
master: master@15.242.100.56:5050
I1228 22:26:37.883460  129538 messenger.go:187] Sending message
mesos.internal.RegisterFrameworkMessage to master@15.242.100.56:5050
I1228 22:26:37.883504  129538 scheduler.go:881] will retry
registration in 1.209320575s if necessary
I1228 22:26:37.883758  129538 http_transporter.go:193] Sending message
to master@15.242.100.56:5050 via http
I1228 22:26:37.883873  129538 http_transporter.go:587] libproc target
URL http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage
I1228 22:26:39.093560  129538 scheduler.go:928] Registering with
master: master@15.242.100.56:5050
I1228 22:26:39.093659  129538 messenger.go:187] Sending message
mesos.internal.RegisterFrameworkMessage to master@15.242.100.56:5050
I1228 22:26:39.093702  129538 scheduler.go:881] will retry
registration in 3.762036352s if necessary
I1228 22:26:39.093765  129538 http_transporter.go:193] Sending message
to master@15.242.100.56:5050 via http
I1228 22:26:39.093847  129538 http_transporter.go:587] libproc target
URL http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage
......

>From the log, the Mesos master rejected the k8s's registeration, and
k8s retry constantly.

Have you met this issue before? Thanks very much in advance!
Best Regards
Nan Xiao


On Mon, Dec 28, 2015 at 7:26 PM, Klaus Ma <klaus1982...@gmail.com> wrote:
> It seems Kubernetes is down; would you help to check kubernetes's status
> (km)?
>
> ----
> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer
> Platform Symphony/DCOS Development & Support, STG, IBM GCG
> +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me
>
> On Mon, Dec 28, 2015 at 6:35 PM, Nan Xiao <xiaonan830...@gmail.com> wrote:
>>
>> Hi all,
>>
>> Greetings from me!
>>
>> I am trying to follow this tutorial
>>
>> (https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/mesos.md)
>> to deploy "k8s on Mesos" on local machines: The k8s is the newest
>> master branch, and Mesos is the 0.26 edition.
>>
>> After running Mesos master(IP:15.242.100.56), Mesos
>> slave(IP:15.242.100.16),, and the k8s(IP:15.242.100.60), I can see the
>> following logs from Mesos master:
>>
>> ......
>> I1227 22:52:34.494478  8069 master.cpp:4269] Received update of slave
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 at slave(1)@15.242.100.16:5051
>> (pqsfc016.ftc.rdlabs.hpecorp.net) with total oversubscribed resources
>> I1227 22:52:34.494940  8065 hierarchical.cpp:400] Slave
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0
>> (pqsfc016.ftc.rdlabs.hpecorp.net) updated with oversubscribed
>> resources  (total: cpus(*):32; mem(*):127878; disk(*):4336;
>> ports(*):[31000-32000], allocated: )
>> I1227 22:53:06.740757  8053 http.cpp:334] HTTP GET for
>> /master/state.json from 15.242.100.60:56219 with
>> User-Agent='Go-http-client/1.1'
>> I1227 22:53:07.736419 8065 http.cpp:334] HTTP GET for
>> /master/state.json from 15.242.100.60:56241 with
>> User-Agent='Go-http-client/1.1'
>> I1227 22:53:07.767196  8070 http.cpp:334] HTTP GET for
>> /master/state.json from 15.242.100.60:56252 with
>> User-Agent='Go-http-client/1.1'
>> I1227 22:53:08.808171  8053 http.cpp:334] HTTP GET for
>> /master/state.json from 15.242.100.60:56272 with
>> User-Agent='Go-http-client/1.1'
>> I1227 22:53:08.815811  8060 master.cpp:2176] Received SUBSCRIBE call
>> for framework 'Kubernetes' at scheduler(1)@15.242.100.60:59488
>> I1227 22:53:08.816182 8060 master.cpp:2247] Subscribing framework
>> Kubernetes with checkpointing enabled and capabilities [  ]
>> I1227 22:53:08.817294  8052 hierarchical.cpp:195] Added framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>> I1227 22:53:08.817464  8050 master.cpp:1122] Framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> scheduler(1)@15.242.100.60:59488 disconnected
>> E1227 22:53:08.817497 8073 process.cpp:1911] Failed to shutdown
>> socket with fd 17: Transport endpoint is not connected
>> I1227 22:53:08.817533  8050 master.cpp:2472] Disconnecting framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> scheduler(1)@15.242.100.60:59488
>> I1227 22:53:08.817595 8050 master.cpp:2496] Deactivating framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> scheduler(1)@15.242.100.60:59488
>> I1227 22:53:08.817797 8050 master.cpp:1146] Giving framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at
>> scheduler(1)@15.242.100.60:59488 7625.14222623576weeks to failover
>> W1227 22:53:08.818389 8062 master.cpp:4840] Master returning
>> resources offered to framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 because the framework has
>> terminated or is inactive
>> I1227 22:53:08.818397  8052 hierarchical.cpp:273] Deactivated
>> framework 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>> I1227 22:53:08.819046  8066 hierarchical.cpp:744] Recovered
>> cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000]
>> (total: cpus(*):32; mem(*):127878; disk(*):4336;
>> ports(*):[31000-32000], allocated: ) on slave
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 from framework
>> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000
>> ......
>>
>> I can't figure out why Mesos master complains "Failed to shutdown
>> socket with fd 17: Transport endpoint is not connected".
>> Could someone give some clues on this issue?
>>
>> Thanks very much in advance!
>>
>> Best Regards
>> Nan Xiao
>
>

Re: The issue of "Failed to shutdown socket with fd xx: Transport endpoint is not connected" on Mesos master

Reply via email to