Hi Klaus, Firstly, thanks very much for your answer!
The km processes are all live: root 129474 128024 2 22:26 pts/0 00:00:00 km apiserver --address=15.242.100.60 --etcd-servers=http://15.242.100.60:4001 --service-cluster-ip-range=10.10.10.0/24 --port=8888 --cloud-provider=mesos --cloud-config=mesos-cloud.conf --secure-port=0 --v=1 root 129509 128024 2 22:26 pts/0 00:00:00 km controller-manager --master=15.242.100.60:8888 --cloud-provider=mesos --cloud-config=./mesos-cloud.conf --v=1 root 129538 128024 0 22:26 pts/0 00:00:00 km scheduler --address=15.242.100.60 --mesos-master=15.242.100.56:5050 --etcd-servers=http://15.242.100.60:4001 --mesos-user=root --api-servers=15.242.100.60:8888 --cluster-dns=10.10.10.10 --cluster-domain=cluster.local --v=2 All the logs are also seem OK, except the logs from scheduler.log: ...... I1228 22:26:37.883092 129538 messenger.go:381] Receiving message mesos.internal.InternalMasterChangeDetected from scheduler(1)@15.242.100.60:33077 I1228 22:26:37.883225 129538 scheduler.go:374] New master master@15.242.100.56:5050 detected I1228 22:26:37.883268 129538 scheduler.go:435] No credentials were provided. Attempting to register scheduler without authentication. I1228 22:26:37.883356 129538 scheduler.go:928] Registering with master: master@15.242.100.56:5050 I1228 22:26:37.883460 129538 messenger.go:187] Sending message mesos.internal.RegisterFrameworkMessage to master@15.242.100.56:5050 I1228 22:26:37.883504 129538 scheduler.go:881] will retry registration in 1.209320575s if necessary I1228 22:26:37.883758 129538 http_transporter.go:193] Sending message to master@15.242.100.56:5050 via http I1228 22:26:37.883873 129538 http_transporter.go:587] libproc target URL http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage I1228 22:26:39.093560 129538 scheduler.go:928] Registering with master: master@15.242.100.56:5050 I1228 22:26:39.093659 129538 messenger.go:187] Sending message mesos.internal.RegisterFrameworkMessage to master@15.242.100.56:5050 I1228 22:26:39.093702 129538 scheduler.go:881] will retry registration in 3.762036352s if necessary I1228 22:26:39.093765 129538 http_transporter.go:193] Sending message to master@15.242.100.56:5050 via http I1228 22:26:39.093847 129538 http_transporter.go:587] libproc target URL http://15.242.100.56:5050/master/mesos.internal.RegisterFrameworkMessage ...... >From the log, the Mesos master rejected the k8s's registeration, and k8s retry constantly. Have you met this issue before? Thanks very much in advance! Best Regards Nan Xiao On Mon, Dec 28, 2015 at 7:26 PM, Klaus Ma <klaus1982...@gmail.com> wrote: > It seems Kubernetes is down; would you help to check kubernetes's status > (km)? > > ---- > Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer > Platform Symphony/DCOS Development & Support, STG, IBM GCG > +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me > > On Mon, Dec 28, 2015 at 6:35 PM, Nan Xiao <xiaonan830...@gmail.com> wrote: >> >> Hi all, >> >> Greetings from me! >> >> I am trying to follow this tutorial >> >> (https://github.com/kubernetes/kubernetes/blob/master/docs/getting-started-guides/mesos.md) >> to deploy "k8s on Mesos" on local machines: The k8s is the newest >> master branch, and Mesos is the 0.26 edition. >> >> After running Mesos master(IP:15.242.100.56), Mesos >> slave(IP:15.242.100.16),, and the k8s(IP:15.242.100.60), I can see the >> following logs from Mesos master: >> >> ...... >> I1227 22:52:34.494478 8069 master.cpp:4269] Received update of slave >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 at slave(1)@15.242.100.16:5051 >> (pqsfc016.ftc.rdlabs.hpecorp.net) with total oversubscribed resources >> I1227 22:52:34.494940 8065 hierarchical.cpp:400] Slave >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 >> (pqsfc016.ftc.rdlabs.hpecorp.net) updated with oversubscribed >> resources (total: cpus(*):32; mem(*):127878; disk(*):4336; >> ports(*):[31000-32000], allocated: ) >> I1227 22:53:06.740757 8053 http.cpp:334] HTTP GET for >> /master/state.json from 15.242.100.60:56219 with >> User-Agent='Go-http-client/1.1' >> I1227 22:53:07.736419 8065 http.cpp:334] HTTP GET for >> /master/state.json from 15.242.100.60:56241 with >> User-Agent='Go-http-client/1.1' >> I1227 22:53:07.767196 8070 http.cpp:334] HTTP GET for >> /master/state.json from 15.242.100.60:56252 with >> User-Agent='Go-http-client/1.1' >> I1227 22:53:08.808171 8053 http.cpp:334] HTTP GET for >> /master/state.json from 15.242.100.60:56272 with >> User-Agent='Go-http-client/1.1' >> I1227 22:53:08.815811 8060 master.cpp:2176] Received SUBSCRIBE call >> for framework 'Kubernetes' at scheduler(1)@15.242.100.60:59488 >> I1227 22:53:08.816182 8060 master.cpp:2247] Subscribing framework >> Kubernetes with checkpointing enabled and capabilities [ ] >> I1227 22:53:08.817294 8052 hierarchical.cpp:195] Added framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 >> I1227 22:53:08.817464 8050 master.cpp:1122] Framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> scheduler(1)@15.242.100.60:59488 disconnected >> E1227 22:53:08.817497 8073 process.cpp:1911] Failed to shutdown >> socket with fd 17: Transport endpoint is not connected >> I1227 22:53:08.817533 8050 master.cpp:2472] Disconnecting framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> scheduler(1)@15.242.100.60:59488 >> I1227 22:53:08.817595 8050 master.cpp:2496] Deactivating framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> scheduler(1)@15.242.100.60:59488 >> I1227 22:53:08.817797 8050 master.cpp:1146] Giving framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 (Kubernetes) at >> scheduler(1)@15.242.100.60:59488 7625.14222623576weeks to failover >> W1227 22:53:08.818389 8062 master.cpp:4840] Master returning >> resources offered to framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 because the framework has >> terminated or is inactive >> I1227 22:53:08.818397 8052 hierarchical.cpp:273] Deactivated >> framework 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 >> I1227 22:53:08.819046 8066 hierarchical.cpp:744] Recovered >> cpus(*):32; mem(*):127878; disk(*):4336; ports(*):[31000-32000] >> (total: cpus(*):32; mem(*):127878; disk(*):4336; >> ports(*):[31000-32000], allocated: ) on slave >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-S0 from framework >> 9c3c6c78-0b62-4eaa-b27a-498f172e7fe6-0000 >> ...... >> >> I can't figure out why Mesos master complains "Failed to shutdown >> socket with fd 17: Transport endpoint is not connected". >> Could someone give some clues on this issue? >> >> Thanks very much in advance! >> >> Best Regards >> Nan Xiao > >