I have a version of the memcached example running on a docker image, and now I'd like to port that to a real cluster (to get a working starting point for the actual service I want to run in slider).
I suspect the configuration issues could be in the zoo keeper or yarn service registry configuration. Running the following (sanitized) commands: slider install-package --package /home/foolish_ewe/mybuild/incubator-slider/app-packages/memcached/jmemcached-1.0.1.zip --name jmemcached --debug --replacepkg slider create jmemcached --template /home/foolish_ewe/mybuild/incubator-slider/app-packages/memcached/appConfig.json --resources /home/foolish_ewe/mybuild/incubator-slider/app-packages/memcached/resources-default.json --manager rm.yarn.cluster.mycompany.com:8032 --debug --zkhosts zookeeper.cluster.mycompany.com:2181 --zkpath /slider_test/clustername/ I'm seeing failed zookeeper connections to localhost:2181 the AM logs: 2017-05-02 16:16:07,992 [main-SendThread(localhost:2181)] WARN zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused How can I tweak the connection string? If I look at slider/conf/slider-client.xml, I am still using the default configuration and see the following setting: <property> <name>hadoop.registry.zk.quorum</name> <value>@ZK-QUORUM</value> </property> First off, I'm not sure about the @ZK-QUORUM syntax means, overriding this with with connection string with a single host provides no relief from the dreaded symptom. The AM logs look like: 2017-05-02 16:16:07,401 [main] INFO appmaster.SliderAppMaster - Registry service username =fooolish_ewe 2017-05-02 16:16:07,462 [main] INFO appmaster.SliderAppMaster - Service Record ServiceRecord{description='Slider Application Master'; external endpoints: {{ "api" : "http://", "addressType" : "uri", "protocolType" : "webui", "addresses" : [ { "uri" : "http://cluster.mycompany.com:42734" } ] }; { "api" : "classpath:org.apache.slider.management", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/mgmt" } ] }; { "api" : "classpath:org.apache.slider.publisher", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/publisher" } ] }; { "api" : "classpath:org.apache.slider.registry", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/registry" } ] }; { "api" : "classpath:org.apache.slider.publisher.configurations", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/publisher/slider" } ] }; { "api" : "classpath:org.apache.slider.publisher.exports", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "http://cluster.mycompany.com:42734/ws/v1/slider/publisher/exports" } ] }; }; internal endpoints: {{ "api" : "classpath:org.apache.slider.agents.secure", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "https://cluster.mycompany.com:40466/ws/v1/slider/agents" } ] }; { "api" : "classpath:org.apache.slider.agents.oneway", "addressType" : "uri", "protocolType" : "REST", "addresses" : [ { "uri" : "https://cluster.mycompany.com:59141/ws/v1/slider/agents" } ] }; }, attributes: {"yarn:id"="application_1492599342357_0064" "yarn:persistence"="application" }} 2017-05-02 16:16:07,992 [main-SendThread(localhost:2181)] WARN zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) [Several repetitions of the previous error omitted for clarity and then...] 2017-05-02 16:16:12,877 [780172372@qtp-747004588-0] ERROR webapp.Dispatcher - error handling URI: /slideram java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:164) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1286) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.NullPointerException at org.apache.slider.providers.AbstractProviderService.buildEndpointDetails(AbstractProviderService.java:352) at org.apache.slider.providers.AbstractProviderService.buildMonitorDetails(AbstractProviderService.java:337) at org.apache.slider.providers.agent.AgentProviderService.buildMonitorDetails(AgentProviderService.java:810) at org.apache.slider.server.appmaster.web.view.IndexBlock.addProviderServiceOptions(IndexBlock.java:129) at org.apache.slider.server.appmaster.web.view.IndexBlock.doIndex(IndexBlock.java:85) at org.apache.slider.server.appmaster.web.view.IndexBlock.render(IndexBlock.java:60) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212) at org.apache.slider.server.appmaster.web.SliderAMController.index(SliderAMController.java:47) ... 39 more 2017-05-02 16:16:13,495 [main-SendThread(localhost:2181)] WARN zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) [More repetitions of the previous error deleted] 2017-05-02 16:16:22,474 [main] ERROR curator.ConnectionState - Connection timed out for connection string (localhost:2181) and timeout (15000) / elapsed (18944) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:113) at org.apache.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:457) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:239) at org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:234) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:215) at org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:42) at org.apache.hadoop.registry.client.impl.zk.CuratorService.zkDelete(CuratorService.java:673) at org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.delete(RegistryOperationsService.java:160) at org.apache.slider.server.services.yarnregistry.YarnRegistryViewForProviders.putService(YarnRegistryViewForProviders.java:186) at org.apache.slider.server.services.yarnregistry.YarnRegistryViewForProviders.registerSelf(YarnRegistryViewForProviders.java:224) at org.apache.slider.server.appmaster.SliderAppMaster.registerServiceInstance(SliderAppMaster.java:1084) at org.apache.slider.server.appmaster.SliderAppMaster.createAndRunCluster(SliderAppMaster.java:885) at org.apache.slider.server.appmaster.SliderAppMaster.runService(SliderAppMaster.java:525) at org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:188) at org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:475) at org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:403) at org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:630) at org.apache.slider.server.appmaster.SliderAppMaster.main(SliderAppMaster.java:2240) 2017-05-02 16:16:23,403 [main-SendThread(localhost:2181)] WARN zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2017-05-02 16:16:24,504 [main-SendThread(localhost:2181)] WARN zookeeper.ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) With best regards: Bill