Thanks Billie for role group explanation, seems like a good feature to have !
Thinking a bit about the role group, following are my thoughts ... 1. It seems user will either specify explicit component names for each component (as in my case) OR will have a group like ZOOKEEPER where slider generates unique component names like ZOOKEEPER_n (n=1,2..) 2. If #1 is true, then I feel it would be cleaner design to explicitly introduce a new entity called the group. In metaInfo.xml, one <component> element can either have <name> or have <group> but not both. 1. When <name> is specified, the code should not parse for a group 2. When <group> is specified, the code will generate individual names and expect group back from container name 3. This will not just solve the current reported problem but will also make it clearer for slider users to understand and implement groups. I will create the jira and add these thoughts to it ... Thanks On Wed, May 10, 2017 at 7:45 AM, Billie Rinaldi <billie.rina...@gmail.com> wrote: > The role group is used in the unique component names feature. This allows > you to specify, say, 3 instances of a component SOLR, and you will get > components SOLR1, SOLR2, and SOLR3. SOLR would be the group for these 3 > components. This is helpful for apps like ZooKeeper, HBase, and Kafka that > like component instances to be distinguishable by unique IDs. It will be > even more helpful once RegistryDNS is available, which will give each a > unique hostname. > > Yeah, we could think about possible ways of solving this problem. Please > open a ticket along the lines of "allow LABEL_MAKER in component names OR > document that it should not be used." > > On Tue, May 9, 2017 at 7:33 PM, Manoj Samel <manojsamelt...@gmail.com> > wrote: > > > I think I found out what causes the NPE above in .92 and why it works in > > version 0.80 > > > > The component name (a.k.a. role name) is "solo___super" i.e. it has 3 "_" > > > > In 0.92, it seems a new concept of "Role Group" is introduced, which was > > not present in 0.80. > > > > In 0.92 - AgentProviderService.java > > > > private static final String LABEL_MAKER = "___"; > > ... > > private String getRoleName(String label) { > > int index1 = label.indexOf(LABEL_MAKER); > > int index2 = label.lastIndexOf(LABEL_MAKER); > > if (index1 == index2) { > > return label.substring(index1 + LABEL_MAKER.length()); > > } else { > > return label.substring(index1 + LABEL_MAKER.length(), index2); > > } > > } > > > > private String getRoleGroup(String label) { > > return label.substring(label.lastIndexOf(LABEL_MAKER) + > > LABEL_MAKER.length()); > > } > > > > So when the real role name contains 3 "_" e.g. for "solo___super", the > > getRoleName on container name will return just "solo" and not > > "solo___super" and that bad role name can cause NPE > > > > Same role name works in 0.80 because in 0.80, there is no concept of > > roleGroup > > > > In 0.80 - AgentProviderService.java > > > > private String getRoleName(String label) { > > return label.substring(label.indexOf(LABEL_MAKER) + > > LABEL_MAKER.length()); > > } > > > > so in 0.80, the role name "solo__super" will return correct role name > from > > container label > > > > > > 1) I tried to understand what the roleGroup is and whats its usage is but > > could not locate any doc. Can someone give few lines of explanation ? > > 2) Should this be considered a bug in .92 ? If not, and if you think > > LABEL_MAKER should not be used in any role names; at least a clear doc > AND > > a clear check when accepting config files will help. I.e. if LABEL_MAKER > > should not be used in any role names; then slider 0.92 should give error > > when creating cluster or accepting configs during any other operations > etc. > > saying invalid role name etc. etc. > > > > Thanks in advance, > > > > > > On Tue, Apr 11, 2017 at 6:09 PM, Manoj Samel <manojsamelt...@gmail.com> > > wrote: > > > > > Hi > > > > > > Running slider 0.92 on CDH 5.5.1 (which is Hadoop 2.6), with Kerberos > > > > > > I am deploying a application with multiple components. The components > > > start but fail to heart beat to slider AM. The slider AM log shows NPE > at > > > container heartbeat URLs as below. > > > > > > I have attached the complete slider AM log > > > > > > 2017-04-12 00:44:05,741 [2011871076@qtp-814377348-5] INFO > > > agent.AgentProviderService - Handling registration: responseId=-1 > > > timestamp=1491957845550 > > > label=container_e95_1476898378926_91401_01_000003___solo___super > > > hostname=node1078 > > > expectedState=INIT > > > actualState=INIT > > > appVersion=null > > > > > > 2017-04-12 00:44:05,741 [2011871076@qtp-814377348-5] INFO > > > agent.AgentProviderService - label: container_e95_1476898378926_ > > 91401_01_000003___solo___super > > > pkg: null > > > 2017-04-12 00:44:05,741 [2011871076@qtp-814377348-5] INFO > > > agent.AgentProviderService - Registration response: > > > RegistrationResponse{response=OK, responseId=0, statusCommands=null} > > > 2017-04-12 00:44:05,871 [Socket Reader #1 for port 32120] INFO > > ipc.Server > > > - Auth successful for slideradmin@BIGDATA (auth:SIMPLE) > > > 2017-04-12 00:44:05,873 [Socket Reader #1 for port 32120] INFO > > authorize.ServiceAuthorizationManager > > > - Authorization successful for slideradmin@BIGDATA (auth:TOKEN) for > > > protocol=interface org.apache.slider.server.appmaster.rpc. > > > SliderClusterProtocolPB > > > 2017-04-12 00:44:15,749 [1005856666@qtp-814377348-7] ERROR > mortbay.log - > > > /ws/v1/slider/agents/container_e95_1476898378926_ > > > 91401_01_000002___pdx__svt___ten85/heartbeat > > > java.lang.NullPointerException > > > at org.apache.slider.providers.agent.AgentProviderService. > > > handleHeartBeat(AgentProviderService.java:1090) > > > at org.apache.slider.server.appmaster.web.rest.agent. > > > AgentResource.heartbeat(AgentResource.java:98) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > > NativeMethodAccessorImpl.java:62) > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > DelegatingMethodAccessorImpl.java:43) > > > at java.lang.reflect.Method.invoke(Method.java:497) > > > at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1. > > invoke( > > > JavaMethodInvokerFactory.java:60) > > > at com.sun.jersey.server.impl.model.method.dispatch. > > > AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch( > > > AbstractResourceMethodDispatchProvider.java:185) > > > at com.sun.jersey.server.impl.model.method.dispatch. > > > ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher. > > > java:75) > > > at com.sun.jersey.server.impl.uri.rules.HttpMethodRule. > > > accept(HttpMethodRule.java:288) > > > at com.sun.jersey.server.impl.uri.rules.RightHandPathRule. > > > accept(RightHandPathRule.java:147) > > > at com.sun.jersey.server.impl.uri.rules.SubLocatorRule. > > > accept(SubLocatorRule.java:134) > > > at com.sun.jersey.server.impl.uri.rules.RightHandPathRule. > > > accept(RightHandPathRule.java:147) > > > at com.sun.jersey.server.impl.uri.rules.ResourceClassRule. > > > accept(ResourceClassRule.java:108) > > > at com.sun.jersey.server.impl.uri.rules.RightHandPathRule. > > > accept(RightHandPathRule.java:147) > > > at com.sun.jersey.server.impl.uri.rules. > RootResourceClassesRule. > > > accept(RootResourceClassesRule.java:84) > > > at com.sun.jersey.server.impl.application.WebApplicationImpl._ > > > handleRequest(WebApplicationImpl.java:1469) > > > at com.sun.jersey.server.impl.application.WebApplicationImpl._ > > > handleRequest(WebApplicationImpl.java:1400) > > > at com.sun.jersey.server.impl.application.WebApplicationImpl. > > > handleRequest(WebApplicationImpl.java:1349) > > > at com.sun.jersey.server.impl.application.WebApplicationImpl. > > > handleRequest(WebApplicationImpl.java:1339) > > > at com.sun.jersey.spi.container.servlet.WebComponent.service( > > > WebComponent.java:416) > > > at com.sun.jersey.spi.container.servlet.ServletContainer. > > > service(ServletContainer.java:537) > > > at com.sun.jersey.spi.container.servlet.ServletContainer. > > > service(ServletContainer.java:699) > > > at javax.servlet.http.HttpServlet.service( > HttpServlet.java:820) > > > at org.mortbay.jetty.servlet.ServletHolder.handle( > > > ServletHolder.java:511) > > > at org.mortbay.jetty.servlet.ServletHandler.handle( > > > ServletHandler.java:401) > > > at org.mortbay.jetty.servlet.SessionHandler.handle( > > > SessionHandler.java:182) > > > at org.mortbay.jetty.handler.ContextHandler.handle( > > > ContextHandler.java:766) > > > at org.mortbay.jetty.handler.HandlerWrapper.handle( > > > HandlerWrapper.java:152) > > > at org.mortbay.jetty.Server.handle(Server.java:326) > > > at org.mortbay.jetty.HttpConnection.handleRequest( > > > HttpConnection.java:542) > > > at org.mortbay.jetty.HttpConnection$RequestHandler. > > > content(HttpConnection.java:945) > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756) > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser. > > > java:212) > > > at org.mortbay.jetty.HttpConnection.handle( > > > HttpConnection.java:404) > > > at org.mortbay.io.nio.SelectChannelEndPoint.run( > > > SelectChannelEndPoint.java:410) > > > at org.mortbay.thread.QueuedThreadPool$PoolThread. > > > run(QueuedThreadPool.java:582) > > > 2017-04-12 00:44:15,750 [2011871076@qtp-814377348-5] ERROR > mortbay.log - > > > /ws/v1/slider/agents/container_e95_1476898378926_ > > > 91401_01_000004___pdx__svt___ten83/heartbeat > > > java.lang.NullPointerException .... > > > > > > > > >