Filed https://issues.apache.org/jira/browse/SLIDER-1124
On Thu, May 19, 2016 at 7:18 AM, Billie Rinaldi <billie.rina...@gmail.com> wrote: > I agree, it would be good to throw an Exception if the format of the port > range is bad. > > On Wed, May 18, 2016 at 7:15 PM, Manoj Samel <manojsamelt...@gmail.com> > wrote: > > > Found a issue in the appConfig.json. Json was updated using IDE, and > while > > at first glance it looks like "32201-33100", it was really "32201–33100" > . > > The character in the second case is not a "-" but actually three > characters > > that together appear somewhat like it (but its wider and lower than - ). > > > > So this is neither a "," separated list or "-" range as the code expects > > and it errors out. > > > > It would be useful if such "bad" range is caught up earlier with clearer > > message like invalid or unparsable port range specified. > > > > Looking at the code > > > > SliderAppMaster.java buildPortScanner() reads the > > key KEY_ALLOWED_PORT_RANGE and passes the associated value > > to portScanner.setPortRange(). > > > > In PortScanner.java setPortRange() , it first tries to split on "," or > else > > tries to split on "-". However, there is no "else" part if it does not > > finds the "-" pattern (which will happen in above case). Since there is > no > > else part, there is no exception etc. thrown at this point > > and this.remainingPortsToCheck gets set to a empty set, resulting in more > > obscure error later in getAvailablePortViaPortArray(). > > > > I think it would be good to have a "else" part added to range matchers > > below and a exception with input text thrown at that point - so the > > misconfigured value will be obvious > > > > Matcher m = SINGLE_NUMBER.matcher(range.trim()); > > if (m.find()) { > > inputPorts.add(Integer.parseInt(m.group())); > > } else { > > m = NUMBER_RANGE.matcher(range.trim()); > > if (m.find()) { > > } // else is missing ..... Add with a exception ??? > > > > Thoughts ? > > > > Manoj > > > > > > On Mon, May 16, 2016 at 6:45 PM, Manoj Samel <manojsamelt...@gmail.com> > > wrote: > > > > > Here is slider.log for Slider AM. Note the port range does not always > > > fails - it does work occassionally ! > > > > > > > > > "appConf" :{ > > > "schema" : "http://example.org/specification/v2.0.0", > > > "metadata" : { }, > > > "global" : { > > > "site.global.qaprod___gms.tenant" : "qaprod___gms", > > > "site.global.memory_val" : "200m", > > > "site.global.service_mgmt_log_dir" : "/data/processlauncher/log", > > > "site.global.xms_val" : "128m", > > > "site.global.slider.allowed.ports" : "32201–33100", > > > "java_home" : "/usr/java/latest", > > > "site.global.xmx_val" : "256m", > > > "zookeeper.quorum" : "host1:2181,host2:2181,host3:2181", > > > "site.global.qaprod___gms.principal" : "qaprod.gms", > > > "env.MALLOC_ARENA_MAX" : "4", > > > "site.global.qaprod___gms.http_port" : > > > "${qaprod___gms.ALLOCATED_PORT}{PER_CONTAINER}", > > > "site.global.bas_host" : "host4", > > > "site.global.app_version" : "1.0.0", > > > "site.global.bas_port" : "9009", > > > "application.def" : ".slider/package/spas/spas-1.0.0.zip", > > > "site.global.qaprod___gms.keytabfile" : > > > "/etc/hadoop/conf/qaprod.gms.keytab", > > > "zookeeper.hosts" : "host1,host2,host3", > > > "zookeeper.path" : "/services/slider/users/slideradmin/spas", > > > "site.global.additional_cp" : "/usr/lib/hadoop/lib/*", > > > "site.global.qaprod___gms.https_port" : > > > "${qaprod___gms.ALLOCATED_PORT}{PER_CONTAINER}", > > > "site.fs.defaultFS" : "hdfs://bdsqaprod2", > > > "site.dfs.namenode.kerberos.principal" : "hdfs/_HOST@BIGDATA", > > > "site.fs.default.name" : "hdfs://bdsqaprod2" > > > }, > > > "credentials" : { }, > > > "components" : { > > > "slider-appmaster" : { > > > "jvm.heapsize" : "256M", > > > "slider.keytab.principal.name" : "slideradmin", > > > "slider.am.keytab.local.path" : > > "/etc/hadoop/conf/slideradmin.keytab" > > > }, > > > "qaprod___gms" : { } > > > } > > > }}: > > > 2016-05-16 19:24:51,456 [main] INFO appmaster.SliderAppMaster - Conf > dir > > > > > > /hadoop/disk4/yarn/local/usercache/slideradmin/appcache/application_1462834537586_7708/container_e15_1462834537586_7708_09_000001/propagatedconf > > > does not exist. > > > 2016-05-16 19:24:51,456 [main] INFO appmaster.SliderAppMaster - Parent > > > dir > > > > > > /hadoop/disk4/yarn/local/usercache/slideradmin/appcache/application_1462834537586_7708/container_e15_1462834537586_7708_09_000001: > > > tmp > > > launch_container.sh > > > container_tokens > > > lib > > > confdir > > > expandedarchive > > > > > > 2016-05-16 19:24:51,457 [main] INFO appmaster.SliderAppMaster - > Cluster > > > provider type is agent > > > 2016-05-16 19:24:51,500 [main] INFO appmaster.SliderAppMaster - RM is > at > > > host9/10.52.88.218:23130 > > > 2016-05-16 19:24:51,553 [main] INFO appmaster.SliderAppMaster - AM for > > ID > > > 7708 > > > 2016-05-16 19:24:51,596 [main] INFO impl.NMClientAsyncImpl - Upper > bound > > > of the thread pool size is 500 > > > 2016-05-16 19:24:51,597 [main] INFO > > impl.ContainerManagementProtocolProxy > > > - yarn.client.max-cached-nodemanagers-proxies : 0 > > > 2016-05-16 19:24:51,654 [main] ERROR main.ServiceLauncher - No > available > > > ports found in configured range {} > > > 2016-05-16 19:24:51,656 [main] INFO util.ExitUtil - Exiting with > status > > 77 > > > 2016-05-16 19:24:51,658 [Thread-1] INFO appmaster.SliderAppMaster - > > > Process has exited with exit code 0 mapped to 0 -ignoring > > > 2016-05-16 19:24:51,659 [AMRM Callback Handler Thread] INFO > > > impl.AMRMClientAsyncImpl - Interrupted while waiting for queue > > > java.lang.InterruptedException > > > at > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014) > > > at > > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048) > > > at > > > > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > > at > > > > > > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274) > > > > > > > > > > > > On Mon, May 16, 2016 at 6:07 PM, Billie Rinaldi < > > billie.rina...@gmail.com> > > > wrote: > > > > > >> Manoj, > > >> > > >> Attachments don't come through on ASF mailing lists. What value are > you > > >> using for site.global.slider.allowed.ports? > > >> > > >> Billie > > >> > > >> On Mon, May 16, 2016 at 4:20 PM, Manoj Samel < > manojsamelt...@gmail.com> > > >> wrote: > > >> > > >> > Hi Tim, > > >> > > > >> > Thanks for your reply > > >> > > > >> > In my case, the slider AM itself fails to start. The app remains in > > >> > accepted state and the container_001 log (the slider AM) errors. I > > have > > >> > attached the container_001 slider.log below. It has all the configs, > > >> > including app config (host names and some other info is altered) > > >> > > > >> > The specified port range is not being used on cluster. Also, the > error > > >> is > > >> > intermittent i.e. it works for a port range some times and does not > > >> works > > >> > for same for other times. It I removed the allocated_port, it works > > >> fine. > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > On Mon, May 16, 2016 at 2:41 PM, Tim I <t...@timisrael.com> wrote: > > >> > > > >> >> I'm pretty sure the PER_CONTAINER statement works as follows for a > > each > > >> >> instantiation of a particular service within a slider app: > > >> >> 1) check the allowable port range > > >> >> 2) look for an available port on the machine within the allowed > port > > >> range > > >> >> 3) launch with selected port > > >> >> > > >> >> In the case of Accumulo, if you're running two tservers on the same > > box > > >> >> and > > >> >> have a range of ports that they can use, they will not conflict. > > >> However, > > >> >> if you statically set the tserver's port and two launch on the same > > >> >> machine, they will both try to use it and conflict. One will > > >> eventually > > >> >> relaunch on a node with an available port provided resources exist > > and > > >> you > > >> >> don't exceed your failure threshold. > > >> >> > > >> >> If you are having an issue with your app master not launching, it > > >> might be > > >> >> because you specified a port or port range that is fully utilized > by > > >> >> another application on the same box. > > >> >> > > >> >> Is your cluster idle or do you have other slider apps running? Do > > you > > >> >> have > > >> >> more complete output of the logs and possibly the appConfig that > you > > >> can > > >> >> share? Are you sure it's the AM failing to start and not a service > > >> within > > >> >> your slider app? > > >> >> > > >> >> Tim > > >> >> On May 16, 2016 4:02 PM, "Manoj Samel" <manojsamelt...@gmail.com> > > >> wrote: > > >> >> > > >> >> Hello, > > >> >> > > >> >> When using ALLOCATED_PORT clause, there is a option > "PER_CONTAINER". > > >> >> > > >> >> Can someone explain what does "PER_CONTAINER" option does ? It says > > >> keep > > >> >> port allocation private to container. What does that means ? If > > >> multiple > > >> >> containers are chosen to on same host machine, will this cause > issue > > ? > > >> >> > > >> >> When using a specific port range using > > >> site.global.slider.allowed.ports, I > > >> >> am getting frequent errors in starting slider AM. The log says > > >> >> *2016-05-14 20:39:29,236 [main] ERROR main.ServiceLauncher - No > > >> available > > >> >> ports found in configured range {}* > > >> >> 2016-05-14 20:39:29,237 [main] INFO util.ExitUtil - Exiting with > > >> status > > >> >> 77 > > >> >> > > >> >> The error is not hard, sometimes same port range works file. I am > > >> >> wondering > > >> >> if this has anything to do with PER_CONTAINER ... > > >> >> > > >> >> This is on slider 0.80 on Hadoop 2.6 secured cluster ... > > >> >> > > >> >> Than > > >> >> > > >> > > > >> > > > >> > > > > > > > > >