Filed https://issues.apache.org/jira/browse/SLIDER-1124

On Thu, May 19, 2016 at 7:18 AM, Billie Rinaldi <billie.rina...@gmail.com>
wrote:

> I agree, it would be good to throw an Exception if the format of the port
> range is bad.
>
> On Wed, May 18, 2016 at 7:15 PM, Manoj Samel <manojsamelt...@gmail.com>
> wrote:
>
> > Found a issue in the appConfig.json. Json was updated using IDE, and
> while
> > at first glance it looks like "32201-33100", it was really "32201–33100"
> .
> > The character in the second case is not a "-" but actually three
> characters
> > that together appear somewhat like it (but its wider and lower than - ).
> >
> > So this is neither a "," separated list or "-" range as the code expects
> > and it errors out.
> >
> > It would be useful if such "bad" range is caught up earlier with clearer
> > message like invalid or unparsable port range specified.
> >
> > Looking at the code
> >
> > SliderAppMaster.java buildPortScanner() reads the
> > key KEY_ALLOWED_PORT_RANGE and passes the associated value
> > to portScanner.setPortRange().
> >
> > In PortScanner.java setPortRange() , it first tries to split on "," or
> else
> > tries to split on "-". However, there is no "else" part if it does not
> > finds the "-" pattern (which will happen in above case). Since there is
> no
> > else part, there is no exception etc. thrown at this point
> > and this.remainingPortsToCheck gets set to a empty set, resulting in more
> > obscure error later in getAvailablePortViaPortArray().
> >
> > I think it would be good to have a "else" part added to range matchers
> > below and a exception with input text thrown at that point - so the
> > misconfigured value will be obvious
> >
> >       Matcher m = SINGLE_NUMBER.matcher(range.trim());
> >       if (m.find()) {
> >         inputPorts.add(Integer.parseInt(m.group()));
> >       } else {
> >         m = NUMBER_RANGE.matcher(range.trim());
> >         if (m.find()) {
> >         } // else is missing ..... Add with a exception ???
> >
> > Thoughts ?
> >
> > Manoj
> >
> >
> > On Mon, May 16, 2016 at 6:45 PM, Manoj Samel <manojsamelt...@gmail.com>
> > wrote:
> >
> > > Here is slider.log for Slider AM. Note the port range does not always
> > > fails - it does work occassionally !
> > >
> > >
> > > "appConf" :{
> > >   "schema" : "http://example.org/specification/v2.0.0";,
> > >   "metadata" : { },
> > >   "global" : {
> > >     "site.global.qaprod___gms.tenant" : "qaprod___gms",
> > >     "site.global.memory_val" : "200m",
> > >     "site.global.service_mgmt_log_dir" : "/data/processlauncher/log",
> > >     "site.global.xms_val" : "128m",
> > >     "site.global.slider.allowed.ports" : "32201–33100",
> > >     "java_home" : "/usr/java/latest",
> > >     "site.global.xmx_val" : "256m",
> > >     "zookeeper.quorum" : "host1:2181,host2:2181,host3:2181",
> > >     "site.global.qaprod___gms.principal" : "qaprod.gms",
> > >     "env.MALLOC_ARENA_MAX" : "4",
> > >     "site.global.qaprod___gms.http_port" :
> > > "${qaprod___gms.ALLOCATED_PORT}{PER_CONTAINER}",
> > >     "site.global.bas_host" : "host4",
> > >     "site.global.app_version" : "1.0.0",
> > >     "site.global.bas_port" : "9009",
> > >     "application.def" : ".slider/package/spas/spas-1.0.0.zip",
> > >     "site.global.qaprod___gms.keytabfile" :
> > > "/etc/hadoop/conf/qaprod.gms.keytab",
> > >     "zookeeper.hosts" : "host1,host2,host3",
> > >     "zookeeper.path" : "/services/slider/users/slideradmin/spas",
> > >     "site.global.additional_cp" : "/usr/lib/hadoop/lib/*",
> > >     "site.global.qaprod___gms.https_port" :
> > > "${qaprod___gms.ALLOCATED_PORT}{PER_CONTAINER}",
> > >     "site.fs.defaultFS" : "hdfs://bdsqaprod2",
> > >     "site.dfs.namenode.kerberos.principal" : "hdfs/_HOST@BIGDATA",
> > >     "site.fs.default.name" : "hdfs://bdsqaprod2"
> > >   },
> > >   "credentials" : { },
> > >   "components" : {
> > >     "slider-appmaster" : {
> > >       "jvm.heapsize" : "256M",
> > >       "slider.keytab.principal.name" : "slideradmin",
> > >       "slider.am.keytab.local.path" :
> > "/etc/hadoop/conf/slideradmin.keytab"
> > >     },
> > >     "qaprod___gms" : { }
> > >   }
> > > }}:
> > > 2016-05-16 19:24:51,456 [main] INFO  appmaster.SliderAppMaster - Conf
> dir
> > >
> >
> /hadoop/disk4/yarn/local/usercache/slideradmin/appcache/application_1462834537586_7708/container_e15_1462834537586_7708_09_000001/propagatedconf
> > > does not exist.
> > > 2016-05-16 19:24:51,456 [main] INFO  appmaster.SliderAppMaster - Parent
> > > dir
> > >
> >
> /hadoop/disk4/yarn/local/usercache/slideradmin/appcache/application_1462834537586_7708/container_e15_1462834537586_7708_09_000001:
> > > tmp
> > > launch_container.sh
> > > container_tokens
> > > lib
> > > confdir
> > > expandedarchive
> > >
> > > 2016-05-16 19:24:51,457 [main] INFO  appmaster.SliderAppMaster -
> Cluster
> > > provider type is agent
> > > 2016-05-16 19:24:51,500 [main] INFO  appmaster.SliderAppMaster - RM is
> at
> > > host9/10.52.88.218:23130
> > > 2016-05-16 19:24:51,553 [main] INFO  appmaster.SliderAppMaster - AM for
> > ID
> > > 7708
> > > 2016-05-16 19:24:51,596 [main] INFO  impl.NMClientAsyncImpl - Upper
> bound
> > > of the thread pool size is 500
> > > 2016-05-16 19:24:51,597 [main] INFO
> > impl.ContainerManagementProtocolProxy
> > > - yarn.client.max-cached-nodemanagers-proxies : 0
> > > 2016-05-16 19:24:51,654 [main] ERROR main.ServiceLauncher - No
> available
> > > ports found in configured range {}
> > > 2016-05-16 19:24:51,656 [main] INFO  util.ExitUtil - Exiting with
> status
> > 77
> > > 2016-05-16 19:24:51,658 [Thread-1] INFO  appmaster.SliderAppMaster -
> > > Process has exited with exit code 0 mapped to 0 -ignoring
> > > 2016-05-16 19:24:51,659 [AMRM Callback Handler Thread] INFO
> > >  impl.AMRMClientAsyncImpl - Interrupted while waiting for queue
> > > java.lang.InterruptedException
> > >         at
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
> > >         at
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
> > >         at
> > >
> >
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> > >         at
> > >
> >
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
> > >
> > >
> > >
> > > On Mon, May 16, 2016 at 6:07 PM, Billie Rinaldi <
> > billie.rina...@gmail.com>
> > > wrote:
> > >
> > >> Manoj,
> > >>
> > >> Attachments don't come through on ASF mailing lists.  What value are
> you
> > >> using for site.global.slider.allowed.ports?
> > >>
> > >> Billie
> > >>
> > >> On Mon, May 16, 2016 at 4:20 PM, Manoj Samel <
> manojsamelt...@gmail.com>
> > >> wrote:
> > >>
> > >> > Hi Tim,
> > >> >
> > >> > Thanks for your reply
> > >> >
> > >> > In my case, the slider AM itself fails to start. The app remains in
> > >> > accepted state and the container_001 log (the slider AM) errors. I
> > have
> > >> > attached the container_001 slider.log below. It has all the configs,
> > >> > including app config (host names and some other info is altered)
> > >> >
> > >> > The specified port range is not being used on cluster. Also, the
> error
> > >> is
> > >> > intermittent i.e. it works for a port range some times and does not
> > >> works
> > >> > for same for other times. It I removed the allocated_port, it works
> > >> fine.
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > On Mon, May 16, 2016 at 2:41 PM, Tim I <t...@timisrael.com> wrote:
> > >> >
> > >> >> I'm pretty sure the PER_CONTAINER statement works as follows for a
> > each
> > >> >> instantiation of a particular service within a slider app:
> > >> >> 1) check the allowable port range
> > >> >> 2) look for an available port on the machine within the allowed
> port
> > >> range
> > >> >> 3) launch with selected port
> > >> >>
> > >> >> In the case of Accumulo, if you're running two tservers on the same
> > box
> > >> >> and
> > >> >> have a range of ports that they can use, they will not conflict.
> > >> However,
> > >> >> if you statically set the tserver's port and two launch on the same
> > >> >> machine, they will both try to use it and conflict.  One will
> > >> eventually
> > >> >> relaunch on a node with an available port provided resources exist
> > and
> > >> you
> > >> >> don't exceed your failure threshold.
> > >> >>
> > >> >> If you are having an issue with your app master not launching, it
> > >> might be
> > >> >> because you specified a port or port range that is fully utilized
> by
> > >> >> another application on the same box.
> > >> >>
> > >> >> Is your cluster idle or do you have other slider apps running?  Do
> > you
> > >> >> have
> > >> >> more complete output of the logs and possibly the appConfig that
> you
> > >> can
> > >> >> share?  Are you sure it's the AM failing to start and not a service
> > >> within
> > >> >> your slider app?
> > >> >>
> > >> >> Tim
> > >> >> On May 16, 2016 4:02 PM, "Manoj Samel" <manojsamelt...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >> Hello,
> > >> >>
> > >> >> When using ALLOCATED_PORT clause, there is a option
> "PER_CONTAINER".
> > >> >>
> > >> >> Can someone explain what does "PER_CONTAINER" option does ? It says
> > >> keep
> > >> >> port allocation private to container. What does that means ? If
> > >> multiple
> > >> >> containers are chosen to on same host machine, will this cause
> issue
> > ?
> > >> >>
> > >> >> When using a specific port range using
> > >> site.global.slider.allowed.ports, I
> > >> >> am getting frequent errors in starting slider AM. The log says
> > >> >> *2016-05-14 20:39:29,236 [main] ERROR main.ServiceLauncher - No
> > >> available
> > >> >> ports found in configured range {}*
> > >> >> 2016-05-14 20:39:29,237 [main] INFO  util.ExitUtil - Exiting with
> > >> status
> > >> >> 77
> > >> >>
> > >> >> The error is not hard, sometimes same port range works file. I am
> > >> >> wondering
> > >> >> if this has anything to do with PER_CONTAINER ...
> > >> >>
> > >> >> This is on slider 0.80 on Hadoop 2.6 secured cluster ...
> > >> >>
> > >> >> Than
> > >> >>
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to