The issue I am facing currently with Storm, is nimbus port being incorrect
in storm.yaml. So, doubt I have is, in slider it is enforced by command
ordering that supervisors don't start unless nimbus has started, but when
would markers for nimbus host and port would be replaced with actual values
from nimbus service ?

I am seeing many times, when nimbus take little more time to start, many
supervisors have nimbus port as 0 in their config, which leads to issues,
in starting storm workers.

Another doubt I have is if nimbus re-starts due to some issue, after few
hours, how would the new host and port information would be propagated to
already running supervisors?

On Sat, Apr 11, 2015 at 8:19 AM, Gour Saha <gs...@hortonworks.com> wrote:

> There were no issues. The variable was renamed to be more user friendly.
>
> -Gour
>
> On 4/10/15, 3:48 PM, "Nitin Aggarwal" <nitin3588.aggar...@gmail.com>
> wrote:
>
> >My mistake, we are running slider version 0.50. I believe these configs
> >were changed in 0.60 version.
> >Also, were there any issues around port allocation that could fix this
> >issue, so that I can back-port them to 0.50 ?
> >
> >On Wed, Apr 1, 2015 at 3:19 PM, Sumit Mohanty <smoha...@hortonworks.com>
> >wrote:
> >
> >> "${SUPERVISOR.ALLOCATED_PORT}{DO_NOT_PROPAGATE}",
> >>
> >> should be changed to
> >>
> >> "${SUPERVISOR.ALLOCATED_PORT}{PER_CONTAINER}",
> >>
> >> As a reference -
> >>
> >>
> https://github.com/apache/incubator-slider/blob/develop/app-packages/stor
> >>m/appConfig-default.json
> >>
> >> I think even DEF_ZK_PATH should be DEFAULT_ZK_PATH
> >> ________________________________________
> >> From: Nitin Aggarwal <nitin3588.aggar...@gmail.com>
> >> Sent: Monday, March 30, 2015 10:38 AM
> >> To: dev@slider.incubator.apache.org
> >> Subject: Re: Invalid port 0 for storm instances
> >>
> >> Yes, storm package is built internally.
> >>
> >> App configuration:
> >>
> >> "appConf" :{
> >>   "schema" : "http://example.org/specification/v2.0.0";,
> >>   "metadata" : { },
> >>   "global" : {
> >>     "agent.conf" : "/apps/slider/agent/conf/agent.ini",
> >>     "application.def" :
> >>"/apps/slider/app-packages/storm/storm_v0_9_4.zip",
> >>     "config_types" : "storm-site",
> >>     "create.default.zookeeper.node" : "true",
> >>     "env.MALLOC_ARENA_MAX" : "4",
> >>     "java_home" : "/usr/java/jdk1.7.0_40",
> >>     "package_list" : "files/storm-0.9.4-SNAPSHOT-bin.tar.gz",
> >>     "site.fs.default.name" : "hdfs://XXXXX/",
> >>     "site.fs.defaultFS" : "hdfs://XXXXXX:8020/",
> >>     "site.global.app_install_dir" : "${AGENT_WORK_ROOT}/app/install",
> >>     "site.global.app_log_dir" : "/srv/var/hadoop/logs/deathstar",
> >>     "site.global.app_pid_dir" : "${AGENT_WORK_ROOT}/app/run",
> >>     "site.global.app_root" :
> >> "${AGENT_WORK_ROOT}/app/install/apache-storm-0.9.4-SNAPSHOT",
> >>     "site.global.app_user" : "yarn",
> >>     "site.global.ganglia_enabled" : "false",
> >>     "site.global.ganglia_server_host" : "${NN_HOST}",
> >>     "site.global.ganglia_server_id" : "Application2",
> >>     "site.global.ganglia_server_port" : "8668",
> >>     "site.global.hbase_instance_name" : "XXXXXX",
> >>     "site.global.opentsdb_server_host" : "XXXXX",
> >>     "site.global.opentsdb_server_port" : "4242",
> >>     "site.global.rest_api_admin_port" :
> >>"${STORM_REST_API.ALLOCATED_PORT}",
> >>     "site.global.rest_api_port" : "${STORM_REST_API.ALLOCATED_PORT}",
> >>     "site.global.security_enabled" : "false",
> >>     "site.global.storm_instance_name" : "XXXXX",
> >>     "site.global.user_group" : "hadoop",
> >>     "site.storm-site.dev.zookeeper.path" :
> >> "${AGENT_WORK_ROOT}/app/tmp/dev-storm-zookeeper",
> >>     "site.storm-site.drpc.childopts" : "-Xmx768m",
> >>     "site.storm-site.drpc.invocations.port" : "0",
> >>     "site.storm-site.drpc.port" : "0",
> >>     "site.storm-site.drpc.queue.size" : "128",
> >>     "site.storm-site.drpc.request.timeout.secs" : "600",
> >>     "site.storm-site.drpc.worker.threads" : "64",
> >>     "site.storm-site.java.library.path" :
> >>
> >>
> >>"/etc/hadoop/conf:/usr/lib/hadoop/lib/native:/usr/local/lib:/opt/local/li
> >>b:/usr/lib",
> >>     "site.storm-site.logviewer.appender.name" : "A1",
> >>     "site.storm-site.logviewer.childopts" : "-Xmx128m",
> >>     "site.storm-site.logviewer.port" :
> >> "${SUPERVISOR.ALLOCATED_PORT}{DO_NOT_PROPAGATE}",
> >>     "site.storm-site.nimbus.childopts" : "-Xmx1024m",
> >>     "site.storm-site.nimbus.cleanup.inbox.freq.secs" : "600",
> >>     "site.storm-site.nimbus.file.copy.expiration.secs" : "600",
> >>     "site.storm-site.nimbus.host" : "${NIMBUS_HOST}",
> >>     "site.storm-site.nimbus.inbox.jar.expiration.secs" : "3600",
> >>     "site.storm-site.nimbus.monitor.freq.secs" : "10",
> >>     "site.storm-site.nimbus.reassign" : "true",
> >>     "site.storm-site.nimbus.supervisor.timeout.secs" : "60",
> >>     "site.storm-site.nimbus.task.launch.secs" : "120",
> >>     "site.storm-site.nimbus.task.timeout.secs" : "5",
> >>     "site.storm-site.nimbus.thrift.max_buffer_size" : "1048576",
> >>     "site.storm-site.nimbus.thrift.port" : "${NIMBUS.ALLOCATED_PORT}",
> >>     "site.storm-site.nimbus.topology.validator" :
> >> "backtype.storm.nimbus.DefaultTopologyValidator",
> >>     "site.storm-site.storm.cluster.mode" : "distributed",
> >>     "site.storm-site.storm.local.dir" :
> >>"${AGENT_WORK_ROOT}/app/tmp/storm",
> >>     "site.storm-site.storm.local.mode.zmq" : "false",
> >>     "site.storm-site.storm.messaging.netty.buffer_size" : "5242880",
> >>     "site.storm-site.storm.messaging.netty.client_worker_threads" : "1",
> >>     "site.storm-site.storm.messaging.netty.max_retries" : "300",
> >>     "site.storm-site.storm.messaging.netty.max_wait_ms" : "1000",
> >>     "site.storm-site.storm.messaging.netty.min_wait_ms" : "100",
> >>     "site.storm-site.storm.messaging.netty.server_worker_threads" : "1",
> >>     "site.storm-site.storm.messaging.transport" :
> >> "backtype.storm.messaging.netty.Context",
> >>     "site.storm-site.storm.thrift.transport" :
> >> "backtype.storm.security.auth.SimpleTransportPlugin",
> >>     "site.storm-site.storm.zookeeper.connection.timeout" : "15000",
> >>     "site.storm-site.storm.zookeeper.port" : "2181",
> >>     "site.storm-site.storm.zookeeper.retry.interval" : "1000",
> >>     "site.storm-site.storm.zookeeper.retry.intervalceiling.millis" :
> >> "30000",
> >>     "site.storm-site.storm.zookeeper.retry.times" : "5",
> >>     "site.storm-site.storm.zookeeper.root" : "${DEF_ZK_PATH}",
> >>     "site.storm-site.storm.zookeeper.servers" : "['${ZK_HOST}']",
> >>     "site.storm-site.storm.zookeeper.session.timeout" : "20000",
> >>     "site.storm-site.supervisor.childopts" : "-Xms512m -Xmx512m
> >> -XX:PermSize=64m -XX:MaxPermSize=64m",
> >>     "site.storm-site.supervisor.enable" : "true",
> >>     "site.storm-site.supervisor.heartbeat.frequency.secs" : "5",
> >>     "site.storm-site.supervisor.monitor.frequency.secs" : "3",
> >>     "site.storm-site.supervisor.slots.ports" :
> >> "[${SUPERVISOR.ALLOCATED_PORT}{DO_NOT_PROPAGATE}]",
> >>     "site.storm-site.supervisor.worker.start.timeout.secs" : "120",
> >>     "site.storm-site.supervisor.worker.timeout.secs" : "5",
> >>     "site.storm-site.task.heartbeat.frequency.secs" : "3",
> >>     "site.storm-site.task.refresh.poll.secs" : "10",
> >>     "site.storm-site.topology.acker.executors" : "null",
> >>     "site.storm-site.topology.builtin.metrics.bucket.size.secs" : "60",
> >>     "site.storm-site.topology.debug" : "false",
> >>     "site.storm-site.topology.disruptor.wait.strategy" :
> >> "com.lmax.disruptor.BlockingWaitStrategy",
> >>     "site.storm-site.topology.enable.message.timeouts" : "true",
> >>     "site.storm-site.topology.error.throttle.interval.secs" : "10",
> >>     "site.storm-site.topology.executor.receive.buffer.size" : "1024",
> >>     "site.storm-site.topology.executor.send.buffer.size" : "1024",
> >>     "site.storm-site.topology.fall.back.on.java.serialization" : "true",
> >>     "site.storm-site.topology.kryo.factory" :
> >> "backtype.storm.serialization.DefaultKryoFactory",
> >>     "site.storm-site.topology.max.error.report.per.interval" : "5",
> >>     "site.storm-site.topology.max.spout.pending" : "null",
> >>     "site.storm-site.topology.max.task.parallelism" : "null",
> >>     "site.storm-site.topology.message.timeout.secs" : "30",
> >>     "site.storm-site.topology.optimize" : "true",
> >>     "site.storm-site.topology.receiver.buffer.size" : "8",
> >>     "site.storm-site.topology.skip.missing.kryo.registrations" :
> >>"false",
> >>     "site.storm-site.topology.sleep.spout.wait.strategy.time.ms" : "1",
> >>     "site.storm-site.topology.spout.wait.strategy" :
> >> "backtype.storm.spout.SleepSpoutWaitStrategy",
> >>     "site.storm-site.topology.state.synchronization.timeout.secs" :
> >>"60",
> >>     "site.storm-site.topology.stats.sample.rate" : "0.05",
> >>     "site.storm-site.topology.tick.tuple.freq.secs" : "null",
> >>     "site.storm-site.topology.transfer.buffer.size" : "1024",
> >>     "site.storm-site.topology.trident.batch.emit.interval.millis" :
> >>"500",
> >>     "site.storm-site.topology.tuple.serializer" :
> >> "backtype.storm.serialization.types.ListDelegateSerializer",
> >>     "site.storm-site.topology.worker.childopts" : "null",
> >>     "site.storm-site.topology.worker.shared.thread.pool.size" : "4",
> >>     "site.storm-site.topology.workers" : "1",
> >>     "site.storm-site.transactional.zookeeper.port" : "null",
> >>     "site.storm-site.transactional.zookeeper.root" : "/transactional",
> >>     "site.storm-site.transactional.zookeeper.servers" : "null",
> >>     "site.storm-site.ui.port" : "${STORM_UI_SERVER.ALLOCATED_PORT}",
> >>     "site.storm-site.worker.childopts" : "-Xms2048m -Xmx2048m
> >> -XX:PermSize=64m -XX:MaxPermSize=64m",
> >>     "site.storm-site.worker.heartbeat.frequency.secs" : "1",
> >>     "site.storm-site.zmq.hwm" : "0",
> >>     "site.storm-site.zmq.linger.millis" : "5000",
> >>     "site.storm-site.zmq.threads" : "1"
> >>
> >>   },
> >>
> >> Also, it is working for some supervisors, which hints that parameter
> >>should
> >> be fine. I am not a lot familiar with slider codebase, but do we rely
> >>that
> >> nimbus should be up and running before we install supervisors
> >> configuration, as we have to replace these markers with actual host and
> >> port values ? Or supervisor agents poll these config parameters even
> >>later
> >> from some service, where nimbus published it after being started?
> >>
> >> Thanks
> >> Nitin
> >>
> >>
> >> On Mon, Mar 30, 2015 at 10:28 AM, Jon Maron <jma...@hortonworks.com>
> >> wrote:
> >>
> >> > We transitioned from ³ALLOCATED_PORT² to ³PER_CONTAINER² - I just
> >>can¹t
> >> > recall whether that was in the 0.60 timeframe?
> >> >
> >> > ‹ Jon
> >> >
> >> > > On Mar 30, 2015, at 1:21 PM, Sumit Mohanty <sumit.moha...@gmail.com
> >
> >> > wrote:
> >> > >
> >> > > Did you create the storm package yourself? Can you share the
> >> > appConfig.json
> >> > > you are using?
> >> > >
> >> > > On Mon, Mar 30, 2015 at 10:07 AM, Nitin Aggarwal <
> >> > > nitin3588.aggar...@gmail.com> wrote:
> >> > >
> >> > >> It's a typo just in the mail. It is replaced correctly for some of
> >>the
> >> > >> supervisors configuration.
> >> > >> I am running slider version 0.60.
> >> > >>
> >> > >> On Mon, Mar 30, 2015 at 10:04 AM, Sumit Mohanty <
> >> > smoha...@hortonworks.com>
> >> > >> wrote:
> >> > >>
> >> > >>> If the exact text was "ALOOCATED_PORT" then replacing them with
> >> > >>> "ALLOCATED_PORT" might solve it.
> >> > >>>
> >> > >>> Otherwise, whats the version of Slider are you using?
> >> > >>> ________________________________________
> >> > >>> From: Nitin Aggarwal <nitin3588.aggar...@gmail.com>
> >> > >>> Sent: Monday, March 30, 2015 9:51 AM
> >> > >>> To: dev@slider.incubator.apache.org
> >> > >>> Subject: Invalid port 0 for storm instances
> >> > >>>
> >> > >>> Hi,
> >> > >>>
> >> > >>> I am trying to run storm cluster with 200 instances, using Slider.
> >> > While
> >> > >>> submitting topologies to the cluster, some of the workers failed
> >>to
> >> > start
> >> > >>> due to error.
> >> > >>>
> >> > >>> 2015-03-27T13:27:40.168-0400 b.s.event [ERROR] Error when
> >>processing
> >> > >> event
> >> > >>> java.lang.IllegalArgumentException: invalid port: 0
> >> > >>> at
> >> > backtype.storm.security.auth.ThriftClient.<init>(ThriftClient.java:54)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at backtype.storm.utils.NimbusClient.<init>(NimbusClient.java:47)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at backtype.storm.utils.NimbusClient.<init>(NimbusClient.java:43)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at
> >> > >>>
> >> > >>
> >> >
> >>
> >>backtype.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:3
> >>6)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at backtype.storm.utils.Utils.downloadFromMaster(Utils.java:253)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at
> >> backtype.storm.daemon.supervisor$fn__6900.invoke(supervisor.clj:482)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at clojure.lang.MultiFn.invoke(MultiFn.java:241)
> >> > ~[clojure-1.5.1.jar:na]
> >> > >>> at
> >> > >>>
> >> > >>>
> >> > >>
> >> >
> >>
> >>backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6820.inv
> >>oke(supervisor.clj:371)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at
> >>backtype.storm.event$event_manager$fn__2825.invoke(event.clj:40)
> >> > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
> >> > >>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
> >> > >>> at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40]
> >> > >>>
> >> > >>> I found that some of the supervisors don't have the correct
> >> > >> configurations.
> >> > >>> Their configuration, still have markers like ${NIMBUS_HOST},
> >> > >>> ${NIMBUS.ALOOCATED_PORT}.
> >> > >>>
> >> > >>> Are these markers expected in supervisor storm configuration ?
> >> > >>>
> >> > >>> Thanks
> >> > >>> Nitin
> >> > >>>
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > thanks
> >> > > Sumit
> >> >
> >> >
> >>
>
>

Reply via email to