My mistake, we are running slider version 0.50. I believe these configs were changed in 0.60 version. Also, were there any issues around port allocation that could fix this issue, so that I can back-port them to 0.50 ?
On Wed, Apr 1, 2015 at 3:19 PM, Sumit Mohanty <smoha...@hortonworks.com> wrote: > "${SUPERVISOR.ALLOCATED_PORT}{DO_NOT_PROPAGATE}", > > should be changed to > > "${SUPERVISOR.ALLOCATED_PORT}{PER_CONTAINER}", > > As a reference - > https://github.com/apache/incubator-slider/blob/develop/app-packages/storm/appConfig-default.json > > I think even DEF_ZK_PATH should be DEFAULT_ZK_PATH > ________________________________________ > From: Nitin Aggarwal <nitin3588.aggar...@gmail.com> > Sent: Monday, March 30, 2015 10:38 AM > To: dev@slider.incubator.apache.org > Subject: Re: Invalid port 0 for storm instances > > Yes, storm package is built internally. > > App configuration: > > "appConf" :{ > "schema" : "http://example.org/specification/v2.0.0", > "metadata" : { }, > "global" : { > "agent.conf" : "/apps/slider/agent/conf/agent.ini", > "application.def" : "/apps/slider/app-packages/storm/storm_v0_9_4.zip", > "config_types" : "storm-site", > "create.default.zookeeper.node" : "true", > "env.MALLOC_ARENA_MAX" : "4", > "java_home" : "/usr/java/jdk1.7.0_40", > "package_list" : "files/storm-0.9.4-SNAPSHOT-bin.tar.gz", > "site.fs.default.name" : "hdfs://XXXXX/", > "site.fs.defaultFS" : "hdfs://XXXXXX:8020/", > "site.global.app_install_dir" : "${AGENT_WORK_ROOT}/app/install", > "site.global.app_log_dir" : "/srv/var/hadoop/logs/deathstar", > "site.global.app_pid_dir" : "${AGENT_WORK_ROOT}/app/run", > "site.global.app_root" : > "${AGENT_WORK_ROOT}/app/install/apache-storm-0.9.4-SNAPSHOT", > "site.global.app_user" : "yarn", > "site.global.ganglia_enabled" : "false", > "site.global.ganglia_server_host" : "${NN_HOST}", > "site.global.ganglia_server_id" : "Application2", > "site.global.ganglia_server_port" : "8668", > "site.global.hbase_instance_name" : "XXXXXX", > "site.global.opentsdb_server_host" : "XXXXX", > "site.global.opentsdb_server_port" : "4242", > "site.global.rest_api_admin_port" : "${STORM_REST_API.ALLOCATED_PORT}", > "site.global.rest_api_port" : "${STORM_REST_API.ALLOCATED_PORT}", > "site.global.security_enabled" : "false", > "site.global.storm_instance_name" : "XXXXX", > "site.global.user_group" : "hadoop", > "site.storm-site.dev.zookeeper.path" : > "${AGENT_WORK_ROOT}/app/tmp/dev-storm-zookeeper", > "site.storm-site.drpc.childopts" : "-Xmx768m", > "site.storm-site.drpc.invocations.port" : "0", > "site.storm-site.drpc.port" : "0", > "site.storm-site.drpc.queue.size" : "128", > "site.storm-site.drpc.request.timeout.secs" : "600", > "site.storm-site.drpc.worker.threads" : "64", > "site.storm-site.java.library.path" : > > "/etc/hadoop/conf:/usr/lib/hadoop/lib/native:/usr/local/lib:/opt/local/lib:/usr/lib", > "site.storm-site.logviewer.appender.name" : "A1", > "site.storm-site.logviewer.childopts" : "-Xmx128m", > "site.storm-site.logviewer.port" : > "${SUPERVISOR.ALLOCATED_PORT}{DO_NOT_PROPAGATE}", > "site.storm-site.nimbus.childopts" : "-Xmx1024m", > "site.storm-site.nimbus.cleanup.inbox.freq.secs" : "600", > "site.storm-site.nimbus.file.copy.expiration.secs" : "600", > "site.storm-site.nimbus.host" : "${NIMBUS_HOST}", > "site.storm-site.nimbus.inbox.jar.expiration.secs" : "3600", > "site.storm-site.nimbus.monitor.freq.secs" : "10", > "site.storm-site.nimbus.reassign" : "true", > "site.storm-site.nimbus.supervisor.timeout.secs" : "60", > "site.storm-site.nimbus.task.launch.secs" : "120", > "site.storm-site.nimbus.task.timeout.secs" : "5", > "site.storm-site.nimbus.thrift.max_buffer_size" : "1048576", > "site.storm-site.nimbus.thrift.port" : "${NIMBUS.ALLOCATED_PORT}", > "site.storm-site.nimbus.topology.validator" : > "backtype.storm.nimbus.DefaultTopologyValidator", > "site.storm-site.storm.cluster.mode" : "distributed", > "site.storm-site.storm.local.dir" : "${AGENT_WORK_ROOT}/app/tmp/storm", > "site.storm-site.storm.local.mode.zmq" : "false", > "site.storm-site.storm.messaging.netty.buffer_size" : "5242880", > "site.storm-site.storm.messaging.netty.client_worker_threads" : "1", > "site.storm-site.storm.messaging.netty.max_retries" : "300", > "site.storm-site.storm.messaging.netty.max_wait_ms" : "1000", > "site.storm-site.storm.messaging.netty.min_wait_ms" : "100", > "site.storm-site.storm.messaging.netty.server_worker_threads" : "1", > "site.storm-site.storm.messaging.transport" : > "backtype.storm.messaging.netty.Context", > "site.storm-site.storm.thrift.transport" : > "backtype.storm.security.auth.SimpleTransportPlugin", > "site.storm-site.storm.zookeeper.connection.timeout" : "15000", > "site.storm-site.storm.zookeeper.port" : "2181", > "site.storm-site.storm.zookeeper.retry.interval" : "1000", > "site.storm-site.storm.zookeeper.retry.intervalceiling.millis" : > "30000", > "site.storm-site.storm.zookeeper.retry.times" : "5", > "site.storm-site.storm.zookeeper.root" : "${DEF_ZK_PATH}", > "site.storm-site.storm.zookeeper.servers" : "['${ZK_HOST}']", > "site.storm-site.storm.zookeeper.session.timeout" : "20000", > "site.storm-site.supervisor.childopts" : "-Xms512m -Xmx512m > -XX:PermSize=64m -XX:MaxPermSize=64m", > "site.storm-site.supervisor.enable" : "true", > "site.storm-site.supervisor.heartbeat.frequency.secs" : "5", > "site.storm-site.supervisor.monitor.frequency.secs" : "3", > "site.storm-site.supervisor.slots.ports" : > "[${SUPERVISOR.ALLOCATED_PORT}{DO_NOT_PROPAGATE}]", > "site.storm-site.supervisor.worker.start.timeout.secs" : "120", > "site.storm-site.supervisor.worker.timeout.secs" : "5", > "site.storm-site.task.heartbeat.frequency.secs" : "3", > "site.storm-site.task.refresh.poll.secs" : "10", > "site.storm-site.topology.acker.executors" : "null", > "site.storm-site.topology.builtin.metrics.bucket.size.secs" : "60", > "site.storm-site.topology.debug" : "false", > "site.storm-site.topology.disruptor.wait.strategy" : > "com.lmax.disruptor.BlockingWaitStrategy", > "site.storm-site.topology.enable.message.timeouts" : "true", > "site.storm-site.topology.error.throttle.interval.secs" : "10", > "site.storm-site.topology.executor.receive.buffer.size" : "1024", > "site.storm-site.topology.executor.send.buffer.size" : "1024", > "site.storm-site.topology.fall.back.on.java.serialization" : "true", > "site.storm-site.topology.kryo.factory" : > "backtype.storm.serialization.DefaultKryoFactory", > "site.storm-site.topology.max.error.report.per.interval" : "5", > "site.storm-site.topology.max.spout.pending" : "null", > "site.storm-site.topology.max.task.parallelism" : "null", > "site.storm-site.topology.message.timeout.secs" : "30", > "site.storm-site.topology.optimize" : "true", > "site.storm-site.topology.receiver.buffer.size" : "8", > "site.storm-site.topology.skip.missing.kryo.registrations" : "false", > "site.storm-site.topology.sleep.spout.wait.strategy.time.ms" : "1", > "site.storm-site.topology.spout.wait.strategy" : > "backtype.storm.spout.SleepSpoutWaitStrategy", > "site.storm-site.topology.state.synchronization.timeout.secs" : "60", > "site.storm-site.topology.stats.sample.rate" : "0.05", > "site.storm-site.topology.tick.tuple.freq.secs" : "null", > "site.storm-site.topology.transfer.buffer.size" : "1024", > "site.storm-site.topology.trident.batch.emit.interval.millis" : "500", > "site.storm-site.topology.tuple.serializer" : > "backtype.storm.serialization.types.ListDelegateSerializer", > "site.storm-site.topology.worker.childopts" : "null", > "site.storm-site.topology.worker.shared.thread.pool.size" : "4", > "site.storm-site.topology.workers" : "1", > "site.storm-site.transactional.zookeeper.port" : "null", > "site.storm-site.transactional.zookeeper.root" : "/transactional", > "site.storm-site.transactional.zookeeper.servers" : "null", > "site.storm-site.ui.port" : "${STORM_UI_SERVER.ALLOCATED_PORT}", > "site.storm-site.worker.childopts" : "-Xms2048m -Xmx2048m > -XX:PermSize=64m -XX:MaxPermSize=64m", > "site.storm-site.worker.heartbeat.frequency.secs" : "1", > "site.storm-site.zmq.hwm" : "0", > "site.storm-site.zmq.linger.millis" : "5000", > "site.storm-site.zmq.threads" : "1" > > }, > > Also, it is working for some supervisors, which hints that parameter should > be fine. I am not a lot familiar with slider codebase, but do we rely that > nimbus should be up and running before we install supervisors > configuration, as we have to replace these markers with actual host and > port values ? Or supervisor agents poll these config parameters even later > from some service, where nimbus published it after being started? > > Thanks > Nitin > > > On Mon, Mar 30, 2015 at 10:28 AM, Jon Maron <jma...@hortonworks.com> > wrote: > > > We transitioned from “ALLOCATED_PORT” to “PER_CONTAINER” - I just can’t > > recall whether that was in the 0.60 timeframe? > > > > — Jon > > > > > On Mar 30, 2015, at 1:21 PM, Sumit Mohanty <sumit.moha...@gmail.com> > > wrote: > > > > > > Did you create the storm package yourself? Can you share the > > appConfig.json > > > you are using? > > > > > > On Mon, Mar 30, 2015 at 10:07 AM, Nitin Aggarwal < > > > nitin3588.aggar...@gmail.com> wrote: > > > > > >> It's a typo just in the mail. It is replaced correctly for some of the > > >> supervisors configuration. > > >> I am running slider version 0.60. > > >> > > >> On Mon, Mar 30, 2015 at 10:04 AM, Sumit Mohanty < > > smoha...@hortonworks.com> > > >> wrote: > > >> > > >>> If the exact text was "ALOOCATED_PORT" then replacing them with > > >>> "ALLOCATED_PORT" might solve it. > > >>> > > >>> Otherwise, whats the version of Slider are you using? > > >>> ________________________________________ > > >>> From: Nitin Aggarwal <nitin3588.aggar...@gmail.com> > > >>> Sent: Monday, March 30, 2015 9:51 AM > > >>> To: dev@slider.incubator.apache.org > > >>> Subject: Invalid port 0 for storm instances > > >>> > > >>> Hi, > > >>> > > >>> I am trying to run storm cluster with 200 instances, using Slider. > > While > > >>> submitting topologies to the cluster, some of the workers failed to > > start > > >>> due to error. > > >>> > > >>> 2015-03-27T13:27:40.168-0400 b.s.event [ERROR] Error when processing > > >> event > > >>> java.lang.IllegalArgumentException: invalid port: 0 > > >>> at > > backtype.storm.security.auth.ThriftClient.<init>(ThriftClient.java:54) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at backtype.storm.utils.NimbusClient.<init>(NimbusClient.java:47) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at backtype.storm.utils.NimbusClient.<init>(NimbusClient.java:43) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at > > >>> > > >> > > > backtype.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:36) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at backtype.storm.utils.Utils.downloadFromMaster(Utils.java:253) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at > backtype.storm.daemon.supervisor$fn__6900.invoke(supervisor.clj:482) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at clojure.lang.MultiFn.invoke(MultiFn.java:241) > > ~[clojure-1.5.1.jar:na] > > >>> at > > >>> > > >>> > > >> > > > backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6820.invoke(supervisor.clj:371) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at backtype.storm.event$event_manager$fn__2825.invoke(event.clj:40) > > >>> ~[storm-core-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT] > > >>> at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] > > >>> at java.lang.Thread.run(Thread.java:724) [na:1.7.0_40] > > >>> > > >>> I found that some of the supervisors don't have the correct > > >> configurations. > > >>> Their configuration, still have markers like ${NIMBUS_HOST}, > > >>> ${NIMBUS.ALOOCATED_PORT}. > > >>> > > >>> Are these markers expected in supervisor storm configuration ? > > >>> > > >>> Thanks > > >>> Nitin > > >>> > > >> > > > > > > > > > > > > -- > > > thanks > > > Sumit > > > > >