[jira] [Commented] (STORM-994) Connection leak between nimbus and supervisors
[ https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727018#comment-14727018 ] Frantz Mazoyer commented on STORM-994: -- My pleasure :-) Thanks for merging. > Connection leak between nimbus and supervisors > -- > > Key: STORM-994 > URL: https://issues.apache.org/jira/browse/STORM-994 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5 >Reporter: Frantz Mazoyer >Assignee: Frantz Mazoyer >Priority: Minor > Fix For: 0.10.0 > > > Successive deploys/undeploys of topology(ies) may result in a connection leak > between nimbus and its supervisors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-1023) Nimbus server hogs 100% CPU and clients are stuck
[ https://issues.apache.org/jira/browse/STORM-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frantz Mazoyer updated STORM-1023: -- Labels: (was: newbie) Description: Testing environment is Storm 0.9.5 / thrift java 0.7. Test scenario: Deploy storm topology in loop. When nimbus cleanup timeout is reached, an error is thrown by thrift server: "Exception while invoking ..." ... TException Test result: Thrift java server in nimbus goes 100% CPU in infinite loop in: jstack: {code} "Thread-5" prio=10 tid=0x7fb134aab800 nid=0x6767 runnable [0x7fb129c9b000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) ... at org.apache.thrift7.server.TNonblockingServer$SelectThread.select(TNonblockingServer.java:284) {code} strace: {code} epoll_wait(70, {{EPOLLIN, {u32=866, u64=866}}, {EPOLLIN, {u32=876, u64=876}}}, 4096, 4294967295) = 2 {code} Investigation and tests show that: Any Exception thrown during the processor execution will bypass the call to {code} responseReady() {code} and will cause the counter {code} readBufferBytesAllocated.addAndGet(-buffer_.array().length); {code} not to be decremented by the size of the request buffer. After a bunch of failed requests, this counter almost reaches the max value MAX_READ_BUFFER_BYTES causing any subsequent request to be delayed forever because the following test in {code} read() {code}: {code} if (readBufferBytesAllocated.get() + frameSize > MAX_READ_BUFFER_BYTES) {code} is always true. At the end, the server thread loops in select() which immediately wakes up for read() since the content of the socket was never drained. This loops forever between select and read() method above causing a 100% CPU on server thread. Moreover, all client requests are stuck forever. Example of failed request: {code} 2015-09-01T12:19:35.954+0200 b.s.d.nimbus [WARN] Topology submission exception. (topology name='mytopology') # 2015-09-01T12:19:35.955+0200 o.a.t.s.TNonblockingServer [ERROR] Unexpected exception while invoking! java.lang.IllegalArgumentException: /opt/SPE/share/storm/storm/local/nimbus/inbox/stormjar-3f8f3ba7-5420-4773-af24-bfa294cceb79.jar to copy to /opt/SPE/share/storm/storm/local/nimbus/stormdis t/mytopology-87-1441102775 does not exist! at backtype.storm.daemon.nimbus$fn__3827.invoke(nimbus.clj:1173) ~[storm-core-0.9.5.jar:0.9.5] at clojure.lang.MultiFn.invoke(MultiFn.java:236) ~[clojure-1.5.1.jar:na] at backtype.storm.daemon.nimbus$setup_storm_code.invoke(nimbus.clj:307) ~[storm-core-0.9.5.jar:0.9.5] at backtype.storm.daemon.nimbus$fn__3724$exec_fn__1103__auto__$reify__3737.submitTopologyWithOpts(nimbus.clj:953) ~[storm-core-0.9.5.jar:0.9.5] at backtype.storm.daemon.nimbus$fn__3724$exec_fn__1103__auto__$reify__3737.submitTopology(nimbus.clj:966) ~[storm-core-0.9.5.jar:0.9.5] at backtype.storm.generated.Nimbus$Processor$submitTopology.getResult(Nimbus.java:1240) ~[storm-core-0.9.5.jar:0.9.5] at backtype.storm.generated.Nimbus$Processor$submitTopology.getResult(Nimbus.java:1228) ~[storm-core-0.9.5.jar:0.9.5] at org.apache.thrift7.ProcessFunction.process(ProcessFunction.java:32) ~[storm-core-0.9.5.jar:0.9.5] at org.apache.thrift7.TBaseProcessor.process(TBaseProcessor.java:34) ~[storm-core-0.9.5.jar:0.9.5] at org.apache.thrift7.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:632) ~[storm-core-0.9.5.jar:0.9.5] at org.apache.thrift7.server.THsHaServer$Invocation.run(THsHaServer.java:201) [storm-core-0.9.5.jar:0.9.5] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] {code} was: Testing environment is Storm 0.9.5 / thrift java 0.7. Test scenario: Deploy storm topology in loop. When nimbus cleanup timeout is reached, an error is thrown by thrift server: "Exception while invoking ..." ... TException Test result: Thrift java server in nimbus goes 100% CPU in infinite loop in: jstack: {code} "Thread-5" prio=10 tid=0x7fb134aab800 nid=0x6767 runnable [0x7fb129c9b000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at
[jira] [Created] (STORM-1023) Nimbus server hogs 100% CPU and clients are stuck
Frantz Mazoyer created STORM-1023: - Summary: Nimbus server hogs 100% CPU and clients are stuck Key: STORM-1023 URL: https://issues.apache.org/jira/browse/STORM-1023 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5, 0.9.6 Environment: Storm 0.9.5 / thrift 0.7 Reporter: Frantz Mazoyer Testing environment is Storm 0.9.5 / thrift java 0.7. Test scenario: Deploy storm topology in loop. When nimbus cleanup timeout is reached, an error is thrown by thrift server: "Exception while invoking ..." ... TException Test result: Thrift java server in nimbus goes 100% CPU in infinite loop in: jstack: {code} "Thread-5" prio=10 tid=0x7fb134aab800 nid=0x6767 runnable [0x7fb129c9b000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) ... at org.apache.thrift7.server.TNonblockingServer$SelectThread.select(TNonblockingServer.java:284) {code} strace: {code} epoll_wait(70, {{EPOLLIN, {u32=866, u64=866}}, {EPOLLIN, {u32=876, u64=876}}}, 4096, 4294967295) = 2 {code} Investigation and tests show that: Any Exception thrown during the processor execution will bypass the call to {code} responseReady() {code} and will cause the counter {code} readBufferBytesAllocated.addAndGet(-buffer_.array().length); {code} not to be decremented by the size of the request buffer. After a bunch of failed requests, this counter almost reaches the max value MAX_READ_BUFFER_BYTES causing any subsequent request to be delayed forever because the following test in {code} read() {code}: {code} if (readBufferBytesAllocated.get() + frameSize > MAX_READ_BUFFER_BYTES) {code} is always true. At the end, the server thread loops in select() which immediately wakes up for read() since the content of the socket was never drained. This loops forever between select and read() method above causing a 100% CPU on server thread. Moreover, all client requests are stuck forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-994) Connection leak between nimbus and supervisors
[ https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frantz Mazoyer updated STORM-994: - Affects Version/s: 0.11.0 Fix Version/s: 0.11.0 Connection leak between nimbus and supervisors -- Key: STORM-994 URL: https://issues.apache.org/jira/browse/STORM-994 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Priority: Minor Successive deploys/undeploys of topology(ies) may result in a connection leak between nimbus and its supervisors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-994) Connection leak between nimbus and supervisors
[ https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frantz Mazoyer updated STORM-994: - Fix Version/s: (was: 0.11.0) Connection leak between nimbus and supervisors -- Key: STORM-994 URL: https://issues.apache.org/jira/browse/STORM-994 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Priority: Minor Successive deploys/undeploys of topology(ies) may result in a connection leak between nimbus and its supervisors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-994) Connection leak between nimbus and supervisors
[ https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697754#comment-14697754 ] Frantz Mazoyer commented on STORM-994: -- In Utils.java method downloadFromMaster leaks connections to supervisors when code is downloaded from nimbus : {code} NimbusClient client = NimbusClient.getConfiguredClient(conf); {code} creates a new client object everytime but never closed. Connection leak between nimbus and supervisors -- Key: STORM-994 URL: https://issues.apache.org/jira/browse/STORM-994 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.9.5 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Successive deploys/undeploys of topology(ies) may result in a connection leak between nimbus and its supervisors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-994) Connection leak between nimbus and supervisors
Frantz Mazoyer created STORM-994: Summary: Connection leak between nimbus and supervisors Key: STORM-994 URL: https://issues.apache.org/jira/browse/STORM-994 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.9.5 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Successive deploys/undeploys of topology(ies) may result in a connection leak between nimbus and its supervisors -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-130) [Storm 0.8.2]: java.io.FileNotFoundException: File '../stormconf.ser' does not exist
[ https://issues.apache.org/jira/browse/STORM-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586624#comment-14586624 ] Frantz Mazoyer commented on STORM-130: -- Hello, looks like this issue was actually solved in 0.9.5? Could anybody confirm? If so, could the jira be updated accordingly? Thanks a lot for the help :-) Kind regards, Frantz [Storm 0.8.2]: java.io.FileNotFoundException: File '../stormconf.ser' does not exist Key: STORM-130 URL: https://issues.apache.org/jira/browse/STORM-130 Project: Apache Storm Issue Type: Bug Reporter: James Xu Assignee: Sriharsha Chintalapani Priority: Minor Fix For: 0.10.0, 0.9.4 Attachments: README.txt, nimbus.log.gz, supervisor_logs.tar.gz, worker_logs.tar.gz, worker_logs_of_kafka_traffic.tar.gz, worker_logs_of_zookeeper_traffic_2015-04-11.tar.gz, worker_logs_of_zookeeper_traffic_2015-04-12.tar.gz, worker_logs_of_zookeeper_traffic_2015-04-13.tar.gz, workers_with_stormconf.ser.gz https://github.com/nathanmarz/storm/issues/438 Hi developers, We met critical issue with deploying storm topology to our prod cluster. After deploying topology we got trace on workers (Storm 0.8.2/zookeeper-3.3.6) : 2013-01-14 10:57:39 ZooKeeper [INFO] Initiating client connection, connectString=zookeeper1.company.com:2181,zookeeper2.company.com:2181,zookeeper3.company.com:2181 sessionTimeout=2 watcher=com.netflix.curator.ConnectionState@254ba9a2 2013-01-14 10:57:39 ClientCnxn [INFO] Opening socket connection to server zookeeper1.company.com/10.72.209.112:2181 2013-01-14 10:57:39 ClientCnxn [INFO] Socket connection established to zookeeper1.company.com/10.72.209.112:2181, initiating session 2013-01-14 10:57:39 ClientCnxn [INFO] Session establishment complete on server zookeeper1.company.com/10.72.209.112:2181, sessionid = 0x13b3e4b5c780239, negotiated timeout = 2 2013-01-14 10:57:39 zookeeper [INFO] Zookeeper state update: :connected:none 2013-01-14 10:57:39 ZooKeeper [INFO] Session: 0x13b3e4b5c780239 closed 2013-01-14 10:57:39 ClientCnxn [INFO] EventThread shut down 2013-01-14 10:57:39 CuratorFrameworkImpl [INFO] Starting 2013-01-14 10:57:39 ZooKeeper [INFO] Initiating client connection, connectString=zookeeper1.company.com:2181,zookeeper2.company.com:2181,zookeeper3.company.com:2181/storm sessionTimeout=2 watcher=com.netflix.curator.ConnectionState@33a998c7 2013-01-14 10:57:39 ClientCnxn [INFO] Opening socket connection to server zookeeper1.company.com/10.72.209.112:2181 2013-01-14 10:57:39 ClientCnxn [INFO] Socket connection established to zookeeper1.company.com/10.72.209.112:2181, initiating session 2013-01-14 10:57:39 ClientCnxn [INFO] Session establishment complete on server zookeeper1.company.com/10.72.209.112:2181, sessionid = 0x13b3e4b5c78023a, negotiated timeout = 2 2013-01-14 10:57:39 worker [ERROR] Error on initialization of server mk-worker java.io.FileNotFoundException: File '/tmp/storm/supervisor/stormdist/normalization-prod-1-1358161053/stormconf.ser' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:137) at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1135) at backtype.storm.config$read_supervisor_storm_conf.invoke(config.clj:138) at backtype.storm.daemon.worker$worker_data.invoke(worker.clj:146) at backtype.storm.daemon.worker$fn__4348$exec_fn__1228__auto4349.invoke(worker.clj:332) at clojure.lang.AFn.applyToHelper(AFn.java:185) at clojure.lang.AFn.applyTo(AFn.java:151) at clojure.core$apply.invoke(core.clj:601) at backtype.storm.daemon.worker$fn__4348$mk_worker__4404.doInvoke(worker.clj:323) at clojure.lang.RestFn.invoke(RestFn.java:512) at backtype.storm.daemon.worker$_main.invoke(worker.clj:433) at clojure.lang.AFn.applyToHelper(AFn.java:172) at clojure.lang.AFn.applyTo(AFn.java:151) at backtype.storm.daemon.worker.main(Unknown Source) 2013-01-14 10:57:39 util [INFO] Halting process: (Error on initialization) Supervisor trace: 2013-01-14 10:59:01 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 still hasn't started 2013-01-14 10:59:02 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 still hasn't starte ... 2013-01-14 10:59:34 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 still hasn't started 2013-01-14 10:59:35 supervisor [INFO] Worker d6735377-f0d6-4247-9f35-c8620e2b0e26 failed to start 2013-01-14 10:59:35 supervisor [INFO] Worker 234264c6-d9d6-4e8a-ab0a-8926bdd6b536 failed to start 2013-01-14 10:59:35 supervisor [INFO] Shutting down and clearing state for id 234264c6-d9d6-4e8a-ab0a-8926bdd6b536. Current supervisor time: 1358161175. State: :disallowed, Heartbeat: nil 2013-01-14
[jira] [Commented] (STORM-585) Performance issue in none grouping
[ https://issues.apache.org/jira/browse/STORM-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281265#comment-14281265 ] Frantz Mazoyer commented on STORM-585: -- Hi, should I switch the Jira to Resolved / Fixed ? Thanks :-) Performance issue in none grouping -- Key: STORM-585 URL: https://issues.apache.org/jira/browse/STORM-585 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.2-incubating, 0.9.3, 0.10.0, 0.9.3-rc2 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Priority: Minor Fix For: 0.10.0 In function mk-grouper, target-tasks is originally a ^List It then becomes a clojure vector: ... target-tasks (vec (sort target-tasks))] ... In :none grouping case, java method '.get' is called on target-tasks object: ... (.get target-tasks i) ... At run time, clojure will use introspection to find a method with a matching name and signature, which is very costly. Using clojure built-in vector 'get' function instead of '.get' method made us gain 25% performance in our use-case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-322) Windows Scripts do not handle spaces in JAVA_HOME path
[ https://issues.apache.org/jira/browse/STORM-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280618#comment-14280618 ] Frantz Mazoyer commented on STORM-322: -- Tested on a windows box with https://github.com/apache/storm/pull/373 Should be ok :-) Windows Scripts do not handle spaces in JAVA_HOME path -- Key: STORM-322 URL: https://issues.apache.org/jira/browse/STORM-322 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.2-incubating Reporter: Derek Dagit Priority: Minor Labels: newbie, windows If Java is installed to a path that has spaces, the windows scripts will error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254585#comment-14254585 ] Frantz Mazoyer commented on STORM-248: -- Hello, I would just like to point out that [https://issues.apache.org/jira/browse/STORM-487] (removing storm.cmd) has just been +1ed. If this is merged, I will resubmit another pull request without the changes in storm*.cmd files. Kind regards :) cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Assignee: Frantz Mazoyer Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253598#comment-14253598 ] Frantz Mazoyer commented on STORM-248: -- Hi, please check following branch : [https://github.com/fmazoyer/storm/commit/b2f8eb651d3e8dbd7939aae126d803da6b948663] This fixes STORM-248 and STORM-322 (spaces in storm.*cmd). This was therefore tested on Linux and Windows. Thanks a lot for your help and comments :-) Kind regards cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Assignee: Frantz Mazoyer Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249851#comment-14249851 ] Frantz Mazoyer commented on STORM-248: -- Thanks a lot for answering my question and for your comments :-) So this is my understanding of the behaviour of storm.py : {code} storm nimbus|supervisor|... {code} will eventually run : {code} java ... -Dlogback.configurationFile=path/cluster.xml ... {code} As far as the daemons (nimbus, supervisor...) are concerned, this {code} -Dlogback.configurationFile {code} is 'opaquely' passed on to the JVM and processed by the logback library directly. The supervisor process itself is no exception to the rule, except that supervisor.clj forks the worker daemons, with the following parameter: {code} (str -Dlogback.configurationFile= storm-home file-path-separator logback file-path-separator worker.xml) {code} If we keep the same mechanism, we have to somehow provide the path to the logback configuration directory (that would be the same as the one where cluster.xml is, that's right). In any case, I guess we will agree on the fact that: - We need to get the storm.logback.conf.dir from storm.yaml in storm.py, let's call it {code} get_logback_conf_dir() {code} - For each daemon, we can pass on to the JVM: {code} -Dlogback.configurationFile= + get_logback_conf_dir() + /cluster.xml {code} Now we may disagree on that :-): - For the supervisor daemon, pass on to the JVM: {code} -Dstorm.logback.conf.dir= + get_logback_conf_dir(), -Dlogback.configurationFile= + get_logback_conf_dir() + /cluster.xml {code} - In launch-worker in supervisor.clj, add the following lines: {code} ... (let ... storm-logback-conf-dir (or (System/getProperty storm.logback.conf.dir) (str storm-home file-path-separator logback)) ... command (concat ... ... (str -Dlogback.configurationFile= storm-logback-conf-dir file-path-separator worker.xml) ... {code} Hope I'm not leaving anything out. What do you reckon ? Thanks a lot for your time :-) cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Assignee: Frantz Mazoyer Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239347#comment-14239347 ] Frantz Mazoyer commented on STORM-248: -- Thanks for the comment :-) I would propose this as a fix, then: - Add storm.logback.conf.dir to storm.yaml. - In python script, for all daemons except supervisor, determine -Dlogback.configurationFile from storm.logback.conf.dir and pass it on to daemon as -Dlogback.configurationFile= + STORM_LOGBACK_CONF_DIR + /cluster.xml. If not defined, fall back to default/usual storm logback directory pointing to cluster.xml - In python script, for the supervisor, if defined, pass on -Dstorm.logback.conf.dir= + STORM_LOGBACK_CONF_DIR and -Dlogback.configurationFile= + STORM_LOGBACK_CONF_DIR + /cluster.xml Otherwise, fall back to default/usual storm logback directory pointing to cluster.xml - In supervisor.clj, if storm.logback.conf.dir is defined, use is at root dir for worker.xml in launch command line; otherwise, fall back to default/usual storm logback directory pointing to worker.xml What do you think ? Thanks again for the help :-) cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Assignee: Frantz Mazoyer Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-585) Performance issue in none grouping
Frantz Mazoyer created STORM-585: Summary: Performance issue in none grouping Key: STORM-585 URL: https://issues.apache.org/jira/browse/STORM-585 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.3-rc2 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Priority: Minor Fix For: 0.10.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-585) Performance issue in none grouping
[ https://issues.apache.org/jira/browse/STORM-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frantz Mazoyer updated STORM-585: - Description: In function mk-grouper, target-tasks is originally a ^List It then becomes a clojure vector: ... target-tasks (vec (sort target-tasks))] ... In :none grouping case, java method '.get' is called on target-tasks object: ... (.get target-tasks i) ... At run time, clojure will use introspection to find a method with a matching name and signature, which is very costly. Using clojure built-in vector 'get' function instead of '.get' method made us gain 25% performance in our use-case. Affects Version/s: 0.10.0 0.9.3 0.9.2-incubating Performance issue in none grouping -- Key: STORM-585 URL: https://issues.apache.org/jira/browse/STORM-585 Project: Apache Storm Issue Type: Bug Affects Versions: 0.9.2-incubating, 0.9.3, 0.10.0, 0.9.3-rc2 Reporter: Frantz Mazoyer Assignee: Frantz Mazoyer Priority: Minor Fix For: 0.10.0 In function mk-grouper, target-tasks is originally a ^List It then becomes a clojure vector: ... target-tasks (vec (sort target-tasks))] ... In :none grouping case, java method '.get' is called on target-tasks object: ... (.get target-tasks i) ... At run time, clojure will use introspection to find a method with a matching name and signature, which is very costly. Using clojure built-in vector 'get' function instead of '.get' method made us gain 25% performance in our use-case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223101#comment-14223101 ] Frantz Mazoyer commented on STORM-248: -- Hello, I would like to lift this limitation to ease my deployments. Here's the fix I propose: In file supervisor.clj, function launch-worker, in the 'let', I would add 'log-conf-file' like so: ... (let [conf (:conf supervisor) storm-home (System/getProperty storm.home) log-conf-file (or (System/getProperty logback.configurationFile) (str storm-home /logback/cluster.xml)) ... and substitute current concatenation of logback.configurationFile with it: ... (str -Dlogback.configurationFile= log-conf-file) ... If it's ok with you, can I submit a pull request on the master, next? Thanks a lot for your help. cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frantz Mazoyer reassigned STORM-248: Assignee: Frantz Mazoyer cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Assignee: Frantz Mazoyer Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.
[ https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223116#comment-14223116 ] Frantz Mazoyer commented on STORM-248: -- Hello, I would like to lift this limitation to ease my deployments. Here's the fix I propose: In file supervisor.clj, function launch-worker, in the 'let', I would add 'storm-log-conf-file' like so: ... (let [conf (:conf supervisor) storm-home (System/getProperty storm.home) ... storm-log-conf-file (or (System/getProperty logback.configurationFile) (str storm-home file-path-separator logback file-path-separator cluster.xml)) ... and substitute current concatenation of logback.configurationFile with it: ... (str -Dlogback.configurationFile= storm-log-conf-file) ... If it's ok with you, can I submit a pull request on the master, next? Thanks a lot for your help cluster.xml location is hardcoded for workers. -- Key: STORM-248 URL: https://issues.apache.org/jira/browse/STORM-248 Project: Apache Storm Issue Type: Bug Reporter: Edward Goslin Assignee: Frantz Mazoyer Priority: Trivial when the supervisor spawns a worker process, it assumes the cluster.xml is in -Dlogback.configurationFile= storm-home /logback/cluster.xml It should take the VM arguement for the supervisor and pass it down to the worker. System.get(logback.configurationFile, storm-home + /logback/cluster.xml) -- This message was sent by Atlassian JIRA (v6.3.4#6332)