[jira] [Commented] (STORM-994) Connection leak between nimbus and supervisors

2015-09-02 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14727018#comment-14727018
 ] 

Frantz Mazoyer commented on STORM-994:
--

My pleasure :-)
Thanks for merging.

> Connection leak between nimbus and supervisors
> --
>
> Key: STORM-994
> URL: https://issues.apache.org/jira/browse/STORM-994
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5
>Reporter: Frantz Mazoyer
>Assignee: Frantz Mazoyer
>Priority: Minor
> Fix For: 0.10.0
>
>
> Successive deploys/undeploys of topology(ies) may result in a connection leak 
> between nimbus and its supervisors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-1023) Nimbus server hogs 100% CPU and clients are stuck

2015-09-01 Thread Frantz Mazoyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-1023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantz Mazoyer updated STORM-1023:
--
 Labels:   (was: newbie)
Description: 
Testing environment is Storm 0.9.5 / thrift java 0.7.
Test scenario: 
  Deploy storm topology in loop.
  When nimbus cleanup timeout is reached, an error is thrown by thrift server: 
  "Exception while invoking ..." ... TException

Test result:
  Thrift java server in nimbus goes 100% CPU in infinite loop in:

jstack:
{code}
"Thread-5" prio=10 tid=0x7fb134aab800 nid=0x6767 runnable 
[0x7fb129c9b000]
   java.lang.Thread.State: RUNNABLE
  at 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
  at 
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
  at 
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
  at 
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
...
at 
org.apache.thrift7.server.TNonblockingServer$SelectThread.select(TNonblockingServer.java:284)
 
{code}

strace:
{code}
epoll_wait(70, {{EPOLLIN, {u32=866, u64=866}}, {EPOLLIN, {u32=876, u64=876}}}, 
4096, 4294967295) = 2
{code}

Investigation and tests show that:
Any Exception thrown during the processor execution will bypass the call to 
{code} responseReady() {code} and will cause the counter {code}   
readBufferBytesAllocated.addAndGet(-buffer_.array().length); {code} not to be 
decremented by the size of the request buffer.

After a bunch of failed requests, this counter almost reaches the max value 
MAX_READ_BUFFER_BYTES causing any subsequent request to be delayed forever 
because the following test in {code} read() {code}:
{code}   if (readBufferBytesAllocated.get() + frameSize > 
MAX_READ_BUFFER_BYTES)  {code} is always true.

At the end, the server thread loops in select() which immediately wakes up for 
read() since the content of the socket was never drained.

This loops forever between select and read() method above causing a 100% CPU on 
server thread.
Moreover, all client requests are stuck forever.

Example of failed request:
{code}
2015-09-01T12:19:35.954+0200 b.s.d.nimbus [WARN] Topology submission exception. 
(topology name='mytopology') #
2015-09-01T12:19:35.955+0200 o.a.t.s.TNonblockingServer [ERROR] Unexpected 
exception while invoking!
java.lang.IllegalArgumentException: 
/opt/SPE/share/storm/storm/local/nimbus/inbox/stormjar-3f8f3ba7-5420-4773-af24-bfa294cceb79.jar
 to copy to /opt/SPE/share/storm/storm/local/nimbus/stormdis
t/mytopology-87-1441102775 does not exist!
at backtype.storm.daemon.nimbus$fn__3827.invoke(nimbus.clj:1173) 
~[storm-core-0.9.5.jar:0.9.5]
at clojure.lang.MultiFn.invoke(MultiFn.java:236) ~[clojure-1.5.1.jar:na]
at backtype.storm.daemon.nimbus$setup_storm_code.invoke(nimbus.clj:307) 
~[storm-core-0.9.5.jar:0.9.5]
at 
backtype.storm.daemon.nimbus$fn__3724$exec_fn__1103__auto__$reify__3737.submitTopologyWithOpts(nimbus.clj:953)
 ~[storm-core-0.9.5.jar:0.9.5]
at 
backtype.storm.daemon.nimbus$fn__3724$exec_fn__1103__auto__$reify__3737.submitTopology(nimbus.clj:966)
 ~[storm-core-0.9.5.jar:0.9.5]
at 
backtype.storm.generated.Nimbus$Processor$submitTopology.getResult(Nimbus.java:1240)
 ~[storm-core-0.9.5.jar:0.9.5]
at 
backtype.storm.generated.Nimbus$Processor$submitTopology.getResult(Nimbus.java:1228)
 ~[storm-core-0.9.5.jar:0.9.5]
at org.apache.thrift7.ProcessFunction.process(ProcessFunction.java:32) 
~[storm-core-0.9.5.jar:0.9.5]
at org.apache.thrift7.TBaseProcessor.process(TBaseProcessor.java:34) 
~[storm-core-0.9.5.jar:0.9.5]
at 
org.apache.thrift7.server.TNonblockingServer$FrameBuffer.invoke(TNonblockingServer.java:632)
 ~[storm-core-0.9.5.jar:0.9.5]
at 
org.apache.thrift7.server.THsHaServer$Invocation.run(THsHaServer.java:201) 
[storm-core-0.9.5.jar:0.9.5]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_75]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_75]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
{code} 

  was:
Testing environment is Storm 0.9.5 / thrift java 0.7.
Test scenario: 
  Deploy storm topology in loop.
  When nimbus cleanup timeout is reached, an error is thrown by thrift server: 
  "Exception while invoking ..." ... TException

Test result:
  Thrift java server in nimbus goes 100% CPU in infinite loop in:

jstack:
{code}
"Thread-5" prio=10 tid=0x7fb134aab800 nid=0x6767 runnable 
[0x7fb129c9b000]
   java.lang.Thread.State: RUNNABLE
  at 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
  at 

[jira] [Created] (STORM-1023) Nimbus server hogs 100% CPU and clients are stuck

2015-09-01 Thread Frantz Mazoyer (JIRA)
Frantz Mazoyer created STORM-1023:
-

 Summary: Nimbus server hogs 100% CPU and clients are stuck 
 Key: STORM-1023
 URL: https://issues.apache.org/jira/browse/STORM-1023
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5, 0.9.6
 Environment: Storm 0.9.5 / thrift 0.7
Reporter: Frantz Mazoyer


Testing environment is Storm 0.9.5 / thrift java 0.7.
Test scenario: 
  Deploy storm topology in loop.
  When nimbus cleanup timeout is reached, an error is thrown by thrift server: 
  "Exception while invoking ..." ... TException

Test result:
  Thrift java server in nimbus goes 100% CPU in infinite loop in:

jstack:
{code}
"Thread-5" prio=10 tid=0x7fb134aab800 nid=0x6767 runnable 
[0x7fb129c9b000]
   java.lang.Thread.State: RUNNABLE
  at 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
  at 
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
  at 
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
  at 
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
...
at 
org.apache.thrift7.server.TNonblockingServer$SelectThread.select(TNonblockingServer.java:284)
 
{code}

strace:
{code}
epoll_wait(70, {{EPOLLIN, {u32=866, u64=866}}, {EPOLLIN, {u32=876, u64=876}}}, 
4096, 4294967295) = 2
{code}

Investigation and tests show that:
Any Exception thrown during the processor execution will bypass the call to 
{code} responseReady() {code} and will cause the counter {code}   
readBufferBytesAllocated.addAndGet(-buffer_.array().length); {code} not to be 
decremented by the size of the request buffer.

After a bunch of failed requests, this counter almost reaches the max value 
MAX_READ_BUFFER_BYTES causing any subsequent request to be delayed forever 
because the following test in {code} read() {code}:
{code}   if (readBufferBytesAllocated.get() + frameSize > 
MAX_READ_BUFFER_BYTES)  {code} is always true.

At the end, the server thread loops in select() which immediately wakes up for 
read() since the content of the socket was never drained.

This loops forever between select and read() method above causing a 100% CPU on 
server thread.
Moreover, all client requests are stuck forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-994) Connection leak between nimbus and supervisors

2015-08-27 Thread Frantz Mazoyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantz Mazoyer updated STORM-994:
-
Affects Version/s: 0.11.0
Fix Version/s: 0.11.0

 Connection leak between nimbus and supervisors
 --

 Key: STORM-994
 URL: https://issues.apache.org/jira/browse/STORM-994
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer
Priority: Minor

 Successive deploys/undeploys of topology(ies) may result in a connection leak 
 between nimbus and its supervisors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-994) Connection leak between nimbus and supervisors

2015-08-27 Thread Frantz Mazoyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantz Mazoyer updated STORM-994:
-
Fix Version/s: (was: 0.11.0)

 Connection leak between nimbus and supervisors
 --

 Key: STORM-994
 URL: https://issues.apache.org/jira/browse/STORM-994
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.11.0, 0.9.5
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer
Priority: Minor

 Successive deploys/undeploys of topology(ies) may result in a connection leak 
 between nimbus and its supervisors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-994) Connection leak between nimbus and supervisors

2015-08-14 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697754#comment-14697754
 ] 

Frantz Mazoyer commented on STORM-994:
--

In Utils.java method downloadFromMaster leaks connections to supervisors when 
code is downloaded from nimbus :

{code}
NimbusClient client = NimbusClient.getConfiguredClient(conf);
{code}

creates a new client object everytime but never closed.

 Connection leak between nimbus and supervisors
 --

 Key: STORM-994
 URL: https://issues.apache.org/jira/browse/STORM-994
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.9.5
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer

 Successive deploys/undeploys of topology(ies) may result in a connection leak 
 between nimbus and its supervisors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-994) Connection leak between nimbus and supervisors

2015-08-14 Thread Frantz Mazoyer (JIRA)
Frantz Mazoyer created STORM-994:


 Summary: Connection leak between nimbus and supervisors
 Key: STORM-994
 URL: https://issues.apache.org/jira/browse/STORM-994
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3, 0.10.0, 0.9.4, 0.9.5
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer


Successive deploys/undeploys of topology(ies) may result in a connection leak 
between nimbus and its supervisors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-130) [Storm 0.8.2]: java.io.FileNotFoundException: File '../stormconf.ser' does not exist

2015-06-15 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586624#comment-14586624
 ] 

Frantz Mazoyer commented on STORM-130:
--

Hello,

looks like this issue was actually solved in 0.9.5?
Could anybody confirm?

If so, could the jira be updated accordingly?

Thanks a lot for the help :-)

Kind regards,
Frantz

 [Storm 0.8.2]: java.io.FileNotFoundException: File '../stormconf.ser' does 
 not exist
 

 Key: STORM-130
 URL: https://issues.apache.org/jira/browse/STORM-130
 Project: Apache Storm
  Issue Type: Bug
Reporter: James Xu
Assignee: Sriharsha Chintalapani
Priority: Minor
 Fix For: 0.10.0, 0.9.4

 Attachments: README.txt, nimbus.log.gz, supervisor_logs.tar.gz, 
 worker_logs.tar.gz, worker_logs_of_kafka_traffic.tar.gz, 
 worker_logs_of_zookeeper_traffic_2015-04-11.tar.gz, 
 worker_logs_of_zookeeper_traffic_2015-04-12.tar.gz, 
 worker_logs_of_zookeeper_traffic_2015-04-13.tar.gz, 
 workers_with_stormconf.ser.gz


 https://github.com/nathanmarz/storm/issues/438
 Hi developers,
 We met critical issue with deploying storm topology to our prod cluster.
 After deploying topology we got trace on workers (Storm 
 0.8.2/zookeeper-3.3.6) :
 2013-01-14 10:57:39 ZooKeeper [INFO] Initiating client connection, 
 connectString=zookeeper1.company.com:2181,zookeeper2.company.com:2181,zookeeper3.company.com:2181
  sessionTimeout=2 watcher=com.netflix.curator.ConnectionState@254ba9a2
 2013-01-14 10:57:39 ClientCnxn [INFO] Opening socket connection to server 
 zookeeper1.company.com/10.72.209.112:2181
 2013-01-14 10:57:39 ClientCnxn [INFO] Socket connection established to 
 zookeeper1.company.com/10.72.209.112:2181, initiating session
 2013-01-14 10:57:39 ClientCnxn [INFO] Session establishment complete on 
 server zookeeper1.company.com/10.72.209.112:2181, sessionid = 
 0x13b3e4b5c780239, negotiated timeout = 2
 2013-01-14 10:57:39 zookeeper [INFO] Zookeeper state update: :connected:none
 2013-01-14 10:57:39 ZooKeeper [INFO] Session: 0x13b3e4b5c780239 closed
 2013-01-14 10:57:39 ClientCnxn [INFO] EventThread shut down
 2013-01-14 10:57:39 CuratorFrameworkImpl [INFO] Starting
 2013-01-14 10:57:39 ZooKeeper [INFO] Initiating client connection, 
 connectString=zookeeper1.company.com:2181,zookeeper2.company.com:2181,zookeeper3.company.com:2181/storm
  sessionTimeout=2 watcher=com.netflix.curator.ConnectionState@33a998c7
 2013-01-14 10:57:39 ClientCnxn [INFO] Opening socket connection to server 
 zookeeper1.company.com/10.72.209.112:2181
 2013-01-14 10:57:39 ClientCnxn [INFO] Socket connection established to 
 zookeeper1.company.com/10.72.209.112:2181, initiating session
 2013-01-14 10:57:39 ClientCnxn [INFO] Session establishment complete on 
 server zookeeper1.company.com/10.72.209.112:2181, sessionid = 
 0x13b3e4b5c78023a, negotiated timeout = 2
 2013-01-14 10:57:39 worker [ERROR] Error on initialization of server mk-worker
 java.io.FileNotFoundException: File 
 '/tmp/storm/supervisor/stormdist/normalization-prod-1-1358161053/stormconf.ser'
  does not exist
 at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:137)
 at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1135)
 at backtype.storm.config$read_supervisor_storm_conf.invoke(config.clj:138)
 at backtype.storm.daemon.worker$worker_data.invoke(worker.clj:146)
 at 
 backtype.storm.daemon.worker$fn__4348$exec_fn__1228__auto4349.invoke(worker.clj:332)
 at clojure.lang.AFn.applyToHelper(AFn.java:185)
 at clojure.lang.AFn.applyTo(AFn.java:151)
 at clojure.core$apply.invoke(core.clj:601)
 at 
 backtype.storm.daemon.worker$fn__4348$mk_worker__4404.doInvoke(worker.clj:323)
 at clojure.lang.RestFn.invoke(RestFn.java:512)
 at backtype.storm.daemon.worker$_main.invoke(worker.clj:433)
 at clojure.lang.AFn.applyToHelper(AFn.java:172)
 at clojure.lang.AFn.applyTo(AFn.java:151)
 at backtype.storm.daemon.worker.main(Unknown Source)
 2013-01-14 10:57:39 util [INFO] Halting process: (Error on initialization)
 Supervisor trace:
 2013-01-14 10:59:01 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 
 still hasn't started
 2013-01-14 10:59:02 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 
 still hasn't starte
 ...
 2013-01-14 10:59:34 supervisor [INFO] d6735377-f0d6-4247-9f35-c8620e2b0e26 
 still hasn't started
 2013-01-14 10:59:35 supervisor [INFO] Worker 
 d6735377-f0d6-4247-9f35-c8620e2b0e26 failed to start
 2013-01-14 10:59:35 supervisor [INFO] Worker 
 234264c6-d9d6-4e8a-ab0a-8926bdd6b536 failed to start
 2013-01-14 10:59:35 supervisor [INFO] Shutting down and clearing state for id 
 234264c6-d9d6-4e8a-ab0a-8926bdd6b536. Current supervisor time: 1358161175. 
 State: :disallowed, Heartbeat: nil
 2013-01-14 

[jira] [Commented] (STORM-585) Performance issue in none grouping

2015-01-17 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14281265#comment-14281265
 ] 

Frantz Mazoyer commented on STORM-585:
--

Hi, should I switch the Jira to Resolved / Fixed ?
Thanks :-)

 Performance issue in none grouping
 --

 Key: STORM-585
 URL: https://issues.apache.org/jira/browse/STORM-585
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating, 0.9.3, 0.10.0, 0.9.3-rc2
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer
Priority: Minor
 Fix For: 0.10.0


 In function mk-grouper, target-tasks is originally a ^List
 It then becomes a clojure vector:
 ...
 target-tasks (vec (sort target-tasks))]
 ...
 In :none grouping case, java method '.get' is called on target-tasks object:
 ...
 (.get target-tasks i)
 ...
 At run time, clojure will use introspection to find a method with a matching 
 name and signature, which is very costly.
 Using clojure built-in vector 'get' function instead of '.get' method made us 
 gain 25% performance in our use-case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-322) Windows Scripts do not handle spaces in JAVA_HOME path

2015-01-16 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280618#comment-14280618
 ] 

Frantz Mazoyer commented on STORM-322:
--

Tested on a windows box with https://github.com/apache/storm/pull/373 
Should be ok :-)


 Windows Scripts do not handle spaces in JAVA_HOME path
 --

 Key: STORM-322
 URL: https://issues.apache.org/jira/browse/STORM-322
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating
Reporter: Derek Dagit
Priority: Minor
  Labels: newbie, windows

 If Java is installed to a path that has spaces, the windows scripts will 
 error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.

2014-12-20 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14254585#comment-14254585
 ] 

Frantz Mazoyer commented on STORM-248:
--

Hello,
I would just like to point out that 
[https://issues.apache.org/jira/browse/STORM-487] (removing storm.cmd) has just 
been +1ed.
If this is merged, I will resubmit another pull request without the changes in 
storm*.cmd files.
Kind regards :)

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Assignee: Frantz Mazoyer
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.

2014-12-19 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253598#comment-14253598
 ] 

Frantz Mazoyer commented on STORM-248:
--

Hi, please check following branch : 
[https://github.com/fmazoyer/storm/commit/b2f8eb651d3e8dbd7939aae126d803da6b948663]

This fixes STORM-248 and STORM-322 (spaces in storm.*cmd).

This was therefore tested on Linux and Windows.

Thanks a lot for your help and comments :-)
Kind regards

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Assignee: Frantz Mazoyer
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.

2014-12-17 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249851#comment-14249851
 ] 

Frantz Mazoyer commented on STORM-248:
--

Thanks a lot for answering my question and for your comments :-)

So this is my understanding of the behaviour of storm.py :
{code}
  storm nimbus|supervisor|...
{code}
will eventually run :
{code}
  java ... -Dlogback.configurationFile=path/cluster.xml ...
{code}
As far as the daemons (nimbus, supervisor...) are concerned, this {code} 
-Dlogback.configurationFile {code} is 'opaquely' passed on to the JVM and 
processed by the logback library directly.
The supervisor process itself is no exception to the rule, except that 
supervisor.clj forks the worker daemons, with the following parameter:
{code}
(str -Dlogback.configurationFile= storm-home file-path-separator logback 
file-path-separator worker.xml)
{code}
If we keep the same mechanism, we have to somehow provide the path to the 
logback configuration directory (that would be the same as the one where 
cluster.xml is, that's right).

In any case, I guess we will agree on the fact that:
  - We need to get the storm.logback.conf.dir from storm.yaml in storm.py, 
let's call it {code} get_logback_conf_dir() {code}
  - For each daemon, we can pass on to the JVM: {code} 
-Dlogback.configurationFile= + get_logback_conf_dir() + /cluster.xml {code}

Now we may disagree on that :-):
   - For the supervisor daemon, pass on to the JVM: 
   {code}  
   -Dstorm.logback.conf.dir= + get_logback_conf_dir(),
   -Dlogback.configurationFile= + get_logback_conf_dir() + /cluster.xml
   {code}
  - In launch-worker in supervisor.clj, add the following lines:
   {code}
  ...
  (let ...
  storm-logback-conf-dir (or (System/getProperty 
storm.logback.conf.dir) (str storm-home file-path-separator logback))
  ...
  command (concat ...
  ...
  (str -Dlogback.configurationFile= storm-logback-conf-dir 
file-path-separator worker.xml)
  ...
{code}

Hope I'm not leaving anything out.

What do you reckon ?
Thanks a lot for your time :-)

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Assignee: Frantz Mazoyer
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.

2014-12-09 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14239347#comment-14239347
 ] 

Frantz Mazoyer commented on STORM-248:
--

Thanks for the comment :-)

I would propose this as a fix, then:
- Add storm.logback.conf.dir to storm.yaml.
- In python script, for all daemons except supervisor, determine 
-Dlogback.configurationFile from storm.logback.conf.dir and pass it on to 
daemon as -Dlogback.configurationFile= + STORM_LOGBACK_CONF_DIR + 
/cluster.xml.
  If not defined, fall back to default/usual storm logback directory 
pointing to cluster.xml  
- In python script, for the supervisor, if defined, pass on 
-Dstorm.logback.conf.dir= + STORM_LOGBACK_CONF_DIR and 
-Dlogback.configurationFile= + STORM_LOGBACK_CONF_DIR + /cluster.xml
  Otherwise, fall back to default/usual storm logback directory pointing to 
cluster.xml  
- In supervisor.clj, if storm.logback.conf.dir is defined, use is at root 
dir for worker.xml in launch command line; otherwise, fall back to 
default/usual storm logback directory pointing to worker.xml  

What do you think ?
Thanks again for the help :-)

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Assignee: Frantz Mazoyer
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (STORM-585) Performance issue in none grouping

2014-12-05 Thread Frantz Mazoyer (JIRA)
Frantz Mazoyer created STORM-585:


 Summary: Performance issue in none grouping
 Key: STORM-585
 URL: https://issues.apache.org/jira/browse/STORM-585
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.3-rc2
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer
Priority: Minor
 Fix For: 0.10.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-585) Performance issue in none grouping

2014-12-05 Thread Frantz Mazoyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantz Mazoyer updated STORM-585:
-
  Description: 
In function mk-grouper, target-tasks is originally a ^List
It then becomes a clojure vector:
...
target-tasks (vec (sort target-tasks))]
...

In :none grouping case, java method '.get' is called on target-tasks object:
...
(.get target-tasks i)
...

At run time, clojure will use introspection to find a method with a matching 
name and signature, which is very costly.

Using clojure built-in vector 'get' function instead of '.get' method made us 
gain 25% performance in our use-case. 

Affects Version/s: 0.10.0
   0.9.3
   0.9.2-incubating

 Performance issue in none grouping
 --

 Key: STORM-585
 URL: https://issues.apache.org/jira/browse/STORM-585
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 0.9.2-incubating, 0.9.3, 0.10.0, 0.9.3-rc2
Reporter: Frantz Mazoyer
Assignee: Frantz Mazoyer
Priority: Minor
 Fix For: 0.10.0


 In function mk-grouper, target-tasks is originally a ^List
 It then becomes a clojure vector:
 ...
 target-tasks (vec (sort target-tasks))]
 ...
 In :none grouping case, java method '.get' is called on target-tasks object:
 ...
 (.get target-tasks i)
 ...
 At run time, clojure will use introspection to find a method with a matching 
 name and signature, which is very costly.
 Using clojure built-in vector 'get' function instead of '.get' method made us 
 gain 25% performance in our use-case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.

2014-11-24 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223101#comment-14223101
 ] 

Frantz Mazoyer commented on STORM-248:
--

Hello,

I would like to lift this limitation to ease my deployments.

Here's the fix I propose:

In file supervisor.clj, function launch-worker, in the 'let', I would add 
'log-conf-file' like so:
...
(let [conf (:conf supervisor)
  storm-home (System/getProperty storm.home)
  log-conf-file (or (System/getProperty logback.configurationFile) 
(str storm-home /logback/cluster.xml))  
...
and substitute current concatenation of logback.configurationFile with it:
...
 (str -Dlogback.configurationFile= log-conf-file)
...

If it's ok with you, can I submit a pull request on the master, next?

Thanks a lot for your help.

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (STORM-248) cluster.xml location is hardcoded for workers.

2014-11-24 Thread Frantz Mazoyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frantz Mazoyer reassigned STORM-248:


Assignee: Frantz Mazoyer

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Assignee: Frantz Mazoyer
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (STORM-248) cluster.xml location is hardcoded for workers.

2014-11-24 Thread Frantz Mazoyer (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223116#comment-14223116
 ] 

Frantz Mazoyer commented on STORM-248:
--

 Hello, 

I would like to lift this limitation to ease my deployments. 

Here's the fix I propose: 

In file supervisor.clj, function launch-worker, in the 'let', I would add 
'storm-log-conf-file' like so: 
... 
(let [conf (:conf supervisor) 
  storm-home (System/getProperty storm.home)
  ... 
  storm-log-conf-file (or (System/getProperty 
logback.configurationFile) (str storm-home file-path-separator logback 
file-path-separator cluster.xml)) 
... 
and substitute current concatenation of logback.configurationFile with it: 
... 
 (str -Dlogback.configurationFile= storm-log-conf-file) 
... 

If it's ok with you, can I submit a pull request on the master, next? 

Thanks a lot for your help

 cluster.xml location is hardcoded for workers.
 --

 Key: STORM-248
 URL: https://issues.apache.org/jira/browse/STORM-248
 Project: Apache Storm
  Issue Type: Bug
Reporter: Edward Goslin
Assignee: Frantz Mazoyer
Priority: Trivial

 when the supervisor spawns a worker process, it assumes the cluster.xml is in
 -Dlogback.configurationFile= storm-home /logback/cluster.xml
 It should take the VM arguement for the supervisor and pass it down to the 
 worker.
 System.get(logback.configurationFile, storm-home + /logback/cluster.xml)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)