[jira] [Updated] (STORM-2754) Not killing on exceptions in other threads

2017-09-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2754:
--
Labels: pull-request-available  (was: )

> Not killing on exceptions in other threads
> --
>
> Key: STORM-2754
> URL: https://issues.apache.org/jira/browse/STORM-2754
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
>
> We probably don't want to kill the process if the exceptions are from other 
> threads



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2754) Not killing on exceptions in other threads

2017-09-21 Thread Ethan Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Li updated STORM-2754:

Description: We probably don't want to kill the process if the exceptions 
are from other threads  (was: We probably don't want to kill the threads if the 
exceptions are from other threads)

> Not killing on exceptions in other threads
> --
>
> Key: STORM-2754
> URL: https://issues.apache.org/jira/browse/STORM-2754
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>
> We probably don't want to kill the process if the exceptions are from other 
> threads



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2754) Not killing on exceptions in other threads

2017-09-21 Thread Ethan Li (JIRA)
Ethan Li created STORM-2754:
---

 Summary: Not killing on exceptions in other threads
 Key: STORM-2754
 URL: https://issues.apache.org/jira/browse/STORM-2754
 Project: Apache Storm
  Issue Type: Improvement
Reporter: Ethan Li
Assignee: Ethan Li
Priority: Minor


We probably don't want to kill the threads if the exceptions are from other 
threads



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2753) Avoid shutting down netty server on netty exception

2017-09-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2753:
--
Labels: pull-request-available  (was: )

> Avoid shutting down netty server on netty exception
> ---
>
> Key: STORM-2753
> URL: https://issues.apache.org/jira/browse/STORM-2753
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-client
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
>
> We should avoid shutting down netty server on netty exception



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2753) Avoid shutting down netty server on netty exception

2017-09-21 Thread Ethan Li (JIRA)
Ethan Li created STORM-2753:
---

 Summary: Avoid shutting down netty server on netty exception
 Key: STORM-2753
 URL: https://issues.apache.org/jira/browse/STORM-2753
 Project: Apache Storm
  Issue Type: Bug
  Components: storm-client
Reporter: Ethan Li
Assignee: Ethan Li
Priority: Minor


We should avoid shutting down netty server on netty exception



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2084) after supervisor v2 merge async localizer and localizer

2017-09-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2084:
--
Labels: pull-request-available  (was: )

> after supervisor v2 merge async localizer and localizer
> ---
>
> Key: STORM-2084
> URL: https://issues.apache.org/jira/browse/STORM-2084
> Project: Apache Storm
>  Issue Type: Improvement
>  Components: storm-core
>Affects Versions: 2.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>  Labels: pull-request-available
>
> Once we mere in STORM-2018 
> https://github.com/apache/storm/pull/1642 
> we should look into merging the two localizers into a single class.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2744) Add in "restart timeout" for backpressure

2017-09-21 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2744:
--
Labels: pull-request-available  (was: )

> Add in "restart timeout" for backpressure
> -
>
> Key: STORM-2744
> URL: https://issues.apache.org/jira/browse/STORM-2744
> Project: Apache Storm
>  Issue Type: Improvement
>Reporter: Ethan Li
>Assignee: Ethan Li
>Priority: Minor
>  Labels: pull-request-available
>
> Instead of stopping indefinitely we want to add a timeout value to the 
> backpressure mechanism so that spouts won't get stuck if bolts fail to switch 
> back on.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (STORM-2748) TickTupleTest is useless

2017-09-21 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans resolved STORM-2748.

   Resolution: Fixed
Fix Version/s: 2.0.0

> TickTupleTest is useless
> 
>
> Key: STORM-2748
> URL: https://issues.apache.org/jira/browse/STORM-2748
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-server
>Affects Versions: 2.0.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The test starts up a small topology on a simulated time cluster with 
> TOPOLOGY_TICK_TUPLE_FREQ_SECS set to 1.  Then it simulates 2 seconds of 
> cluster time.  This is not enough time to even launch the topology.  How do I 
> know this?  Because the Bolt and Spout in the topology override `writeObject` 
> so the resulting serialized bolt and spout are empty and trying to 
> deserialize them results in an exception.
> Just running a topology that does nothing and never verifies that the ticks 
> showed up is a really horrible test.  We should either delete it entirely or 
> actually verify that ticks are showing up once a second.  I am leaning 
> towards just removing it totally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2752) Nimbus crashes silently if scheduler is not found

2017-09-21 Thread Martin Burian (JIRA)
Martin Burian created STORM-2752:


 Summary: Nimbus crashes silently if scheduler is not found
 Key: STORM-2752
 URL: https://issues.apache.org/jira/browse/STORM-2752
 Project: Apache Storm
  Issue Type: Bug
Affects Versions: 1.0.5
Reporter: Martin Burian


When nimbus is started and the custom scheduler specified in storm.yaml is not 
in the classpath, nimbus hangs and exits with status 13 about 10s later. No 
errors are logged.

Affected versions 1.0.3-5, I did not test any other. OpenJDK 8.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (STORM-2750) fix double_checked locking

2017-09-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/STORM-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stig Rohde Døssing reassigned STORM-2750:
-

Assignee: Huaiyong Fu

> fix double_checked locking
> --
>
> Key: STORM-2750
> URL: https://issues.apache.org/jira/browse/STORM-2750
> Project: Apache Storm
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Huaiyong Fu
>Assignee: Huaiyong Fu
>  Labels: pull-request-available
>
> update HBaseSecurityUtil singleton to fix double_checked locking
> Double-Checked Locking is widely cited and used as an efficient method for 
> implementing lazy initialization in a multithreaded environment.
> Unfortunately, it will not work reliably in a platform independent way when 
> implemented in Java, without additional synchronization. When implemented in 
> other languages, such as C++, it depends on the memory model of the 
> processor, the reorderings performed by the compiler and the interaction 
> between the compiler and the synchronization library. Since none of these are 
> specified in a language such as C++, little can be said about the situations 
> in which it will work. Explicit memory barriers can be used to make it work 
> in C++, but these barriers are not available in Java.
> See url link for details: 
> http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (STORM-2083) Blacklist Scheduler

2017-09-21 Thread Jungtaek Lim (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174987#comment-16174987
 ] 

Jungtaek Lim edited comment on STORM-2083 at 9/21/17 3:51 PM:
--

[~howardhl]
Hi, I think Github mention wouldn't work for most of us, so leaving kindly 
reminder from JIRA. Could you spend time to continue working on PR to finalize? 
If you can't or you don't want by yourself please let me know so that I can 
take over with preserving your credit. 
Thanks in advance!


was (Author: kabhwan):
[~howardhl]
Hi, I think Github mention wouldn't work for most of us, so leaving kindly 
reminder. Could you spend time to continue working on PR to finalize? 
If you can't or you don't want by yourself please let me know so that I can 
take over with preserving your credit. 
Thanks in advance!

> Blacklist Scheduler
> ---
>
> Key: STORM-2083
> URL: https://issues.apache.org/jira/browse/STORM-2083
> Project: Apache Storm
>  Issue Type: New Feature
>  Components: storm-core
>Reporter: Howard Lee
>  Labels: blacklist, scheduling
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> My company has gone through a fault in production, in which a critical switch 
> causes unstable network for a set of machines with package loss rate of 
> 30%-50%. In such fault, the supervisors and workers on the machines are not 
> definitely dead, which is easy to handle. Instead they are still alive but 
> very unstable. They lost heartbeat to the nimbus occasionally. The nimbus, in 
> such circumstance, will still assign jobs to these machines, but will soon 
> find them invalid again, result in a very slow convergence to stable status.
> To deal with such unstable cases, we intend to implement a blacklist 
> scheduler, which will add the unstable nodes (supervisors, slots) to the 
> blacklist temporarily, and resume them later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (STORM-2083) Blacklist Scheduler

2017-09-21 Thread Jungtaek Lim (JIRA)

[ 
https://issues.apache.org/jira/browse/STORM-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174987#comment-16174987
 ] 

Jungtaek Lim commented on STORM-2083:
-

[~howardhl]
Hi, I think Github mention wouldn't work for most of us, so leaving kindly 
reminder. Could you spend time to continue working on PR to finalize? 
If you can't or you don't want by yourself please let me know so that I can 
take over with preserving your credit. 
Thanks in advance!

> Blacklist Scheduler
> ---
>
> Key: STORM-2083
> URL: https://issues.apache.org/jira/browse/STORM-2083
> Project: Apache Storm
>  Issue Type: New Feature
>  Components: storm-core
>Reporter: Howard Lee
>  Labels: blacklist, scheduling
>  Time Spent: 15h 10m
>  Remaining Estimate: 0h
>
> My company has gone through a fault in production, in which a critical switch 
> causes unstable network for a set of machines with package loss rate of 
> 30%-50%. In such fault, the supervisors and workers on the machines are not 
> definitely dead, which is easy to handle. Instead they are still alive but 
> very unstable. They lost heartbeat to the nimbus occasionally. The nimbus, in 
> such circumstance, will still assign jobs to these machines, but will soon 
> find them invalid again, result in a very slow convergence to stable status.
> To deal with such unstable cases, we intend to implement a blacklist 
> scheduler, which will add the unstable nodes (supervisors, slots) to the 
> blacklist temporarily, and resume them later. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2751) Remove AsyncLoggingContext from Supervisor

2017-09-21 Thread Kishor Patil (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishor Patil updated STORM-2751:

Description: 
If disk can not keep it up with all logging, the {{AsyncLoggingContext}} causes 
large heap memory be utilized causing the JVM to churn CPU.


  was:If disk can not keep it up with all logging, the {{AsyncLoggingContext}} 
causes large heap memory be utilized causing the JVM.


> Remove AsyncLoggingContext from Supervisor
> --
>
> Key: STORM-2751
> URL: https://issues.apache.org/jira/browse/STORM-2751
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Reporter: Kishor Patil
>Assignee: Kishor Patil
>Priority: Minor
>
> If disk can not keep it up with all logging, the {{AsyncLoggingContext}} 
> causes large heap memory be utilized causing the JVM to churn CPU.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (STORM-2751) Remove AsyncLoggingContext from Supervisor

2017-09-21 Thread Kishor Patil (JIRA)
Kishor Patil created STORM-2751:
---

 Summary: Remove AsyncLoggingContext from Supervisor
 Key: STORM-2751
 URL: https://issues.apache.org/jira/browse/STORM-2751
 Project: Apache Storm
  Issue Type: Bug
  Components: storm-core
Reporter: Kishor Patil
Assignee: Kishor Patil
Priority: Minor


If disk can not keep it up with all logging, the {{AsyncLoggingContext}} causes 
large heap memory be utilized causing the JVM.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)