[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2016-11-09 Thread Craig Hawco (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Hawco updated STORM-2194:
---
Description: 
I've been trying to track down a cause of some of our issues with some 
exceptions leaving Storm workers in a zombified state for some time. I believe 
I've isolated the bug to the behaviour in 
`:report-error-and-die`/reportErrorAndDie in the executor. Essentially:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down...")
   ((:suicide-fn <>
{code}

has the grouping for the if statement slightly wrong. It shouldn't log OR die 
from InterruptedException/InterruptedIOException, but it should log under that 
condition, and ALWAYS die. 

Basically:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down..."))
 ((:suicide-fn <>)))
{code}

After digging into the Java port of this code, it looks like a different bug 
was introduced while porting:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
suicideFn.run();
}
{code}

Was how this was initially ported, and STORM-2142 changed this to:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
} else {
suicideFn.run();
}
}
{code}

However, I believe the correct port is as described above:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
}
suicideFn.run();
}
{code}

I'll look into providing patches for the 1.x and 2.x branches shortly.

  was:
I've been trying to track down a cause of some of our issues with some 
exceptions leaving Storm workers in a zombified state for some time. I believe 
I've isolated the bug to the behaviour in 
`:report-error-and-die`/reportErrorAndDie in the executor. Essentially:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down...")
   ((:suicide-fn <>
{code}

has the grouping for the if statement slightly wrong. It shouldn't log OR die 
from InterruptedException/InterruptedIOException, but it should log under that 
condition, and ALWAYS die. 

Basically:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupt

[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2016-11-09 Thread Craig Hawco (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Hawco updated STORM-2194:
---
Description: 
I've been trying to track down a cause of some of our issues with some 
exceptions leaving Storm workers in a zombified state for some time. I believe 
I've isolated the bug to the behaviour in 
`:report-error-and-die`/reportErrorAndDie in the executor. Essentially:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down...")
   ((:suicide-fn <>
{code}

has the grouping for the if statement slightly wrong. It shouldn't log OR die 
from InterruptedException/InterruptedIOException, but it should log under that 
condition, and ALWAYS die. 

Basically:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down..."))
 ((:suicide-fn <>)))
{code}

After digging into the Java port of this code, it looks like a different bug 
was introduced while porting:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
suicideFn.run();
}
{code}

Was how this was initially ported, and STORM-2142 changed this to:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
} else {
suicideFn.run();
}
{code}

However, I believe the correct port is as described above:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
}
suicideFn.run();
{code}

I'll look into providing patches for the 1.x and 2.x branches shortly.

  was:
I've been trying to track down a cause of some of our issues with some 
exceptions leaving Storm workers in a zombified state for some time. I believe 
I've isolated the bug to the behaviour in 
`:report-error-and-die`/reportErrorAndDie in the executor. Essentially:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down...")
   ((:suicide-fn <>
{code}

has the grouping for the if statement slightly wrong. It shouldn't log OR die 
from InterruptedException/InterruptedIOException, but it should log under that 
condition, and ALWAYS die. 

Basically:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion

[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2016-11-09 Thread Craig Hawco (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Hawco updated STORM-2194:
---
Description: 
I've been trying to track down a cause of some of our issues with some 
exceptions leaving Storm workers in a zombified state for some time. I believe 
I've isolated the bug to the behaviour in 
:report-error-and-die/reportErrorAndDie in the executor. Essentially:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down...")
   ((:suicide-fn <>
{code}

has the grouping for the if statement slightly wrong. It shouldn't log OR die 
from InterruptedException/InterruptedIOException, but it should log under that 
condition, and ALWAYS die. 

Basically:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down..."))
 ((:suicide-fn <>)))
{code}

After digging into the Java port of this code, it looks like a different bug 
was introduced while porting:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
suicideFn.run();
}
{code}

Was how this was initially ported, and STORM-2142 changed this to:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
} else {
suicideFn.run();
}
{code}

However, I believe the correct port is as described above:

{code}
if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
|| 
Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
LOG.info("Got interrupted exception shutting thread down...");
}
suicideFn.run();
{code}

I'll look into providing patches for the 1.x and 2.x branches shortly.

  was:
I've been trying to track down a cause of some of our issues with some 
exceptions leaving Storm workers in a zombified state for some time. I believe 
I've isolated the bug to the behaviour in 
`:report-error-and-die`/reportErrorAndDie in the executor. Essentially:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion shutting 
thread down...")
   ((:suicide-fn <>
{code}

has the grouping for the if statement slightly wrong. It shouldn't log OR die 
from InterruptedException/InterruptedIOException, but it should log under that 
condition, and ALWAYS die. 

Basically:

{code}
 :report-error-and-die (fn [error]
 (try
   ((:report-error <>) error)
   (catch Exception e
 (log-message "Error while reporting error to 
cluster, proceeding with shutdown")))
 (if (or
(exception-cause? InterruptedException 
error)
(exception-cause? 
java.io.InterruptedIOException error))
   (log-message "Got interrupted excpetion s

[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2016-11-11 Thread Craig Hawco (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Hawco updated STORM-2194:
---
Attachment: scrubbed-thread-dump.txt

Showing 24 spout threads as expected, but only 23 b-0 threads 

> ReportErrorAndDie doesn't always die
> 
>
> Key: STORM-2194
> URL: https://issues.apache.org/jira/browse/STORM-2194
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0, 1.0.2
>Reporter: Craig Hawco
> Attachments: scrubbed-thread-dump.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I've been trying to track down a cause of some of our issues with some 
> exceptions leaving Storm workers in a zombified state for some time. I 
> believe I've isolated the bug to the behaviour in 
> :report-error-and-die/reportErrorAndDie in the executor. Essentially:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down...")
>((:suicide-fn <>
> {code}
> has the grouping for the if statement slightly wrong. It shouldn't log OR die 
> from InterruptedException/InterruptedIOException, but it should log under 
> that condition, and ALWAYS die. 
> Basically:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down..."))
>  ((:suicide-fn <>)))
> {code}
> After digging into the Java port of this code, it looks like a different bug 
> was introduced while porting:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> suicideFn.run();
> }
> {code}
> Was how this was initially ported, and STORM-2142 changed this to:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> } else {
> suicideFn.run();
> }
> {code}
> However, I believe the correct port is as described above:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> }
> suicideFn.run();
> {code}
> I'll look into providing patches for the 1.x and 2.x branches shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2017-04-04 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-2194:

Fix Version/s: 1.1.1
   1.0.4

> ReportErrorAndDie doesn't always die
> 
>
> Key: STORM-2194
> URL: https://issues.apache.org/jira/browse/STORM-2194
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0, 1.0.2
>Reporter: Craig Hawco
>Assignee: Paul Poulosky
> Fix For: 2.0.0, 1.0.4, 1.1.1
>
> Attachments: scrubbed-thread-dump.txt
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I've been trying to track down a cause of some of our issues with some 
> exceptions leaving Storm workers in a zombified state for some time. I 
> believe I've isolated the bug to the behaviour in 
> :report-error-and-die/reportErrorAndDie in the executor. Essentially:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down...")
>((:suicide-fn <>
> {code}
> has the grouping for the if statement slightly wrong. It shouldn't log OR die 
> from InterruptedException/InterruptedIOException, but it should log under 
> that condition, and ALWAYS die. 
> Basically:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down..."))
>  ((:suicide-fn <>)))
> {code}
> After digging into the Java port of this code, it looks like a different bug 
> was introduced while porting:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> suicideFn.run();
> }
> {code}
> Was how this was initially ported, and STORM-2142 changed this to:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> } else {
> suicideFn.run();
> }
> {code}
> However, I believe the correct port is as described above:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> }
> suicideFn.run();
> {code}
> I'll look into providing patches for the 1.x and 2.x branches shortly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2017-07-03 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-2194:

Fix Version/s: (was: 1.1.1)
   1.2.0

> ReportErrorAndDie doesn't always die
> 
>
> Key: STORM-2194
> URL: https://issues.apache.org/jira/browse/STORM-2194
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0, 1.0.2
>Reporter: Craig Hawco
>Assignee: Paul Poulosky
> Fix For: 2.0.0, 1.0.4, 1.2.0
>
> Attachments: scrubbed-thread-dump.txt
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I've been trying to track down a cause of some of our issues with some 
> exceptions leaving Storm workers in a zombified state for some time. I 
> believe I've isolated the bug to the behaviour in 
> :report-error-and-die/reportErrorAndDie in the executor. Essentially:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down...")
>((:suicide-fn <>
> {code}
> has the grouping for the if statement slightly wrong. It shouldn't log OR die 
> from InterruptedException/InterruptedIOException, but it should log under 
> that condition, and ALWAYS die. 
> Basically:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down..."))
>  ((:suicide-fn <>)))
> {code}
> After digging into the Java port of this code, it looks like a different bug 
> was introduced while porting:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> suicideFn.run();
> }
> {code}
> Was how this was initially ported, and STORM-2142 changed this to:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> } else {
> suicideFn.run();
> }
> {code}
> However, I believe the correct port is as described above:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> }
> suicideFn.run();
> {code}
> I'll look into providing patches for the 1.x and 2.x branches shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2017-07-03 Thread Jungtaek Lim (JIRA)

 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim updated STORM-2194:

Fix Version/s: 1.1.1

> ReportErrorAndDie doesn't always die
> 
>
> Key: STORM-2194
> URL: https://issues.apache.org/jira/browse/STORM-2194
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0, 1.0.2
>Reporter: Craig Hawco
>Assignee: Paul Poulosky
> Fix For: 2.0.0, 1.0.4, 1.1.1, 1.2.0
>
> Attachments: scrubbed-thread-dump.txt
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I've been trying to track down a cause of some of our issues with some 
> exceptions leaving Storm workers in a zombified state for some time. I 
> believe I've isolated the bug to the behaviour in 
> :report-error-and-die/reportErrorAndDie in the executor. Essentially:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down...")
>((:suicide-fn <>
> {code}
> has the grouping for the if statement slightly wrong. It shouldn't log OR die 
> from InterruptedException/InterruptedIOException, but it should log under 
> that condition, and ALWAYS die. 
> Basically:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down..."))
>  ((:suicide-fn <>)))
> {code}
> After digging into the Java port of this code, it looks like a different bug 
> was introduced while porting:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> suicideFn.run();
> }
> {code}
> Was how this was initially ported, and STORM-2142 changed this to:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> } else {
> suicideFn.run();
> }
> {code}
> However, I believe the correct port is as described above:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> }
> suicideFn.run();
> {code}
> I'll look into providing patches for the 1.x and 2.x branches shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (STORM-2194) ReportErrorAndDie doesn't always die

2018-10-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/STORM-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated STORM-2194:
--
Labels: pull-request-available  (was: )

> ReportErrorAndDie doesn't always die
> 
>
> Key: STORM-2194
> URL: https://issues.apache.org/jira/browse/STORM-2194
> Project: Apache Storm
>  Issue Type: Bug
>  Components: storm-core
>Affects Versions: 2.0.0, 1.0.2
>Reporter: Craig Hawco
>Assignee: Paul Poulosky
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0, 1.0.4, 1.1.1, 1.2.0
>
> Attachments: scrubbed-thread-dump.txt
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I've been trying to track down a cause of some of our issues with some 
> exceptions leaving Storm workers in a zombified state for some time. I 
> believe I've isolated the bug to the behaviour in 
> :report-error-and-die/reportErrorAndDie in the executor. Essentially:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down...")
>((:suicide-fn <>
> {code}
> has the grouping for the if statement slightly wrong. It shouldn't log OR die 
> from InterruptedException/InterruptedIOException, but it should log under 
> that condition, and ALWAYS die. 
> Basically:
> {code}
>  :report-error-and-die (fn [error]
>  (try
>((:report-error <>) error)
>(catch Exception e
>  (log-message "Error while reporting error to 
> cluster, proceeding with shutdown")))
>  (if (or
> (exception-cause? InterruptedException 
> error)
> (exception-cause? 
> java.io.InterruptedIOException error))
>(log-message "Got interrupted excpetion 
> shutting thread down..."))
>  ((:suicide-fn <>)))
> {code}
> After digging into the Java port of this code, it looks like a different bug 
> was introduced while porting:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> suicideFn.run();
> }
> {code}
> Was how this was initially ported, and STORM-2142 changed this to:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> } else {
> suicideFn.run();
> }
> {code}
> However, I believe the correct port is as described above:
> {code}
> if (Utils.exceptionCauseIsInstanceOf(InterruptedException.class, e)
> || 
> Utils.exceptionCauseIsInstanceOf(java.io.InterruptedIOException.class, e)) {
> LOG.info("Got interrupted exception shutting thread down...");
> }
> suicideFn.run();
> {code}
> I'll look into providing patches for the 1.x and 2.x branches shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)