[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-10-14 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171522#comment-14171522
 ] 

Purshotam Shah commented on OOZIE-1813:
---

Thanks Rohini for review, committed to trunk. 

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, 
> OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-10-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14171385#comment-14171385
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 2 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 1538
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/2036/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, 
> OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-18 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139775#comment-14139775
 ] 

Rohini Palaniswamy commented on OOZIE-1813:
---

+1. The longer lines are named queries.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, 
> OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139688#comment-14139688
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 2 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color} - patch does not compile, cannot run testcases
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1994/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, 
> OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139638#comment-14139638
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 2 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color} - patch does not compile, cannot run testcases
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1993/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, 
> OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-18 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139486#comment-14139486
 ] 

Purshotam Shah commented on OOZIE-1813:
---

{quote}
Tests failed: 1
. Tests errors: 0
. The patch failed the following testcases:
. 
testMessage_withMixedStatus(org.apache.oozie.command.coord.TestAbandonedCoordChecker)
{quote}
Not sure why this has failed in pre-commit. I tried running whole testcase 
multiple times in my local box, no failure.   Uploaded patch to re trigger 
pre-commit build.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, 
> OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137979#comment-14137979
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 2 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1531
.Tests failed: 1
.Tests errors: 0

.The patch failed the following testcases:

.  
testMessage_withMixedStatus(org.apache.oozie.command.coord.TestAbandonedCoordChecker)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1987/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, 
> OOZIE-1813-V3.patch, OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, 
> OOZIE-1813-V6.patch, OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-17 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137779#comment-14137779
 ] 

Rohini Palaniswamy commented on OOZIE-1813:
---

>From [~mchiang_4w...@yahoo.com]:
  currently the diff of current time and coord job start time is used to check
abandoned job "older_than".

for the case that coord job is in catch up mode, and its start time is earlier
than current time, than it will be considered as "older" job to kill even
though it is created just now.

however coord job can be created at present, and its start time is in the
future. using coord job created time as the base may not be accurate either.

Thanks for catching this Michelle. So the buffer of 2 days should be max of 
(created time, start time). And OOZIE-1813-Amendment-V1.patch addresses that.

+1 Pending jenkins.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, 
> OOZIE-1813-V3.patch, OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, 
> OOZIE-1813-V6.patch, OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-17 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14137767#comment-14137767
 ] 

Purshotam Shah commented on OOZIE-1813:
---

2 days buffer, fails for catchup jobs. Attaching amendment patch.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-Amendment-V1.patch, OOZIE-1813-V2.patch, 
> OOZIE-1813-V3.patch, OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, 
> OOZIE-1813-V6.patch, OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-10 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128933#comment-14128933
 ] 

Purshotam Shah commented on OOZIE-1813:
---

Thanks Rohini and Robert for review. Committed  to trunk.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-10 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128900#comment-14128900
 ] 

Rohini Palaniswamy commented on OOZIE-1813:
---

+1. Lines greater than 132 are Named queries and so ok. Test failure is known 
flaky test.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124264#comment-14124264
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1522
.Tests failed: 4
.Tests errors: 0

.The patch failed the following testcases:

.  
testBundleStatusTransitServiceKilled2(org.apache.oozie.service.TestStatusTransitService)
.  
testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService)
.  
testActionKillCommandDate(org.apache.oozie.command.coord.TestCoordActionsKillXCommand)
.  
testMemoryUsageAndSpeed(org.apache.oozie.service.TestPartitionDependencyManagerService)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1920/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124254#comment-14124254
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1522
.Tests failed: 0
.Tests errors: 1

.The patch failed the following testcases:

.  

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1918/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123866#comment-14123866
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 1522
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1910/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123842#comment-14123842
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1522
.Tests failed: 2
.Tests errors: 0

.The patch failed the following testcases:

.  
testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService)
.  
testPauseBundleAndCoordinator(org.apache.oozie.service.TestPauseTransitService)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1907/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123817#comment-14123817
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:red}-1 COMPILE{color}
.{color:red}-1{color} HEAD does not compile
.{color:red}-1{color} patch does not compile
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color} - patch does not compile, cannot run testcases
{color:red}-1 DISTRO{color}
.{color:red}-1{color} distro tarball fails with the patch


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1916/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123483#comment-14123483
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1521
.Tests failed: 1
.Tests errors: 0

.The patch failed the following testcases:

.  testCoordinatorActionEvent(org.apache.oozie.event.TestEventGeneration)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1902/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122763#comment-14122763
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 1521
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1901/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-04 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14122509#comment-14122509
 ] 

Purshotam Shah commented on OOZIE-1813:
---

New patch, which include

1.
{quote}
Actually, on the final patch, can you add the new config properties to 
oozie-default.xml?
{quote}
2.
{quote}
Also can you add a check to only kill the coord job if it is older than 2 days? 
If there was something submitted and lot of failures initially this would kill 
the coord job. Should give user sometime to correct any error and rerun if 
needed.
{quote}
3.
{quote}
In HA, this service should only run on primary server.
{quote}


> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch, OOZIE-1813-V8.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-09-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119306#comment-14119306
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1515
.Tests failed: 3
.Tests errors: 0

.The patch failed the following testcases:

.  
testBundleStatusTransitServiceKilled2(org.apache.oozie.service.TestStatusTransitService)
.  
testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService)
.  
testUnpauseBundleAndCoordinator(org.apache.oozie.service.TestPauseTransitService)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1850/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-08-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110207#comment-14110207
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1512
.Tests failed: 6
.Tests errors: 2

.The patch failed the following testcases:

.  
testConcurrencyReachedAndChooseNextEligible(org.apache.oozie.service.TestCallableQueueService)
.  testMain(org.apache.oozie.action.hadoop.TestHiveMain)
.  testPigScript(org.apache.oozie.action.hadoop.TestPigMainWithOldAPI)
.  testPigScript(org.apache.oozie.action.hadoop.TestPigMain)
.  testEmbeddedPigWithinPython(org.apache.oozie.action.hadoop.TestPigMain)
.  testPig_withNullExternalID(org.apache.oozie.action.hadoop.TestPigMain)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1713/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107546#comment-14107546
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:red}-1 COMPILE{color}
.{color:red}-1{color} HEAD does not compile
.{color:red}-1{color} patch does not compile
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color} - patch does not compile, cannot run testcases
{color:red}-1 DISTRO{color}
.{color:red}-1{color} distro tarball fails with the patch


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1645/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14106736#comment-14106736
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:green}+1 TESTS{color}
.Tests run: 1512
{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1580/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-08-21 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105110#comment-14105110
 ] 

Purshotam Shah commented on OOZIE-1813:
---

In HA, this service should only run on primary server.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083966#comment-14083966
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1509
.Tests failed: 2
.Tests errors: 0

.The patch failed the following testcases:

.  
testActionKillCommandDate(org.apache.oozie.command.coord.TestCoordActionsKillXCommand)
.  
testCoordActionInputCheckXCommandUniqueness(org.apache.oozie.command.coord.TestCoordActionInputCheckXCommand)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1446/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-06-23 Thread Rohini Palaniswamy (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041461#comment-14041461
 ] 

Rohini Palaniswamy commented on OOZIE-1813:
---

Also can you add a check to only kill the coord job if it is older than 2 days? 
If there was something submitted and lot of failures initially this would kill 
the coord job. Should give user sometime to correct any error and rerun if 
needed.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-06-02 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016063#comment-14016063
 ] 

Robert Kanter commented on OOZIE-1813:
--

Actually, on the final patch, can you add the new config properties to 
oozie-default.xml?

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-06-02 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016059#comment-14016059
 ] 

Robert Kanter commented on OOZIE-1813:
--

+1 on the latest patch on RB (pending Jenkins)

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014236#comment-14014236
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1454
.Tests failed: 4
.Tests errors: 0

.The patch failed the following testcases:

.  
testConcurrencyReachedAndChooseNextEligible(org.apache.oozie.service.TestCallableQueueService)
.  
testBundleEngineKill(org.apache.oozie.servlet.TestV1JobServletBundleEngine)
.  
testActionInputCheckLatestActionCreationTime(org.apache.oozie.command.coord.TestCoordActionInputCheckXCommand)
.  
testTimeOutWithUnresolvedMissingDependencies(org.apache.oozie.command.coord.TestCoordPushDependencyCheckXCommand)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1278/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch, 
> OOZIE-1813-V7.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007962#comment-14007962
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.{color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:red}-1 COMPILE{color}
.{color:red}-1{color} HEAD does not compile
.{color:green}+1{color} patch compiles
.{color:red}-1{color} the patch seems to introduce 472 new javac warning(s)
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1452
.Tests failed: 3
.Tests errors: 4

.The patch failed the following testcases:

.  
testMessage_withTimedout(org.apache.oozie.command.coord.TestAbandonedCoordChecker)
.  
testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)
.  
testConcurrencyReachedAndChooseNextEligible(org.apache.oozie.service.TestCallableQueueService)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1257/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch, OOZIE-1813-V6.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004272#comment-14004272
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:red}-1 RAT{color}
.{color:red}-1{color} the patch seems to introduce 1 new RAT warning(s)
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1452
.Tests failed: 1
.Tests errors: 2

.The patch failed the following testcases:

.  
testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1254/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch, OOZIE-1813-V5.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004186#comment-14004186
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:red}-1 RAT{color}
.{color:red}-1{color} the patch seems to introduce 1 new RAT warning(s)
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:red}-1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:red}-1{color} patch does not compile
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color} - patch does not compile, cannot run testcases
{color:red}-1 DISTRO{color}
.{color:red}-1{color} distro tarball fails with the patch


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1253/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch, 
> OOZIE-1813-V4.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004012#comment-14004012
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:red}-1 RAT{color}
.{color:red}-1{color} the patch seems to introduce 1 new RAT warning(s)
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1451
.Tests failed: 3
.Tests errors: 2

.The patch failed the following testcases:

.  
testActionInputCheckLatestActionCreationTime(org.apache.oozie.command.coord.TestCoordActionInputCheckXCommandNonUTC)
.  
testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)
.  
testBundleEngineResume(org.apache.oozie.servlet.TestV1JobServletBundleEngine)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1250/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-16 Thread Purshotam Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14000379#comment-14000379
 ] 

Purshotam Shah commented on OOZIE-1813:
---

. -1 the patch contains 2 line(s) longer than 132 characters
Are namedQuery.

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch, OOZIE-1813-V3.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-05-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13999553#comment-13999553
 ] 

Hadoop QA commented on OOZIE-1813:
--

Testing JIRA OOZIE-1813

Cleaning local git workspace



{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.{color:green}+1{color} the patch does not introduce any @author tags
.{color:green}+1{color} the patch does not introduce any tabs
.{color:green}+1{color} the patch does not introduce any trailing spaces
.{color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.{color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:red}-1 RAT{color}
.{color:red}-1{color} the patch seems to introduce 1 new RAT warning(s)
{color:green}+1 JAVADOC{color}
.{color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.{color:green}+1{color} HEAD compiles
.{color:green}+1{color} patch compiles
.{color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.{color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.{color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.Tests run: 1450
.Tests failed: 1
.Tests errors: 7

.The patch failed the following testcases:

.  
testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)

{color:green}+1 DISTRO{color}
.{color:green}+1{color} distro tarball builds with the patch 


{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1237/

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1813-V2.patch
>
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1813) Add service to report/kill rogue bundles and coordinator jobs

2014-04-29 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984579#comment-13984579
 ] 

Robert Kanter commented on OOZIE-1813:
--

That's a great idea!

> Add service to report/kill rogue bundles and coordinator jobs
> -
>
> Key: OOZIE-1813
> URL: https://issues.apache.org/jira/browse/OOZIE-1813
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>
> People leave their test coordinator and bundle jobs without ever killing them
> and they just eat up resources heavily. We should have a service which 
> periodically check for abandoned coords and report/kill them.
> We can add multiple logic to this like ( number of consecutive 
> failed/timedout action, total number of failed/timedout action). 
> To start with if number of coord action with failed/timedout status > defined 
> value, then coord is considered to be rogue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)