[jira] [Commented] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2015-10-20 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14966019#comment-14966019
 ] 

Mona Chitnis commented on OOZIE-1976:
-

Thanks [~puru] for your patch. I did a first pass as well and have few 
comments. Waiting for your replies

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: Input-check.docx, OOZIE-1976-WIP.patch, 
> OOZIE-1976-rough-design-2.pdf, OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 38474: OOZIE-1976- Specifying coordinator input datasets in more logical ways

2015-10-20 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38474/#review103325
---


Having a little trouble zero-ing in on the code that checks that logically the 
dependencies have been met. Can you point me to that class? Or is that 
abstracted by the JexlEngine? (I'm not familiar with it)


client/src/main/resources/oozie-coordinator-0.5.xsd (line 110)
<https://reviews.apache.org/r/38474/#comment161313>

there is through 'maxOccurs'. but I'm thinking this patch is supporting any 
arbitrary depth



client/src/main/resources/oozie-coordinator-0.5.xsd (line 126)
<https://reviews.apache.org/r/38474/#comment161314>

what is 'combine' used for?



core/src/main/java/org/apache/oozie/CoordinatorActionBean.java (line 172)
<https://reviews.apache.org/r/38474/#comment161315>

this is part of a different change



core/src/main/java/org/apache/oozie/CoordinatorActionBean.java (line 859)
<https://reviews.apache.org/r/38474/#comment161316>

Log the exception here



core/src/main/java/org/apache/oozie/CoordinatorActionBean.java (line 882)
<https://reviews.apache.org/r/38474/#comment161318>

same as above, log the exception here



core/src/main/java/org/apache/oozie/command/coord/CoordActionInputCheckXCommand.java
 (line 155)
<https://reviews.apache.org/r/38474/#comment161319>

let's make "input-check" a private static final String element and used in 
the multiple places in the code, so its a single place in case the name changes 
later



core/src/main/java/org/apache/oozie/command/coord/CoordPushDependencyCheckXCommand.java
 (line 139)
<https://reviews.apache.org/r/38474/#comment161320>

typo availableList



core/src/main/java/org/apache/oozie/coord/dependency/CoordDependenciesInputCheck.java
 (line 75)
<https://reviews.apache.org/r/38474/#comment161322>

ditto



core/src/main/java/org/apache/oozie/coord/dependency/CoordDependency.java (line 
242)
<https://reviews.apache.org/r/38474/#comment161323>

doesn't close() and flush() on the enclosing OutputStream suffice and 
close/flush the enclosed stream too? Safeguard against closing a stream that's 
already closed



core/src/main/java/org/apache/oozie/coord/dependency/CoordInputCheckerPhaseOne.java
 (line 97)
<https://reviews.apache.org/r/38474/#comment161324>

typo getFirst (not getFist)


- Mona Chitnis


On Sept. 18, 2015, 12:20 a.m., Purshotam Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38474/
> ---
> 
> (Updated Sept. 18, 2015, 12:20 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1976
> https://issues.apache.org/jira/browse/OOZIE-1976
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> There are three components in this patch
> 
> 1. User interface
> A new tag is added to coordinator.xml
> ex.
> 
> 
>   
>   "
>   
>
>
>   
>   
>
>
>  ;
> 
> 
> 
> input-check will have nested and/or/combine operation. It can have min and 
> wait at operator or at date-in.
> If input-check tag is missing then it consider to be old approach where all 
> data dependency are needed.
> 
> 2. Processing
> input-check is converted into logical expression
>   (a&&B)||(c&&d)
> We use jexl to parse the logical expression.
> 
> There are three phase in parsing.
> phase 1 : only resolved dataset are parsed ( only current).   
> phase 2 : once all current are resolved, then future/latest are parsed.
> phase 3 : Doesn't do any filecheck, just return what is being parsed by 
> phase1 and phase2. Is used for EL functions
> 
> 
> 3. Storage.
> if inputcheck is enable, push_missing_dependencies and missing_dependencies 
> are serialized and stored in DB.
> If then not then it's old approach, where they are stored in plan text. This 
> is backward compatible.
> 
> 
> Diffs
> -
> 
>   client/src/main/resources/oozie-coordinator-0.5.xsd 
> e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 
>   core/pom.xml ca40e2e22293a3df2841764ce725420857425139 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 
> 188b70e2e76858228b4d42e5798952383719a93d 
>   core/src/main/java/org/apache/oozie/action/ActionExecutor.java 
> ff83

[jira] [Commented] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2015-04-26 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513118#comment-14513118
 ] 

Mona Chitnis commented on OOZIE-1976:
-

Thanks for taking it up Jaydeep. I will keep a watch on this jira when it's 
ready for review

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Jaydeep Vishwakarma
> Fix For: trunk
>
> Attachments: OOZIE-1976-WIP.patch, OOZIE-1976-rough-design-2.pdf, 
> OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: VOTE Release Oozie 4.1.0 (candidate 1)

2014-11-17 Thread Mona Chitnis
Downloaded the tarball, built and installed Oozie (-DskipTests) and ran an 
example M-R Oozie job against my Hadoop 2.4 cluster. +1 for that.
Was unable to verify the md5 and gpg signatures. I dont find the key used for 
signing the page in the list of public keys on the KEYS page. Please let me 
know if I'm missing the right procedure.
Regards, Mona Chitnis
 

 On Monday, November 17, 2014 8:08 AM, Shwetha GS  
wrote:
   

 +1

On Fri, Nov 14, 2014 at 6:49 AM, bowen zhang <
bowenzhang...@yahoo.com.invalid> wrote:

> Hi,
>
> I have created a build for Oozie 4.1.0, candidate 1.
>
> Keys to verify the signature of the release artifact are available at
>
>  http://www.apache.org/dist/oozie/KEYS
>
> Please download, test, and try it out:
>
>  http://people.apache.org/~bzhang/oozie-4.1.0-rc1
>
> The release, md5 signature, gpg signature, and rat report can all
> be found at the above address.
>
> Vote closes on Monday EOD, the 17th.
>
> Bowen
>

-- 
_
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.


   

[jira] [Commented] (OOZIE-1913) Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-11-03 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195508#comment-14195508
 ] 

Mona Chitnis commented on OOZIE-1913:
-

Reviewboard revision is fairly up-to-date except a couple of unit tests. I will 
be updating that and then would appreciate a review

> Devise a way to turn off SLA alerts for bundle/coordinator flexibly
> ---
>
> Key: OOZIE-1913
> URL: https://issues.apache.org/jira/browse/OOZIE-1913
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
>
> From user:
> Need to turn off the SLA miss alerts in jobs when the bundle is suspended for
> grid upgrades and similar work so that when it's resumed we aren't flooded 
> with a bunch of alerts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-2034) Disable SSLv3 (POODLEbleed vulnerability)

2014-10-24 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183087#comment-14183087
 ] 

Mona Chitnis commented on OOZIE-2034:
-

+1. Pretty straightforward. Thanks for checking the bit about support of TLSv1, 
not TLSv1.1. Can you paste your doc references here for record? 

> Disable SSLv3 (POODLEbleed vulnerability)
> -
>
> Key: OOZIE-2034
> URL: https://issues.apache.org/jira/browse/OOZIE-2034
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Affects Versions: 4.0.1
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Fix For: 4.1.0
>
> Attachments: OOZIE-2034.patch, OOZIE-2034.patch
>
>
> We should disable SSLv3 to protect against the POODLEbleed vulnerability.
> See 
> [CVE-2014-3566|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-3566]
> We have {{sslProtocol="TLS"}} set to only allow TLS in ssl-server.xml, but 
> when I checked, I could still connect with SSLv3.  From what I can tell, 
> there's some ambiguity in the tomcat configs between {{sslProtocol}}, 
> {{sslProtocols}}, and {{sslEnabledProtocols}} so we probably have the wrong 
> thing here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-2034) Disable SSLv3 (POODLEbleed vulnerability)

2014-10-24 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183074#comment-14183074
 ] 

Mona Chitnis commented on OOZIE-2034:
-

starting to look at this now..

> Disable SSLv3 (POODLEbleed vulnerability)
> -
>
> Key: OOZIE-2034
> URL: https://issues.apache.org/jira/browse/OOZIE-2034
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Affects Versions: 4.0.1
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Fix For: 4.1.0
>
> Attachments: OOZIE-2034.patch, OOZIE-2034.patch
>
>
> We should disable SSLv3 to protect against the POODLEbleed vulnerability.
> See 
> [CVE-2014-3566|http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-3566]
> We have {{sslProtocol="TLS"}} set to only allow TLS in ssl-server.xml, but 
> when I checked, I could still connect with SSLv3.  From what I can tell, 
> there's some ambiguity in the tomcat configs between {{sslProtocol}}, 
> {{sslProtocols}}, and {{sslEnabledProtocols}} so we probably have the wrong 
> thing here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-10-20 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/
---

(Updated Oct. 20, 2014, 11:20 p.m.)


Review request for oozie.


Changes
---

review comments addressed. minor changes required in 2 unit tests and will 
update that next


Bugs: OOZIE-1913
https://issues.apache.org/jira/browse/OOZIE-1913


Repository: oozie-git


Description
---

See Jira


Diffs (updated)
-

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java 9c2d14b 
  client/src/main/java/org/apache/oozie/client/OozieClient.java 5e53a18 
  
client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java 
801ad7e 
  client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 4cc6606 
  core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 759e643 
  core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 2362084 
  core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
070cee5 
  core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
de78ab7 
  
core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
 05b7a62 
  core/src/main/java/org/apache/oozie/coord/CoordUtils.java 4643d73 
  
core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java 
e6ab09b 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
4bccef4 
  core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java c19839f 
  
core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java 
ee1085a 
  core/src/main/java/org/apache/oozie/service/EventHandlerService.java 244c048 
  core/src/main/java/org/apache/oozie/servlet/BaseJobServlet.java c94d1e2 
  core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 2578e41 
  core/src/main/java/org/apache/oozie/servlet/V0JobServlet.java b160b46 
  core/src/main/java/org/apache/oozie/servlet/V1JobServlet.java 8dc9608 
  core/src/main/java/org/apache/oozie/servlet/V2JobServlet.java da81b49 
  core/src/main/java/org/apache/oozie/sla/BundleChangeSlaXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/BundleDisableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/BundleEnableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/CoordChangeSlaXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/CoordDisableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/CoordEnableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 189d5ea 
  core/src/main/java/org/apache/oozie/sla/SLACalculator.java 20f93b5 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 188144e 
  core/src/main/java/org/apache/oozie/sla/SLAOperations.java f5fc826 
  core/src/main/java/org/apache/oozie/sla/service/SLAService.java 89615bc 
  core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 7c2620c 
  core/src/main/resources/oozie-default.xml 26eb7e0 
  
core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java 
f13e48f 
  core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java ae3f18d 
  core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 30fd151 
  core/src/test/java/org/apache/oozie/servlet/DagServletTestCase.java 48193c7 
  core/src/test/java/org/apache/oozie/servlet/TestV2JobServlet.java fb203a6 
  core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java db3f6eb 
  core/src/test/java/org/apache/oozie/store/TestCoordinatorStore.java b8b2405 

Diff: https://reviews.apache.org/r/24487/diff/


Testing
---

unit tests added, e-2-e test with CLI command done


Thanks,

Mona Chitnis



Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-10-20 Thread Mona Chitnis


> On Oct. 17, 2014, 7:38 p.m., Rohini Palaniswamy wrote:
> > core/src/main/java/org/apache/oozie/coord/CoordUtils.java, lines 146-147
> > <https://reviews.apache.org/r/24487/diff/4/?file=692718#file692718line146>
> >
> > What happens to other commands?

other commands calling this util method - kill and rerun. in both cases, we 
should allow superset of action and ability to skip over if all actions in the 
range are not there.


> On Oct. 17, 2014, 7:38 p.m., Rohini Palaniswamy wrote:
> > core/src/main/java/org/apache/oozie/coord/CoordUtils.java, line 258
> > <https://reviews.apache.org/r/24487/diff/4/?file=692718#file692718line258>
> >
> > private

referenced in another class too


- Mona


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/#review57158
-------


On Sept. 17, 2014, 6:59 p.m., Mona Chitnis wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24487/
> ---
> 
> (Updated Sept. 17, 2014, 6:59 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1913
> https://issues.apache.org/jira/browse/OOZIE-1913
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> See Jira
> 
> 
> Diffs
> -
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java f3ffd1f 
>   client/src/main/java/org/apache/oozie/client/OozieClient.java d6ff2d0 
>   
> client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java
>  801ad7e 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 4b393c8 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java cc5596b 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 71a9ab4 
>   core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
> 070cee5 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
> de78ab7 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  05b7a62 
>   core/src/main/java/org/apache/oozie/coord/CoordUtils.java 4643d73 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  0aee0e4 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 2c9e00e 
>   core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java c19839f 
>   
> core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java
>  ee1085a 
>   core/src/main/java/org/apache/oozie/service/EventHandlerService.java 
> 244c048 
>   core/src/main/java/org/apache/oozie/servlet/BaseJobServlet.java 11835ed 
>   core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 2578e41 
>   core/src/main/java/org/apache/oozie/servlet/V0JobServlet.java eb699e6 
>   core/src/main/java/org/apache/oozie/servlet/V1JobServlet.java 396661a 
>   core/src/main/java/org/apache/oozie/servlet/V2JobServlet.java de4f865 
>   core/src/main/java/org/apache/oozie/sla/BundleDisableSlaAlertsXCommand.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/oozie/sla/BundleEnableSlaAlertsXCommand.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/oozie/sla/CoordDisableSlaAlertsXCommand.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/oozie/sla/CoordEnableSlaAlertsXCommand.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 189d5ea 
>   core/src/main/java/org/apache/oozie/sla/SLACalculator.java 20f93b5 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java cdf8b73 
>   core/src/main/java/org/apache/oozie/sla/SLAOperations.java f5fc826 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 89615bc 
>   core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 
> 7c2620c 
>   core/src/main/resources/oozie-default.xml 6a91dc6 
>   
> core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java
>  f13e48f 
>   core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java ae3f18d 
>   core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 
> 30fd151 
>   core/src/test/java/org/apache/oozie/servlet/DagServletTestCase.java 48193c7 
>   core/src/test/java/org/apache/oozie/servlet/TestV2JobServlet.java db9c594 
>   core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 
> db3f6eb 
> 
> Diff: https://reviews.apache.org/r/24487/diff/
> 
> 
> Testing
> ---
> 
> unit tests added, e-2-e test with CLI command done
> 
> 
> Thanks,
> 
> Mona Chitnis
> 
>



[jira] [Commented] (OOZIE-1954) Add a way for the MapReduce action to be configured by Java code

2014-09-30 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153721#comment-14153721
 ] 

Mona Chitnis commented on OOZIE-1954:
-

Good work Robert!

> Add a way for the MapReduce action to be configured by Java code
> 
>
> Key: OOZIE-1954
> URL: https://issues.apache.org/jira/browse/OOZIE-1954
> Project: Oozie
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: trunk
>
> Attachments: OOZIE-1954.patch, OOZIE-1954.patch, OOZIE-1954.patch
>
>
> With certain other components (e.g. Avro, HFileOutputFormat (HBase), etc), it 
> becomes impractical to use the MapReduce action and users must instead use 
> the Java action. The problem is that these components require a lot of extra 
> configuration that is often hidden from the user in Java code (e.g. 
> HFileOutputFormat.configureIncrementalLoad(job, table); which can also 
> include decision logic, serialization, and other things that we can't do in 
> an XML file directly.
> One way to solve this problem is to allow the user to give the MR action some 
> Java code that would do this configuration, similar to how we allow the 
> {{}} field to specify an external XML file of configuration 
> properties.
> In more detail, we could have an interface; something like this:
> {code}
> public interface OozieActionConfigurator {
>  public void updateOozieActionConfiguration(Configuration conf);
> }
> {code}
> that the user can implement, create a jar, and include with their MR action 
> (i.e. add a "{{}}" field that let's them specify the class 
> name). To protect the Oozie server from running user code (which could do 
> anything it wants really), it would have to be run in the Launcher Job. The 
> Launcher Job could call this method after it loads the configuration prepared 
> by the Oozie server.
> Another thing this will be helpful is with users who use the Java action to 
> launch MR jobs and expect a bunch of things to be done for them that are not 
> (e.g. delegation token propagation, config loading, returning the hadoop job 
> to Oozie, etc). These are all done with the MR action, so the more users we 
> can move to the MR action from the Java action, the less they'll run into 
> these difficulties.
> Some of this may change slightly as I try to actually implement this (e.g. 
> have to handle throwing exceptions etc).  And one thing I may do is keep this 
> general enough that it should be compatible with all action types in case we 
> want to add this to any of them in the future; though for now, the schema 
> would only accept it for the MapReduce action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Java 7

2014-09-29 Thread Mona Chitnis
+1 Mona Chitnis

 

 On Tuesday, September 23, 2014 6:21 PM, Rohini Palaniswamy 
 wrote:
   

 +1 to drop support for Java 6 from Oozie trunk. I am just closing a vote on
that for Pig now.  From Hadoop 2.7, hadoop is planning to publish maven
artifacts in jdk1.7. So it is better we drop support. Can you also open up
a vote for dropping support for Hadoop 0.20 along with JDK 7 one?

On Tue, Sep 23, 2014 at 10:58 AM, Robert Kanter 
wrote:

> Hi all,
>
> I wanted to open a discussion about Java 7.  Hadoop is planning on dropping
> support for JDK 6 with Hadoop 2.7.  Should we switch Oozie trunk to do the
> same?
>
> On a related note, for OOZIE-1793, I'm trying to improve the findbugs
> reporting for Oozie, and the latest findbugs requires Java 7.  So there are
> 3 options:
> - Drop support for Java 6 for Oozie trunk (the overall question above)
> - Switch to an older version of findbugs that does support Java 6.  I'd
> have to double check, but this may might it more difficult to get html
> human-readable reports instead of XML
> - Leave it as is.  Findbugs only runs with the "verify" target, so all
> other maven commands still work with Java 6, including compiling.
>
> thoughts?
>
> thanks
> - Robert
>


   

[jira] [Commented] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-09-25 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147856#comment-14147856
 ] 

Mona Chitnis commented on OOZIE-1976:
-

Thanks [~rkanter] for comments. 
* We are thinking of using a serialize/deserialize technique (protobuf is 
one option) to convert back and forth from the object. I've created a class 
LogicalDependencySet for this object which either contains the subclass objects 
LogicalDependencyAndSet or LogicalDependencyOrSet and the leaf level is 
Dependency which has the lists of resolved and unresolved instances. Yet to see 
what is the cost of protobuf serde here.
   * Yes it is possible to do nested combinations, but will limit it to a depth 
of 2. i.e. both your examples are depth 2 and the most common cases that we 
should satisfy in the first go. An important thing to note here is the case of 
OR can have two 'strategies' :-
   ** 'Combined' : In case of {{A || B}}, instances of A and B can be 
interleaved to give the final "combined" set of total instances. For this, the 
requirement is that user considers both as equivalent, and they have the same 
frequency, initial instance etc.
   ** 'Exclusive' : In same case as above, either A should be completely used 
or B completely used. No interleaving.
   * Yes a better API output will be to display the action is waiting on which 
OR datasets' instances.

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-WIP.patch, OOZIE-1976-rough-design-2.pdf, 
> OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-09-24 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1976:

Attachment: OOZIE-1976-WIP.patch

attaching WIP patch for records sake. I will upload the v-1 patch when I have a 
fairly working version ready by tomorrow

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>    Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-WIP.patch, OOZIE-1976-rough-design-2.pdf, 
> OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-09-22 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1932:

Attachment: OOZIE-1932-4-amendment.patch

{code}
2014-09-22 22:02:34,148  INFO ShareLibService:539 [main] - USER[-] GROUP[-]
oozie-hadoop-utils-2.3.0.oozie-4.4.1.1.jar uploaded to
hdfs:/tmp/hdfs_shared_lib_path/launcher_2014090233/oozie
2014-09-22 22:02:34,198  INFO ShareLibService:539 [main] - USER[-] GROUP[-]
oozie-sharelib-hcatalog-4.4.1.1.jar uploaded to
hdfs:/tmp/hdfs_shared_lib_path/launcher_2014090233/oozie
2014-09-22 22:02:34,199 ERROR ShareLibService:536 [main] - USER[-] GROUP[-]
Sharelib initialization fails
java.lang.NullPointerException
at
org.apache.oozie.service.ShareLibService.setupLauncherLibPath(ShareLibService.java:178)
at
org.apache.oozie.service.ShareLibService.updateLauncherLib(ShareLibService.java:158)
at
org.apache.oozie.service.ShareLibService.init(ShareLibService.java:111)
at
org.apache.oozie.service.Services.setServiceInternal(Services.java:368)



ShareLibService is dependent on ActionService.

private void setupLauncherLibPath(FileSystem fs, Path tmpLauncherLibPath)
throws IOException {

ActionService actionService = Services.get().get(ActionService.class);
List classes = JavaActionExecutor.getCommonLauncherClasses();
Path baseDir = new Path(tmpLauncherLibPath,
JavaActionExecutor.OOZIE_COMMON_LIBDIR);
copyJarContainingClasses(classes, fs, baseDir,
JavaActionExecutor.OOZIE_COMMON_LIBDIR);
Set actionTypes = actionService.getActionTypes();
{code}

Attaching amendment patch

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932-3.patch, 
> OOZIE-1932-4-amendment.patch, OOZIE-1932-4.patch, OOZIE-1932-addendum.patch, 
> OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-09-22 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143595#comment-14143595
 ] 

Mona Chitnis commented on OOZIE-1932:
-

{quote}
. -1 the patch does not add/modify any testcase
{quote}
This is a simple config change in oozie-default.xml and there is no applicable 
test-case just to check relative order of loading services

{quote}
. The patch failed the following testcases:

. 
testBundleStatusNotTransitionFromKilled(org.apache.oozie.service.TestStatusTransitService)
. 
testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService)
{quote}
These test failures are unrelated to my patch. I reran the tests in my local 
env and they pass consistently

Committed patch to trunk and branch-4.1. Thanks Puru for review!

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932-3.patch, 
> OOZIE-1932-4.patch, OOZIE-1932-addendum.patch, OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: issue after OOZIE-1807

2014-09-18 Thread Mona Chitnis

bq. if a bundle with two actions - one FAILED due to coordinator submission 
error, other KILLED), bundle is supposed to KILLED bq.   Bundle should be 
FAILED and not KILLED. Only when user has KILLED the bundle, should its status 
be KILLED.

Thanks for minor correction. I was shooting for bundle will _not_ be 
DONEWITHERROR, which Bowen said he's observing
 Mona Chitnis
Software Engineer, Hadoop Team
Yahoo! 



 On Thursday, September 18, 2014 5:09 PM, Rohini Palaniswamy 
 wrote:
   

 bq. Shouldn't oozie be intelligent enough to do a no-op on a killed coord job? 
  There are options now to resume a killed coord job. If new end time was 
applied on other coord jobs and not applied on that one, user needs to know.
bq. if a bundle with two actions - one FAILED due to coordinator submission 
error, other KILLED), bundle is supposed to KILLED   Bundle should be FAILED 
and not KILLED. Only when user has KILLED the bundle, should its status be 
KILLED.
-Rohini 
On Thu, Sep 18, 2014 at 4:09 PM, Purshotam Shah 
 wrote:

Bowen,
JIRA has explanation. Please update JIRA if you see any issue with
approach.

>Why is it a good idea to throw an exception if one of the coord jobs is
>in "killed" state? In the BundleJobChangeXCommand, the code doesn't even
>attempt to change the coord job. Shouldn't oozie be intelligent >enough
>to do a no-op on a killed coord job?



To let user know the list of coord jobs for which change is not applied.


Puru.

On 9/18/14, 2:11 PM, "bowen zhang"  wrote:

>Hi Purshotam,
>Why is it a good idea to throw an exception if one of the coord jobs is
>in "killed" state? In the BundleJobChangeXCommand, the code doesn't even
>attempt to change the coord job. Shouldn't oozie be intelligent enough to
>do a no-op on a killed coord job?
>Bowen
>
>
>
>
> From: Purshotam Shah 
>To: "dev@oozie.apache.org" ; Mona Chitnis
>; bowen zhang 
>Sent: Wednesday, September 17, 2014 6:17 PM
>Subject: Re: issue after OOZIE-1807
>
>
>Hi Bowen,
>   BundleJobChangeXCommand command will get applied to bundle and coord
>jobs. It will aggregate message for all killed coord jobs and throw them
>as exception.
>It is similar to chmod command.
>
>JIRA has more details. Let me know if you need any other information.
>
>Puru.
>
>
>
>
>
>On 9/17/14, 6:05 PM, "Mona Chitnis"  wrote:
>
>>
>>Puru,
>>Bowen just gave me a call regarding this issue. Can you answer his
>>question? That'll be faster than me digging through the code.
>> Mona Chitnis
>>Yahoo!
>>
>>     On Wednesday, September 17, 2014 5:51 PM, bowen zhang
>> wrote:
>>
>>
>> Hi guys,
>>
>>Purshatom, I see you checked oozie-1807 into the trunk. So, I have a
>>question, why does it need to throw an exception when someone wants to
>>change a bundle job where one of its coord job is in KILLED state? Due to
>>the change in BundleJobChangeXCommand, this is throwing exceptions when
>>trying to change a RUNNING bundle job where some of the coord jobs are
>>intentionally killed by the user.
>>Thanks,
>>Bowen
>>
>>
>>





   

Re: issue after OOZIE-1807

2014-09-17 Thread Mona Chitnis

Bowen,
Regarding the other issue (if a bundle with two actions - one FAILED due to 
coordinator submission error, other KILLED), bundle is supposed to KILLED. I 
see this taken care of as part of OOZIE-1940 also(StatusTransitService) but it 
is not committed to Apache yet.
Puru can help track down if any of his other patches changed this desired 
behavior.
 Mona Chitnis
Yahoo! 

 On Wednesday, September 17, 2014 6:05 PM, Mona Chitnis 
 wrote:
   

 
Puru,
Bowen just gave me a call regarding this issue. Can you answer his question? 
That'll be faster than me digging through the code.
 Mona Chitnis
Yahoo! 

 On Wednesday, September 17, 2014 5:51 PM, bowen zhang 
 wrote:
   

 Hi guys, 

Purshatom, I see you checked oozie-1807 into the trunk. So, I have a question, 
why does it need to throw an exception when someone wants to change a bundle 
job where one of its coord job is in KILLED state? Due to the change in 
BundleJobChangeXCommand, this is throwing exceptions when trying to change a 
RUNNING bundle job where some of the coord jobs are intentionally killed by the 
user.
Thanks,
Bowen




   

Re: issue after OOZIE-1807

2014-09-17 Thread Mona Chitnis

Puru,
Bowen just gave me a call regarding this issue. Can you answer his question? 
That'll be faster than me digging through the code.
 Mona Chitnis
Yahoo! 

 On Wednesday, September 17, 2014 5:51 PM, bowen zhang 
 wrote:
   

 Hi guys, 

Purshatom, I see you checked oozie-1807 into the trunk. So, I have a question, 
why does it need to throw an exception when someone wants to change a bundle 
job where one of its coord job is in KILLED state? Due to the change in 
BundleJobChangeXCommand, this is throwing exceptions when trying to change a 
RUNNING bundle job where some of the coord jobs are intentionally killed by the 
user.
Thanks,
Bowen


   

Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-09-17 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/
---

(Updated Sept. 17, 2014, 6:59 p.m.)


Review request for oozie.


Changes
---

Addressd Puru's comment to make separate bundle/coord disable/enable commands.


Bugs: OOZIE-1913
https://issues.apache.org/jira/browse/OOZIE-1913


Repository: oozie-git


Description
---

See Jira


Diffs (updated)
-

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java f3ffd1f 
  client/src/main/java/org/apache/oozie/client/OozieClient.java d6ff2d0 
  
client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java 
801ad7e 
  client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 4b393c8 
  core/src/main/java/org/apache/oozie/CoordinatorActionBean.java cc5596b 
  core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 71a9ab4 
  core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
070cee5 
  core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
de78ab7 
  
core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
 05b7a62 
  core/src/main/java/org/apache/oozie/coord/CoordUtils.java 4643d73 
  
core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java 
0aee0e4 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
2c9e00e 
  core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java c19839f 
  
core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java 
ee1085a 
  core/src/main/java/org/apache/oozie/service/EventHandlerService.java 244c048 
  core/src/main/java/org/apache/oozie/servlet/BaseJobServlet.java 11835ed 
  core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 2578e41 
  core/src/main/java/org/apache/oozie/servlet/V0JobServlet.java eb699e6 
  core/src/main/java/org/apache/oozie/servlet/V1JobServlet.java 396661a 
  core/src/main/java/org/apache/oozie/servlet/V2JobServlet.java de4f865 
  core/src/main/java/org/apache/oozie/sla/BundleDisableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/BundleEnableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/CoordDisableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/CoordEnableSlaAlertsXCommand.java 
PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 189d5ea 
  core/src/main/java/org/apache/oozie/sla/SLACalculator.java 20f93b5 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java cdf8b73 
  core/src/main/java/org/apache/oozie/sla/SLAOperations.java f5fc826 
  core/src/main/java/org/apache/oozie/sla/service/SLAService.java 89615bc 
  core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 7c2620c 
  core/src/main/resources/oozie-default.xml 6a91dc6 
  
core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java 
f13e48f 
  core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java ae3f18d 
  core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 30fd151 
  core/src/test/java/org/apache/oozie/servlet/DagServletTestCase.java 48193c7 
  core/src/test/java/org/apache/oozie/servlet/TestV2JobServlet.java db9c594 
  core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java db3f6eb 

Diff: https://reviews.apache.org/r/24487/diff/


Testing
---

unit tests added, e-2-e test with CLI command done


Thanks,

Mona Chitnis



Re: oozie on oracle issue

2014-09-13 Thread Mona Chitnis
Yes sounds like an OpenJPA schemas creation limitation. We directly use Oracle 
SQL scripts for tables creation. In HA mode, I think we use same SID for 
multiple schemas similar to the setup you described here but haven't faced this 
issue because of using SQL directlyhttps://overview.mail.yahoo.com?.src=iOS";>Sent from Yahoo Mail 
for iPhone

RE: oozie on oracle issue

2014-09-12 Thread Mona Chitnis
No not faced this issue before. That might be because we have Oracle instance 
dedicated to Oozie database. You have multiple db_owners because of a shared 
instance between say Oozie and other projects' schemas?https://overview.mail.yahoo.com?.src=iOS";>Sent from Yahoo Mail 
for iPhone

Re: Review Request 24948: OOZIE-1940 StatusTransitService has race condition

2014-09-09 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24948/#review52789
---

Ship it!


Looks good now. Thanks for the 2 clarifications above.

- Mona Chitnis


On Sept. 8, 2014, 9:15 p.m., Purshotam Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24948/
> ---
> 
> (Updated Sept. 8, 2014, 9:15 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1940
> https://issues.apache.org/jira/browse/OOZIE-1940
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> StatusTransitService has race condition
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/BundleActionBean.java 5d85a4d 
>   core/src/main/java/org/apache/oozie/BundleJobBean.java 0f1670a 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 88a2c67 
>   core/src/main/java/org/apache/oozie/command/StatusTransitXCommand.java 
> e69de29 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/BundleJobQueryExecutor.java 
> 36cd968 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  3008393 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 04e6e29 
>   core/src/main/java/org/apache/oozie/service/StatusTransitService.java 
> 21ac25f 
>   core/src/test/java/org/apache/oozie/service/TestStatusTransitService.java 
> bb99138 
> 
> Diff: https://reviews.apache.org/r/24948/diff/
> 
> 
> Testing
> ---
> 
> UTC
> 
> 
> Thanks,
> 
> Purshotam Shah
> 
>



[jira] [Updated] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-09-09 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1932:

Attachment: OOZIE-1932-4.patch

addressed Puru's comment

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932-3.patch, 
> OOZIE-1932-4.patch, OOZIE-1932-addendum.patch, OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-09-09 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1932:

Attachment: OOZIE-1932-3.patch

uploaded new patch OOZIE-1932-3.patch. 

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932-3.patch, 
> OOZIE-1932-addendum.patch, OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 24948: OOZIE-1940 StatusTransitService has race condition

2014-09-09 Thread Mona Chitnis


> On Sept. 4, 2014, 8:55 p.m., Mona Chitnis wrote:
> > core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java,
> >  line 177
> > <https://reviews.apache.org/r/24948/diff/1/?file=668667#file668667line177>
> >
> > related question, is this situation possible? - job status is PAUSED || 
> > PWE, and bundle action status is RWE?
> 
> Purshotam Shah wrote:
> Yes, BundlePauseXCommand only set bundle status to pause. Bundle status 
> can still be in running state.

Ok. I just checked that BundlePauseXCommand and CoordPauseXCommand have empty 
implementations of pauseChildren()
@Override
public void pauseChildren() throws CommandException {
// TODO - need revisit when revisiting coord job status redesign;

}


- Mona


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24948/#review52346
---


On Sept. 8, 2014, 9:15 p.m., Purshotam Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24948/
> ---
> 
> (Updated Sept. 8, 2014, 9:15 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1940
> https://issues.apache.org/jira/browse/OOZIE-1940
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> StatusTransitService has race condition
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/BundleActionBean.java 5d85a4d 
>   core/src/main/java/org/apache/oozie/BundleJobBean.java 0f1670a 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 88a2c67 
>   core/src/main/java/org/apache/oozie/command/StatusTransitXCommand.java 
> e69de29 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/BundleJobQueryExecutor.java 
> 36cd968 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  3008393 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 04e6e29 
>   core/src/main/java/org/apache/oozie/service/StatusTransitService.java 
> 21ac25f 
>   core/src/test/java/org/apache/oozie/service/TestStatusTransitService.java 
> bb99138 
> 
> Diff: https://reviews.apache.org/r/24948/diff/
> 
> 
> Testing
> ---
> 
> UTC
> 
> 
> Thanks,
> 
> Purshotam Shah
> 
>



Re: Review Request 24948: OOZIE-1940 StatusTransitService has race condition

2014-09-09 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24948/#review52722
---



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91699>

But this is executed only if condition bAction.getCoordId() == null. So the 
case you mention will not occur



core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91703>

Then I think we should 'skip' loading in SKIPPED actions from DB. The idea 
of skipped is that its outcome would not make any difference. So if all other 
actions are terminal, you will mark parent coord/bundle as terminal. if some 
are non-terminal, it wont be. ok to optimize this from earlier code


- Mona Chitnis


On Sept. 8, 2014, 9:15 p.m., Purshotam Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24948/
> ---
> 
> (Updated Sept. 8, 2014, 9:15 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1940
> https://issues.apache.org/jira/browse/OOZIE-1940
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> StatusTransitService has race condition
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/BundleActionBean.java 5d85a4d 
>   core/src/main/java/org/apache/oozie/BundleJobBean.java 0f1670a 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 88a2c67 
>   core/src/main/java/org/apache/oozie/command/StatusTransitXCommand.java 
> e69de29 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/BundleJobQueryExecutor.java 
> 36cd968 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  3008393 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 04e6e29 
>   core/src/main/java/org/apache/oozie/service/StatusTransitService.java 
> 21ac25f 
>   core/src/test/java/org/apache/oozie/service/TestStatusTransitService.java 
> bb99138 
> 
> Diff: https://reviews.apache.org/r/24948/diff/
> 
> 
> Testing
> ---
> 
> UTC
> 
> 
> Thanks,
> 
> Purshotam Shah
> 
>



Re: Review Request 24948: OOZIE-1940 StatusTransitService has race condition

2014-09-04 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24948/#review52346
---



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91109>

do we need to execute this synchronously? None of the counts here are 
affected by the outcome of this command. Can queue it to release lock faster



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91115>

related question, is this situation possible? - job status is PAUSED || 
PWE, and bundle action status is RWE?



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91116>

is this a change from current behavior? a mix of suspended, failed, killed, 
DWE, SPE = SPE? I'm not sure. Sounds reasonable to me but need to check if this 
could potentially be confusing to anyone



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91118>

I dont understand this. Are the other bundle actions in Running? Then why 
is status Prep and not Running?



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91117>

why is getPrepStatus calling getRunningStatus?



core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91119>

same comment as BundleStatusTransitX, just make sure whether 
SuspendedWithError is the right status here



core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment91120>

why dont we filter out the SKIPPED actions altogether for status transit 
processing?



core/src/main/java/org/apache/oozie/service/StatusTransitService.java
<https://reviews.apache.org/r/24948/#comment91122>

some form of batching done right away would be good, so we can execute 
bundle/coord update queries on multiple bundles/coords in a batched execute 
query. However then we wont be able to utilize the locking per job to have 
strong consistency. I guess ok to defer this until we have a better idea


- Mona Chitnis


On Aug. 26, 2014, 12:30 a.m., Purshotam Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24948/
> ---
> 
> (Updated Aug. 26, 2014, 12:30 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1940
> https://issues.apache.org/jira/browse/OOZIE-1940
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> StatusTransitService has race condition
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/BundleActionBean.java 5d85a4d 
>   core/src/main/java/org/apache/oozie/BundleJobBean.java 0f1670a 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 88a2c67 
>   core/src/main/java/org/apache/oozie/command/StatusTransitXCommand.java 
> e69de29 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/BundleJobQueryExecutor.java 
> 36cd968 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  3008393 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 04e6e29 
>   core/src/main/java/org/apache/oozie/service/StatusTransitService.java 
> 21ac25f 
>   core/src/test/java/org/apache/oozie/service/TestStatusTransitService.java 
> bb99138 
> 
> Diff: https://reviews.apache.org/r/24948/diff/
> 
> 
> Testing
> ---
> 
> UTC
> 
> 
> Thanks,
> 
> Purshotam Shah
> 
>



Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-09-03 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/#review52265
---


Extra line changes (false ones) in BundleSubmitX and SubmitTransitionX I'll 
remove in final/next version

- Mona Chitnis


On Sept. 4, 2014, 1:05 a.m., Mona Chitnis wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24487/
> ---
> 
> (Updated Sept. 4, 2014, 1:05 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1913
> https://issues.apache.org/jira/browse/OOZIE-1913
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> See Jira
> 
> 
> Diffs
> -
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 79a9b68 
>   client/src/main/java/org/apache/oozie/client/OozieClient.java 363ebd2 
>   
> client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java
>  801ad7e 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 4b393c8 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java cc5596b 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 14fd74c 
>   core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
> 070cee5 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
> d479086 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  a13fe83 
>   core/src/main/java/org/apache/oozie/coord/CoordUtils.java 4643d73 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  0aee0e4 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 25953bf 
>   core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java c19839f 
>   
> core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java
>  7a688b1 
>   core/src/main/java/org/apache/oozie/service/EventHandlerService.java 
> 244c048 
>   core/src/main/java/org/apache/oozie/servlet/BaseJobServlet.java f651d5c 
>   core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 2578e41 
>   core/src/main/java/org/apache/oozie/servlet/V0JobServlet.java 508538d 
>   core/src/main/java/org/apache/oozie/servlet/V1JobServlet.java 6427989 
>   core/src/main/java/org/apache/oozie/servlet/V2JobServlet.java b7b9be9 
>   core/src/main/java/org/apache/oozie/sla/SLAAlertsXCommand.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 189d5ea 
>   core/src/main/java/org/apache/oozie/sla/SLACalculator.java 20f93b5 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java cdf8b73 
>   core/src/main/java/org/apache/oozie/sla/SLAOperations.java f5fc826 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 89615bc 
>   core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 
> 7c2620c 
>   core/src/main/resources/oozie-default.xml 3a957d0 
>   
> core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java
>  f13e48f 
>   core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java ae3f18d 
>   core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 
> 30fd151 
>   core/src/test/java/org/apache/oozie/servlet/DagServletTestCase.java 48193c7 
>   core/src/test/java/org/apache/oozie/servlet/TestV2JobServlet.java db9c594 
>   core/src/test/java/org/apache/oozie/servlet/TestV2SLAServlet.java 5f51b22 
>   core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 
> db3f6eb 
> 
> Diff: https://reviews.apache.org/r/24487/diff/
> 
> 
> Testing
> ---
> 
> unit tests added, e-2-e test with CLI command done
> 
> 
> Thanks,
> 
> Mona Chitnis
> 
>



Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-09-03 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/
---

(Updated Sept. 4, 2014, 1:05 a.m.)


Review request for oozie.


Changes
---

updated patch with review comments


Bugs: OOZIE-1913
https://issues.apache.org/jira/browse/OOZIE-1913


Repository: oozie-git


Description
---

See Jira


Diffs (updated)
-

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java 79a9b68 
  client/src/main/java/org/apache/oozie/client/OozieClient.java 363ebd2 
  
client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java 
801ad7e 
  client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 4b393c8 
  core/src/main/java/org/apache/oozie/CoordinatorActionBean.java cc5596b 
  core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 14fd74c 
  core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
070cee5 
  core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
d479086 
  
core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
 a13fe83 
  core/src/main/java/org/apache/oozie/coord/CoordUtils.java 4643d73 
  
core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java 
0aee0e4 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
25953bf 
  core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java c19839f 
  
core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java 
7a688b1 
  core/src/main/java/org/apache/oozie/service/EventHandlerService.java 244c048 
  core/src/main/java/org/apache/oozie/servlet/BaseJobServlet.java f651d5c 
  core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 2578e41 
  core/src/main/java/org/apache/oozie/servlet/V0JobServlet.java 508538d 
  core/src/main/java/org/apache/oozie/servlet/V1JobServlet.java 6427989 
  core/src/main/java/org/apache/oozie/servlet/V2JobServlet.java b7b9be9 
  core/src/main/java/org/apache/oozie/sla/SLAAlertsXCommand.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 189d5ea 
  core/src/main/java/org/apache/oozie/sla/SLACalculator.java 20f93b5 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java cdf8b73 
  core/src/main/java/org/apache/oozie/sla/SLAOperations.java f5fc826 
  core/src/main/java/org/apache/oozie/sla/service/SLAService.java 89615bc 
  core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 7c2620c 
  core/src/main/resources/oozie-default.xml 3a957d0 
  
core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java 
f13e48f 
  core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java ae3f18d 
  core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 30fd151 
  core/src/test/java/org/apache/oozie/servlet/DagServletTestCase.java 48193c7 
  core/src/test/java/org/apache/oozie/servlet/TestV2JobServlet.java db9c594 
  core/src/test/java/org/apache/oozie/servlet/TestV2SLAServlet.java 5f51b22 
  core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java db3f6eb 

Diff: https://reviews.apache.org/r/24487/diff/


Testing
---

unit tests added, e-2-e test with CLI command done


Thanks,

Mona Chitnis



Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-09-02 Thread Mona Chitnis


> On Aug. 29, 2014, 4:52 p.m., Purshotam Shah wrote:
> > client/src/main/java/org/apache/oozie/client/OozieClient.java, line 152
> > <https://reviews.apache.org/r/24487/diff/2/?file=660982#file660982line152>
> >
> > Do you need to say new (newshouldend ) ? 
> > 
> > When we specify end time for coord, we just say endtime=<>, better to 
> > keep same convention.

Its consistent with the current SLA terminology - should-start, should-end. I 
dont see the major benefit of deviating from this terminology. Also endtime=<> 
and should-end are dealing with different values, former specifies date and 
latter specifies the number of minutes relative to a nominal time for sla 
purposes.


> On Aug. 29, 2014, 4:52 p.m., Purshotam Shah wrote:
> > core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java, 
> > line 91
> > <https://reviews.apache.org/r/24487/diff/2/?file=660987#file660987line91>
> >
> > We are reading from same conf and setting to same conf. why??

dont remember the rationale now. removing


- Mona


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/#review51473
---


On Aug. 14, 2014, 11:13 p.m., Mona Chitnis wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24487/
> ---
> 
> (Updated Aug. 14, 2014, 11:13 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1913
> https://issues.apache.org/jira/browse/OOZIE-1913
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> See Jira
> 
> 
> Diffs
> -
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/java/org/apache/oozie/client/OozieClient.java b468186 
>   
> client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java
>  2f0a45c 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 5d3fc62 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
> 5d3b6af 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
> ffb2d08 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  b4b2fef 
>   core/src/main/java/org/apache/oozie/command/coord/CoordSubmitXCommand.java 
> 02b30ef 
>   core/src/main/java/org/apache/oozie/coord/CoordUtils.java 26db068 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  cd26e07 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 42a0968 
>   core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java 8296a6c 
>   
> core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java
>  3fbd092 
>   core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 8ca2e81 
>   core/src/main/java/org/apache/oozie/servlet/V2SLAServlet.java 8620af5 
>   core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 67d6237 
>   core/src/main/java/org/apache/oozie/sla/SLACalculator.java 132d4df 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 
>   core/src/main/java/org/apache/oozie/sla/SLAOperations.java 0cad071 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2349329 
>   core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 
> fd21c45 
>   core/src/main/resources/oozie-default.xml ebceaa7 
>   core/src/test/java/org/apache/oozie/client/TestWorkflowClient.java e2e0f11 
>   
> core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java
>  fedf4a8 
>   core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java a39efe3 
>   core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 
> fa26935 
>   core/src/test/java/org/apache/oozie/servlet/TestV2SLAServlet.java 5a35fdb 
>   core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 
> 210c99e 
> 
> Diff: https://reviews.apache.org/r/24487/diff/
> 
> 
> Testing
> ---
> 
> unit tests added, e-2-e test with CLI command done
> 
> 
> Thanks,
> 
> Mona Chitnis
> 
>



Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-09-02 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/#review52110
---



client/src/main/java/org/apache/oozie/cli/OozieCLI.java
<https://reviews.apache.org/r/24487/#comment90856>

changing this to hasArgs=false as its not mandatory



client/src/main/java/org/apache/oozie/cli/OozieCLI.java
<https://reviews.apache.org/r/24487/#comment90853>

this got left behind. thanks


- Mona Chitnis


On Aug. 14, 2014, 11:13 p.m., Mona Chitnis wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24487/
> ---
> 
> (Updated Aug. 14, 2014, 11:13 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1913
> https://issues.apache.org/jira/browse/OOZIE-1913
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> See Jira
> 
> 
> Diffs
> -
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/java/org/apache/oozie/client/OozieClient.java b468186 
>   
> client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java
>  2f0a45c 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 5d3fc62 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
> 5d3b6af 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
> ffb2d08 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  b4b2fef 
>   core/src/main/java/org/apache/oozie/command/coord/CoordSubmitXCommand.java 
> 02b30ef 
>   core/src/main/java/org/apache/oozie/coord/CoordUtils.java 26db068 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  cd26e07 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 42a0968 
>   core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java 8296a6c 
>   
> core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java
>  3fbd092 
>   core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 8ca2e81 
>   core/src/main/java/org/apache/oozie/servlet/V2SLAServlet.java 8620af5 
>   core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 67d6237 
>   core/src/main/java/org/apache/oozie/sla/SLACalculator.java 132d4df 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 
>   core/src/main/java/org/apache/oozie/sla/SLAOperations.java 0cad071 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2349329 
>   core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 
> fd21c45 
>   core/src/main/resources/oozie-default.xml ebceaa7 
>   core/src/test/java/org/apache/oozie/client/TestWorkflowClient.java e2e0f11 
>   
> core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java
>  fedf4a8 
>   core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java a39efe3 
>   core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java 
> fa26935 
>   core/src/test/java/org/apache/oozie/servlet/TestV2SLAServlet.java 5a35fdb 
>   core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 
> 210c99e 
> 
> Diff: https://reviews.apache.org/r/24487/diff/
> 
> 
> Testing
> ---
> 
> unit tests added, e-2-e test with CLI command done
> 
> 
> Thanks,
> 
> Mona Chitnis
> 
>



[jira] [Resolved] (OOZIE-1984) SLACalculator in HA mode performs duplicate operations on records with completed jobs

2014-08-28 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis resolved OOZIE-1984.
-

Resolution: Fixed

Committed to trunk and 4.1.0. Thanks for review Ryota

> SLACalculator in HA mode performs duplicate operations on records with 
> completed jobs
> -
>
> Key: OOZIE-1984
> URL: https://issues.apache.org/jira/browse/OOZIE-1984
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Mona Chitnis
> Fix For: trunk, 4.1.0
>
> Attachments: OOZIE-1984-1.patch, OOZIE-1984.patch
>
>
> Scenario:
> SLA periodic run has already processed start,duration and end for a job's sla 
> entry. But job notification for that job came after this, and triggers the 
> sla listener.
> Buggy part:
> {code}
> SLACalculatorMemory.java
> else if 
> (Services.get().get(JobsConcurrencyService.class).isHighlyAvailableMode()) {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> if (slaRegBean != null) { // filter out jobs picked by SLA 
> job event listener
>   // but not actually configured for 
> SLA
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(
> SLASummaryQuery.GET_SLA_SUMMARY, jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> if (slaCalc.getEventProcessed() < 7) {
> slaMap.put(jobId, slaCalc);
> }
> }
> }
> }
> if (slaCalc != null) {
> ..
> Object eventProcObj = ((SLASummaryQueryExecutor) 
> SLASummaryQueryExecutor.getInstance())
> 
> .getSingleValue(SLASummaryQuery.GET_SLA_SUMMARY_EVENTPROCESSED, jobId);
> byte eventProc = ((Byte) eventProcObj).byteValue();
> ..
> processJobEndSuccessSLA(slaCalc, startTime, endTime);
> {code}
> method processJobEndSuccesSLA goes ahead and checks second LSB bit of 
> eventProc and sends duration event _again_. So the bug here is two-fold:
>  * if all events are already processed, still invokes this function
>  * event processed is 8 (1000), so second LSB bit is unset and hence duration 
> processed.
> Fix - not invoke function when eventProc = 1000



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1984) SLACalculator in HA mode performs duplicate operations on records with completed jobs

2014-08-28 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1984:


Attachment: OOZIE-1984-1.patch

updated patch v-1

> SLACalculator in HA mode performs duplicate operations on records with 
> completed jobs
> -
>
> Key: OOZIE-1984
> URL: https://issues.apache.org/jira/browse/OOZIE-1984
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Mona Chitnis
> Fix For: trunk, 4.1.0
>
> Attachments: OOZIE-1984-1.patch, OOZIE-1984.patch
>
>
> Scenario:
> SLA periodic run has already processed start,duration and end for a job's sla 
> entry. But job notification for that job came after this, and triggers the 
> sla listener.
> Buggy part:
> {code}
> SLACalculatorMemory.java
> else if 
> (Services.get().get(JobsConcurrencyService.class).isHighlyAvailableMode()) {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> if (slaRegBean != null) { // filter out jobs picked by SLA 
> job event listener
>   // but not actually configured for 
> SLA
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(
> SLASummaryQuery.GET_SLA_SUMMARY, jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> if (slaCalc.getEventProcessed() < 7) {
> slaMap.put(jobId, slaCalc);
> }
> }
> }
> }
> if (slaCalc != null) {
> ..
> Object eventProcObj = ((SLASummaryQueryExecutor) 
> SLASummaryQueryExecutor.getInstance())
> 
> .getSingleValue(SLASummaryQuery.GET_SLA_SUMMARY_EVENTPROCESSED, jobId);
> byte eventProc = ((Byte) eventProcObj).byteValue();
> ..
> processJobEndSuccessSLA(slaCalc, startTime, endTime);
> {code}
> method processJobEndSuccesSLA goes ahead and checks second LSB bit of 
> eventProc and sends duration event _again_. So the bug here is two-fold:
>  * if all events are already processed, still invokes this function
>  * event processed is 8 (1000), so second LSB bit is unset and hence duration 
> processed.
> Fix - not invoke function when eventProc = 1000



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 25166: OOZIE-1984 SLACalculator in HA mode performs duplicate operations on records with completed jobs

2014-08-28 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25166/
---

(Updated Aug. 28, 2014, 10:55 p.m.)


Review request for oozie.


Changes
---

Ryota's comment addressed and after offline clarification


Bugs: OOZIE-1984
https://issues.apache.org/jira/browse/OOZIE-1984


Repository: oozie-git


Description
---

see jira


Diffs (updated)
-

  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 

Diff: https://reviews.apache.org/r/25166/diff/


Testing
---

existing test pass. e-2-e test will follow in QA


Thanks,

Mona Chitnis



[jira] [Updated] (OOZIE-1984) SLACalculator in HA mode performs duplicate operations on records with completed jobs

2014-08-28 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1984:


Attachment: OOZIE-1984.patch

> SLACalculator in HA mode performs duplicate operations on records with 
> completed jobs
> -
>
> Key: OOZIE-1984
> URL: https://issues.apache.org/jira/browse/OOZIE-1984
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Mona Chitnis
> Fix For: trunk, 4.1.0
>
> Attachments: OOZIE-1984.patch
>
>
> Scenario:
> SLA periodic run has already processed start,duration and end for a job's sla 
> entry. But job notification for that job came after this, and triggers the 
> sla listener.
> Buggy part:
> {code}
> SLACalculatorMemory.java
> else if 
> (Services.get().get(JobsConcurrencyService.class).isHighlyAvailableMode()) {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> if (slaRegBean != null) { // filter out jobs picked by SLA 
> job event listener
>   // but not actually configured for 
> SLA
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(
> SLASummaryQuery.GET_SLA_SUMMARY, jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> if (slaCalc.getEventProcessed() < 7) {
> slaMap.put(jobId, slaCalc);
> }
> }
> }
> }
> if (slaCalc != null) {
> ..
> Object eventProcObj = ((SLASummaryQueryExecutor) 
> SLASummaryQueryExecutor.getInstance())
> 
> .getSingleValue(SLASummaryQuery.GET_SLA_SUMMARY_EVENTPROCESSED, jobId);
> byte eventProc = ((Byte) eventProcObj).byteValue();
> ..
> processJobEndSuccessSLA(slaCalc, startTime, endTime);
> {code}
> method processJobEndSuccesSLA goes ahead and checks second LSB bit of 
> eventProc and sends duration event _again_. So the bug here is two-fold:
>  * if all events are already processed, still invokes this function
>  * event processed is 8 (1000), so second LSB bit is unset and hence duration 
> processed.
> Fix - not invoke function when eventProc = 1000



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 25166: OOZIE-1984 SLACalculator in HA mode performs duplicate operations on records with completed jobs

2014-08-28 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25166/
---

Review request for oozie.


Bugs: OOZIE-1984
https://issues.apache.org/jira/browse/OOZIE-1984


Repository: oozie-git


Description
---

see jira


Diffs
-

  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 

Diff: https://reviews.apache.org/r/25166/diff/


Testing
---

existing test pass. e-2-e test will follow in QA


Thanks,

Mona Chitnis



[jira] [Created] (OOZIE-1984) SLACalculator in HA mode performs duplicate operations on records with completed jobs

2014-08-28 Thread Mona Chitnis (JIRA)
Mona Chitnis created OOZIE-1984:
---

 Summary: SLACalculator in HA mode performs duplicate operations on 
records with completed jobs
 Key: OOZIE-1984
 URL: https://issues.apache.org/jira/browse/OOZIE-1984
 Project: Oozie
  Issue Type: Bug
Affects Versions: trunk
Reporter: Mona Chitnis
 Fix For: trunk, 4.1.0


Scenario:

SLA periodic run has already processed start,duration and end for a job's sla 
entry. But job notification for that job came after this, and triggers the sla 
listener.

Buggy part:
{code}
SLACalculatorMemory.java

else if 
(Services.get().get(JobsConcurrencyService.class).isHighlyAvailableMode()) {
// jobid might not exist in slaMap in HA Setting
SLARegistrationBean slaRegBean = 
SLARegistrationQueryExecutor.getInstance().get(
SLARegQuery.GET_SLA_REG_ALL, jobId);
if (slaRegBean != null) { // filter out jobs picked by SLA job 
event listener
  // but not actually configured for SLA
SLASummaryBean slaSummaryBean = 
SLASummaryQueryExecutor.getInstance().get(
SLASummaryQuery.GET_SLA_SUMMARY, jobId);
slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
if (slaCalc.getEventProcessed() < 7) {
slaMap.put(jobId, slaCalc);
}
}
}
}
if (slaCalc != null) {
..
Object eventProcObj = ((SLASummaryQueryExecutor) 
SLASummaryQueryExecutor.getInstance())

.getSingleValue(SLASummaryQuery.GET_SLA_SUMMARY_EVENTPROCESSED, jobId);
byte eventProc = ((Byte) eventProcObj).byteValue();
..
processJobEndSuccessSLA(slaCalc, startTime, endTime);
{code}

method processJobEndSuccesSLA goes ahead and checks second LSB bit of eventProc 
and sends duration event _again_. So the bug here is two-fold:
 * if all events are already processed, still invokes this function
 * event processed is 8 (1000), so second LSB bit is unset and hence duration 
processed.

Fix - not invoke function when eventProc = 1000



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24948: OOZIE-1940 StatusTransitService has race condition

2014-08-27 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24948/#review51661
---


this is good cleanup and refactoring. Did cursory review to understand 
structural changes. Will review more carefully for any bugs introduced later 
today


core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment90178>

if bundleActionStatus map has some action as RUNNINGWITHERROR, why are we 
setting bundle job to PAUSED?



core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment90177>

typo in bottom



core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
<https://reviews.apache.org/r/24948/#comment90179>

same typo


- Mona Chitnis


On Aug. 26, 2014, 12:30 a.m., Purshotam Shah wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24948/
> ---
> 
> (Updated Aug. 26, 2014, 12:30 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1940
> https://issues.apache.org/jira/browse/OOZIE-1940
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> StatusTransitService has race condition
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/BundleActionBean.java 5d85a4d 
>   core/src/main/java/org/apache/oozie/BundleJobBean.java 0f1670a 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 88a2c67 
>   core/src/main/java/org/apache/oozie/command/StatusTransitXCommand.java 
> e69de29 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordStatusTransitXCommand.java
>  e69de29 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/BundleJobQueryExecutor.java 
> 36cd968 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  3008393 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 04e6e29 
>   core/src/main/java/org/apache/oozie/service/StatusTransitService.java 
> 21ac25f 
>   core/src/test/java/org/apache/oozie/service/TestStatusTransitService.java 
> bb99138 
> 
> Diff: https://reviews.apache.org/r/24948/diff/
> 
> 
> Testing
> ---
> 
> UTC
> 
> 
> Thanks,
> 
> Purshotam Shah
> 
>



[jira] [Commented] (OOZIE-1940) StatusTransitService has race condition

2014-08-27 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112362#comment-14112362
 ] 

Mona Chitnis commented on OOZIE-1940:
-

Agree with approach. Currently each run of Status Transit Service takes 
multiple seconds I believe. If it is going to hold the lock for that long, we 
have to asses the consequences on the other commands waiting for lock. E.g. 
Change command appearing to "hang" on user-facing CLI, because its 
synchronously trying to acquire lock held by STS. OOZIE-1885 should ideally 
reduce this overall time the lock is to be held by STS

> StatusTransitService has race condition
> ---
>
> Key: OOZIE-1940
> URL: https://issues.apache.org/jira/browse/OOZIE-1940
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>
> StatusTransitService doesn't acquire lock while updating DB. 
> We noticed one such issue while doing HA testing, thanks to [~mchiang]
> We issue a change command to change pause time, which got executed on one 
> server. While change command was running on one server, other server started 
> executing StatusTransitService.
> Server 1 log
> {code}
> 2014-07-16 17:28:05,268  INFO StatusTransitService$StatusTransitRunnable:539 
> [pool-1-thread-13] - USER[-] GROUP[-] Acquired lock for 
> [org.apache.oozie.service.StatusTransitService]
> 2014-07-16 17:28:09,694  INFO StatusTransitService$StatusTransitRunnable:539 
> [pool-1-thread-13] - USER[-] GROUP[-] Set coordinator job 
> [0011385-140716042555-oozie-oozi-C] status to 'SUCCEEDED' from 'RUNNING' 
> 2014-07-16 17:28:15,416  INFO StatusTransitService$StatusTransitRunnable:539 
> [pool-1-thread-13] - USER[-] GROUP[-] Released lock for 
> [org.apache.oozie.service.StatusTransitService]
> {code}
> Server 2 log
> {code}
> 2014-07-16 17:28:06,499 DEBUG CoordChangeXCommand:545 [http-0.0.0.0-4443-5] - 
> USER[hadoopqa] GROUP[users] TOKEN[] APP[coordB180] 
> JOB[0011385-140716042555-oozie-oozi-C] ACTION[-] New pause/end date is : Wed 
> Jul 16 17:30:00 UTC 2014 and last action number is : 3
> 2014-07-16 17:28:06,508  INFO CoordChangeXCommand:539 [http-0.0.0.0-4443-5] - 
> USER[hadoopqa] GROUP[users] TOKEN[] APP[coordB180] 
> JOB[0011385-140716042555-oozie-oozi-C] ACTION[-] ENDED CoordChangeXCommand 
> for jobId=0011385-140716042555-oozie-oozi-C
> {code}
> CoordMaterializeTransitionXCommand has created all actions( few were in 
> waiting and few were in running state) and set doneMaterialization to true.
> Change command deletes all waiting coords, except 3 running/SUCCEEDED action 
> and reset doneMaterialization.
> StatusTransitService first loads a set of pending jobs and for each job it 
> make DB calls to check coord action status. Coord jobs are loaded only once 
> in beginning.
> This is what happened.
> 1.StatusTransitService loads the coord job which doneMaterialization is set 
> to true at 17:28:05,268 (server 1)
> 2.Change command deletes waiting cation and reset  doneMaterialization at  
> 17:28:06,508 (server 2)
> 3.StatusTransitService load actions for job, only 3 and in SUCCEEDED status. 
> It never reload the doneMaterialization at 17:28:09,694 (server 1)
> StatusTransitService overrides set job status to SUCCEEDED, bcz it's 
> doneMaterialization and all action are SUCCEEDED.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1940) StatusTransitService has race condition

2014-08-27 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112359#comment-14112359
 ] 

Mona Chitnis commented on OOZIE-1940:
-

linking this as dependent of OOZIE-1885

> StatusTransitService has race condition
> ---
>
> Key: OOZIE-1940
> URL: https://issues.apache.org/jira/browse/OOZIE-1940
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>
> StatusTransitService doesn't acquire lock while updating DB. 
> We noticed one such issue while doing HA testing, thanks to [~mchiang]
> We issue a change command to change pause time, which got executed on one 
> server. While change command was running on one server, other server started 
> executing StatusTransitService.
> Server 1 log
> {code}
> 2014-07-16 17:28:05,268  INFO StatusTransitService$StatusTransitRunnable:539 
> [pool-1-thread-13] - USER[-] GROUP[-] Acquired lock for 
> [org.apache.oozie.service.StatusTransitService]
> 2014-07-16 17:28:09,694  INFO StatusTransitService$StatusTransitRunnable:539 
> [pool-1-thread-13] - USER[-] GROUP[-] Set coordinator job 
> [0011385-140716042555-oozie-oozi-C] status to 'SUCCEEDED' from 'RUNNING' 
> 2014-07-16 17:28:15,416  INFO StatusTransitService$StatusTransitRunnable:539 
> [pool-1-thread-13] - USER[-] GROUP[-] Released lock for 
> [org.apache.oozie.service.StatusTransitService]
> {code}
> Server 2 log
> {code}
> 2014-07-16 17:28:06,499 DEBUG CoordChangeXCommand:545 [http-0.0.0.0-4443-5] - 
> USER[hadoopqa] GROUP[users] TOKEN[] APP[coordB180] 
> JOB[0011385-140716042555-oozie-oozi-C] ACTION[-] New pause/end date is : Wed 
> Jul 16 17:30:00 UTC 2014 and last action number is : 3
> 2014-07-16 17:28:06,508  INFO CoordChangeXCommand:539 [http-0.0.0.0-4443-5] - 
> USER[hadoopqa] GROUP[users] TOKEN[] APP[coordB180] 
> JOB[0011385-140716042555-oozie-oozi-C] ACTION[-] ENDED CoordChangeXCommand 
> for jobId=0011385-140716042555-oozie-oozi-C
> {code}
> CoordMaterializeTransitionXCommand has created all actions( few were in 
> waiting and few were in running state) and set doneMaterialization to true.
> Change command deletes all waiting coords, except 3 running/SUCCEEDED action 
> and reset doneMaterialization.
> StatusTransitService first loads a set of pending jobs and for each job it 
> make DB calls to check coord action status. Coord jobs are loaded only once 
> in beginning.
> This is what happened.
> 1.StatusTransitService loads the coord job which doneMaterialization is set 
> to true at 17:28:05,268 (server 1)
> 2.Change command deletes waiting cation and reset  doneMaterialization at  
> 17:28:06,508 (server 2)
> 3.StatusTransitService load actions for job, only 3 and in SUCCEEDED status. 
> It never reload the doneMaterialization at 17:28:09,694 (server 1)
> StatusTransitService overrides set job status to SUCCEEDED, bcz it's 
> doneMaterialization and all action are SUCCEEDED.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1885) Query optimization for StatusTransitService

2014-08-27 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112357#comment-14112357
 ] 

Mona Chitnis commented on OOZIE-1885:
-

A join query is always more CPU and memory intensive. But it will probably cut 
down on the overall time it takes, because of the multiple queries in loop 
right now. Approach is fine but we should vet it with end-to-end test 
performance gains

> Query optimization for StatusTransitService
> ---
>
> Key: OOZIE-1885
> URL: https://issues.apache.org/jira/browse/OOZIE-1885
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>
> {code}
>  private void coordTransit() throws JPAExecutorException, CommandException {
> List pendingJobCheckList = null;
> if (lastInstanceStartTime == null) {
> LOG.info("Running coordinator status service first instance");
> // this is the first instance, we need to check for all 
> pending jobs;
> pendingJobCheckList = jpaService.execute(new 
> CoordJobsGetPendingJPAExecutor(limit));
> }
> else {
> LOG.info("Running coordinator status service from last 
> instance time =  "
> + DateUtils.formatDateOozieTZ(lastInstanceStartTime));
> // this is not the first instance, we should only check jobs
> // that have actions or jobs been
> // updated >= start time of last service run;
> List actionsList = 
> CoordActionQueryExecutor.getInstance().getList(
> 
> CoordActionQuery.GET_COORD_ACTIONS_BY_LAST_MODIFIED_TIME, 
> lastInstanceStartTime);
> Set coordIds = new HashSet();
> for (CoordinatorActionBean action : actionsList) {
> coordIds.add(action.getJobId());
> }
> pendingJobCheckList = new ArrayList();
> for (String coordId : coordIds.toArray(new 
> String[coordIds.size()])) {
> CoordinatorJobBean coordJob;
> try {
> coordJob = 
> CoordJobQueryExecutor.getInstance().get(CoordJobQuery.GET_COORD_JOB, coordId);
> }
> catch (JPAExecutorException jpaee) {
> if (jpaee.getErrorCode().equals(ErrorCode.E0604)) {
> LOG.warn("Exception happened during 
> StatusTransitRunnable; Coordinator Job doesn't exist", jpaee);
> continue;
> } else {
> throw jpaee;
> }
> }
> // Running coord job might have pending false
> Job.Status coordJobStatus = coordJob.getStatus();
> if ((coordJob.isPending() || 
> coordJobStatus.equals(Job.Status.PAUSED)
> || coordJobStatus.equals(Job.Status.RUNNING)
> || 
> coordJobStatus.equals(Job.Status.RUNNINGWITHERROR)
> || 
> coordJobStatus.equals(Job.Status.PAUSEDWITHERROR))
> && !coordJobStatus.equals(Job.Status.IGNORED)) {
> pendingJobCheckList.add(coordJob);
> }
> }
> 
> pendingJobCheckList.addAll(CoordJobQueryExecutor.getInstance().getList(
> CoordJobQuery.GET_COORD_JOBS_CHANGED, 
> lastInstanceStartTime));
> }
> aggregateCoordJobsStatus(pendingJobCheckList);
> }
> }
> {code}
> This could be done in one sql, something like 
> select w.id, w.status, w.pending from CoordinatorJobBean w where 
> w.startTimestamp <= :matTime AND (w.statusStr = 'PREP' OR w.statusStr = 
> 'RUNNING' or w.statusStr = 'RUNNINGWITHERROR' or w.statusStr= 
> 'PAUSEDWITHERROR' and w.statusStr <> 'IGNORED') w.id in  ( select a.jobId 
> from CoordinatorActionBean a where a.lastModifiedTimestamp >= 
> :lastModifiedTime groupby a.jobId)
> Same for bundleTransit().



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1847) HA - Oozie servers should shutdown (or go in safe mode) in case of ZK failure

2014-08-27 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112333#comment-14112333
 ] 

Mona Chitnis commented on OOZIE-1847:
-

^^ in case of timeout > 3 seconds resulting in server shutdown and job failure

> HA - Oozie servers should shutdown (or go in safe mode) in case of ZK failure
> -
>
> Key: OOZIE-1847
> URL: https://issues.apache.org/jira/browse/OOZIE-1847
> Project: Oozie
>  Issue Type: Bug
>  Components: HA
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1847-V1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1847) HA - Oozie servers should shutdown (or go in safe mode) in case of ZK failure

2014-08-27 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112331#comment-14112331
 ] 

Mona Chitnis commented on OOZIE-1847:
-

pretty straightforward patch and agree its needed. But in addition to printing 
in logs, should we bubble it up to action error message too? That way reason 
for a workflow failing can be pulled up from any of the client-facing APIs too 
- e.g. job-info, web-console, RESTful aPI etc

> HA - Oozie servers should shutdown (or go in safe mode) in case of ZK failure
> -
>
> Key: OOZIE-1847
> URL: https://issues.apache.org/jira/browse/OOZIE-1847
> Project: Oozie
>  Issue Type: Bug
>  Components: HA
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1847-V1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-20 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1976:


Attachment: OOZIE-1976-rough-design-2.pdf

New design spec uploaded (rought-design-2) with additions about
 * Wait-for in action
 * EL functions initial thoughts - implementation details will follow in code 
patch
 * HCatDependencyCache changes (for the in-memory push-based hcat dependencies)
 * Job info API (coord-action) changes for displaying Missing Dependency. It 
runs the risk of being verbose if optional dataset has lot of instances. Needs 
thought about how to possibly truncate there.
 

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>    Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-rough-design-2.pdf, 
> OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-20 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104618#comment-14104618
 ] 

Mona Chitnis commented on OOZIE-1976:
-

For Ryota's comment about priority, I think it complicates the missing 
dependencies field, now we require a structure to indicate something like 
{{P0=dep1,dep2#P1=dep3,dep4}} which in turn is nested under the AND/OR 
structure. So when dependencies are checked and found to exist, action will 
start only when all P0's are satisfied etc. I think this is essentially same as 
putting them in the  block instead of optional  block. For the N out 
of M case, it will start when _any_ instances >=n are available, using all M if 
all there, and not limit to N there.

Good pointer about EL functions, that one's going to be important and will 
probably need a few new ones.

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>    Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-20 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104227#comment-14104227
 ] 

Mona Chitnis commented on OOZIE-1976:
-

Thanks Puru and Ryota. Will incorporate your comments and come up with new 
design specification. As for the 'explain', this can be done as part of 'info' 
command displaying missing dependency itself, rather than introducing another 
command

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-18 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1976:


Attachment: OOZIE-1976-rough-design.pdf

Attaching rough design doc (pdf)

> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>    Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1976-rough-design.pdf
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-18 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1976:


Description: 
All dataset instances specified as input to coordinator, currently work on AND 
logic i.e. ALL of them should be available for workflow to start. We should 
enhance this to include more logical ways of specifying availability criteria 
e.g.
 * OR between instances
 * minimum N out of K instances
 * delta datasets (process data incrementally)

Use-cases for this:
 * Different datasets are BCP, and workflow can run with either, whichever 
arrives earlier.
 * Data is not guaranteed, and while $coord:latest allows skipping to available 
ones, workflow will never trigger unless mentioned number of instances are 
found.
 * Workflow is like a ‘refining’ algorithm which should run after minimum 
required datasets are ready, and should only process the delta for efficiency.

This JIRA is to discuss the design and then the review the implementation for 
some or all of the above features.

  was:
All dataset instances specified as input to coordinator, currently work on AND 
logic i.e. ALL of them should be available for workflow to start. We should 
enhance this to include more logical ways of specifying availability criteria 
e.g.
 * OR between instances
 * minimum N out of K instances
 * delta datasets (process data incrementally)

Use-cases for this:
Different datasets are BCP, and workflow can run with either, whichever arrives 
earlier.
Data is not guaranteed, and while $coord:latest allows skipping to available 
ones, workflow will never trigger unless mentioned number of instances are 
found.
Workflow is like a ‘refining’ algorithm which should run after minimum required 
datasets are ready, and should only process the delta for efficiency.

This JIRA is to discuss the design and then the review the implementation for 
some or all of the above features.


> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>    Assignee: Mona Chitnis
> Fix For: trunk
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
>  * Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
>  * Data is not guaranteed, and while $coord:latest allows skipping to 
> available ones, workflow will never trigger unless mentioned number of 
> instances are found.
>  * Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-18 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1976:


Description: 
All dataset instances specified as input to coordinator, currently work on AND 
logic i.e. ALL of them should be available for workflow to start. We should 
enhance this to include more logical ways of specifying availability criteria 
e.g.
 * OR between instances
 * minimum N out of K instances
 * delta datasets (process data incrementally)

Use-cases for this:
Different datasets are BCP, and workflow can run with either, whichever arrives 
earlier.
Data is not guaranteed, and while $coord:latest allows skipping to available 
ones, workflow will never trigger unless mentioned number of instances are 
found.
Workflow is like a ‘refining’ algorithm which should run after minimum required 
datasets are ready, and should only process the delta for efficiency.

This JIRA is to discuss the design and then the review the implementation for 
some or all of the above features.

  was:
All dataset instances specified as input to coordinator, currently work on AND 
logic i.e. ALL of them should be available for workflow to start. We should 
enhance this to include more logical ways of specifying availability criteria 
e.g.
 * OR between instances
 * minimum N out of K instances
 * delta datasets (process data incrementally)

This JIRA is to discuss the design and then the review the implementation for 
some or all of the above features.


> Specifying coordinator input datasets in more logical ways
> --
>
> Key: OOZIE-1976
> URL: https://issues.apache.org/jira/browse/OOZIE-1976
> Project: Oozie
>  Issue Type: New Feature
>  Components: coordinator
>Affects Versions: trunk
>Reporter: Mona Chitnis
>    Assignee: Mona Chitnis
> Fix For: trunk
>
>
> All dataset instances specified as input to coordinator, currently work on 
> AND logic i.e. ALL of them should be available for workflow to start. We 
> should enhance this to include more logical ways of specifying availability 
> criteria e.g.
>  * OR between instances
>  * minimum N out of K instances
>  * delta datasets (process data incrementally)
> Use-cases for this:
> Different datasets are BCP, and workflow can run with either, whichever 
> arrives earlier.
> Data is not guaranteed, and while $coord:latest allows skipping to available 
> ones, workflow will never trigger unless mentioned number of instances are 
> found.
> Workflow is like a ‘refining’ algorithm which should run after minimum 
> required datasets are ready, and should only process the delta for efficiency.
> This JIRA is to discuss the design and then the review the implementation for 
> some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OOZIE-1976) Specifying coordinator input datasets in more logical ways

2014-08-18 Thread Mona Chitnis (JIRA)
Mona Chitnis created OOZIE-1976:
---

 Summary: Specifying coordinator input datasets in more logical ways
 Key: OOZIE-1976
 URL: https://issues.apache.org/jira/browse/OOZIE-1976
 Project: Oozie
  Issue Type: New Feature
  Components: coordinator
Affects Versions: trunk
Reporter: Mona Chitnis
Assignee: Mona Chitnis
 Fix For: trunk


All dataset instances specified as input to coordinator, currently work on AND 
logic i.e. ALL of them should be available for workflow to start. We should 
enhance this to include more logical ways of specifying availability criteria 
e.g.
 * OR between instances
 * minimum N out of K instances
 * delta datasets (process data incrementally)

This JIRA is to discuss the design and then the review the implementation for 
some or all of the above features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-14 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/
---

(Updated Aug. 14, 2014, 11:13 p.m.)


Review request for oozie.


Changes
---

updated patch to include unit tests, and fixes uncovered in the process


Bugs: OOZIE-1913
https://issues.apache.org/jira/browse/OOZIE-1913


Repository: oozie-git


Description
---

See Jira


Diffs (updated)
-

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
  client/src/main/java/org/apache/oozie/client/OozieClient.java b468186 
  
client/src/main/java/org/apache/oozie/client/event/jms/JMSHeaderConstants.java 
2f0a45c 
  client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 5d3fc62 
  core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
  core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
  core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
5d3b6af 
  core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
ffb2d08 
  
core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
 b4b2fef 
  core/src/main/java/org/apache/oozie/command/coord/CoordSubmitXCommand.java 
02b30ef 
  core/src/main/java/org/apache/oozie/coord/CoordUtils.java 26db068 
  
core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java 
cd26e07 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
42a0968 
  core/src/main/java/org/apache/oozie/jms/JMSSLAEventListener.java 8296a6c 
  
core/src/main/java/org/apache/oozie/service/CoordMaterializeTriggerService.java 
3fbd092 
  core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 8ca2e81 
  core/src/main/java/org/apache/oozie/servlet/V2SLAServlet.java 8620af5 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 67d6237 
  core/src/main/java/org/apache/oozie/sla/SLACalculator.java 132d4df 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 
  core/src/main/java/org/apache/oozie/sla/SLAOperations.java 0cad071 
  core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2349329 
  core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java fd21c45 
  core/src/main/resources/oozie-default.xml ebceaa7 
  core/src/test/java/org/apache/oozie/client/TestWorkflowClient.java e2e0f11 
  
core/src/test/java/org/apache/oozie/command/coord/TestCoordSubmitXCommand.java 
fedf4a8 
  core/src/test/java/org/apache/oozie/coord/TestCoordUtils.java a39efe3 
  core/src/test/java/org/apache/oozie/jms/TestJMSSLAEventListener.java fa26935 
  core/src/test/java/org/apache/oozie/servlet/TestV2SLAServlet.java 5a35fdb 
  core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 210c99e 

Diff: https://reviews.apache.org/r/24487/diff/


Testing (updated)
---

unit tests added, e-2-e test with CLI command done


Thanks,

Mona Chitnis



[jira] [Commented] (OOZIE-1913) Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-13 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096118#comment-14096118
 ] 

Mona Chitnis commented on OOZIE-1913:
-

Want to mention another point:

this API also allows to disable for "ALL" sla instances for a coordinator or 
bundle. For bundle, that would mean all coordinators' all actions. 
SLARegistrationBean stores 'parentId' if the sla object pertains to 
coord-action/wf-action/bundle-action. To avoid heavy dB query in case of the 
suspend ALL for bundle(s) case, I want to change this 'parentId' to point to 
bundle jobId directly, if coordinator is part of a bundle. If not, it will be 
coord job id as it is now.

The impact this has is in JMSSLAEventListener, where topicName is set to this 
parentId. So topicName will get set to top-level bundle-id, and user will have 
to change topic name being listened to. Please give feedback if this is a 
reasonable approach. I will make sure appropriate JMS selector options are 
available, if user gives this bundle id topicName, but still wants to limit per 
coordinator job id. 

> Devise a way to turn off SLA alerts for bundle/coordinator flexibly
> ---
>
> Key: OOZIE-1913
> URL: https://issues.apache.org/jira/browse/OOZIE-1913
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
>
> From user:
> Need to turn off the SLA miss alerts in jobs when the bundle is suspended for
> grid upgrades and similar work so that when it's resumed we aren't flooded 
> with a bunch of alerts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-11 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/#review50269
---



core/src/main/resources/oozie-default.xml
<https://reviews.apache.org/r/24487/#comment87956>

this change is part of OOZIE-1932 and will remove it in next patch version


- Mona Chitnis


On Aug. 8, 2014, 2:20 a.m., Mona Chitnis wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24487/
> ---
> 
> (Updated Aug. 8, 2014, 2:20 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1913
> https://issues.apache.org/jira/browse/OOZIE-1913
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> See Jira
> 
> 
> Diffs
> -
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/java/org/apache/oozie/client/OozieClient.java b468186 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 5d3fc62 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
> 5d3b6af 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
> ffb2d08 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  b4b2fef 
>   core/src/main/java/org/apache/oozie/command/coord/CoordSubmitXCommand.java 
> 02b30ef 
>   core/src/main/java/org/apache/oozie/coord/CoordUtils.java 26db068 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  cd26e07 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 42a0968 
>   core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 8ca2e81 
>   core/src/main/java/org/apache/oozie/servlet/V2SLAServlet.java 8620af5 
>   core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 67d6237 
>   core/src/main/java/org/apache/oozie/sla/SLACalculator.java 132d4df 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 
>   core/src/main/java/org/apache/oozie/sla/SLAOperations.java 0cad071 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2349329 
>   core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 
> fd21c45 
>   core/src/main/resources/oozie-default.xml ebceaa7 
>   core/src/test/java/org/apache/oozie/client/TestWorkflowClient.java e2e0f11 
> 
> Diff: https://reviews.apache.org/r/24487/diff/
> 
> 
> Testing
> ---
> 
> ongoing
> 
> 
> Thanks,
> 
> Mona Chitnis
> 
>



Re: Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-08 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/#review50101
---



client/src/main/java/org/apache/oozie/cli/OozieCLI.java
<https://reviews.apache.org/r/24487/#comment87652>

this has been removed



client/src/main/java/org/apache/oozie/cli/OozieCLI.java
<https://reviews.apache.org/r/24487/#comment87653>

this has been removed. The action ids/dates range is read as argument for 
option -suspendalerts itself


- Mona Chitnis


On Aug. 8, 2014, 2:20 a.m., Mona Chitnis wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24487/
> ---
> 
> (Updated Aug. 8, 2014, 2:20 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1913
> https://issues.apache.org/jira/browse/OOZIE-1913
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> See Jira
> 
> 
> Diffs
> -
> 
>   client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
>   client/src/main/java/org/apache/oozie/client/OozieClient.java b468186 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 5d3fc62 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
>   core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
>   core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
> 5d3b6af 
>   
> core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
> ffb2d08 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  b4b2fef 
>   core/src/main/java/org/apache/oozie/command/coord/CoordSubmitXCommand.java 
> 02b30ef 
>   core/src/main/java/org/apache/oozie/coord/CoordUtils.java 26db068 
>   
> core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java
>  cd26e07 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
> 42a0968 
>   core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 8ca2e81 
>   core/src/main/java/org/apache/oozie/servlet/V2SLAServlet.java 8620af5 
>   core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 67d6237 
>   core/src/main/java/org/apache/oozie/sla/SLACalculator.java 132d4df 
>   core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 
>   core/src/main/java/org/apache/oozie/sla/SLAOperations.java 0cad071 
>   core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2349329 
>   core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java 
> fd21c45 
>   core/src/main/resources/oozie-default.xml ebceaa7 
>   core/src/test/java/org/apache/oozie/client/TestWorkflowClient.java e2e0f11 
> 
> Diff: https://reviews.apache.org/r/24487/diff/
> 
> 
> Testing
> ---
> 
> ongoing
> 
> 
> Thanks,
> 
> Mona Chitnis
> 
>



Re: Problem with Oozie DeadLock

2014-08-08 Thread Mona Chitnis
Hi Fabiano,

You should definitely be able to execute multiple Oozie jobs simultaneously. 
The issue comes up when you have a small hadoop cluster setup, and thus very 
small number of queue slots for submitting jobs to the ResourceManager.

Can you look into adding an additional queue by configuring your Hadoop cluster 
through capacity-scheduler.xml? Then you can use the approach mentioned in 
OOZIE-1673, to specify in your workflow's properties

oozie.launcher.mapreduce.job.queuename=queue1
mapreduce.job.queuename=queue2 

and avoid deadlock situation.

You can also avoid deadlocks by tuning the memory requirements of your oozie 
launcher and child jobs, to request for lower memory container slots, and 
increase the number of jobs you can submit and execute that 
way.http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.9.1/bk_installing_manually_book/content/rpm-chap1-11.html

 
Mona Chitnis
Yahoo!


On Friday, August 8, 2014 6:58 AM, Gaetano Fabiano  
wrote:
 


Hi All,
this is my first post on this mailing list I hope to stay here for long
time. I saw the project and I love it.
But recently I'm going crazy about issues.
Our problem is about Oozie Deadlock, we would like to execute more than one
Oozie job at the same time but when we try to execute its the entire
environment becames locked all.
We read a lot of post about this issue but without solution.

How is possible to set two different queue? We read a lot of post where
people suggest to use different queue for different execution
Where we could set these queue setting?
Is this the correct way to have different execution at the same time?

Our bewilderment is reading this issue
https://issues.apache.org/jira/browse/OOZIE-1673 and if is as described
the big doubt is about the usefulness of the enteire Oozie framework.
I hope someone can help us to resolve it and clear our mind about.
Any suggestion is welcome.

Regards
Gaetano

Dott. Gaetano Fabiano

via Timpone n° 79
87055 San  Giovanni in Fiore (Cs) ITALY
mobile: +39 328 9469919
phone: +39 0984 991980
email: fabiano.gaet...@gmail.com
skype: deepyoudeep
skype:: gaetano.fab
msn: deep...@hotmail.it
twitter: @gaetanofabiano
gtalk/hangoout/Google+ fg.pa...@gmail.com

[jira] [Commented] (OOZIE-1913) Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-08 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090999#comment-14090999
 ] 

Mona Chitnis commented on OOZIE-1913:
-

okay let me remove "-id" requirement. Regarding treating as job operation, I 
think it becomes ambiguous what type of alerts it means, so better to be clear 
with 'sla' command. Also, it removes the need to add additional param 'actions'.

But can rework this if there's a consensus about what api usage is more 
intuitive. Asking feedback from users too

> Devise a way to turn off SLA alerts for bundle/coordinator flexibly
> ---
>
> Key: OOZIE-1913
> URL: https://issues.apache.org/jira/browse/OOZIE-1913
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
>
> From user:
> Need to turn off the SLA miss alerts in jobs when the bundle is suspended for
> grid upgrades and similar work so that when it's resumed we aren't flooded 
> with a bunch of alerts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1913) Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-07 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1913:


Summary: Devise a way to turn off SLA alerts for bundle/coordinator 
flexibly  (was: Devise a way to turn off SLA alerts when bundle/coordinator 
suspended)

> Devise a way to turn off SLA alerts for bundle/coordinator flexibly
> ---
>
> Key: OOZIE-1913
> URL: https://issues.apache.org/jira/browse/OOZIE-1913
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
>
> From user:
> Need to turn off the SLA miss alerts in jobs when the bundle is suspended for
> grid upgrades and similar work so that when it's resumed we aren't flooded 
> with a bunch of alerts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 24487: OOZIE-1913 Devise a way to turn off SLA alerts for bundle/coordinator flexibly

2014-08-07 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24487/
---

Review request for oozie.


Bugs: OOZIE-1913
https://issues.apache.org/jira/browse/OOZIE-1913


Repository: oozie-git


Description
---

See Jira


Diffs
-

  client/src/main/java/org/apache/oozie/cli/OozieCLI.java 33935d3 
  client/src/main/java/org/apache/oozie/client/OozieClient.java b468186 
  client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 5d3fc62 
  core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 795bf63 
  core/src/main/java/org/apache/oozie/CoordinatorJobBean.java 8fd53f1 
  core/src/main/java/org/apache/oozie/command/SubmitTransitionXCommand.java 
5d3b6af 
  core/src/main/java/org/apache/oozie/command/bundle/BundleSubmitXCommand.java 
ffb2d08 
  
core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
 b4b2fef 
  core/src/main/java/org/apache/oozie/command/coord/CoordSubmitXCommand.java 
02b30ef 
  core/src/main/java/org/apache/oozie/coord/CoordUtils.java 26db068 
  
core/src/main/java/org/apache/oozie/executor/jpa/CoordActionQueryExecutor.java 
cd26e07 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordJobQueryExecutor.java 
42a0968 
  core/src/main/java/org/apache/oozie/servlet/SLAServlet.java 8ca2e81 
  core/src/main/java/org/apache/oozie/servlet/V2SLAServlet.java 8620af5 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 67d6237 
  core/src/main/java/org/apache/oozie/sla/SLACalculator.java 132d4df 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 3801325 
  core/src/main/java/org/apache/oozie/sla/SLAOperations.java 0cad071 
  core/src/main/java/org/apache/oozie/sla/service/SLAService.java 2349329 
  core/src/main/java/org/apache/oozie/util/CoordActionsInDateRange.java fd21c45 
  core/src/main/resources/oozie-default.xml ebceaa7 
  core/src/test/java/org/apache/oozie/client/TestWorkflowClient.java e2e0f11 

Diff: https://reviews.apache.org/r/24487/diff/


Testing
---

ongoing


Thanks,

Mona Chitnis



Re: 4.1 release

2014-08-07 Thread Mona Chitnis
OOZIE-1932 I will fix in a couple of days.

Thanks,

 
Mona Chitnis



On Thursday, August 7, 2014 10:25 AM, bowen zhang 
 wrote:
 


Hi guys,
The following link shows all the unresolved issues that will go into 4.1 
release. If anyone has another ticket that needs to go into 4.1, please make 
the fix version "4.1.0".
Issue Navigator - ASF JIRA
Thanks,
Bowen

  
             
Issue Navigator - ASF JIRA
Linked Applications Loading…… Dashboards Projects Issues Agile Help Online Help 
 
View on issues.apache.org Preview by Yahoo

Re: Review Request 24187: OOZIE-1958 address duplication of env variables in oozie.launcher.yarn.app.mapreduce.am.env when running with uber mode

2014-08-05 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24187/#review49628
---


+1 pending minor comment about naming and checking via e-2-e test


core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/24187/#comment86870>

keep naming consistent i.e.
launcherEnvMap and launcherEnvMapStr



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/24187/#comment86865>

false formatting change.. but its ok to include



core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
<https://reviews.apache.org/r/24187/#comment86866>

same as above


- Mona Chitnis


On Aug. 1, 2014, 5:56 p.m., Ryota Egashira wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24187/
> ---
> 
> (Updated Aug. 1, 2014, 5:56 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1958
> https://issues.apache.org/jira/browse/OOZIE-1958
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/OOZIE-1958?filter=-1
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 
> 94b55cf 
>   
> core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java 
> 72a137c 
> 
> Diff: https://reviews.apache.org/r/24187/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Ryota Egashira
> 
>



Re: Thanks for fixing UTC.

2014-08-02 Thread Mona Chitnis
No problem. OOZIE-1811 did fix root cause of non-flaky tests failing randomly 
but there's one which is timing sensitive and needs to be fixed
org.apache.oozie.service.TestCallableQueueService.testConcurrencyReachedAndChooseNextEligible


and OOZIE-1952 will fix TestPurgeXCommand which uses old StoreService code.

And then we're done! :)

 
Mona Chitnis
Software Engineer, Hadoop Team
Yahoo!


On Friday, August 1, 2014 4:25 PM, Purshotam Shah 
 wrote:
 


Thanks Mona for fixing testcases. It feel so good to see no UTC failure.


On 8/1/14, 4:13 PM, "Hadoop QA (JIRA)"  wrote:

>
>    [ 
>https://issues.apache.org/jira/browse/OOZIE-1939?page=com.atlassian.jira.p
>lugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14083151#com
>ment-14083151 ] 
>
>Hadoop QA commented on OOZIE-1939:
>--
>
>Testing JIRA OOZIE-1939
>
>Cleaning local git workspace
>
>
>
>{color:green}+1
 PATCH_APPLIES{color}
>{color:green}+1 CLEAN{color}
>{color:green}+1 RAW_PATCH_ANALYSIS{color}
>.    {color:green}+1{color} the patch does not introduce any @author tags
>.    {color:green}+1{color} the patch does not introduce any tabs
>.    {color:green}+1{color} the patch does not introduce any trailing
>spaces
>.    {color:green}+1{color} the patch does not introduce any line longer
>than 132
>.    {color:green}+1{color} the patch does adds/modifies 1 testcase(s)
>{color:green}+1 RAT{color}
>. 
   {color:green}+1{color} the patch does not seem to introduce new RAT
>warnings
>{color:green}+1 JAVADOC{color}
>.    {color:green}+1{color} the patch does not seem to introduce new
>Javadoc warnings
>{color:green}+1 COMPILE{color}
>.    {color:green}+1{color} HEAD
 compiles
>.    {color:green}+1{color} patch compiles
>.    {color:green}+1{color} the patch does not seem to introduce new
>javac warnings
>{color:green}+1 BACKWARDS_COMPATIBILITY{color}
>.    {color:green}+1{color} the patch does not change any JPA
>Entity/Colum/Basic/Lob/Transient annotations
>.    {color:green}+1{color} the patch does not modify JPA files
>{color:green}+1 TESTS{color}
>.    Tests run: 1506
>{color:green}+1 DISTRO{color}
>.    {color:green}+1{color} distro tarball builds with
 the patch
>
>
>{color:green}*+1 Overall result, good!, no -1s*{color}
>
>
>The full output of the test-patch run is available at
>
>.  https://builds.apache.org/job/oozie-trunk-precommit-build/1377/
>
>> Incorrect job information is set while logging
>> --
>>
>>                 Key: OOZIE-1939
>>                 URL: https://issues.apache.org/jira/browse/OOZIE-1939
>>             Project: Oozie
>>          Issue Type: Bug
>>            Reporter: Purshotam Shah
>>            Assignee: Azrael
>>         Attachments: OOZIE-1939.1.patch, OOZIE-1939.2.patch
>>
>>
>> {code}
>> 2014-07-16 17:28:06,422 DEBUG
 CoordChangeXCommand:545
>>[http-0.0.0.0-4443-5] - USER[hadoopqa] GROUP[users] TOKEN[]
>>APP[coordB236] JOB[0011514-140716042555-oozie-oozi-C] ACTION[-] Acquired
>>lock for [0011385-140716042555-oozie-oozi-C] in [coord_change]
>> 2014-07-16 17:28:06,422 TRACE CoordChangeXCommand:548
>>[http-0.0.0.0-4443-5] - USER[hadoopqa] GROUP[users] TOKEN[]
>>APP[coordB236] JOB[0011514-140716042555-oozie-oozi-C] ACTION[-] Load
>>state for [0011385-140716042555-oozie-oozi-C]
>> {code}
>> {code}
>>     protected void loadState() throws CommandException {
>>         jpaService = Services.get().get(JPAService.class);
>>         if (jpaService == null) {
>>             LOG.error(ErrorCode.E0610);
>>         }
>>        
 try {
>>             coordJob =
>>CoordJobQueryExecutor.getInstance().get(CoordJobQuery.GET_COORD_JOB_MATER
>>IALIZE, jobId);
>>             prevStatus = coordJob.getStatus();
>>         }
>>         catch (JPAExecutorException jex) {
>>             throw new CommandException(jex);
>>         }
>>         // calculate start materialize and end materialize time
>>        
 calcMatdTime();
>>         LogUtils.setLogInfo(coordJob, logInfo);
>>     }
>> {code}
>> Most of the commands set jobinfo after loadstate, because of that few
>>log statements ( like acquiring lock, load state) logs with
 previous
>>jobinfo. 
>
>
>
>--
>This message was sent by Atlassian JIRA
>(v6.2#6252)

[jira] [Commented] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-08-01 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082590#comment-14082590
 ] 

Mona Chitnis commented on OOZIE-1932:
-

okay thanks. will revise the order 

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932-addendum.patch, 
> OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-08-01 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: (was: OOZIE-1811-3.patch)

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch, 
> OOZIE-1811-3.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-08-01 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: OOZIE-1811-3.patch

good catch! uploaded new patch

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch, 
> OOZIE-1811-3.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-08-01 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: OOZIE-1811-3.patch

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch, 
> OOZIE-1811-3.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-08-01 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: (was: OOZIE-1811-3.patch)

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch, 
> OOZIE-1811-3.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-08-01 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: OOZIE-1811-3.patch

addressed review comments and fixed couple of classes missed in earlier patch - 
BatchQueryExecutor, SLA*QueryExecutors to be consistent

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch, 
> OOZIE-1811-3.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1939) Incorrect job information is set while logging

2014-08-01 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082317#comment-14082317
 ] 

Mona Chitnis commented on OOZIE-1939:
-

Yet it will work with threadlocal params too. Fix was done to minimize overall 
change and just clear prefix and set it to what object the thread is handling 
now. same will apply with threadlocal params too

> Incorrect job information is set while logging
> --
>
> Key: OOZIE-1939
> URL: https://issues.apache.org/jira/browse/OOZIE-1939
> Project: Oozie
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Azrael
> Attachments: OOZIE-1939.1.patch, OOZIE-1939.2.patch
>
>
> {code}
> 2014-07-16 17:28:06,422 DEBUG CoordChangeXCommand:545 [http-0.0.0.0-4443-5] - 
> USER[hadoopqa] GROUP[users] TOKEN[] APP[coordB236] 
> JOB[0011514-140716042555-oozie-oozi-C] ACTION[-] Acquired lock for 
> [0011385-140716042555-oozie-oozi-C] in [coord_change]
> 2014-07-16 17:28:06,422 TRACE CoordChangeXCommand:548 [http-0.0.0.0-4443-5] - 
> USER[hadoopqa] GROUP[users] TOKEN[] APP[coordB236] 
> JOB[0011514-140716042555-oozie-oozi-C] ACTION[-] Load state for 
> [0011385-140716042555-oozie-oozi-C]
> {code}
> {code}
> protected void loadState() throws CommandException {
> jpaService = Services.get().get(JPAService.class);
> if (jpaService == null) {
> LOG.error(ErrorCode.E0610);
> }
> try {
> coordJob = 
> CoordJobQueryExecutor.getInstance().get(CoordJobQuery.GET_COORD_JOB_MATERIALIZE,
>  jobId);
> prevStatus = coordJob.getStatus();
> }
> catch (JPAExecutorException jex) {
> throw new CommandException(jex);
> }
> // calculate start materialize and end materialize time
> calcMatdTime();
> LogUtils.setLogInfo(coordJob, logInfo);
> }
> {code}
> Most of the commands set jobinfo after loadstate, because of that few log 
> statements ( like acquiring lock, load state) logs with previous jobinfo. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1811) Current test failures in trunk

2014-07-31 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081820#comment-14081820
 ] 

Mona Chitnis commented on OOZIE-1811:
-

{{. -1 the patch contains 2 line(s) with trailing spaces}} located and fixed in 
the xml file - {{coord-action-sla.xml}}

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-07-31 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1932:


Attachment: OOZIE-1932-addendum.patch

attaching simple change to initialize CallableQueueService at the very end so 
that its destroyed first in order. unit test TestBulkMonitorWebServiceAPI 
failed in my local machine but not able to determine if cause related to this 
change. I will let the pre-commit build test run decide

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932-addendum.patch, 
> OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1811) Current test failures in trunk

2014-07-29 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078158#comment-14078158
 ] 

Mona Chitnis commented on OOZIE-1811:
-

above failures due to strange network error on the host. Happened before at 
https://builds.apache.org/job/oozie-trunk-precommit-build/1363/ too.

Ran the whole suit locally and only 1 failed, which I've mentioned is going to 
be part of OOZIE-1952.

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (OOZIE-1932) Services should load CallableQueueService after MemoryLocksService

2014-07-29 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis reopened OOZIE-1932:
-


Reopening issue to fix similar issue with URIHandlerService should be loaded 
before CallableQueueService, so that its closed before. This JIRA's scope to 
include a permanent fix to the services ordering to work for all cases, and 
avoid all NPEs and other issues with the services during server shutdown/startup

> Services should load CallableQueueService after MemoryLocksService
> --
>
> Key: OOZIE-1932
> URL: https://issues.apache.org/jira/browse/OOZIE-1932
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1932-2.patch, OOZIE-1932.patch
>
>
> This is not a problem during startup but is during shutdown, as services are 
> destroyed in reverse order of initialization. Hence, when MemoryLocksService 
> destroy sets it to null, and commands are still executing due to 
> CallableQueueService still active, they all encounter NPEs during locking. 
> This is a simple fix in oozie-default.xml to set MemoryLocksService before in 
> the order of services loading.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-07-29 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: OOZIE-1811-2.patch

updated patch to apply cleanly to trunk HEAD

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch, OOZIE-1811-2.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1811) Current test failures in trunk

2014-07-28 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1811:


Attachment: OOZIE-1811-1.patch

attaching patch which fixes the QueryExecutors and TestSLAEventGeneration. 
Errors related to StoreService usage in tests can be fixed as part of overall 
StoreService fix in OOZIE-1952

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
> Attachments: OOZIE-1811-1.patch
>
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OOZIE-1952) Cleanup duplicate/obsolete code - Command, StoreService

2014-07-28 Thread Mona Chitnis (JIRA)
Mona Chitnis created OOZIE-1952:
---

 Summary: Cleanup duplicate/obsolete code - Command, StoreService
 Key: OOZIE-1952
 URL: https://issues.apache.org/jira/browse/OOZIE-1952
 Project: Oozie
  Issue Type: Task
Reporter: Mona Chitnis


StoreService has been superceded by JPAService, and Command has been superceded 
by XCommand. These old classes have been lying around long enough and probably 
only referenced through unit tests, creating some confusion when tests have to 
be fixed for flaky failures



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1811) Current test failures in trunk

2014-07-24 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074117#comment-14074117
 ] 

Mona Chitnis commented on OOZIE-1811:
-

I'd suggest getting rid of the static reference to JPAService in each of the 
Query Executors. We can always get the reference to it from the Services 
singleton, while executing the query. By keeping another static reference and 
manipulating it through the constructor and destroy(), we run the risk of 
nullifying it inadvertently. This is why suddenly so many tests are becoming 
flaky and it is very tough to detect exact patterns or even fix tests in a 
foolproof way. 

I ran the whole suit with the static reference removed and only 2 tests failed 
- which is quite an improvement!
{code}
Results :

Failed tests:   
testBundleId(org.apache.oozie.servlet.TestBulkMonitorWebServiceAPI): 
expected: but was:

Tests in error: 
  testSucCoordPurgeXCommand(org.apache.oozie.command.TestPurgeXCommand): E0604: 
Job does not exist [000-140724213655573-oozie-chit-C]
{code}

Test#2 here is failing with error StoreService cannot work without JPAService. 
We can replace usage of StoreService completely as it is superceded by 
JPAService anyway.
Test #1 doesnt really have any error except random assert fail, and this test 
is not usually flaky so can ignore


> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1944) Recursive variable resolution broken when same parameter name in config-default and action conf

2014-07-24 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1944:


Attachment: OOZIE-1944-2.patch

adding null check for configDefault which was causing TestWorkflowAppParser 
tests to fail

> Recursive variable resolution broken when same parameter name in 
> config-default and action conf
> ---
>
> Key: OOZIE-1944
> URL: https://issues.apache.org/jira/browse/OOZIE-1944
> Project: Oozie
>  Issue Type: Bug
>  Components: workflow
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1944-1.patch, OOZIE-1944-2.patch
>
>
> Hitting error
> {code}
> can not create DagEngine for submitting jobs
> org.apache.oozie.DagEngineException: E0803: IO error, Variable
> substitution depth too large: 20 ${param}/000
> {code}
> when config-default.xml has
> {{param=default}}
> and action conf has
> {code}
> 
> ...
> 
> 
> param
> ${param}/000
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1944) Recursive variable resolution broken when same parameter name in config-default and action conf

2014-07-23 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1944:


Attachment: OOZIE-1944-1.patch

Attaching patch. approach is to switch from using 
XConfiguration.injectDefaults() method to copy(), since the former does a 
Configuration.get() which tries to recursively resolve params. So simply, copy 
over defaults, global, and finally action , in this order of 
precedence

> Recursive variable resolution broken when same parameter name in 
> config-default and action conf
> ---
>
> Key: OOZIE-1944
> URL: https://issues.apache.org/jira/browse/OOZIE-1944
> Project: Oozie
>  Issue Type: Bug
>  Components: workflow
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
> Attachments: OOZIE-1944-1.patch
>
>
> Hitting error
> {code}
> can not create DagEngine for submitting jobs
> org.apache.oozie.DagEngineException: E0803: IO error, Variable
> substitution depth too large: 20 ${param}/000
> {code}
> when config-default.xml has
> {{param=default}}
> and action conf has
> {code}
> 
> ...
> 
> 
> param
> ${param}/000
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1944) Recursive variable resolution broken when same parameter name in config-default and action conf

2014-07-23 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1944:


Fix Version/s: (was: trunk)

> Recursive variable resolution broken when same parameter name in 
> config-default and action conf
> ---
>
> Key: OOZIE-1944
> URL: https://issues.apache.org/jira/browse/OOZIE-1944
> Project: Oozie
>  Issue Type: Bug
>  Components: workflow
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: 4.1.0
>
>
> Hitting error
> {code}
> can not create DagEngine for submitting jobs
> org.apache.oozie.DagEngineException: E0803: IO error, Variable
> substitution depth too large: 20 ${param}/000
> {code}
> when config-default.xml has
> {{param=default}}
> and action conf has
> {code}
> 
> ...
> 
> 
> param
> ${param}/000
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1872) TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime is failing for past couple of builds

2014-07-23 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1872:


Fix Version/s: (was: trunk)

> TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime
>  is failing for past couple of builds
> ---
>
> Key: OOZIE-1872
> URL: https://issues.apache.org/jira/browse/OOZIE-1872
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk, 4.1.0
>Reporter: Rohini Palaniswamy
> Fix For: 4.1.0
>
> Attachments: OOZIE-1872-1.patch
>
>
> https://builds.apache.org/job/oozie-trunk-precommit-build/1291/testReport/junit/org.apache.oozie.command.coord/TestCoordActionInputCheckXCommand/testActionInputCheckLatestActionCreationTime/



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1872) TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime is failing for past couple of builds

2014-07-23 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1872:


Component/s: tests

> TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime
>  is failing for past couple of builds
> ---
>
> Key: OOZIE-1872
> URL: https://issues.apache.org/jira/browse/OOZIE-1872
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk, 4.1.0
>Reporter: Rohini Palaniswamy
> Fix For: trunk, 4.1.0
>
> Attachments: OOZIE-1872-1.patch
>
>
> https://builds.apache.org/job/oozie-trunk-precommit-build/1291/testReport/junit/org.apache.oozie.command.coord/TestCoordActionInputCheckXCommand/testActionInputCheckLatestActionCreationTime/



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1872) TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime is failing for past couple of builds

2014-07-23 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1872:


Attachment: OOZIE-1872-1.patch

attaching fix to test case. Root causes of failure were

 * materialize command was directly queuing input-check command with zero 
delay, with no changes in action-actual-time taking effect. Hence actual time 
was not updated to desired value, instead remaining at 'current time' and 
giving wrong dependency results
 * explicitly invoked input-check command was failing precondition verification 
due to earlier command transitioning action to FAILED.
* flakiness was due to timing issues of the direct vs explicit input-check 
commands

> TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime
>  is failing for past couple of builds
> ---
>
> Key: OOZIE-1872
> URL: https://issues.apache.org/jira/browse/OOZIE-1872
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk, 4.1.0
>Reporter: Rohini Palaniswamy
> Fix For: trunk, 4.1.0
>
> Attachments: OOZIE-1872-1.patch
>
>
> https://builds.apache.org/job/oozie-trunk-precommit-build/1291/testReport/junit/org.apache.oozie.command.coord/TestCoordActionInputCheckXCommand/testActionInputCheckLatestActionCreationTime/



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OOZIE-1945) NPE in JaveActionExecutor#check()

2014-07-22 Thread Mona Chitnis (JIRA)
Mona Chitnis created OOZIE-1945:
---

 Summary: NPE in JaveActionExecutor#check()
 Key: OOZIE-1945
 URL: https://issues.apache.org/jira/browse/OOZIE-1945
 Project: Oozie
  Issue Type: Bug
Affects Versions: trunk
Reporter: Mona Chitnis
Priority: Trivial
 Fix For: trunk, 4.1.0


in method check()
{code}
 String errorCode = props.getProperty("error.code");
if (errorCode.equals("0")) {
errorCode = "JA018";
}
if (errorCode.equals("-1")) {
errorCode = "JA019";
}
errorReason = props.getProperty("error.reason");
{code}
if error.code is null, these leads to NPEs
easy fix
{code}
if ("0".equals(errorCode))
...
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1536) Coordinator action reruns start a new workflow

2014-07-22 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1536:


Assignee: (was: Mona Chitnis)

> Coordinator action reruns start a new workflow
> --
>
> Key: OOZIE-1536
> URL: https://issues.apache.org/jira/browse/OOZIE-1536
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Srikanth Sundarrajan
>
> Coordinator action reruns start a new workflow and if existing workflow for 
> the action is in running state, the same is not checked. Coord rerun can 
> possibly do a workflow re-run to prevent this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1944) Recursive variable resolution broken when same parameter name in config-default and action conf

2014-07-21 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1944:


Fix Version/s: 4.1.0

> Recursive variable resolution broken when same parameter name in 
> config-default and action conf
> ---
>
> Key: OOZIE-1944
> URL: https://issues.apache.org/jira/browse/OOZIE-1944
> Project: Oozie
>  Issue Type: Bug
>  Components: workflow
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk, 4.1.0
>
>
> Hitting error
> {code}
> can not create DagEngine for submitting jobs
> org.apache.oozie.DagEngineException: E0803: IO error, Variable
> substitution depth too large: 20 ${param}/000
> {code}
> when config-default.xml has
> {{param=default}}
> and action conf has
> {code}
> 
> ...
> 
> 
> param
> ${param}/000
> 
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (OOZIE-1944) Recursive variable resolution broken when same parameter name in config-default and action conf

2014-07-21 Thread Mona Chitnis (JIRA)
Mona Chitnis created OOZIE-1944:
---

 Summary: Recursive variable resolution broken when same parameter 
name in config-default and action conf
 Key: OOZIE-1944
 URL: https://issues.apache.org/jira/browse/OOZIE-1944
 Project: Oozie
  Issue Type: Bug
  Components: workflow
Affects Versions: trunk
Reporter: Mona Chitnis
Assignee: Mona Chitnis
 Fix For: trunk


Hitting error
{code}
can not create DagEngine for submitting jobs
org.apache.oozie.DagEngineException: E0803: IO error, Variable
substitution depth too large: 20 ${param}/000
{code}

when config-default.xml has
{{param=default}}
and action conf has
{code}

...


param
${param}/000


{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-21 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: OOZIE-1933-unit-tests-fix.patch

updated patch for cleanly apply to trunk

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch, 
> OOZIE-1933-unit-tests-fix.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-21 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: (was: sla_unit_tests-1.patch)

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch, 
> OOZIE-1933-unit-tests-fix.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-21 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis resolved OOZIE-1933.
-

Resolution: Fixed

failing unit tests fix committed to trunk after review. Thanks!

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch, 
> OOZIE-1933-unit-tests-fix.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-21 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: (was: sla_unit_tests.patch)

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch, 
> OOZIE-1933-unit-tests-fix.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1811) Current test failures in trunk

2014-07-18 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066771#comment-14066771
 ] 

Mona Chitnis commented on OOZIE-1811:
-

{{org.apache.oozie.command.coord.TestCoordActionInputCheckXCommand.testActionInputCheckLatestCurrentTime}}
 also failing because JPAService null. Same class test but using latest 
calculation with rest to action creation time (old behavior) 
{{org.apache.oozie.command.coord.TestCoordActionInputCheckXCommand.testActionInputCheckLatestActionCreationTime}}
 however, failing with a dependency mismatch problem - OOZIE-1872

> Current test failures in trunk
> --
>
> Key: OOZIE-1811
> URL: https://issues.apache.org/jira/browse/OOZIE-1811
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Mona Chitnis
>Priority: Critical
>
> There's a bunch of test failures currently in trunk; I'm not sure what 
> commit(s) is the cause, but I think it was somewhat recent.
> e.g. https://builds.apache.org/job/oozie-trunk-precommit-build/1199/
> Reproducible by running these tests, instead of having to run them all, which 
> takes a lot longer :)
> {noformat}
> mvn clean test 
> -Dtest=TestSubWorkflowActionExecutor,TestBunldeChangeXCommand,TestCoordUpdateXCommand,TestCoordJobQueryExecutor,TestStatusTransitService,TestSLAEventGeneration
> {noformat}
> {noformat}
> Results :
> Failed tests:   
> testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration):
>  expected:<...11921-oozie-rkan-C@1[]> but was:<...11921-oozie-rkan-C@1[2]>
>   
> testCoordStatusTransitServiceDoneWithError(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
>   
> testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService):
>  expected: but was:
> Tests in error: 
>   testGetList(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
>   testInsert(org.apache.oozie.executor.jpa.TestCoordJobQueryExecutor)
> Tests run: 62, Failures: 3, Errors: 2, Skipped: 0
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-17 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: sla_unit_tests-1.patch

updated patch to include another broken testcase. All other failed tests pass 
locally and are known to be flaky

Test run:
{code}
Results :

Failed tests:   
testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)
  testRecovery(org.apache.oozie.action.hadoop.TestJavaActionExecutor): 
expected:<[SUCCEED]ED> but was:<[FAILED/KILL]ED>
  
testCoordStatusTransitServiceBackwardSupport(org.apache.oozie.service.TestStatusTransitService)

Tests in error: 
  testOnJobEvent(org.apache.oozie.sla.TestSLAJobEventListener): invalid child 
id [wa1]
  
testActionReuseWfJobAppPath(org.apache.oozie.command.wf.TestActionStartXCommand):
 E0607: Other error in operation [action.start], null
  testWorkflowRun(org.apache.oozie.command.wf.TestLastModified): 
org.apache.oozie.DagEngineException: E0607: Other error in operation [start], 
null
  testSucJobPurgeXCommand(org.apache.oozie.command.TestPurgeXCommand): E0604: 
Job does not exist [001-140717193440158-oozie-chit-W]
  testSucCoordPurgeXCommand(org.apache.oozie.command.TestPurgeXCommand): E0604: 
Job does not exist [000-140717193442386-oozie-chit-C]
{code}


> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch, 
> sla_unit_tests-1.patch, sla_unit_tests.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-17 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: sla_unit_tests.patch

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch, 
> sla_unit_tests.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-17 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis reopened OOZIE-1933:
-


adding test cases broken by the patch

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-17 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis resolved OOZIE-1933.
-

Resolution: Fixed

committed to trunk. thanks for review!

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1938) Fork-join job does not execute join node sometimes during HA failover

2014-07-16 Thread Mona Chitnis (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064458#comment-14064458
 ] 

Mona Chitnis commented on OOZIE-1938:
-

More context - all actions are completed, some via server 1 others via server 
2. 

1) Checking the SignalXCommand code also against the WF_ACTIONS table for all 
actions for this job, all of them have pending=0. This probably explains why 
they weren't recovered by ActionCheckerRunnable.

2) As each forked action finishes, two signals are sent - signal value OK and 
signal value :sync:. The 'sync' is needed to maintain the fork-join count, so 
increment on initial forks sending signal :sync:, and then decrement on joins 
sending signal :sync:. I think because of the time when one of the servers was 
down, these :sync:'s were lost or failed to get processed. We dont see this 
problem in a different scenario when both servers were up before actions 
finished and started signaling :sync:.

Not very confident about changing the way we handle the :sync:, so would like 
to discuss the best approach here. The easier approach would be to set the 
action's pending flag in this process so that recovery will pick up action and 
help restore correct :sync: count.

Feedback/corrections?

> Fork-join job does not execute join node sometimes during HA failover
> -
>
> Key: OOZIE-1938
> URL: https://issues.apache.org/jira/browse/OOZIE-1938
> Project: Oozie
>  Issue Type: Bug
>  Components: HA
>Affects Versions: trunk
>Reporter: Mona Chitnis
> Fix For: trunk
>
>
> Reported by [~mchiang].
> Scenario: (2 Oozie HA servers)
> 21:38:56 submit job at oozie client
> 21:41:42 shut down server1
> 21:46:52 shut down server2
> 21:47:30 start server1
> 22:15:05 start server2
> the last fork path end time is 21:52:53.
> 22:36:48 the job is still RUNNING, not moving to join node.
> Digging into the logs, the locking part seems to work fine with forked action 
> processing distributed amongst the two servers when both running or when one 
> of them is down. The issue seems to be why even RecoveryService fails to pick 
> up the job after all the forks had completed



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-16 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: OOZIE-1933-4-1.patch

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch, OOZIE-1933-4-1.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23524: Logging improvements (amendment to OOZIE-1911) + OOZIE-1933

2014-07-16 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23524/
---

(Updated July 16, 2014, 10:47 p.m.)


Review request for oozie.


Changes
---

addressed comments


Bugs: OOZIE-1933
https://issues.apache.org/jira/browse/OOZIE-1933


Repository: oozie-git


Description
---

See JIRA


Diffs (updated)
-

  core/src/main/java/org/apache/oozie/service/EventHandlerService.java 6c075ab 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 5349b33 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 5b30fc0 
  core/src/main/java/org/apache/oozie/util/LogUtils.java 723ac36 
  core/src/test/java/org/apache/oozie/service/TestEventHandlerService.java 
ffb25e7 
  core/src/test/java/org/apache/oozie/service/TestHASLAService.java 419e98b 
  core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 438f2c2 
  core/src/test/java/org/apache/oozie/sla/TestSLAService.java 205bcd1 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 6bf0a8f 

Diff: https://reviews.apache.org/r/23524/diff/


Testing
---

added new tests and checked existing ones pass


Thanks,

Mona Chitnis



[jira] [Updated] (OOZIE-1933) SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs

2014-07-16 Thread Mona Chitnis (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mona Chitnis updated OOZIE-1933:


Attachment: OOZIE-1933-3.patch

attaching patch reviewed and updated from ReviewBoard

> SLACalculatorMemory HA changes assume SLARegistrationBean exists for all jobs
> -
>
> Key: OOZIE-1933
> URL: https://issues.apache.org/jira/browse/OOZIE-1933
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>    Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
> Attachments: OOZIE-1933-3.patch
>
>
> SLACalculatorMemory.addJobStatus()
> {code}
> else {
> // jobid might not exist in slaMap in HA Setting
> SLARegistrationBean slaRegBean = 
> SLARegistrationQueryExecutor.getInstance().get(
> SLARegQuery.GET_SLA_REG_ALL, jobId);
> SLASummaryBean slaSummaryBean = 
> SLASummaryQueryExecutor.getInstance().get(SLASummaryQuery.GET_SLA_SUMMARY,
> jobId);
> slaCalc = new SLACalcStatus(slaSummaryBean, slaRegBean);
> {code}
> Because of SLA Listener, job notification event triggers this even for jobs 
> with no SLA configured - leading to NPE in the SLACalcStatus constructor and 
> annoying exception stacktraces in logs
> Patch to also include log prefix addition to some SLACalculator log line



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 23524: Logging improvements (amendment to OOZIE-1911) + OOZIE-1933

2014-07-16 Thread Mona Chitnis

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23524/
---

(Updated July 16, 2014, 8:09 p.m.)


Review request for oozie.


Changes
---

addressed review comments. checked all tests (new+existing ones) pass


Bugs: OOZIE-1933
https://issues.apache.org/jira/browse/OOZIE-1933


Repository: oozie-git


Description
---

See JIRA


Diffs (updated)
-

  core/src/main/java/org/apache/oozie/service/EventHandlerService.java 6c075ab 
  core/src/main/java/org/apache/oozie/sla/SLACalcStatus.java 5349b33 
  core/src/main/java/org/apache/oozie/sla/SLACalculatorMemory.java 5b30fc0 
  core/src/main/java/org/apache/oozie/util/LogUtils.java 723ac36 
  core/src/test/java/org/apache/oozie/service/TestEventHandlerService.java 
ffb25e7 
  core/src/test/java/org/apache/oozie/sla/TestSLACalculatorMemory.java 438f2c2 
  core/src/test/java/org/apache/oozie/sla/TestSLAService.java 205bcd1 
  core/src/test/java/org/apache/oozie/test/XTestCase.java 6bf0a8f 

Diff: https://reviews.apache.org/r/23524/diff/


Testing
---

added new tests and checked existing ones pass


Thanks,

Mona Chitnis



  1   2   3   4   5   6   7   8   9   10   >