[jira] [Commented] (OOZIE-1966) Re-format oozie codebase

2014-08-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090913#comment-14090913
 ] 

Alejandro Abdelnur commented on OOZIE-1966:
---

[~shwethags], headers is OK. we can improve the test-patch. btw, you can run a 
each one of the test-patch reports independently:

{code}
Usage: bin/test-patch 
  (--jira= | --patch=)
  (--reset-scm | --dirty-scm)
  [--tasks=]
  [--skip-tasks=]
  [--jira-cli=]
  [--jira-user=]
  [--jira-password=]
  [-D...]
  [-P...]
  [--list-tasks]
  [--verbose]

$ bin/test-patch --list-tasks

Available Tasks:

  CLEAN
  RAW_PATCH_ANALYSIS
  RAT
  JAVADOC
  COMPILE
  BACKWARDS_COMPATIBILITY
  TESTS
  DISTRO
{code}

> Re-format oozie codebase
> 
>
> Key: OOZIE-1966
> URL: https://issues.apache.org/jira/browse/OOZIE-1966
> Project: Oozie
>  Issue Type: Sub-task
>Reporter: Shwetha G S
>Assignee: Shwetha G S
> Attachments: OOZIE-1966-v2.patch, OOZIE-1966.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1966) Re-format oozie codebase

2014-08-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090390#comment-14090390
 ] 

Alejandro Abdelnur commented on OOZIE-1966:
---

please do not do this. it will make very difficult to track real changes in 
commits.

> Re-format oozie codebase
> 
>
> Key: OOZIE-1966
> URL: https://issues.apache.org/jira/browse/OOZIE-1966
> Project: Oozie
>  Issue Type: Sub-task
>Reporter: Shwetha G S
>Assignee: Shwetha G S
> Attachments: OOZIE-1966.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1961) Remove requireJavaVersion from enforcer rules

2014-08-05 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086586#comment-14086586
 ] 

Alejandro Abdelnur commented on OOZIE-1961:
---

we don't need it anymore. BTW, I think we could move to JDK 1.7 as minimum 
requirement for oozie.

> Remove requireJavaVersion from enforcer rules
> -
>
> Key: OOZIE-1961
> URL: https://issues.apache.org/jira/browse/OOZIE-1961
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Lars Francke
>Priority: Minor
> Attachments: OOZIE-1961.1.patch, OOZIE-1961.2.patch
>
>
> Currently the Oozie build fails with Java 1.7 due to this enforcer rule in 
> {{pom.xml}}:
> {code:xml}
> 
>   [${javaVersion}.0,${javaVersion}.1000}]
> 
> {code}
> And {{javaVersion}} is set to {{1.6}}.
> Maybe I'm missing something but I don't see why Oozie wouldn't compile/work 
> with 1.7 or 1.8. This patch just removes this enforcer rule.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1876) use pom properties rather than specific version numbers in the pom files of hbaselibs, hcataloglibs, sharelib, etc

2014-07-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057799#comment-14057799
 ] 

Alejandro Abdelnur commented on OOZIE-1876:
---

I'm afraid this will break things nastily when consuming Oozie artifacts from a 
Maven repo as profiles don't have an effect on dependencies.

> use pom properties rather than specific version numbers in the pom files of 
> hbaselibs, hcataloglibs, sharelib, etc 
> ---
>
> Key: OOZIE-1876
> URL: https://issues.apache.org/jira/browse/OOZIE-1876
> Project: Oozie
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 4.0.1
>Reporter: John
>Assignee: Shwetha G S
> Fix For: trunk
>
> Attachments: OOZIE-1876-v2.patch, OOZIE-1876.patch
>
>
> version numbers (hbase, hive, hcatalog, sqoop, etc) are hard coded in the pom 
> files.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1917) Authentication secret should be random by default and needs to coordinate with HA

2014-07-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053906#comment-14053906
 ] 

Alejandro Abdelnur commented on OOZIE-1917:
---

I think this should be taken care in hadoop-auth itself, just opened 
HADOOP-10791 for it.

> Authentication secret should be random by default and needs to coordinate 
> with HA
> -
>
> Key: OOZIE-1917
> URL: https://issues.apache.org/jira/browse/OOZIE-1917
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA, security
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
>
> {{oozie.authentication.signature.secret}} is currently set to {{oozie}} by 
> default, which is a pretty poor value for this.  We should set it to be 
> random by default (i.e. blank in oozie-site/default).  
> We should also make it so that with Oozie HA, we store this value in 
> ZooKeeper so all Oozie servers can use the same secret.  This may get a 
> little tricky because hadoop-auth's AuthenticationFilter doesn't make it 
> easy/practical to change how the Signer and secret are set.  We'll likely 
> have to have Oozie's AuthFilter compute it's own random secret and do all the 
> ZK stuff and set the value of {{oozie.authentication.signature.secret}} 
> before calling AuthenticationFilter#init



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1879) Workflow Rerun causes error depending on the order of forked nodes

2014-06-16 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033104#comment-14033104
 ] 

Alejandro Abdelnur commented on OOZIE-1879:
---

+1 LGTM. One minor NIT in LiteWorkflowInstance.java, {{if (numActionEndTimes >= 
0) {}} will work, but strictly speaking it should be {{>}}.

> Workflow Rerun causes error depending on the order of forked nodes
> --
>
> Key: OOZIE-1879
> URL: https://issues.apache.org/jira/browse/OOZIE-1879
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: OOZIE-1879.patch
>
>
> Suppose you have a workflow like this:
> {noformat}
> start --> fork
> fork --> shell1, shell2
> shell1 --> join
> shell2 --> join
> join --> shell3
> shell3 --> end
> {noformat}
> And all but shell3 are successful.  
> Assuming you fix the problem with shell3, if you do a rerun, the following 
> two outcomes can happen:
> # If shell1 finished before shell2, then the rerun succeeds
> # If shell2 finished before shell1, then the rerun fails
> The error in the second outcome is simply this log message:
> {noformat}
> 2014-05-29 17:17:03,735 ERROR 
> org.apache.oozie.workflow.lite.LiteWorkflowInstance: 
> SERVER[cdh5-1.cloudera.local] USER[pdvorak] GROUP[-] TOKEN[] 
> APP[test-rerun-wf] JOB[004-140521220856264-oozie-oozi-W] 
> ACTION[004-140521220856264-oozie-oozi-W@join] invalid execution path 
> [/shell1/]
> {noformat}
> After a bunch of digging, I discovered that during a rerun with the above 
> workflow or similar workflows, LiteWorkflowInstance#signal gets called for 
> each action in the fork node in the order that they are listed in the fork 
> node's XML; however, during the original run, LiteWorkflowInstance#signal 
> gets called for each action in the order that they complete (i.e. endTime).  
> When these don't match, you get the above error.  The general fix for this is 
> therefore to ensure that during a rerun, LiteWorkflowInstance#signal gets 
> called for each action in the fork node in the order that they originally ran 
> in.  And if you think about it, that is more correct than the current 
> behavior anyway.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1877) Setting to fail oozie server startup in case of sharelib misconfiguration

2014-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14030934#comment-14030934
 ] 

Alejandro Abdelnur commented on OOZIE-1877:
---

IMO, Oozie should be spinning in safe mode if HDFS is not avail, it should not 
die.

> Setting to fail oozie server startup in case of sharelib misconfiguration
> -
>
> Key: OOZIE-1877
> URL: https://issues.apache.org/jira/browse/OOZIE-1877
> Project: Oozie
>  Issue Type: Sub-task
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
> Attachments: OOZIE-1877-V1.patch
>
>
> "OOZIE-1584 Setup sharelib using script and pickup latest(honor 
> ship.launcher) and remove DFS dependency at startup" has removed sharelib 
> dependency at startup.
> If DFS is down or sharelib is misconfigured. Server will start without 
> loading sharelib, admin can issue sharelibupdate command to load sharelib.
> This is good, may not be acceptable in production. If sharelib is 
> misconfigured then oozie server will come up without loading sharelib and all 
> submitted hadoop job will fail.
> Better to have a property to shutdown oozie server, if sharelib 
> initialization fails.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1879) Workflow Rerun causes error depending on the order of forked nodes

2014-06-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028856#comment-14028856
 ] 

Alejandro Abdelnur commented on OOZIE-1879:
---

Oofff, this is an obscure one, nice digging. 

Do we have to serialize the actions endtimes in the WF? I'm worried about 
backwards compatibility with WFs started before the upgrade to a release with 
this patch. Or the serialization/deserialization handles that? 

> Workflow Rerun causes error depending on the order of forked nodes
> --
>
> Key: OOZIE-1879
> URL: https://issues.apache.org/jira/browse/OOZIE-1879
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: OOZIE-1879.patch
>
>
> Suppose you have a workflow like this:
> {noformat}
> start --> fork
> fork --> shell1, shell2
> shell1 --> join
> shell2 --> join
> join --> shell3
> shell3 --> end
> {noformat}
> And all but shell3 are successful.  
> Assuming you fix the problem with shell3, if you do a rerun, the following 
> two outcomes can happen:
> # If shell1 finished before shell2, then the rerun succeeds
> # If shell2 finished before shell1, then the rerun fails
> The error in the second outcome is simply this log message:
> {noformat}
> 2014-05-29 17:17:03,735 ERROR 
> org.apache.oozie.workflow.lite.LiteWorkflowInstance: 
> SERVER[cdh5-1.cloudera.local] USER[pdvorak] GROUP[-] TOKEN[] 
> APP[test-rerun-wf] JOB[004-140521220856264-oozie-oozi-W] 
> ACTION[004-140521220856264-oozie-oozi-W@join] invalid execution path 
> [/shell1/]
> {noformat}
> After a bunch of digging, I discovered that during a rerun with the above 
> workflow or similar workflows, LiteWorkflowInstance#signal gets called for 
> each action in the fork node in the order that they are listed in the fork 
> node's XML; however, during the original run, LiteWorkflowInstance#signal 
> gets called for each action in the order that they complete (i.e. endTime).  
> When these don't match, you get the above error.  The general fix for this is 
> therefore to ensure that during a rerun, LiteWorkflowInstance#signal gets 
> called for each action in the fork node in the order that they originally ran 
> in.  And if you think about it, that is more correct than the current 
> behavior anyway.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1388) Add a admin servlet to show thread stack trace and CPU usage per thread

2014-06-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028501#comment-14028501
 ] 

Alejandro Abdelnur commented on OOZIE-1388:
---

That is OK. Now, is his servlet protected so only 'admins' can do that? it 
should

> Add a admin servlet to show thread stack trace and CPU usage per thread
> ---
>
> Key: OOZIE-1388
> URL: https://issues.apache.org/jira/browse/OOZIE-1388
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: trunk
>
> Attachments: JVM_Info_Snapshot.html, OOZIE-1388-1.patch, 
> OOZIE-1388-2.patch, OOZIE-1388-3.patch, OOZIE-1388-4.patch
>
>
> ThreadMXBean can be used to display the stack trace and also CPU usage per 
> thread. The servlet will be very useful for debugging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1388) Add a admin servlet to show thread stack trace and CPU usage per thread

2014-06-11 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027909#comment-14027909
 ] 

Alejandro Abdelnur commented on OOZIE-1388:
---

Codehale metrics has some servlets to prodive similar info, please look at 
those if they satisfy fully or partially what this JIRA is trying to do.

> Add a admin servlet to show thread stack trace and CPU usage per thread
> ---
>
> Key: OOZIE-1388
> URL: https://issues.apache.org/jira/browse/OOZIE-1388
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: trunk
>
> Attachments: JVM_Info_Snapshot.html, OOZIE-1388-1.patch, 
> OOZIE-1388-2.patch, OOZIE-1388-3.patch
>
>
> ThreadMXBean can be used to display the stack trace and also CPU usage per 
> thread. The servlet will be very useful for debugging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (OOZIE-1608) Update Curator to 2.4.0 when its available to fix security hole

2014-02-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13896974#comment-13896974
 ] 

Alejandro Abdelnur commented on OOZIE-1608:
---

+1

> Update Curator to 2.4.0 when its available to fix security hole
> ---
>
> Key: OOZIE-1608
> URL: https://issues.apache.org/jira/browse/OOZIE-1608
> Project: Oozie
>  Issue Type: Bug
>  Components: HA, security
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Attachments: OOZIE-1608.patch
>
>
> As I discovered when working on OOZIE-1491, there is a Curator bug 
> (CURATOR-58) without which the ZooKeeper locks will always have world ACLs 
> even with Kerberos enabled.  This could allow a malicious user to acquire one 
> of the locks and never release it, thus preventing Oozie from continuing to 
> process the job associated with that lock.  
> I've verified that CURATOR-58 fixes the problem, and the locks have the 
> correct "sasl" ACLs, but it won't be available until Curator 2.4.0 is 
> released.  We should make sure to update to Curator 2.4.0 as soon as possible 
> to fix this security hole.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (OOZIE-1651) Oozie should mask the signature secret in the configuration output

2014-01-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860627#comment-13860627
 ] 

Alejandro Abdelnur commented on OOZIE-1651:
---

ok, +1.

> Oozie should mask the signature secret in the configuration output
> --
>
> Key: OOZIE-1651
> URL: https://issues.apache.org/jira/browse/OOZIE-1651
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.3.2, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: OOZIE-1651.patch, OOZIE-1651.patch, OOZIE-1651.patch, 
> OOZIE-1651.patch
>
>
> The value of {{oozie.authentication.signature.secret}} is the secret that's 
> used to sign the cookies/tokens crated by Oozie for authentication after 
> Kerberos.  If a malicious user were to find out this secret, they could forge 
> counterfeit cookies/tokens as any user with any expiration date.  
> Oozie exposed the configuration properties via its REST API.  It currently 
> only masks any properties that end with ".password" (i.e. 
> {{oozie.service.JPAService.jdbc.password}}).  We should expand this to also 
> mask the signature secret.  
> In fact, it would be useful to generalize this ability to add a property that 
> masks something the user can configure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (OOZIE-1651) Oozie should mask the signature secret in the configuration output

2014-01-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860613#comment-13860613
 ] 

Alejandro Abdelnur commented on OOZIE-1651:
---

sorry, I've meant a set with the postfixes (.secret & .password) to mask, not 
hardcoded properties.

> Oozie should mask the signature secret in the configuration output
> --
>
> Key: OOZIE-1651
> URL: https://issues.apache.org/jira/browse/OOZIE-1651
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.3.2, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: OOZIE-1651.patch, OOZIE-1651.patch, OOZIE-1651.patch, 
> OOZIE-1651.patch
>
>
> The value of {{oozie.authentication.signature.secret}} is the secret that's 
> used to sign the cookies/tokens crated by Oozie for authentication after 
> Kerberos.  If a malicious user were to find out this secret, they could forge 
> counterfeit cookies/tokens as any user with any expiration date.  
> Oozie exposed the configuration properties via its REST API.  It currently 
> only masks any properties that end with ".password" (i.e. 
> {{oozie.service.JPAService.jdbc.password}}).  We should expand this to also 
> mask the signature secret.  
> In fact, it would be useful to generalize this ability to add a property that 
> masks something the user can configure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (OOZIE-1651) Oozie should mask the signature secret in the configuration output

2014-01-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860402#comment-13860402
 ] 

Alejandro Abdelnur commented on OOZIE-1651:
---

making the masking postfix configurable seems overkill, I would have them 
hardcoded.

> Oozie should mask the signature secret in the configuration output
> --
>
> Key: OOZIE-1651
> URL: https://issues.apache.org/jira/browse/OOZIE-1651
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Affects Versions: 3.3.2, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
> Attachments: OOZIE-1651.patch, OOZIE-1651.patch, OOZIE-1651.patch
>
>
> The value of {{oozie.authentication.signature.secret}} is the secret that's 
> used to sign the cookies/tokens crated by Oozie for authentication after 
> Kerberos.  If a malicious user were to find out this secret, they could forge 
> counterfeit cookies/tokens as any user with any expiration date.  
> Oozie exposed the configuration properties via its REST API.  It currently 
> only masks any properties that end with ".password" (i.e. 
> {{oozie.service.JPAService.jdbc.password}}).  We should expand this to also 
> mask the signature secret.  
> In fact, it would be useful to generalize this ability to add a property that 
> masks something the user can configure.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (OOZIE-1643) Oozie doesn't parse Hadoop Job Id from the Hive action

2014-01-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860403#comment-13860403
 ] 

Alejandro Abdelnur commented on OOZIE-1643:
---

+1

> Oozie doesn't parse Hadoop Job Id from the Hive action
> --
>
> Key: OOZIE-1643
> URL: https://issues.apache.org/jira/browse/OOZIE-1643
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1643.patch
>
>
> I'm not sure how long this has been going on (possibly for quite a while), 
> but the Hive action isn't able to parse the Hadoop Job Ids launched by Hive.  
> The way its supposed to work is that the {{HiveMain}} creates a 
> {{hive-log4j.properties}} file which redirects the output from {{HiveCLI}} to 
> the console (for easy viewing in the launcher, and creates a 
> {{hive-exec-log4j.properties}} to redirect the output from one of the 
> {{hive-exec}} classes to a log file; Oozie would then parse that log file for 
> the Hadoop Job Ids.  
> What's instead happening is that the {{HiveCLI}} is picking up a 
> {{hive-log4j.properties}} file from {{hive-common.jar}} instead.  This is 
> making it log everything to {{stderr}}.  Oozie then can't parse the Hadoop 
> Job Id.
> {noformat:title=stdout}
> ...
> <<< Invocation of Hive command completed <<<
>  Hadoop Job IDs executed by Hive: 
> <<< Invocation of Main class completed <<<
> Oozie Launcher, capturing output data:
> ===
> #
> #Mon Dec 16 16:01:34 PST 2013
> hadoopJobs=
> ===
> {noformat}
> {noformat:title=stderr}
> Picked up _JAVA_OPTIONS: -Djava.awt.headless=true
> 2013-12-16 16:01:20.884 java[59363:1703] Unable to load realm info from 
> SCDynamicStore
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
> org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
> Logging initialized using configuration in 
> jar:file:/Users/rkanter/dev/hadoop-1.2.0/dirs/mapred/taskTracker/distcache/-4202506229388278450_-1489127056_2111515407/localhost/user/rkanter/share/lib/lib_20131216160106/hive/hive-common-0.10.0.jar!/hive-log4j.properties
> Hive history file=/tmp/rkanter/hive_job_log_rkanter_201312161601_851054619.txt
> OK
> Time taken: 5.444 seconds
> Total MapReduce jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201312161418_0008, Tracking URL = 
> http://localhost:50030/jobdetails.jsp?jobid=job_201312161418_0008
> Kill Command = /Users/rkanter/dev/hadoop-1.2.0/libexec/../bin/hadoop job  
> -kill job_201312161418_0008
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0
> 2013-12-16 16:01:33,409 Stage-1 map = 0%,  reduce = 0%
> 2013-12-16 16:01:34,415 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201312161418_0008
> Ended Job = 1084818925, job is filtered out (removed at runtime).
> Ended Job = -956386500, job is filtered out (removed at runtime).
> Moving data to: 
> hdfs://localhost:8020/tmp/hive-rkanter/hive_2013-12-16_16-01-28_168_4802779111653057155/-ext-1
> Moving data to: /user/rkanter/examples/output-data/hive
> MapReduce Jobs Launched: 
> Job 0:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 6.284 seconds
> Log file: 
> /Users/rkanter/dev/hadoop-1.2.0/dirs/mapred/taskTracker/rkanter/jobcache/job_201312161418_0007/attempt_201312161418_0007_m_00_0/work/hive-oozie-job_201312161418_0007.log
>   not present. Therefore no Hadoop jobids found
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (OOZIE-1655) Change oozie.service.JPAService.validate.db.connection to true

2014-01-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860401#comment-13860401
 ] 

Alejandro Abdelnur commented on OOZIE-1655:
---

+1

> Change oozie.service.JPAService.validate.db.connection to true
> --
>
> Key: OOZIE-1655
> URL: https://issues.apache.org/jira/browse/OOZIE-1655
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1655.patch, OOZIE-1655.patch
>
>
> We've seen many database-related issues solved by simply setting 
> {{oozie.service.JPAService.validate.db.connection}} to {{true}} (default is 
> {{false}}).  My understanding of this property is that it makes sure that the 
> database is properly connected, which is helpful when the connection isn't so 
> good; and it only adds a minor overhead.  
> It would be useful to change the default for this property to be {{true}}
> While we're at it, we should change 
> {{oozie.service.JPAService.create.db.schema}} in oozie-default to {{false}} 
> because its {{false}} in oozie-site and its a little confusing that they're 
> not the same.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (OOZIE-1633) Test failures related to sharelib when running against Hadoop 2

2013-12-04 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839572#comment-13839572
 ] 

Alejandro Abdelnur commented on OOZIE-1633:
---

+1 pending jenkins

> Test failures related to sharelib when running against Hadoop 2
> ---
>
> Key: OOZIE-1633
> URL: https://issues.apache.org/jira/browse/OOZIE-1633
> Project: Oozie
>  Issue Type: Sub-task
>  Components: tests
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1633.patch
>
>
> When run against Hadoop 2, {{TestJavaActionExecutor}} has this test failure:
> {noformat}
> testAddShareLibSchemeAndAuthority(org.apache.oozie.action.hadoop.TestJavaActionExecutor)
>   Time elapsed: 0.002 sec  <<< ERROR!
> org.apache.oozie.action.ActionExecutorException: File /user/rkanter/share 
> does not exist.
>   at 
> org.apache.oozie.action.hadoop.JavaActionExecutor.addShareLib(JavaActionExecutor.java:521)
>   at 
> org.apache.oozie.action.hadoop.TestJavaActionExecutor.testAddShareLibSchemeAndAuthority(TestJavaActionExecutor.java:1257)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at junit.framework.TestCase.runTest(TestCase.java:168)
>   at junit.framework.TestCase.runBare(TestCase.java:134)
>   at junit.framework.TestResult$1.protect(TestResult.java:110)
>   at junit.framework.TestResult.runProtected(TestResult.java:128)
>   at junit.framework.TestResult.run(TestResult.java:113)
>   at junit.framework.TestCase.run(TestCase.java:124)
>   at junit.framework.TestSuite.runTest(TestSuite.java:243)
>   at junit.framework.TestSuite.run(TestSuite.java:238)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
>   at 
> org.apache.maven.surefire.junitcore.ClassDemarcatingRunner.run(ClassDemarcatingRunner.java:58)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:24)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>   at java.lang.Thread.run(Thread.java:695)
> {noformat}
> And {{TestShareLibService}} has these three failures:
> {noformat}
> testAddShareLib_pig(org.apache.oozie.service.TestShareLibService)  Time 
> elapsed: 0.004 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: expected:<2> but was:<4>
>   at junit.framework.Assert.fail(Assert.java:50)
>   at junit.framework.Assert.failNotEquals(Assert.java:287)
>   at junit.framework.Assert.assertEquals(Assert.java:67)
>   at junit.framework.Assert.assertEquals(Assert.java:199)
>   at junit.framework.Assert.assertEquals(Assert.java:205)
>   at 
> org.apache.oozie.service.TestShareLibService.testAddShareLib_pig(TestShareLibService.java:176)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at junit.framework.TestCase.runTest(TestCase.java:168)
>   at junit.framework.TestCase.runBare(TestCase.java:134)
>   at junit.framework.TestResult$1.protect(TestResult.java:110)
>   at junit.framework.TestResult.runProtected(TestResult.java:128)
>   at junit.framework.TestResult.run(TestResult.java:113)
>   at junit.framework.TestCase.run(TestCase.java:124)
>   at junit.framework.TestSuite.runTest(TestSuite.java:243)
>   at junit.framework.TestSuite.run(TestSuite.java:238)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
>   at 
> org.apache.maven.surefire.junitcore.ClassDemarcatingRunner.run(ClassDemarcatingRunner.java:58)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:24)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Execu

[jira] [Commented] (OOZIE-1631) Tools module should have a direct dependency on mockito

2013-12-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838403#comment-13838403
 ] 

Alejandro Abdelnur commented on OOZIE-1631:
---

+1

> Tools module should have a direct dependency on mockito
> ---
>
> Key: OOZIE-1631
> URL: https://issues.apache.org/jira/browse/OOZIE-1631
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Minor
> Attachments: OOZIE-1631.patch
>
>
> Mockito is used by some of the tests in the tools module; however, it is only 
> included because its a test dependency of hive-serde, which is included with 
> the hcatalog libs.  We should make it a direct dependency in case hcat 
> removes mockito because Oozie won't compile anymore.
> dependency tree for reference:
> {noformat}
> [INFO] org.apache.oozie:oozie-tools:jar:4.1.0-SNAPSHOT
> [INFO] +- org.apache.derby:derby:jar:10.6.1.0:compile
> [INFO] +- org.apache.oozie:oozie-hcatalog:jar:0.5.0.oozie-4.1.0-SNAPSHOT:test 
> (scope not updated to compile)
> ...
> [INFO] |  +- org.apache.hive:hive-serde:jar:0.10.0:test
> [INFO] |  |  +- org.mockito:mockito-all:jar:1.8.5:test (version managed from 
> 1.8.2)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1491) Make sure HA works with a secure ZooKeeper

2013-12-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837091#comment-13837091
 ] 

Alejandro Abdelnur commented on OOZIE-1491:
---

LGTM +1

> Make sure HA works with a secure ZooKeeper
> --
>
> Key: OOZIE-1491
> URL: https://issues.apache.org/jira/browse/OOZIE-1491
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1491.patch, OOZIE-1491.patch, OOZIE-1491.patch, 
> OOZIE-1491.patch, OOZIE-1491.patch
>
>
> We need to make sure that HA works with a secure ZooKeeper.  This includes 
> the SASL ACL setting that will prevent someone else from deleting the oozie 
> znodes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (OOZIE-1552) Bring Windows shell script functionality and structure in line with trunk

2013-12-02 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned OOZIE-1552:
-

Assignee: Ostap  (was: David Wannemacher)

done

> Bring Windows shell script functionality and structure in line with trunk
> -
>
> Key: OOZIE-1552
> URL: https://issues.apache.org/jira/browse/OOZIE-1552
> Project: Oozie
>  Issue Type: Sub-task
>  Components: core, tools
>Affects Versions: trunk
>Reporter: David Wannemacher
>Assignee: Ostap
> Fix For: trunk
>
>
> OOZIE-1523 implemented the shell scripts for windows at roughly a 3.2.0 
> version level. There are some additional options and restructuring that need 
> to be implemented to bring the shell scripts in line with trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1575) Add functionality to submit sqoop jobs through http on oozie server side

2013-11-25 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832157#comment-13832157
 ] 

Alejandro Abdelnur commented on OOZIE-1575:
---

The {{oozie}} script does the following already:
{code}
${JAVA_BIN} ${OOZIE_CLIENT_OPTS} -cp ${OOZIECPPATH} 
org.apache.oozie.cli.OozieCLI "${@}"
{code}
This means that if you do {{oozie sqoop ... -query "select * from T"}}, 
{{"select * from T"}} will be pass to java as a single argument. So, nothing to 
be done here other than document that the query must be between double quotes.

Aren't you seeing that behavior?

> Add functionality to submit sqoop jobs through http on oozie server side
> 
>
> Key: OOZIE-1575
> URL: https://issues.apache.org/jira/browse/OOZIE-1575
> Project: Oozie
>  Issue Type: Sub-task
>  Components: client
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Fix For: trunk
>
> Attachments: oozie-1575.patch, oozie-1575.patch, oozie-1575.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (OOZIE-1575) Add functionality to submit sqoop jobs through http on oozie server side

2013-11-25 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1575:
--

Summary: Add functionality to submit sqoop jobs through http on oozie 
server side  (was: Add functionality to summit sqoop jobs through http on oozie 
server side)

> Add functionality to submit sqoop jobs through http on oozie server side
> 
>
> Key: OOZIE-1575
> URL: https://issues.apache.org/jira/browse/OOZIE-1575
> Project: Oozie
>  Issue Type: Sub-task
>  Components: client
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Fix For: trunk
>
> Attachments: oozie-1575.patch, oozie-1575.patch, oozie-1575.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (OOZIE-1575) Add functionality to summit sqoop jobs through http on oozie server side

2013-11-25 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1575:
--

Summary: Add functionality to summit sqoop jobs through http on oozie 
server side  (was: Add functionality to sumbit sqoop job through http on oozie 
server side)

> Add functionality to summit sqoop jobs through http on oozie server side
> 
>
> Key: OOZIE-1575
> URL: https://issues.apache.org/jira/browse/OOZIE-1575
> Project: Oozie
>  Issue Type: Sub-task
>  Components: client
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Fix For: trunk
>
> Attachments: oozie-1575.patch, oozie-1575.patch, oozie-1575.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1550) Create a safeguard to kill errant recursive workflows before they bring down oozie

2013-11-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822831#comment-13822831
 ] 

Alejandro Abdelnur commented on OOZIE-1550:
---

+1 after jenkins

> Create a safeguard to kill errant recursive workflows before they bring down 
> oozie
> --
>
> Key: OOZIE-1550
> URL: https://issues.apache.org/jira/browse/OOZIE-1550
> Project: Oozie
>  Issue Type: Improvement
>  Components: workflow
>Affects Versions: 3.3.2, 4.0.0
>Reporter: Robert Justice
>Assignee: Robert Kanter
>  Labels: features
> Attachments: OOZIE-1550.patch, OOZIE-1550.patch
>
>
> If a user creates an errant workflow with a sub-workflow that calls the 
> workflow again, without a proper decision node to exit the workflow, it will 
> continue to create numerous jobs until the oozie server is saturated.  A user 
> recently had 400,000 running jobs and oozie was non-responsive.  I would 
> suggest we have some method of preventing a user from taking out oozie, such 
> as a max running jobs 
> parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1550) Create a safeguard to kill errant recursive workflows before they bring down oozie

2013-11-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821783#comment-13821783
 ] 

Alejandro Abdelnur commented on OOZIE-1550:
---

Looks good, +1 after the following comments are addressed and jenkins passes 
again:

* The method {{injectSubworkflowDepth}} should be 
{{verifyAndInjectSubworkflowDepth}}
* The default of depth 10  seems too conservative and it may break complex 
apps, I'd put something like 50


> Create a safeguard to kill errant recursive workflows before they bring down 
> oozie
> --
>
> Key: OOZIE-1550
> URL: https://issues.apache.org/jira/browse/OOZIE-1550
> Project: Oozie
>  Issue Type: Improvement
>  Components: workflow
>Affects Versions: 3.3.2, 4.0.0
>Reporter: Robert Justice
>Assignee: Robert Kanter
>  Labels: features
> Attachments: OOZIE-1550.patch
>
>
> If a user creates an errant workflow with a sub-workflow that calls the 
> workflow again, without a proper decision node to exit the workflow, it will 
> continue to create numerous jobs until the oozie server is saturated.  A user 
> recently had 400,000 running jobs and oozie was non-responsive.  I would 
> suggest we have some method of preventing a user from taking out oozie, such 
> as a max running jobs 
> parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1606) Update Curator to 2.3.0 and fix some misc minor ZK related things

2013-11-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821779#comment-13821779
 ] 

Alejandro Abdelnur commented on OOZIE-1606:
---

+1

> Update Curator to 2.3.0 and fix some misc minor ZK related things
> -
>
> Key: OOZIE-1606
> URL: https://issues.apache.org/jira/browse/OOZIE-1606
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1606.patch
>
>
> Curator 2.3.0 has recently been released and while I don't think there's 
> anything specific in there that affects us, we may as well upgrade it.
> There's also some misc minor ZK related things that I've noticed while 
> working on other JIRAs that I want to fix that don't really deserve their own 
> JIRA so this seems like a good opportunity.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1565) OOZIE-1481 should only affect v2 of the API, not v1

2013-10-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806996#comment-13806996
 ] 

Alejandro Abdelnur commented on OOZIE-1565:
---

+1

> OOZIE-1481 should only affect v2 of the API, not v1
> ---
>
> Key: OOZIE-1565
> URL: https://issues.apache.org/jira/browse/OOZIE-1565
> Project: Oozie
>  Issue Type: Bug
>  Components: coordinator
>Affects Versions: trunk, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: trunk, 4.0.1
>
> Attachments: OOZIE-1565.patch
>
>
> OOZIE-1481 changed the behavior of the v1 API such that when getting coord 
> info, specifying {{len=0}} now returns 0 actions instead of all actions.  
> Also, on the REST call, not specifying any {{len}} parameter is interpreted 
> by the Oozie server as {{len=0}}.  
> This is a logically backwards incompatible change.  We should keep this 
> change in the v2 API, but change the v1 API back to the original (incorrect) 
> behavior.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1597) Cleanup database before every test

2013-10-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806990#comment-13806990
 ] 

Alejandro Abdelnur commented on OOZIE-1597:
---

+1 LGTM

> Cleanup database before every test
> --
>
> Key: OOZIE-1597
> URL: https://issues.apache.org/jira/browse/OOZIE-1597
> Project: Oozie
>  Issue Type: Improvement
>  Components: tests
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1597.patch
>
>
> While investigating a flakey test 
> ({{org.apache.oozie.sla.TestSLAJobEventListener.testOnJobEvent}}) I realized 
> that some of the flakey SLA tests that I've seen lately are the same issue: 
> The database has some leftover stuff from a previous test that its not 
> expecting.  
> Normally this is easy to fix because we can simply call 
> {{cleanUpDBTables()}}.  However, {{cleanUpDBTables}} requires some of the 
> {{Services}} to be running, so you have to call it after starting 
> {{Services}}; but, some of the failures were occurring during Services 
> initialization (specifically when {{SLAService}} initializes the 
> {{SLACalculatorMemory}}, which tries to load some data from the database, 
> which may be incomplete (e.g. SLA registration for a job that doesn't 
> exist)).  So, in this case, we can't call {{cleanUpDBTables()}} before or 
> after starting {{Services}}.
> This brings the larger issue that we should be cleaning up the database 
> before every test anyway to make sure that the tests are truly independent 
> and to prevent harmful leaking (just like we did a while back with the 
> {{Services}}).  I think we should have {{XTestCase.setup()}} call 
> {{cleanUpDBTables()}} so that every test automatically it (and handle the 
> {{Services}} dependency appropriately).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1589) TestZKLocksService is flakey

2013-10-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806986#comment-13806986
 ] 

Alejandro Abdelnur commented on OOZIE-1589:
---

+1

> TestZKLocksService is flakey
> 
>
> Key: OOZIE-1589
> URL: https://issues.apache.org/jira/browse/OOZIE-1589
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1589.patch
>
>
> TestZKLocksService is highly dependent on the order of things happening 
> because its testing locks.  I've seen tests in this class fail a number of 
> times with messages like this:
> {noformat}
> expected: but was:
> {noformat}
> which is because things happened in a slightly different order than it was 
> expecting (though everything is happening correctly)
> When I created these tests, I just took the TestLockService and made it use 
> ZKLocks instead of MemoryLocks.  The ZKLocks take longer to lock than the 
> MemoryLocks, so the timings are sometimes too fast.  I think we just need to 
> increase the sleep calls, and use the {{sleep()}} method instead of 
> {{Thread.sleep()}} so it will scale with the "waitfor ratio" on slower 
> machines.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1596) TestOozieMySqlDBCLI.testCreateMysql fails when tests are executed in a different order

2013-10-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806984#comment-13806984
 ] 

Alejandro Abdelnur commented on OOZIE-1596:
---

+1

> TestOozieMySqlDBCLI.testCreateMysql fails when tests are executed in a 
> different order
> --
>
> Key: OOZIE-1596
> URL: https://issues.apache.org/jira/browse/OOZIE-1596
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Minor
> Attachments: OOZIE-1596.patch
>
>
> {{TestOozieMySqlDBCLI.testCreateMysql}} will fail if the tests are executed 
> in a different order.
> TestOozieMySqlDBCLI.testCreateMysql relies on the default setting of 
> {{FakeConnection.CREATE}} (which is {{true}}), but if 
> {{TestOozieMySqlDBCLI.testUpgradeMysql}} is executed first, it will change 
> {{FakeConnection.CREATE}} to {{false}}, and 
> {{TestOozieMySqlDBCLI.testCreateMysql}} will fail.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1541) Typo in Oozie HA admin -servers command in documentation

2013-10-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806983#comment-13806983
 ] 

Alejandro Abdelnur commented on OOZIE-1541:
---

+1

> Typo in Oozie HA admin -servers command in documentation
> 
>
> Key: OOZIE-1541
> URL: https://issues.apache.org/jira/browse/OOZIE-1541
> Project: Oozie
>  Issue Type: Bug
>  Components: docs, HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Trivial
> Fix For: trunk
>
> Attachments: OOZIE-1541.patch
>
>
> The CLI {{admin -servers}} command gives an example in the documentation:
> {noformat}
> $ oozie admin http://localhost:11000/oozie -servers
> OozieA : http://localhost:11000/oozie
> OozieB : http://localhost:12000/oozie
> OozieC : http://localhost:13000/oozie
> {noformat}
> The command is missing the {{-oozie}} before the address.  
> It would also probably be more clear if they were different hosts instead of 
> the same host with different ports (same with the REST docs).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1582) Bump up Tomcat version to 6.0.37

2013-10-18 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799348#comment-13799348
 ] 

Alejandro Abdelnur commented on OOZIE-1582:
---

+1

> Bump up Tomcat version to 6.0.37
> 
>
> Key: OOZIE-1582
> URL: https://issues.apache.org/jira/browse/OOZIE-1582
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Affects Versions: trunk, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1582.patch
>
>
> Tomcat 6.0.37 fixes two security issues 
> (http://tomcat.apache.org/security-6.html).  We should upgrade from 6.0.36 to 
> incorporate them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1460) Implement and Document security for HA

2013-10-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792107#comment-13792107
 ] 

Alejandro Abdelnur commented on OOZIE-1460:
---

+1 pending jenkins.

> Implement and Document security for HA
> --
>
> Key: OOZIE-1460
> URL: https://issues.apache.org/jira/browse/OOZIE-1460
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1460.patch, OOZIE-1460.patch, OOZIE-1460.patch, 
> OOZIE-1460.patch
>
>
> Implement and document anything that needs to be done to add support for 
> security (i.e. kerberos) for High Availability.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1460) Implement and Document security for HA

2013-10-10 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792026#comment-13792026
 ] 

Alejandro Abdelnur commented on OOZIE-1460:
---

A few minor things:

After checking not not NULL we should trim authHandlerName in getAuthentiation()

The getAuthentication() class selection should be done at service startup as 
that does not change during runtime.

getConnection(), if there is a failure with one of the hosts, don't we want to 
proceed with incomplete logs, instead throwing an exception?


> Implement and Document security for HA
> --
>
> Key: OOZIE-1460
> URL: https://issues.apache.org/jira/browse/OOZIE-1460
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1460.patch, OOZIE-1460.patch, OOZIE-1460.patch
>
>
> Implement and document anything that needs to be done to add support for 
> security (i.e. kerberos) for High Availability.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1560) Log messages should have a way of identifying which server they came from when using HA

2013-10-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790666#comment-13790666
 ] 

Alejandro Abdelnur commented on OOZIE-1560:
---

one NIT, the patch is using in several places the "oozie.instance.id" literal, 
we should have a constant for it. +1 after that.

> Log messages should have a way of identifying which server they came from 
> when using HA
> ---
>
> Key: OOZIE-1560
> URL: https://issues.apache.org/jira/browse/OOZIE-1560
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1560.patch, OOZIE-1560.patch, OOZIE-1560.patch
>
>
> When using HA, the only way to know which server is processing a specific job 
> is to go into each server's log file and look for log messages about that 
> job; when looking at the logs from the API, there is no way to know.  
> This information can be useful, so it would be good to add the server name as 
> part of the log message.  This can either be done for logs permanently or can 
> be done when a server is aggregating/collating logs from the other servers.  
> The former is probably more efficient than the latter.  If we go with the 
> former, I'd say that we should always do it, regardless of HA so the log 
> formatting is consistent and in case that server is added to an HA group 
> later.
> For example, instead of this:
> {noformat}
> 2013-09-29 16:46:20,182 WARN org.apache.oozie.command.wf.ActionStartXCommand: 
> USER[root] GROUP[-] TOKEN[] APP[demo-wf] 
> JOB[000-130925230553293-oozie-oozi-W] 
> ACTION[000-130925230553293-oozie-oozi-W@streaming-node] 
> [***000-130925230553293-oozie-oozi-W@streaming-node***]Action 
> status=RUNNING
> {noformat}
> we can have this:
> {noformat}
> 2013-09-29 16:46:20,182 WARN org.apache.oozie.command.wf.ActionStartXCommand: 
> USER[root] GROUP[-] SERVER[oozieA] TOKEN[] APP[demo-wf] 
> JOB[000-130925230553293-oozie-oozi-W] 
> ACTION[000-130925230553293-oozie-oozi-W@streaming-node] 
> [***000-130925230553293-oozie-oozi-W@streaming-node***]Action 
> status=RUNNING
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1460) Implement and Document security for HA

2013-10-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13790658#comment-13790658
 ] 

Alejandro Abdelnur commented on OOZIE-1460:
---

The following is already before the patch, the patch just moves it.

{code}
+FileOutputStream fos = new FileOutputStream(tempFile);
+IOUtils.copyStream(is, fos);
+is.close();
+fos.close();
+reader = new BufferedReader(new FileReader(tempFile));
{code}

Still, why do we need to copy it locally? why not just returning a {{new 
BufferedReader(new InputStreamReader(conn.getInputStream()))}}


The getConnection() method seems to be redoing much of the stuff the 
AuthenticatedURL does. The AuthenticatedURL should work without security. We 
should just use the AuthenticatedURL:

{code}
AuthenticatedURL.Token token = new AuthenticatedURL.Token();
Authenticator authenticator = getAuthenticator();
HttpURLConnection conn = new 
AuthenticatedURL(authenticator).openConnection(url, token);
{code}


> Implement and Document security for HA
> --
>
> Key: OOZIE-1460
> URL: https://issues.apache.org/jira/browse/OOZIE-1460
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1460.patch
>
>
> Implement and document anything that needs to be done to add support for 
> security (i.e. kerberos) for High Availability.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1563) colt jar includes GPL licence

2013-10-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13789858#comment-13789858
 ] 

Alejandro Abdelnur commented on OOZIE-1563:
---

+1 pending jenkins

> colt jar includes GPL licence
> -
>
> Key: OOZIE-1563
> URL: https://issues.apache.org/jira/browse/OOZIE-1563
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk, 4.0.1
>Reporter: Bowen Zhang
>Assignee: Robert Kanter
> Attachments: OOZIE-1563.patch
>
>
> I believe the colt jar is introduced in the SLA feature. The "Hep" class 
> inside the jar has GPL licence which restricts the usage and distribution of 
> oozie. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1564) Metadata for dead Oozie servers should remain in ZK

2013-10-03 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785620#comment-13785620
 ] 

Alejandro Abdelnur commented on OOZIE-1564:
---

if we solve OOZIE-1561, then we don't need to worry about missing logs, then 
this logic would not be needed anymore.

> Metadata for dead Oozie servers should remain in ZK
> ---
>
> Key: OOZIE-1564
> URL: https://issues.apache.org/jira/browse/OOZIE-1564
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>
> As per the discussion in OOZIE-1561, we should have two services in the 
> service discovery on ZK:
> # Contains the metadata for any Oozie server that has ever connected, even 
> ones that are currently dead
> #- This would be used by log streaming so that dead servers are always 
> identified
> #- I think there is an easy way to do this with a setting in Curator, but I 
> have to test it.  If not, then we'll have to manually create/manage the 
> ZNodes.
> # Contains the metadata for any Oozie server that is still alive (the current 
> behavior)
> #- This would be used by the mod calculation so only alive servers are 
> included
> Maintaining two lists will avoid the need to do any pinging or tricky 
> heartbeating.  
> We'd have to add a new admin command to cleanup the dead servers.  It would 
> look at which servers are not in the alive list and remove those.
> It would also be good to update the admin command that lists servers to 
> specify which ones are alive and which are dead; and possibly have filtering 
> (i.e. show only alive servers, show only dead servers, show all servers)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1561) When using Oozie HA, the logs should also be HA

2013-10-02 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784321#comment-13784321
 ] 

Alejandro Abdelnur commented on OOZIE-1561:
---

I was thinking that we should remember in ZK all the oozie instances that came 
up (remember always). then the log streaming will report the missing instance 
when that list does not match the current instances. and an admin command would 
allow an admin to purge an instance that is gone forever.

If we want to go the HDFS logging, then we should create an HDFSLogAppender and 
ensure it does hflush() on every message. Also, it should be async on the 
writing side (Oozie server) to not slowdown Oozie processing. Though the later 
means some logs could be lost in the case of an Oozie instance crashing.

> When using Oozie HA, the logs should also be HA
> ---
>
> Key: OOZIE-1561
> URL: https://issues.apache.org/jira/browse/OOZIE-1561
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
>
> Currently, if an Oozie server goes down, the logs from that server become 
> unavailable until the server comes back up.  In the meantime, the user may or 
> may not be aware that log messages could be missing when Oozie streams logs 
> to the user.  
> We should come up with a way to make the logs HA.  
> Some ideas:
> # When rolling the logs, copy them into HDFS; Oozie servers can then read the 
> log files directly from HDFS instead of each other
> #- The downside to this is that there will be a window where logs could still 
> be missing as they only show up in HDFS after rolling over (default = 1hr) 
> and Oozie servers would still have to contact each other for the last hour of 
> logs
> #- The upside is that it minimizes the amount of logs that could be missing 
> and would be fairly straightforward to implement
> # Log directly to HDFS
> #- The downside is that this may be complicated or tricky to get working 
> properly
> #-- This also introduces a strict dependency on HDFS
> #- The upside is that this would completely solve the issue and Oozie servers 
> would simply get all logs directly from HDFS
> # Log to ZooKeeper or a database
> #- I think the log files will be too big to do this
> I've assigned this to myself, but if someone wants to tackle this, feel free 
> to reassign it.  I think idea 2 is the most practical, but I'm also open to 
> other ideas on how to do this.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1561) When using Oozie HA, the logs should also be HA

2013-10-01 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783491#comment-13783491
 ] 

Alejandro Abdelnur commented on OOZIE-1561:
---

I wonder if making the logs HA is not a bit too much. Wouldn't be acceptable if 
the streamed logs start and end with a "#AN OOZIE INSTANCE IS 
UNAVAILABLE AT THE MOMENT, LOGS MAY BE PARTIAL##" ?

> When using Oozie HA, the logs should also be HA
> ---
>
> Key: OOZIE-1561
> URL: https://issues.apache.org/jira/browse/OOZIE-1561
> Project: Oozie
>  Issue Type: Improvement
>  Components: HA
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
>
> Currently, if an Oozie server goes down, the logs from that server become 
> unavailable until the server comes back up.  In the meantime, the user may or 
> may not be aware that log messages could be missing when Oozie streams logs 
> to the user.  
> We should come up with a way to make the logs HA.  
> Some ideas:
> # When rolling the logs, copy them into HDFS; Oozie servers can then read the 
> log files directly from HDFS instead of each other
> #- The downside to this is that there will be a window where logs could still 
> be missing as they only show up in HDFS after rolling over (default = 1hr) 
> and Oozie servers would still have to contact each other for the last hour of 
> logs
> #- The upside is that it minimizes the amount of logs that could be missing 
> and would be fairly straightforward to implement
> # Log directly to HDFS
> #- The downside is that this may be complicated or tricky to get working 
> properly
> #-- This also introduces a strict dependency on HDFS
> #- The upside is that this would completely solve the issue and Oozie servers 
> would simply get all logs directly from HDFS
> # Log to ZooKeeper or a database
> #- I think the log files will be too big to do this
> I've assigned this to myself, but if someone wants to tackle this, feel free 
> to reassign it.  I think idea 2 is the most practical, but I'm also open to 
> other ideas on how to do this.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (OOZIE-1557) TestFsActionExecutor.testChmodWithGlob fails against Hadoop 2.1.x-beta

2013-09-27 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13780085#comment-13780085
 ] 

Alejandro Abdelnur commented on OOZIE-1557:
---

+1

> TestFsActionExecutor.testChmodWithGlob fails against Hadoop 2.1.x-beta
> --
>
> Key: OOZIE-1557
> URL: https://issues.apache.org/jira/browse/OOZIE-1557
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk, 4.0.1
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1557.patch
>
>
> When running against Hadoop 2.1.x-beta, 
> {{TestFsActionExecutor.testChmodWithGlob}} fails because of an incompatible 
> change introduced by HDFS-4659 with how file permissions are set.
> {noformat}
> ---
> Test set: org.apache.oozie.action.hadoop.TestFsActionExecutor
> ---
> Tests run: 24, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 35.813 sec 
> <<< FAILURE!
> testChmodWithGlob(org.apache.oozie.action.hadoop.TestFsActionExecutor)  Time 
> elapsed: 0.002 sec  <<< FAILURE!
> junit.framework.ComparisonFailure: expected: but 
> was:
>   at junit.framework.Assert.assertEquals(Assert.java:85)
>   at junit.framework.Assert.assertEquals(Assert.java:91)
>   at 
> org.apache.oozie.action.hadoop.TestFsActionExecutor.testChmodWithGlob(TestFsActionExecutor.java:468)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at junit.framework.TestCase.runTest(TestCase.java:168)
>   at junit.framework.TestCase.runBare(TestCase.java:134)
>   at junit.framework.TestResult$1.protect(TestResult.java:110)
>   at junit.framework.TestResult.runProtected(TestResult.java:128)
>   at junit.framework.TestResult.run(TestResult.java:113)
>   at junit.framework.TestCase.run(TestCase.java:124)
>   at junit.framework.TestSuite.runTest(TestSuite.java:243)
>   at junit.framework.TestSuite.run(TestSuite.java:238)
>   at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
>   at 
> org.apache.maven.surefire.junitcore.ClassDemarcatingRunner.run(ClassDemarcatingRunner.java:58)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:24)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>   at java.lang.Thread.run(Thread.java:680)
> {noformat}
> As this behavior is different between Hadoop 1 and 2, I think we should 
> simply have the test not check for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1546) TestMapReduceActionExecutorUberJar.testMapReduceWithUberJarEnabled fails

2013-09-23 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13774815#comment-13774815
 ] 

Alejandro Abdelnur commented on OOZIE-1546:
---

+1

> TestMapReduceActionExecutorUberJar.testMapReduceWithUberJarEnabled fails
> 
>
> Key: OOZIE-1546
> URL: https://issues.apache.org/jira/browse/OOZIE-1546
> Project: Oozie
>  Issue Type: Bug
>  Components: tests
>Affects Versions: trunk, 4.0.0
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1546.patch, OOZIE-1546.patch
>
>
> org.apache.oozie.action.hadoop.TestMapReduceActionExecutorUberJar.testMapReduceWithUberJarEnabled
>  fails because OOZIE-1501 added a check that the Hadoop counter 
> "MAP_OUTPUT_RECORDS" is 2. In the non-UberJar version of the test, this is 2 
> because the map outputs the 2 dummy inputs. In the UberJar version, we check 
> what's in the classpath by outputting it from the map task as separate 
> records; this means that the "MAP_OUTPUT_RECORDS" can vary depending on 
> what's in the classpath.
> This wasn't caught by Jenkins because this test is normally skipped because 
> the Uber Jar feature doesn't work with the current versions of Hadoop that 
> we're using.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1540) When oozie.zookeeper.oozie.id is not specified, its using a space instead of the hostname

2013-09-17 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769706#comment-13769706
 ] 

Alejandro Abdelnur commented on OOZIE-1540:
---

+1 pending jenkins.

> When oozie.zookeeper.oozie.id is not specified, its using a space instead of 
> the hostname
> -
>
> Key: OOZIE-1540
> URL: https://issues.apache.org/jira/browse/OOZIE-1540
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
> Fix For: trunk
>
> Attachments: OOZIE-1540.patch, OOZIE-1540.patch
>
>
> If you don't specify {{oozie.zookeeper.oozie.id}} it supposed to default to 
> the hostname.  oozie-default.xml has this set to " " (space), which I 
> misremembered how Configuration handled this, so currently, if you do this, 
> all Oozie servers in the HA namespace will have the same id (i.e. " ") and 
> will overwrite each other's entry in the service discovery in ZK, leaving 
> only one.  This makes certain HA functionalities not work, including anything 
> that relies on the service discovery such as log streaming.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1518) Copy action sharelib jars from hdfs

2013-09-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762104#comment-13762104
 ] 

Alejandro Abdelnur commented on OOZIE-1518:
---

bq. Admin puts sharelib dirs of pig,hive,etc under here 
/user/oozie/share/lib-staging

How this is different from the current behavior? where the admin must run 
oozie-setup.sh sharelib -update?

> Copy action sharelib jars from hdfs 
> 
>
> Key: OOZIE-1518
> URL: https://issues.apache.org/jira/browse/OOZIE-1518
> Project: Oozie
>  Issue Type: Bug
>Reporter: Virag Kothari
>
> OOZIE-1461 will copy the launcher related sharelib jars from Oozie's 
> classpath to hdfs.
> This JIRA aims to copy the jars from an hdfs staging dir. This staging dir 
> will typically contain action related jars. 
> OOZIE-1461 always creates a sharelib with new timestamp. This JIRA will make 
> sure that only if either the action jars or launcher jars are modified, then 
> a new sharelib is created.   
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1490) Remove unix OS enforcement from build

2013-09-04 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1490:
--

Summary: Remove unix OS enforcement from build  (was: Remove unix OS 
enforcement)

> Remove unix OS enforcement from build
> -
>
> Key: OOZIE-1490
> URL: https://issues.apache.org/jira/browse/OOZIE-1490
> Project: Oozie
>  Issue Type: Sub-task
>  Components: build
> Environment: Windows
>Reporter: David Wannemacher
>Assignee: David Wannemacher
> Fix For: trunk
>
> Attachments: OOZIE-1490.2.patch, OOZIE-1490.3.patch, OOZIE-1490.patch
>
>
> Windows builds do not run because there is a requireOS restriction for unix.
> Also, building and running unit tests on windows requires that a specific 
> hadoop version is used. Right now this requires editing multiple pom.xml 
> files because the version is hard-coded. After discussion, we will continue 
> to require this rather than refactor the hadoop versions, because doing so 
> will break maven publishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1490) Remove unix OS enforcement

2013-09-04 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758075#comment-13758075
 ] 

Alejandro Abdelnur commented on OOZIE-1490:
---

+1

> Remove unix OS enforcement
> --
>
> Key: OOZIE-1490
> URL: https://issues.apache.org/jira/browse/OOZIE-1490
> Project: Oozie
>  Issue Type: Sub-task
>  Components: build
> Environment: Windows
>Reporter: David Wannemacher
>Assignee: David Wannemacher
> Fix For: trunk
>
> Attachments: OOZIE-1490.2.patch, OOZIE-1490.3.patch, OOZIE-1490.patch
>
>
> Windows builds do not run because there is a requireOS restriction for unix.
> Also, building and running unit tests on windows requires that a specific 
> hadoop version is used. Right now this requires editing multiple pom.xml 
> files because the version is hard-coded. After discussion, we will continue 
> to require this rather than refactor the hadoop versions, because doing so 
> will break maven publishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1490) Remove unix OS enforcement and parametrize hadoop versions

2013-09-04 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13757561#comment-13757561
 ] 

Alejandro Abdelnur commented on OOZIE-1490:
---

David, a POM with ${hadoopOne.version}.oozie-4.1.0-SNAPSHOT 
is unusable when deployed to a maven repo. we cannot do that. I know it works 
when you are building locally, but the artifact cannot be used from a maven 
repo.


> Remove unix OS enforcement and parametrize hadoop versions
> --
>
> Key: OOZIE-1490
> URL: https://issues.apache.org/jira/browse/OOZIE-1490
> Project: Oozie
>  Issue Type: Sub-task
>  Components: build
> Environment: Windows
>Reporter: David Wannemacher
>Assignee: David Wannemacher
> Fix For: trunk
>
> Attachments: OOZIE-1490.2.patch, OOZIE-1490.patch
>
>
> Windows builds do not run because there is a requireOS restriction for unix.
> Also, building and running unit tests on windows requires that a specific 
> hadoop version is used. Right now this requires editing multiple pom.xml 
> files because the version is hard-coded. This should be fixed to allow 
> passing specific hadoop  versions to the build on the command-line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (OOZIE-1490) Remove unix OS enforcement and parametrize hadoop versions

2013-09-04 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13757561#comment-13757561
 ] 

Alejandro Abdelnur edited comment on OOZIE-1490 at 9/4/13 8:37 AM:
---

David, a POM with 
$\{hadoopOne.version\}.oozie-4.1.0-SNAPSHOT is unusable when 
deployed to a maven repo. we cannot do that. I know it works when you are 
building locally, but the artifact cannot be used from a maven repo.


  was (Author: tucu00):
David, a POM with 
${hadoopOne.version}.oozie-4.1.0-SNAPSHOT is unusable when 
deployed to a maven repo. we cannot do that. I know it works when you are 
building locally, but the artifact cannot be used from a maven repo.

  
> Remove unix OS enforcement and parametrize hadoop versions
> --
>
> Key: OOZIE-1490
> URL: https://issues.apache.org/jira/browse/OOZIE-1490
> Project: Oozie
>  Issue Type: Sub-task
>  Components: build
> Environment: Windows
>Reporter: David Wannemacher
>Assignee: David Wannemacher
> Fix For: trunk
>
> Attachments: OOZIE-1490.2.patch, OOZIE-1490.patch
>
>
> Windows builds do not run because there is a requireOS restriction for unix.
> Also, building and running unit tests on windows requires that a specific 
> hadoop version is used. Right now this requires editing multiple pom.xml 
> files because the version is hard-coded. This should be fixed to allow 
> passing specific hadoop  versions to the build on the command-line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (OOZIE-1489) Support building and running Oozie on Windows

2013-09-03 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned OOZIE-1489:
-

Assignee: David Wannemacher

assigning to [~dwann] (just added you as contributor).

> Support building and running Oozie on Windows
> -
>
> Key: OOZIE-1489
> URL: https://issues.apache.org/jira/browse/OOZIE-1489
> Project: Oozie
>  Issue Type: Bug
>  Components: build, core, examples, scripts, tests
> Environment: Windows
>Reporter: David Wannemacher
>Assignee: David Wannemacher
>
> There are several issues that prevent Oozie from working on Windows natively 
> (without Cygwin):
> # Unix OS is enforced in the build.
> # Platform-specific assumptions for file path formats. For example, assuming 
> that forward slash is a valid path separator.
> # Workarounds required for windows-specific issues. For example, environment 
> variables referring to file paths are enclosed in double quotes, which need 
> to be removed prior to using them.
> # Unit tests make many platform-specific assumptions that break on windows.
> # .sh scripts need to be ported to .cmd/.ps1.
> Microsoft and Hortonworks have addressed these issues in a series of patches, 
> which will now be ported to Apache Oozie trunk and submitted to the community 
> for review. These patches will be attached in subtasks of this top-level Jira.
> I will also document the assumptions and steps that are needed to build on 
> Windows. One such assumption is that running unit tests require hadoop core 
> built from branch-1-win.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1448) A CoordActionUpdateXCommand gets queued for all workflows even if they were not launched by a coordinator

2013-08-09 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735007#comment-13735007
 ] 

Alejandro Abdelnur commented on OOZIE-1448:
---

+1 LGTM. Still there is some weirdness here, why do we need to retry updating 
the parent

{code}
new CoordActionUpdateXCommand(wfjob, maxRetries).call();
{code}

[~virag], it seems you did this as part of OOZIE-1424, can you please explain 
the reason of this retry?

> A CoordActionUpdateXCommand gets queued for all workflows even if they were 
> not launched by a coordinator
> -
>
> Key: OOZIE-1448
> URL: https://issues.apache.org/jira/browse/OOZIE-1448
> Project: Oozie
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1448.patch, OOZIE-1448.patch, OOZIE-1448.patch, 
> OOZIE-1448.patch, OOZIE-1448.patch
>
>
> Once a workflow (that wasn't started by a coordinator) ends, there's almost 
> always a warning/error logged that looks like this:
> {noformat}
> 2013-07-09 16:16:54,711  WARN CoordActionUpdateXCommand:542 - USER[rkanter] 
> GROUP[-] TOKEN[] APP[pig-wf] JOB[000-130709161625948-oozie-rkan-W] 
> ACTION[-] E1100: Command precondition does not hold before execution, [, 
> coord action is null], Error Code: E1100
> {noformat}
> The error is harmless, but it tends to confuse users who think that something 
> went wrong.  It also means that we have an extra unnecessary command in the 
> queue for every workflow that wasn't started by a coordinator.
> In SignalXCommand, there is a line like this:
> {code:java}
> new CoordActionUpdateXCommand(wfJob).call();//Note: Called even if wf is 
> not necessarily instantiated by coordinator
> {code}
> The comment is part of the original code, and makes me think that this was 
> done on purpose or perhaps when there wasn't a good way to check if a 
> workflow was started by a coordinator?
> I think we can fix this by simply checking if the parent of {{wfJob}} is a 
> coordinator.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1448) A CoordActionUpdateXCommand gets queued for all workflows even if they were not launched by a coordinator

2013-08-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733937#comment-13733937
 ] 

Alejandro Abdelnur commented on OOZIE-1448:
---

wouldn't make sense to have an {{updateParentIfNecessary(wfJob)}} method and 
used it in all places instead duplicating the following code multiple times?

{code}
+// update coordinator action if the wf was actually 
started by a coord
+if (wfJob.getParentId() != null && 
wfJob.getParentId().contains("-C@")) {
+new CoordActionUpdateXCommand(wfJob, 3).call();
+}
{code}


> A CoordActionUpdateXCommand gets queued for all workflows even if they were 
> not launched by a coordinator
> -
>
> Key: OOZIE-1448
> URL: https://issues.apache.org/jira/browse/OOZIE-1448
> Project: Oozie
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1448.patch, OOZIE-1448.patch, OOZIE-1448.patch, 
> OOZIE-1448.patch
>
>
> Once a workflow (that wasn't started by a coordinator) ends, there's almost 
> always a warning/error logged that looks like this:
> {noformat}
> 2013-07-09 16:16:54,711  WARN CoordActionUpdateXCommand:542 - USER[rkanter] 
> GROUP[-] TOKEN[] APP[pig-wf] JOB[000-130709161625948-oozie-rkan-W] 
> ACTION[-] E1100: Command precondition does not hold before execution, [, 
> coord action is null], Error Code: E1100
> {noformat}
> The error is harmless, but it tends to confuse users who think that something 
> went wrong.  It also means that we have an extra unnecessary command in the 
> queue for every workflow that wasn't started by a coordinator.
> In SignalXCommand, there is a line like this:
> {code:java}
> new CoordActionUpdateXCommand(wfJob).call();//Note: Called even if wf is 
> not necessarily instantiated by coordinator
> {code}
> The comment is part of the original code, and makes me think that this was 
> done on purpose or perhaps when there wasn't a good way to check if a 
> workflow was started by a coordinator?
> I think we can fix this by simply checking if the parent of {{wfJob}} is a 
> coordinator.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1443) forkjoin validation should not allow a fork to go to the same node multiple times

2013-08-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733929#comment-13733929
 ] 

Alejandro Abdelnur commented on OOZIE-1443:
---

+1

> forkjoin validation should not allow a fork to go to the same node multiple 
> times
> -
>
> Key: OOZIE-1443
> URL: https://issues.apache.org/jira/browse/OOZIE-1443
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Duc Anh Le
>Assignee: Robert Kanter
>Priority: Minor
> Fix For: trunk
>
> Attachments: dangling-node.png, OOZIE-1443.patch, OOZIE-1443.patch, 
> workflow.xml
>
>
> The forkjoin validation code should not allow a fork to go to the same node 
> multiple times.  For example, this should not be allowed:
> Start -> ForkNode
> ForkNode -> (MR-1, MR-2, MR-2)
> MR-1 -> JoinNode
> MR-2 -> JoinNode
> JoinNode -> End
> We don't normally allow the same node to be executed multiple times because 
> it will have one entry in the DB and the state will get overwritten and cause 
> problems.  In this case, the workflow gets stuck RUNNING.  
> This should be a fairly trivial update to the forkjoin validation code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1403) forkjoin validation blocks some valid cases involving decision nodes

2013-08-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732706#comment-13732706
 ] 

Alejandro Abdelnur commented on OOZIE-1403:
---

LGTM, +1

> forkjoin validation blocks some valid cases involving decision nodes
> 
>
> Key: OOZIE-1403
> URL: https://issues.apache.org/jira/browse/OOZIE-1403
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: trunk
>
> Attachments: OOZIE-1403.patch
>
>
> As described 
> [here|https://issues.apache.org/jira/browse/OOZIE-1035?focusedCommentId=13676534&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13676534]
>  in OOZIE-1035, the new forkjoin checker code is blocking some valid cases 
> involving decision nodes where the decision nodes are inside the forkjoin; 
> when they are outside, its not a problem.
> 1) This uses a decision node to "insert" an action based on {{foo}}:
> {noformat}
> 
>
>
> 
> 
>
>
> 
> 
>
> 
> 
>
> 
> 
> {noformat}
> 2) This uses a decision node to "replace" an action based on {{foo}}:
> {noformat}
> 
>
>
> 
> 
>
>
> 
> 
>
> 
> 
>
> 
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1448) A CoordActionUpdateXCommand gets queued for all workflows even if they were not launched by a coordinator

2013-08-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732704#comment-13732704
 ] 

Alejandro Abdelnur commented on OOZIE-1448:
---

+1 after rerunning tests once OOZIE-1449 is in.

> A CoordActionUpdateXCommand gets queued for all workflows even if they were 
> not launched by a coordinator
> -
>
> Key: OOZIE-1448
> URL: https://issues.apache.org/jira/browse/OOZIE-1448
> Project: Oozie
>  Issue Type: Bug
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1448.patch, OOZIE-1448.patch
>
>
> Once a workflow (that wasn't started by a coordinator) ends, there's almost 
> always a warning/error logged that looks like this:
> {noformat}
> 2013-07-09 16:16:54,711  WARN CoordActionUpdateXCommand:542 - USER[rkanter] 
> GROUP[-] TOKEN[] APP[pig-wf] JOB[000-130709161625948-oozie-rkan-W] 
> ACTION[-] E1100: Command precondition does not hold before execution, [, 
> coord action is null], Error Code: E1100
> {noformat}
> The error is harmless, but it tends to confuse users who think that something 
> went wrong.  It also means that we have an extra unnecessary command in the 
> queue for every workflow that wasn't started by a coordinator.
> In SignalXCommand, there is a line like this:
> {code:java}
> new CoordActionUpdateXCommand(wfJob).call();//Note: Called even if wf is 
> not necessarily instantiated by coordinator
> {code}
> The comment is part of the original code, and makes me think that this was 
> done on purpose or perhaps when there wasn't a good way to check if a 
> workflow was started by a coordinator?
> I think we can fix this by simply checking if the parent of {{wfJob}} is a 
> coordinator.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1449) Coordinator Workflow parent relationship is broken for purge service

2013-08-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732702#comment-13732702
 ] 

Alejandro Abdelnur commented on OOZIE-1449:
---

LGTM, +1

> Coordinator Workflow parent relationship is broken for purge service
> 
>
> Key: OOZIE-1449
> URL: https://issues.apache.org/jira/browse/OOZIE-1449
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Critical
> Fix For: trunk
>
> Attachments: OOZIE-1449.patch, OOZIE-1449.patch, OOZIE-1449.patch, 
> OOZIE-1449.patch, OOZIE-1449.patch
>
>
> OOZIE-1118 improved the logic of the purge service to take into account the 
> parent-child relationships of bundle, coordinator, workflow, and subworkflow 
> jobs.  However, in queries (and test code) dealing with the coordinator 
> parent of a workflow, it is incorrectly using the coordinator job id instead 
> of the coordinator action id.  
> This means that the purging logic won't properly associate workflows with 
> their parent coordinators; all jobs should eventually be purged, so its not 
> completely broken, but it is currently possible for child workflows to be 
> purged before their coordinator parents (the correct behavior is for no child 
> or parent to be purged until the parent and all children are ready).  
> This doesn't affect the coordinator-bundle or subworkflow-workflow 
> relationships because they use their parent is the job id not the action id; 
> only the workflow-coordinator relationship uses the coordinator action id as 
> the parent id for the workflow job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1458) If a Credentials type is not defined, Oozie should say something

2013-08-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732696#comment-13732696
 ] 

Alejandro Abdelnur commented on OOZIE-1458:
---

+1

> If a Credentials type is not defined, Oozie should say something
> 
>
> Key: OOZIE-1458
> URL: https://issues.apache.org/jira/browse/OOZIE-1458
> Project: Oozie
>  Issue Type: Improvement
>  Components: security
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: OOZIE-1458.patch
>
>
> If you use the Credentials Module and define a {{}} section like 
> this:
> {code:xml}
> 
>  
>   
>...
>   
>  
> 
> {code}
> but you didn't add a credentials type for {{hcat}} in oozie-site.xml; that 
> is, you *did not* add this:
> {code:xml}
> 
>  oozie.credentials.credentialclasses
>  hcat=org.apache.oozie.action.hadoop.HCatCredentials
> 
> {code}
> then Oozie will silently not use the credentials class (because it doesn't 
> know about it) so the action trying to use it will fail.  It's pretty easy to 
> forget to add the property to oozie-site.xml, so it would be nice if instead 
> of silently ignoring needing to acquire a credential for an action, the 
> workflow should fail either on that action when it tries to use the 
> credential and can't find it, or (probably better) when submitting the 
> workflow as part of the initial checking (e.g. after we do the forkjoin 
> checking or something).  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1490) Remove unix OS enforcement and parametrize hadoop versions

2013-08-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732609#comment-13732609
 ] 

Alejandro Abdelnur commented on OOZIE-1490:
---

David, parameterizing the versions of Hadoop on the POMs and requiring them to 
be passed during build, makes the POMs unusable from a maven repo. the POMs 
must have a fixed version. If you want to have a new version of hadoop you have 
to create the set of submodules in hadooplibs. 

There as a few places in Oozie that it assumes Unix filesystem, changing the 
POMs only is not enough.

> Remove unix OS enforcement and parametrize hadoop versions
> --
>
> Key: OOZIE-1490
> URL: https://issues.apache.org/jira/browse/OOZIE-1490
> Project: Oozie
>  Issue Type: Sub-task
>  Components: build
> Environment: Windows
>Reporter: David Wannemacher
> Fix For: trunk
>
> Attachments: OOZIE-1490.patch
>
>
> Windows builds do not run because there is a requireOS restriction for unix.
> Also, building and running unit tests on windows requires that a specific 
> hadoop version is used. Right now this requires editing multiple pom.xml 
> files because the version is hard-coded. This should be fixed to allow 
> passing specific hadoop  versions to the build on the command-line.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1471) Support glob in FS action and prepare blocks

2013-07-29 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723183#comment-13723183
 ] 

Alejandro Abdelnur commented on OOZIE-1471:
---

sound good

> Support glob in FS action and prepare blocks
> 
>
> Key: OOZIE-1471
> URL: https://issues.apache.org/jira/browse/OOZIE-1471
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ryota Egashira
>Assignee: Ryota Egashira
> Fix For: trunk
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1471) Support glob in FS action and prepare blocks

2013-07-29 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723031#comment-13723031
 ] 

Alejandro Abdelnur commented on OOZIE-1471:
---

not very trilled about this (as it can easily hog the NN if using a broad 
wildcard), but if we do it we should support chgroup and chmod as well.

> Support glob in FS action and prepare blocks
> 
>
> Key: OOZIE-1471
> URL: https://issues.apache.org/jira/browse/OOZIE-1471
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ryota Egashira
>Assignee: Ryota Egashira
> Fix For: trunk
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1461) provide an option to auto-deploy launcher jar onto HDFS system libpath

2013-07-23 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717733#comment-13717733
 ] 

Alejandro Abdelnur commented on OOZIE-1461:
---

Doing an upgrade is a series of manual steps and verifications, for upgrades of 
the Oozie server, the sharelibs or both.

By using a different directory every time a sharelib is updated (regardless of 
the Oozie server being upgraded) ...

* push new sharelib to a new HDFS dir
* update oozie-site.xml with new sharelib location
* restart oozie server

... you don't disrupt any running job in the cluster, and the new sharelib dir 
gets picked up after the Oozie server restart.

And these steps can be automated if necessary to avoid human errors.

Option #3 seems to me unnecessary complexity that will require checks to ensure 
dangling old dirs are not left behind.

Also, with Oozie HA coming, we'll have to have a switch to enabled/disable this 
per Oozie instance (making the config of de instances diferent) plus we'll have 
to use ZK to communicate to the other Oozie instance where the temp dir is for 
the current run. And if the oozie instance responsible for pushing the launcher 
jar to temp dir is restarted, the other oozie instance will have to pick up the 
new temp dir on the flight.

IMO, this seems to be too much complexity for what is worth.


> provide an option to auto-deploy launcher jar onto HDFS system libpath
> --
>
> Key: OOZIE-1461
> URL: https://issues.apache.org/jira/browse/OOZIE-1461
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ryota Egashira
>Assignee: Ryota Egashira
> Fix For: trunk
>
>
> after OOZIE-1311, 1315, when oozie.action.ship.launcher.jar is false, 
> launcher jar is shipped from sharelib, but it requires manual process for 
> admin people to upload jar files onto the sharelib at server-start time, 
> before actually starting running workflow actions. This JIRA to provide an 
> option to remove the manual process, and make oozie server (ActionService) to 
> automatically create and upload launcher jar files onto HDFS (tmp directory 
> under oozie.service.WorkflowAppService.system.libpath), and allow workflow 
> actions to consume from there. every time oozie server starts, it 
> automatically creates a new directory to upload launcher jars to, and also 
> purges stale directories previously created (older than 7 days, 
> configurable). 
> if false (which is current default). the behavior is the same with previous, 
> launcher jars provided from sharelib (when 
> oozie.action.ship.launcher.jar=false) or each workflow action creates and 
> ships launcher jar (when oozie.action.ship.launcher.jar=true)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (OOZIE-1461) provide an option to auto-deploy launcher jar onto HDFS system libpath

2013-07-22 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned OOZIE-1461:
-

Assignee: Ryota Egashira  (was: Alejandro Abdelnur)

> provide an option to auto-deploy launcher jar onto HDFS system libpath
> --
>
> Key: OOZIE-1461
> URL: https://issues.apache.org/jira/browse/OOZIE-1461
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ryota Egashira
>Assignee: Ryota Egashira
> Fix For: trunk
>
>
> after OOZIE-1311, 1315, when oozie.action.ship.launcher.jar is false, 
> launcher jar is shipped from sharelib, but it requires manual process for 
> admin people to upload jar files onto the sharelib at server-start time, 
> before actually starting running workflow actions. This JIRA to provide an 
> option to remove the manual process, and make oozie server (ActionService) to 
> automatically create and upload launcher jar files onto HDFS (tmp directory 
> under oozie.service.WorkflowAppService.system.libpath), and allow workflow 
> actions to consume from there. every time oozie server starts, it 
> automatically creates a new directory to upload launcher jars to, and also 
> purges stale directories previously created (older than 7 days, 
> configurable). 
> if false (which is current default). the behavior is the same with previous, 
> launcher jars provided from sharelib (when 
> oozie.action.ship.launcher.jar=false) or each workflow action creates and 
> ships launcher jar (when oozie.action.ship.launcher.jar=true)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (OOZIE-1461) provide an option to auto-deploy launcher jar onto HDFS system libpath

2013-07-22 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur reassigned OOZIE-1461:
-

Assignee: Alejandro Abdelnur  (was: Ryota Egashira)

> provide an option to auto-deploy launcher jar onto HDFS system libpath
> --
>
> Key: OOZIE-1461
> URL: https://issues.apache.org/jira/browse/OOZIE-1461
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ryota Egashira
>Assignee: Alejandro Abdelnur
> Fix For: trunk
>
>
> after OOZIE-1311, 1315, when oozie.action.ship.launcher.jar is false, 
> launcher jar is shipped from sharelib, but it requires manual process for 
> admin people to upload jar files onto the sharelib at server-start time, 
> before actually starting running workflow actions. This JIRA to provide an 
> option to remove the manual process, and make oozie server (ActionService) to 
> automatically create and upload launcher jar files onto HDFS (tmp directory 
> under oozie.service.WorkflowAppService.system.libpath), and allow workflow 
> actions to consume from there. every time oozie server starts, it 
> automatically creates a new directory to upload launcher jars to, and also 
> purges stale directories previously created (older than 7 days, 
> configurable). 
> if false (which is current default). the behavior is the same with previous, 
> launcher jars provided from sharelib (when 
> oozie.action.ship.launcher.jar=false) or each workflow action creates and 
> ships launcher jar (when oozie.action.ship.launcher.jar=true)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1461) provide an option to auto-deploy launcher jar onto HDFS system libpath

2013-07-22 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715820#comment-13715820
 ] 

Alejandro Abdelnur commented on OOZIE-1461:
---

why do we need #3? I think this is not necessary. If you want a performance 
boost just use #2. If you are using sharelibs, you can just set the config for 
#2 and it will work, nothing additional to do.

> provide an option to auto-deploy launcher jar onto HDFS system libpath
> --
>
> Key: OOZIE-1461
> URL: https://issues.apache.org/jira/browse/OOZIE-1461
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Ryota Egashira
>Assignee: Ryota Egashira
> Fix For: trunk
>
>
> after OOZIE-1311, 1315, when oozie.action.ship.launcher.jar is false, 
> launcher jar is shipped from sharelib, but it requires manual process for 
> admin people to upload jar files onto the sharelib at server-start time, 
> before actually starting running workflow actions. This JIRA to provide an 
> option to remove the manual process, and make oozie server (ActionService) to 
> automatically create and upload launcher jar files onto HDFS (tmp directory 
> under oozie.service.WorkflowAppService.system.libpath), and allow workflow 
> actions to consume from there. every time oozie server starts, it 
> automatically creates a new directory to upload launcher jars to, and also 
> purges stale directories previously created (older than 7 days, 
> configurable). 
> if false (which is current default). the behavior is the same with previous, 
> launcher jars provided from sharelib (when 
> oozie.action.ship.launcher.jar=false) or each workflow action creates and 
> ships launcher jar (when oozie.action.ship.launcher.jar=true)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1440) Build fails in certain environments due to xerces OpenJPA issue

2013-07-01 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697235#comment-13697235
 ] 

Alejandro Abdelnur commented on OOZIE-1440:
---

+1

> Build fails in certain environments due to xerces OpenJPA issue
> ---
>
> Key: OOZIE-1440
> URL: https://issues.apache.org/jira/browse/OOZIE-1440
> Project: Oozie
>  Issue Type: Bug
>  Components: build
>Affects Versions: trunk
>Reporter: Robert Kanter
>Assignee: Sean Mackrory
> Fix For: trunk
>
> Attachments: 
> 0001-OOZIE-1440.-Build-fails-in-certain-environments-due-.patch, 
> 0001-OOZIE-1440.-Build-fails-in-certain-environments-due-.patch, 
> OOZIE-1440_amendment.patch, OOZIE-1440.patch
>
>
> We've been seeing some build failures on some machines due to an issue with 
> OpenJPA (specifically, the changes made in OOZIE-1377).  
> We see these two cryptic errors in the output from the build:
> {{[INFO] --- openjpa-maven-plugin:2.2.2:enhance (enhancer) @ oozie-core ---}}
> {{An error occurred while attempting to determine the version of 
> "file:/var/lib/jenkins/workspace/build/oozie/4.1.0-SNAPSHOT/source/core/target/classes/META-INF/persistence.xml".}}
> {{[ERROR] Failed to execute goal 
> org.apache.openjpa:openjpa-maven-plugin:2.2.2:enhance (enhancer) on project 
> oozie-core: Execution enhancer of goal 
> org.apache.openjpa:openjpa-maven-plugin:2.2.2:enhance failed: 
> org.apache.openjpa.persistence.PersistenceProductDerivation:java.lang.ClassCastException:
>  org.apache.xerces.parsers.XIncludeAwareParserConfiguration cannot be cast to 
> org.apache.xerces.xni.parser.XMLParserConfiguration -> [Help 1]}}
> I saw that RAVE-245 had a similar error and their solution was to add Xerces 
> as a dependency on the OpenJPA plugin.  This prevents a classpath env issue 
> from sometimes choosing the built-in buggy version of Xerces that Java is 
> using by default (see OOZIE-1017), and instead to use the better version that 
> we've been using elsewhere.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1434) Commands not releasing memory lock

2013-06-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695808#comment-13695808
 ] 

Alejandro Abdelnur commented on OOZIE-1434:
---

if the jdbc driver does not support it, then it should be something to 
configure on the SQLServer itself, if not there either, then there is no 
control from our side.

> Commands not releasing memory lock
> --
>
> Key: OOZIE-1434
> URL: https://issues.apache.org/jira/browse/OOZIE-1434
> Project: Oozie
>  Issue Type: Bug
>Reporter: Virag Kothari
>Assignee: Virag Kothari
>
> If the command gets stuck in making a db call during loadState() after 
> acquiring the lock, the lock will never be released and the job will be stuck 
> forever. 
> Need to introduce a timeout for the database call.
> There might be also be an issue with the implementation of MemoryLocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1434) Commands not releasing memory lock

2013-06-27 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694822#comment-13694822
 ] 

Alejandro Abdelnur commented on OOZIE-1434:
---

BTW, if the openjpa or the jdbcdriver do not timeout the query, and the command 
does not complete, the lock should be held. I don't think we should change that 
behavior.

> Commands not releasing memory lock
> --
>
> Key: OOZIE-1434
> URL: https://issues.apache.org/jira/browse/OOZIE-1434
> Project: Oozie
>  Issue Type: Bug
>Reporter: Virag Kothari
>Assignee: Virag Kothari
>
> If the command gets stuck in making a db call during loadState() after 
> acquiring the lock, the lock will never be released and the job will be stuck 
> forever. 
> Need to introduce a timeout for the database call.
> There might be also be an issue with the implementation of MemoryLocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1434) Commands not releasing memory lock

2013-06-27 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694819#comment-13694819
 ] 

Alejandro Abdelnur commented on OOZIE-1434:
---

On the locking issue, was not with memory locks but with the XCommand locking 
logic, this was fix in OOZIE-1051.

>From OpenJPA docs:


1.8.3.  Query Timeout Hint

To specify a query timeout hint in milliseconds to those database drivers that 
support it, specify a hint name of "javax.persistence.query.timeout" with an 
integer value greater than zero, or zero for no timeout which is the default 
behavior.


It seems the JDBC driver must support the time out. From quickly looking at the 
OpenJPA dictionaries for the DBs we support, postgres does not support this 
option.

This property should be set in the poreprties used to create the EMF.

As we chatted during after the meetup, I think we should modify the JPAService 
so all oozie-site properties starting with:

{code}
oozie.service.JPAService.jpaconf.NAME=VALUE
{code}

Are seeded into the properties object used to create the EMF as NAME=VALUE.

By doing this we can fine tune openjpa without having to modify Oozie.

 


> Commands not releasing memory lock
> --
>
> Key: OOZIE-1434
> URL: https://issues.apache.org/jira/browse/OOZIE-1434
> Project: Oozie
>  Issue Type: Bug
>Reporter: Virag Kothari
>Assignee: Virag Kothari
>
> If the command gets stuck in making a db call during loadState() after 
> acquiring the lock, the lock will never be released and the job will be stuck 
> forever. 
> Need to introduce a timeout for the database call.
> There might be also be an issue with the implementation of MemoryLocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1415) Reassess Oozie instrumentation for performance analysis

2013-06-26 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694472#comment-13694472
 ] 

Alejandro Abdelnur commented on OOZIE-1415:
---

As I've mentioned after the meetup, we could instead use 
http://metrics.codahale.com/ for all instrumentation. I think it worth the 
refactoring.

> Reassess Oozie instrumentation for performance analysis
> ---
>
> Key: OOZIE-1415
> URL: https://issues.apache.org/jira/browse/OOZIE-1415
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Mona Chitnis
> Fix For: trunk
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Oozie instrumentation needs some reassessment to figure out what information 
> is still relevant and what needs to be added because of the multitude of new 
> features added. This will help in performance analysis and lead the way to 
> figure out performance optimizations

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1319) "LAST_ONLY" in execution control for coordinator job still runs all the actions

2013-06-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684042#comment-13684042
 ] 

Alejandro Abdelnur commented on OOZIE-1319:
---

Bowen, could you please check if the failure is related or not?

> "LAST_ONLY" in execution control for coordinator job still runs all the 
> actions
> ---
>
> Key: OOZIE-1319
> URL: https://issues.apache.org/jira/browse/OOZIE-1319
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1319.patch
>
>
> In execute() of CoordJobGetReadyActionsJPAExecutor.java, once we retrieve the 
> top item from a "LIFO" query result, we do not discard or delete the 
> remaining items from the result list. As a result, the next time execute() is 
> invoked, we will be retrieving the next item in line. Consequently, LAST_ONLY 
> strategy will also execute all ready actions for a given coordinator job, 
> making it no different than LIFO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1418) Bug fix

2013-06-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683668#comment-13683668
 ] 

Alejandro Abdelnur commented on OOZIE-1418:
---

please update summary to the bug itself (sadly this is not the only bug we 
found ;) )

> Bug fix
> ---
>
> Key: OOZIE-1418
> URL: https://issues.apache.org/jira/browse/OOZIE-1418
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Mona Chitnis
>Assignee: Mona Chitnis
> Fix For: trunk
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Bug fixes for the following: 
> 1. When a workflow action is killed via ActionKillXCommand, the end time of 
> the action is not set. You can see it even now for KILLED actions on the 
> web-console. This is leading to NPE in Sla calculation
> 2. V2SLAServlet constructor had wrong argument - failing AuthFilter init() 
> 3. Handle case on Coordinator rerun, where sla job is rerun with a different 
> conf without sla, thus having no SLA registration record to update
> 4. Exception handling for worker threads spawned from EventHandlerService and 
> SLAService, to avoid thread from dying and quitting processing any events in 
> its queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1414) Configuring Oozie for HTTPS still allows HTTP connections to all resources

2013-06-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683530#comment-13683530
 ] 

Alejandro Abdelnur commented on OOZIE-1414:
---

+1, LGTM

> Configuring Oozie for HTTPS still allows HTTP connections to all resources
> --
>
> Key: OOZIE-1414
> URL: https://issues.apache.org/jira/browse/OOZIE-1414
> Project: Oozie
>  Issue Type: Bug
>  Components: security
>Reporter: Robert Kanter
>Assignee: Robert Kanter
>Priority: Blocker
> Fix For: trunk, 4.0.0
>
> Attachments: OOZIE-1414.patch
>
>
> When you run {{oozie-setup.sh prepare-war -secure}} it is supposed to replace 
> server.xml with ssl-server.xml (in the oozie-server/conf/ dir) and web.xml 
> with ssl-web.xml (in the WAR file).
> OOZIE-670 changed oozie-setup.sh to prepare the war file without calling 
> addtowar.sh.  However, the code added by OOZIE-1233 and OOZIE-1268 still 
> delegates replacing web.xml with ssl-web.xml to addtowar.sh, which 
> oozie-setup.sh no longer calls.
> Therefore, when you try to configure Oozie for HTTPS, it will use the 
> original web.xml file; which means that {color:red}all resources are 
> accessible from both HTTPS and *HTTP*.{color}
> This isn't an issue in Oozie 3.3.2 because it didn't include OOZIE-670, so 
> addtowar.sh was still called.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1416) verify if MAPREDUCE-5199 impacts launcher jobs

2013-06-14 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13683431#comment-13683431
 ] 

Alejandro Abdelnur commented on OOZIE-1416:
---

sounds good. Do we have to then (in the future) remove the logic in the 
MR/Pig/Hive main classes that set the ENV var to  the token file?

> verify if MAPREDUCE-5199 impacts launcher jobs
> --
>
> Key: OOZIE-1416
> URL: https://issues.apache.org/jira/browse/OOZIE-1416
> Project: Oozie
>  Issue Type: Bug
>  Components: workflow
>Affects Versions: trunk
>Reporter: Alejandro Abdelnur
>Priority: Blocker
>
> MAPREDUCE-5199 does aways with the tokens file.
> we should verify that this does not affect how oozie works and if it does fix 
> things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (OOZIE-1416) verify if MAPREDUCE-5199 impacts launcher jobs

2013-06-14 Thread Alejandro Abdelnur (JIRA)
Alejandro Abdelnur created OOZIE-1416:
-

 Summary: verify if MAPREDUCE-5199 impacts launcher jobs
 Key: OOZIE-1416
 URL: https://issues.apache.org/jira/browse/OOZIE-1416
 Project: Oozie
  Issue Type: Bug
  Components: workflow
Affects Versions: trunk
Reporter: Alejandro Abdelnur
Priority: Blocker


MAPREDUCE-5199 does aways with the tokens file.

we should verify that this does not affect how oozie works and if it does fix 
things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1415) Reassess Oozie instrumentation for performance analysis

2013-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682870#comment-13682870
 ] 

Alejandro Abdelnur commented on OOZIE-1415:
---

yeah!!!

And we have to fix the timers. they do the avg/stddev from the start time of 
the oozie server. that is useless as the longer it runs it will converge to a 
point and will never move even if things are getting funny now.

I have some code there that what it does is computes the avg time of the last N 
crons. by looking at the instrumentation log will then be possible to see 
trends.

I can either create a JIRA for this or upload a patch here this change and you 
can take it from there.

> Reassess Oozie instrumentation for performance analysis
> ---
>
> Key: OOZIE-1415
> URL: https://issues.apache.org/jira/browse/OOZIE-1415
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk
>Reporter: Mona Chitnis
> Fix For: trunk
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Oozie instrumentation needs some reassessment to figure out what information 
> is still relevant and what needs to be added because of the multitude of new 
> features added. This will help in performance analysis and lead the way to 
> figure out performance optimizations

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1319) "LAST_ONLY" in execution control for coordinator job still runs all the actions

2013-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682869#comment-13682869
 ] 

Alejandro Abdelnur commented on OOZIE-1319:
---

+1, pending jenkins, kicking off jenkins manually.

> "LAST_ONLY" in execution control for coordinator job still runs all the 
> actions
> ---
>
> Key: OOZIE-1319
> URL: https://issues.apache.org/jira/browse/OOZIE-1319
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1319.patch
>
>
> In execute() of CoordJobGetReadyActionsJPAExecutor.java, once we retrieve the 
> top item from a "LIFO" query result, we do not discard or delete the 
> remaining items from the result list. As a result, the next time execute() is 
> invoked, we will be retrieving the next item in line. Consequently, LAST_ONLY 
> strategy will also execute all ready actions for a given coordinator job, 
> making it no different than LIFO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1349) oozieCLI -Doozie.auth.token.cache doesn't work

2013-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682852#comment-13682852
 ] 

Alejandro Abdelnur commented on OOZIE-1349:
---

+1. jenkins is not working. I'll commit this, the change is trival. 

> oozieCLI -Doozie.auth.token.cache doesn't work
> --
>
> Key: OOZIE-1349
> URL: https://issues.apache.org/jira/browse/OOZIE-1349
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1349.patch, oozie-1349.patch
>
>
> under main method in OozieCLI.java, instead of calling 
> System.getProperties().contains(AuthOozieClient.USE_AUTH_TOKEN_CACHE_SYS_PROP),
>  we should call containsKey since we are checking if the key is set or not, 
> not if the value is set or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1349) oozieCLI -Doozie.auth.token.cache doesn't work

2013-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682354#comment-13682354
 ] 

Alejandro Abdelnur commented on OOZIE-1349:
---

Bowen, can you please verify the patch still applies and rebase the patch if 
necessary?

> oozieCLI -Doozie.auth.token.cache doesn't work
> --
>
> Key: OOZIE-1349
> URL: https://issues.apache.org/jira/browse/OOZIE-1349
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1349.patch
>
>
> under main method in OozieCLI.java, instead of calling 
> System.getProperties().contains(AuthOozieClient.USE_AUTH_TOKEN_CACHE_SYS_PROP),
>  we should call containsKey since we are checking if the key is set or not, 
> not if the value is set or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1398) [Scale] Reduce the number of CLOB columns used

2013-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682328#comment-13682328
 ] 

Alejandro Abdelnur commented on OOZIE-1398:
---

got it. So we should take care of the 3rd one then, may be introducing a 
constant in JPAService that defines the DB version and the DB tool checks/sets 
to it. So when we do a DB change we change the constant value and the DB tool 
will just do the check for the new version. In the JPAService we should check 
the value of that DB version using direct JDBC not to have mapping conflict if 
the DB is of an earlier version (I'm not sure but JPA could fail to initialize).

> [Scale] Reduce the number of CLOB columns used
> --
>
> Key: OOZIE-1398
> URL: https://issues.apache.org/jira/browse/OOZIE-1398
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk, 3.3.2
>Reporter: Rohini Palaniswamy
>Assignee: Ryota Egashira
> Fix For: trunk
>
> Attachments: OOZIE-1398-v4.patch, OOZIE-1398-v5.patch, 
> OOZIE-1398-v7.patch, OOZIE-1398-v8.patch, OOZIE-1398-v8.patch
>
>
>   When the number of concurrent submissions on Oozie increased to 100-200 per 
> minute, it was not able to scale and we hit Oracle issues as there were lot 
> of CLOB columns and DB became a bottle neck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1398) [Scale] Reduce the number of CLOB columns used

2013-06-13 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682236#comment-13682236
 ] 

Alejandro Abdelnur commented on OOZIE-1398:
---

There are a few of things missed in the committed patch.

* The DBTOOL should set a new DB version after doing the upgrade.
* The DBTOOL should check before doing the upgrade the DB version is lesser 
than the current one.
* Also, we should add in the JPAService a check to fail startup if the DB 
version is not the correct one.

The first 2 should be a follow up JIRA to this one, the 3rd one should be a new 
JIRA.

> [Scale] Reduce the number of CLOB columns used
> --
>
> Key: OOZIE-1398
> URL: https://issues.apache.org/jira/browse/OOZIE-1398
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: trunk, 3.3.2
>Reporter: Rohini Palaniswamy
>Assignee: Ryota Egashira
> Fix For: trunk
>
> Attachments: OOZIE-1398-v4.patch, OOZIE-1398-v5.patch, 
> OOZIE-1398-v7.patch, OOZIE-1398-v8.patch, OOZIE-1398-v8.patch
>
>
>   When the number of concurrent submissions on Oozie increased to 100-200 per 
> minute, it was not able to scale and we hit Oracle issues as there were lot 
> of CLOB columns and DB became a bottle neck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (OOZIE-1411) upgrade to OpenJPA 2.2.1

2013-06-12 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur resolved OOZIE-1411.
---

Resolution: Invalid

misread version. the previous patch took openjpa to 2.2.2 which is not affected.

> upgrade to OpenJPA 2.2.1
> 
>
> Key: OOZIE-1411
> URL: https://issues.apache.org/jira/browse/OOZIE-1411
> Project: Oozie
>  Issue Type: Bug
>  Components: build
>Affects Versions: trunk
>Reporter: Alejandro Abdelnur
>
> This just came up in the openjpa alias:
> CVE-2013-1768: Apache OpenJPA security vulnerability
> Severity: Important
> Vendor: The Apache Software Foundation
> Versions Affected:
> OpenJPA 1.0.0 to 1.0.4
> OpenJPA 1.1.0
> OpenJPA 1.3.0
> OpenJPA 1.2.0 to 1.2.2
> OpenJPA 2.0.0 to 2.0.1
> OpenJPA 2.1.0 to 2.1.1
> OpenJPA 2.2.0 to 2.2.1
> Description: Deserialization of a maliciously crafted OpenJPA object can
> result in an executable file being written to the file system. An
> attacker needs to discover an unprotected server program to exploit the
> vulnerability.  It then needs to exploit another unprotected server
> program to execute the file and gain access to the system.  OpenJPA
> usage by itself does not introduce the vulnerability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (OOZIE-1411) upgrade to OpenJPA 2.2.1

2013-06-12 Thread Alejandro Abdelnur (JIRA)
Alejandro Abdelnur created OOZIE-1411:
-

 Summary: upgrade to OpenJPA 2.2.1
 Key: OOZIE-1411
 URL: https://issues.apache.org/jira/browse/OOZIE-1411
 Project: Oozie
  Issue Type: Bug
  Components: build
Affects Versions: trunk
Reporter: Alejandro Abdelnur


This just came up in the openjpa alias:

CVE-2013-1768: Apache OpenJPA security vulnerability

Severity: Important

Vendor: The Apache Software Foundation

Versions Affected:

OpenJPA 1.0.0 to 1.0.4
OpenJPA 1.1.0
OpenJPA 1.3.0
OpenJPA 1.2.0 to 1.2.2
OpenJPA 2.0.0 to 2.0.1
OpenJPA 2.1.0 to 2.1.1
OpenJPA 2.2.0 to 2.2.1

Description: Deserialization of a maliciously crafted OpenJPA object can
result in an executable file being written to the file system. An
attacker needs to discover an unprotected server program to exploit the
vulnerability.  It then needs to exploit another unprotected server
program to execute the file and gain access to the system.  OpenJPA
usage by itself does not introduce the vulnerability.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-08 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Attachment: OOZIE-1377.patch

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch, 
> OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-08 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Attachment: (was: OOZIE-1377.patch)

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch, 
> OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-08 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Attachment: OOZIE-1377.patch

and now getting rid of the trailing spaces.

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch, 
> OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-08 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678837#comment-13678837
 ] 

Alejandro Abdelnur commented on OOZIE-1377:
---

test failures unrelated, i'll remove trailing space on commit.

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch, 
> OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-07 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Attachment: OOZIE-1377.patch

making openjpa plugin to use version variable and trimming trailing spaces.

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch, 
> OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-07 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Attachment: OOZIE-1377.patch

reupdating patch as I did not save the core/pom.xml after final changes.

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-07 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678609#comment-13678609
 ] 

Alejandro Abdelnur commented on OOZIE-1377:
---

not strictly necessary, the plugin can be used for diff versions, but will to 
get free plugin fixes.

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2

2013-06-07 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Summary: OpenJPA runtime enhancement should be disabled and update OpenJPA 
to 2.2.2  (was: OpenJPA runtime enhancement should be disabled)

> OpenJPA runtime enhancement should be disabled and update OpenJPA to 2.2.2
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (OOZIE-1377) OpenJPA runtime enhancement should be disabled

2013-06-07 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated OOZIE-1377:
--

Attachment: OOZIE-1377.patch

updated patch that bumps up the version of openjpa to 2.2.2. With openjpa 2.1.0 
we were still seeing OOM issues, with 2.2.0 not anymore.

> OpenJPA runtime enhancement should be disabled
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch, OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1398) [Scale] Reduce the number of CLOB columns used

2013-06-06 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677655#comment-13677655
 ] 

Alejandro Abdelnur commented on OOZIE-1398:
---

Sounds good, we can drop the AUTH_TOKEN columns all together. They were thought 
very early on when the Hadoop DT token story was still not fully backed and we 
had idea that the client would have to give the DT token to oozie on submission.

We have to see if OpenJPA supports making the type change, else the ooziedb 
tool will have to handle the change for each DB type using the corresponding 
sql.

> [Scale] Reduce the number of CLOB columns used
> --
>
> Key: OOZIE-1398
> URL: https://issues.apache.org/jira/browse/OOZIE-1398
> Project: Oozie
>  Issue Type: Improvement
>Affects Versions: 3.3.2
>Reporter: Rohini Palaniswamy
> Fix For: trunk
>
>
>   When the number of concurrent submissions on Oozie increased to 100-200 per 
> minute, it was not able to scale and we hit Oracle issues as there were lot 
> of CLOB columns and DB became a bottle neck.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1305) coordinator job should have an option to recover none of actions after downtime

2013-05-31 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671840#comment-13671840
 ] 

Alejandro Abdelnur commented on OOZIE-1305:
---

we may need to do that in the dbtool when updating from a previous version, 
right?

> coordinator job should have an option to recover none of actions after 
> downtime
> ---
>
> Key: OOZIE-1305
> URL: https://issues.apache.org/jira/browse/OOZIE-1305
> Project: Oozie
>  Issue Type: New Feature
>Reporter: Bowen Zhang
> Attachments: oozierecoverydesigndoc.pdf, oozierecoverydesigndoc.pdf
>
>
> after oozie server is back up after some down time, coordinator job should 
> have the option of recovering no actions. We will add a "NONE" option in 
> execution strategy of controls section of a coordinator job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1305) coordinator job should have an option to recover none of actions after downtime

2013-05-30 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13670660#comment-13670660
 ] 

Alejandro Abdelnur commented on OOZIE-1305:
---

The idea seems OK. Instead creating a new HEARTBEAT table, why not using the 
OOZIE_SYS table (created by OozieDBCLI)? We would just have to define the 
proper Bean for it, then we can have a row there to keep the heartbeat.

> coordinator job should have an option to recover none of actions after 
> downtime
> ---
>
> Key: OOZIE-1305
> URL: https://issues.apache.org/jira/browse/OOZIE-1305
> Project: Oozie
>  Issue Type: New Feature
>Reporter: Bowen Zhang
> Attachments: oozierecoverydesigndoc.pdf
>
>
> after oozie server is back up after some down time, coordinator job should 
> have the option of recovering no actions. We will add a "NONE" option in 
> execution strategy of controls section of a coordinator job

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1319) "LAST_ONLY" in execution control for coordinator job still runs all the actions

2013-05-29 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669883#comment-13669883
 ] 

Alejandro Abdelnur commented on OOZIE-1319:
---

no need to reupload, i've kicked the build for this patch manually.

> "LAST_ONLY" in execution control for coordinator job still runs all the 
> actions
> ---
>
> Key: OOZIE-1319
> URL: https://issues.apache.org/jira/browse/OOZIE-1319
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1319.patch
>
>
> In execute() of CoordJobGetReadyActionsJPAExecutor.java, once we retrieve the 
> top item from a "LIFO" query result, we do not discard or delete the 
> remaining items from the result list. As a result, the next time execute() is 
> invoked, we will be retrieving the next item in line. Consequently, LAST_ONLY 
> strategy will also execute all ready actions for a given coordinator job, 
> making it no different than LIFO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1349) oozieCLI -Doozie.auth.token.cache doesn't work

2013-05-29 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669872#comment-13669872
 ] 

Alejandro Abdelnur commented on OOZIE-1349:
---

+1, pending jenkins.

> oozieCLI -Doozie.auth.token.cache doesn't work
> --
>
> Key: OOZIE-1349
> URL: https://issues.apache.org/jira/browse/OOZIE-1349
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1349.patch
>
>
> under main method in OozieCLI.java, instead of calling 
> System.getProperties().contains(AuthOozieClient.USE_AUTH_TOKEN_CACHE_SYS_PROP),
>  we should call containsKey since we are checking if the key is set or not, 
> not if the value is set or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1319) "LAST_ONLY" in execution control for coordinator job still runs all the actions

2013-05-28 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13668822#comment-13668822
 ] 

Alejandro Abdelnur commented on OOZIE-1319:
---

Apologies for the delay. patch LGTM. One more thing we may have to take care is 
in the purge logic, can you please check this?

Any reason why it has not been marked as 'submit patch' for test-patch to run?


> "LAST_ONLY" in execution control for coordinator job still runs all the 
> actions
> ---
>
> Key: OOZIE-1319
> URL: https://issues.apache.org/jira/browse/OOZIE-1319
> Project: Oozie
>  Issue Type: Bug
>Reporter: Bowen Zhang
>Assignee: Bowen Zhang
> Attachments: oozie-1319.patch
>
>
> In execute() of CoordJobGetReadyActionsJPAExecutor.java, once we retrieve the 
> top item from a "LIFO" query result, we do not discard or delete the 
> remaining items from the result list. As a result, the next time execute() is 
> invoked, we will be retrieving the next item in line. Consequently, LAST_ONLY 
> strategy will also execute all ready actions for a given coordinator job, 
> making it no different than LIFO.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1387) Proxysubmission from the Oozie client doesn't allow the mapreduce API

2013-05-21 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663394#comment-13663394
 ] 

Alejandro Abdelnur commented on OOZIE-1387:
---

+1 pending jenkins.

> Proxysubmission from the Oozie client doesn't allow the mapreduce API
> -
>
> Key: OOZIE-1387
> URL: https://issues.apache.org/jira/browse/OOZIE-1387
> Project: Oozie
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.3.2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Fix For: trunk
>
> Attachments: OOZIE-1387.patch
>
>
> The OozieClient looks for {{mapred.mapper.class}} and 
> {{mapred.reducer.class}} so they have to be specified. If a user wants to use 
> a mapper/reducer that inherits from mapreduce.Mapper/Reducer (i.e. mapreduce 
> API), Oozie will force them to use mapred.mapper/reducer.class to specify 
> their mapper/reducer, which will make Hadoop throw an exception during the 
> job.
> We should update the checking to allow either {{mapred.mapper/reducer.class}} 
> or {{mapreduce.mapper/reducer.class}} as long as one or the other is 
> specified.
> This isn't a problem for a regular MR action or the REST API because its 
> purely an Oozie client check (OozieCLI#mrCommand).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1373) Oozie compilation fails with jdk7

2013-05-16 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659931#comment-13659931
 ] 

Alejandro Abdelnur commented on OOZIE-1373:
---

+1 pending jenkins.

> Oozie compilation fails with jdk7
> -
>
> Key: OOZIE-1373
> URL: https://issues.apache.org/jira/browse/OOZIE-1373
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: 3.3.2
>Reporter: Rohini Palaniswamy
>Assignee: Rohini Palaniswamy
> Fix For: trunk
>
> Attachments: enabling-jdk7-build-support.patch, OOZIE-1373.patch
>
>
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile 
> (default-testCompile) on project oozie-tools: Compilation failure: 
> Compilation failure:
> [ERROR] 
> /projects/apache/trunk/oozie/tools/src/test/java/org/apache/oozie/tools/FakeDriver.java:[27,7]
>  error: FakeDriver is not abstract and does not override abstract method 
> getParentLogger() in Driver
> [ERROR] 
> /projects/apache/trunk/oozie/tools/src/test/java/org/apache/oozie/tools/FakeConnection.java:[53,7]
>  error: FakeConnection is not abstract and does not override abstract method 
> getNetworkTimeout() in Connection
> [ERROR] 
> /projects/apache/trunk/oozie/tools/src/test/java/org/apache/oozie/tools/FakeConnection.java:[330,19]
>  error: FakeResultSet is not abstract and does not override abstract method 
> getObject(String,Class) in ResultSet

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1377) OpenJPA runtime enhancement should be disabled

2013-05-15 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658799#comment-13658799
 ] 

Alejandro Abdelnur commented on OOZIE-1377:
---

sure, will do on commit. thx

> OpenJPA runtime enhancement should be disabled
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (OOZIE-1377) OpenJPA runtime enhancement should be disabled

2013-05-15 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/OOZIE-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658757#comment-13658757
 ] 

Alejandro Abdelnur commented on OOZIE-1377:
---

* test failure is unrelated.
* patch does not add tests, but the new configuration is verified by all tests 
running.
* this does not break backwards compatibility, the warkning is false positive.

> OpenJPA runtime enhancement should be disabled
> --
>
> Key: OOZIE-1377
> URL: https://issues.apache.org/jira/browse/OOZIE-1377
> Project: Oozie
>  Issue Type: Bug
>  Components: core
>Affects Versions: 3.3.2
>Reporter: Alejandro Abdelnur
>Assignee: Alejandro Abdelnur
> Attachments: OOZIE-1377.patch
>
>
> The persistence.xml has runtime enhancement enabled.
> We are running into OOM (due to a leak in OpenJPA) under certain usage 
> patterns.
> While we are enhancing the classes at build time, the problem still persists.
> After checking with the openjpa folks 
> (http://mail-archives.apache.org/mod_mbox/openjpa-users/201305.mbox/%3CCABH8ernmKrQMf_2JMdOdnd6Xh7%3DDRFwSf6s4NL8tuE0ZHfZunA%40mail.gmail.com%3E)
>  we disabled the runtime enhancement and the OOM issue went away.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   3   4   5   >