[jira] [Created] (OOZIE-3610) Need to improve Oozie job handling if oozie launcher Map is killed

2020-09-15 Thread Shubham (Jira)
Shubham created OOZIE-3610:
--

 Summary: Need to improve Oozie job handling if oozie launcher Map 
is killed
 Key: OOZIE-3610
 URL: https://issues.apache.org/jira/browse/OOZIE-3610
 Project: Oozie
  Issue Type: Improvement
Reporter: Shubham


 
 - Due to Yarn Pre-emption, Yarn is killing Map task, which results in new 
Spark job. So we have duplicate job.

 - We were able to reproduce issue after killing running Oozie launcher Map 
task.

 - To fix above issue, we applied OOZIE-2194, which kills running Spark Job on 
map task kill(launcher job) and launch new one.

 - If existing job is killed, we are wasting progress made by existing running 
job and it will again start from beginning



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (OOZIE-3434) Filtering for invalid jobtype should give error message

2020-03-02 Thread Shubham (Jira)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham updated OOZIE-3434:
---
Attachment: OOZIE-3434-001.patch

> Filtering for invalid jobtype should give error message
> ---
>
> Key: OOZIE-3434
> URL: https://issues.apache.org/jira/browse/OOZIE-3434
> Project: Oozie
>  Issue Type: Bug
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Minor
> Attachments: OOZIE-3434-001.patch
>
>
> It is possible for specifying the jobtype during jobs filter in the REST API 
> and in the command line as well. Supported jobtypes are: {{wf}}, {{coord}}, 
> {{bundle}}
> If we filter for an invalid jobtype then the command line version does not 
> print anything:
> {noformat}$ oozie jobs -jobtype xxx
> {noformat}
> The REST api gives a JSON containing a NullPointerException:
> {noformat}/ 20190219121114
> // http://localhost:11000/oozie/v1/jobs?jobtype=xxx
> {
>   "errorMessage": "java.lang.NullPointerException",
>   "httpStatusCode": 500
> }
> {noformat}
> We should give a meaningful error message in this case like {{unrecognized 
> jobtype: xxx}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (OOZIE-3391) Fix TestSsh tests

2019-08-06 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham reassigned OOZIE-3391:
--

Assignee: Shubham

> Fix TestSsh tests
> -
>
> Key: OOZIE-3391
> URL: https://issues.apache.org/jira/browse/OOZIE-3391
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Major
>
> The root {{pom.xml}} turns off {{TestSsh*}} tests:
> {noformat}
> **/${test.exclude}.java
> ${test.exclude.pattern}
> 
> **/TestSsh*.java
> 
> {noformat}
> Although originally it meant to turn off only {{TestSshActionExecutor}} and 
> {{TestSshActionExecutorExtension}} tests, it also turns off the newer Fluent 
> Job tests:
> {noformat}TestSshAction.java
> TestSshActionMapping.java
> TestSshActionBuilder.java
> {noformat}
> We should fix the tests. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (OOZIE-3391) Fix TestSsh tests

2019-07-05 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16879026#comment-16879026
 ] 

Shubham commented on OOZIE-3391:


[~asalamon74], Please go ahead.

I could not take a look at it.

> Fix TestSsh tests
> -
>
> Key: OOZIE-3391
> URL: https://issues.apache.org/jira/browse/OOZIE-3391
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Major
>
> The root {{pom.xml}} turns off {{TestSsh*}} tests:
> {noformat}
> **/${test.exclude}.java
> ${test.exclude.pattern}
> 
> **/TestSsh*.java
> 
> {noformat}
> Although originally it meant to turn off only {{TestSshActionExecutor}} and 
> {{TestSshActionExecutorExtension}} tests, it also turns off the newer Fluent 
> Job tests:
> {noformat}TestSshAction.java
> TestSshActionMapping.java
> TestSshActionBuilder.java
> {noformat}
> We should fix the tests. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3435) Inconsistent jobtype filtering

2019-03-27 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803602#comment-16803602
 ] 

Shubham commented on OOZIE-3435:


Thanks for inputs.

Since documentation mentions 'coordinator', I was only concerned about jobtype 
- 'coordinator' if we replace contains/startsWith() with equals().

As you mentioned, we can handle both 'coord' and 'coordinator', so it should be 
fine.

> Inconsistent jobtype filtering
> --
>
> Key: OOZIE-3435
> URL: https://issues.apache.org/jira/browse/OOZIE-3435
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Minor
>
> It is possible for specifying the jobtype during jobs filter in the REST API 
> and in the command line as well. Supported jobtypes are: {{wf}}, {{coord}}, 
> {{bundle}}
> For some reason Oozie checks the jobtype using {{contains}} and 
> {{startsWith}} also in 
> [OozieCLI|https://github.com/apache/oozie/blob/master/client/src/main/java/org/apache/oozie/cli/OozieCLI.java#L1757-L1765]:
> {noformat}
> else if (jobtype.toLowerCase().contains("wf")) {
> ...
> }
> else if (jobtype.toLowerCase().startsWith("coord")) {
> ...
> }
> else if (jobtype.toLowerCase().startsWith("bundle")) {
> ...
> }
> {noformat}
> and in 
> [V1JobsServlet|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/servlet/V1JobsServlet.java#L302-L310]:
> {noformat}
> if (jobtype.contains("wf")) {
> json = getWorkflowJobs(request);
> }
> else if (jobtype.contains("coord")) {
> json = getCoordinatorJobs(request);
> }
> else if (jobtype.contains("bundle")) {
> json = getBundleJobs(request);
> }
> {noformat}
> This has several strange side effects:
>  * It is possible to filter using jobtypes like: {{wfxxx}}, {{coord}}, 
> {{bundle, xxxwfxxx}}.
>  * It is possible to filter for coordinators using {{xxxcoordxxx}} in REST 
> API but it's not working in CLI.
>  * It is possible to filter for {{coordwf}}, this will filter for 
> workflows.
>  * Filtering for {{bundlecoord}} will list the bundles in CLI and it will 
> list the coordinators in the REST API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OOZIE-3337) Improve the documentation of Web Services API / Standard Job Submission

2019-03-27 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham reassigned OOZIE-3337:
--

Assignee: Shubham

> Improve the documentation of Web Services API / Standard Job Submission
> ---
>
> Key: OOZIE-3337
> URL: https://issues.apache.org/jira/browse/OOZIE-3337
> Project: Oozie
>  Issue Type: Improvement
>  Components: docs
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Minor
>  Labels: newbie
>
> The Web Services API documentation shows an example where two properties are 
> defined ({{user.name}} and {{oozie.wf.application.path}}), but the first 
> property is not used in the value tag of the second property.
>   
> {noformat}
> 
> user.name
> bansalm
> 
> 
> oozie.wf.application.path
> hdfs://foo:8020/user/bansalm/myapp/
> 
> {noformat}
> We should use {{$\{user.name}}} instead of {{bansalm}} in the second value.
> We should also show an example where the {{workflow.xml}} contains some 
> {{$\{variable}}} and the XML specifies the value for the {{variable}}. This 
> will show that it is possible to submit parametrized jobs using the Web 
> Services REST API.
>   
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OOZIE-3391) Fix TestSsh tests

2019-03-27 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham reassigned OOZIE-3391:
--

Assignee: Shubham

> Fix TestSsh tests
> -
>
> Key: OOZIE-3391
> URL: https://issues.apache.org/jira/browse/OOZIE-3391
> Project: Oozie
>  Issue Type: Improvement
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Major
>
> The root {{pom.xml}} turns off {{TestSsh*}} tests:
> {noformat}
> **/${test.exclude}.java
> ${test.exclude.pattern}
> 
> **/TestSsh*.java
> 
> {noformat}
> Although originally it meant to turn off only {{TestSshActionExecutor}} and 
> {{TestSshActionExecutorExtension}} tests, it also turns off the newer Fluent 
> Job tests:
> {noformat}TestSshAction.java
> TestSshActionMapping.java
> TestSshActionBuilder.java
> {noformat}
> We should fix the tests. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OOZIE-3402) [client] Error message for missing workflow definition contains error code duplication

2019-03-27 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham reassigned OOZIE-3402:
--

Assignee: Shubham

> [client] Error message for missing workflow definition contains error code 
> duplication
> --
>
> Key: OOZIE-3402
> URL: https://issues.apache.org/jira/browse/OOZIE-3402
> Project: Oozie
>  Issue Type: Bug
>  Components: client
>Affects Versions: 5.1.0
>Reporter: Andras Piros
>Assignee: Shubham
>Priority: Minor
>
> When a workflow definition is not present on HDFS, and the property 
> {{oozie.jobs.api.generated.xml}} is not present in the HTTP request sent to 
> {{V2JobServlet}} by {{OozieCLI}}, following error message is displayed:
> {noformat}
> Error: E0307 : E0307: Runtime error [App directory
> [hdfs://localhost:9000/user/root/examples/apps/subwf] does not exist and app 
> definition cannot be created because of missing config value 
> [oozie.jobs.api.generated.xml]
> {noformat}
> The error code {{E0307}} is displayed two times, need to remove one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (OOZIE-3435) Inconsistent jobtype filtering

2019-03-27 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802803#comment-16802803
 ] 

Shubham edited comment on OOZIE-3435 at 3/27/19 1:51 PM:
-

[~asalamon74], Should we replace contains/startsWith() with equals() instead, 
to make it more consistent and clear? Because right now its kind of misleading 
as you mentioned in your description.

I believe if we change this to equals, then it may require document changes as 
well.

Let me know your thoughts.


was (Author: shubham.chhabra):
[~asalamon74], Should we replace contains/startsWith() with equals() instead? 

I believe if we change this to equals, then it may require document changes as 
well.

Let me know your thoughts.

> Inconsistent jobtype filtering
> --
>
> Key: OOZIE-3435
> URL: https://issues.apache.org/jira/browse/OOZIE-3435
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Minor
>
> It is possible for specifying the jobtype during jobs filter in the REST API 
> and in the command line as well. Supported jobtypes are: {{wf}}, {{coord}}, 
> {{bundle}}
> For some reason Oozie checks the jobtype using {{contains}} and 
> {{startsWith}} also in 
> [OozieCLI|https://github.com/apache/oozie/blob/master/client/src/main/java/org/apache/oozie/cli/OozieCLI.java#L1757-L1765]:
> {noformat}
> else if (jobtype.toLowerCase().contains("wf")) {
> ...
> }
> else if (jobtype.toLowerCase().startsWith("coord")) {
> ...
> }
> else if (jobtype.toLowerCase().startsWith("bundle")) {
> ...
> }
> {noformat}
> and in 
> [V1JobsServlet|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/servlet/V1JobsServlet.java#L302-L310]:
> {noformat}
> if (jobtype.contains("wf")) {
> json = getWorkflowJobs(request);
> }
> else if (jobtype.contains("coord")) {
> json = getCoordinatorJobs(request);
> }
> else if (jobtype.contains("bundle")) {
> json = getBundleJobs(request);
> }
> {noformat}
> This has several strange side effects:
>  * It is possible to filter using jobtypes like: {{wfxxx}}, {{coord}}, 
> {{bundle, xxxwfxxx}}.
>  * It is possible to filter for coordinators using {{xxxcoordxxx}} in REST 
> API but it's not working in CLI.
>  * It is possible to filter for {{coordwf}}, this will filter for 
> workflows.
>  * Filtering for {{bundlecoord}} will list the bundles in CLI and it will 
> list the coordinators in the REST API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OOZIE-3435) Inconsistent jobtype filtering

2019-03-27 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham reassigned OOZIE-3435:
--

Assignee: Shubham

> Inconsistent jobtype filtering
> --
>
> Key: OOZIE-3435
> URL: https://issues.apache.org/jira/browse/OOZIE-3435
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: trunk
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Minor
>
> It is possible for specifying the jobtype during jobs filter in the REST API 
> and in the command line as well. Supported jobtypes are: {{wf}}, {{coord}}, 
> {{bundle}}
> For some reason Oozie checks the jobtype using {{contains}} and 
> {{startsWith}} also in 
> [OozieCLI|https://github.com/apache/oozie/blob/master/client/src/main/java/org/apache/oozie/cli/OozieCLI.java#L1757-L1765]:
> {noformat}
> else if (jobtype.toLowerCase().contains("wf")) {
> ...
> }
> else if (jobtype.toLowerCase().startsWith("coord")) {
> ...
> }
> else if (jobtype.toLowerCase().startsWith("bundle")) {
> ...
> }
> {noformat}
> and in 
> [V1JobsServlet|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/servlet/V1JobsServlet.java#L302-L310]:
> {noformat}
> if (jobtype.contains("wf")) {
> json = getWorkflowJobs(request);
> }
> else if (jobtype.contains("coord")) {
> json = getCoordinatorJobs(request);
> }
> else if (jobtype.contains("bundle")) {
> json = getBundleJobs(request);
> }
> {noformat}
> This has several strange side effects:
>  * It is possible to filter using jobtypes like: {{wfxxx}}, {{coord}}, 
> {{bundle, xxxwfxxx}}.
>  * It is possible to filter for coordinators using {{xxxcoordxxx}} in REST 
> API but it's not working in CLI.
>  * It is possible to filter for {{coordwf}}, this will filter for 
> workflows.
>  * Filtering for {{bundlecoord}} will list the bundles in CLI and it will 
> list the coordinators in the REST API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (OOZIE-3434) Filtering for invalid jobtype should give error message

2019-03-04 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham reassigned OOZIE-3434:
--

Assignee: Shubham

> Filtering for invalid jobtype should give error message
> ---
>
> Key: OOZIE-3434
> URL: https://issues.apache.org/jira/browse/OOZIE-3434
> Project: Oozie
>  Issue Type: Bug
>Reporter: Andras Salamon
>Assignee: Shubham
>Priority: Minor
>
> It is possible for specifying the jobtype during jobs filter in the REST API 
> and in the command line as well. Supported jobtypes are: {{wf}}, {{coord}}, 
> {{bundle}}
> If we filter for an invalid jobtype then the command line version does not 
> print anything:
> {noformat}$ oozie jobs -jobtype xxx
> {noformat}
> The REST api gives a JSON containing a NullPointerException:
> {noformat}/ 20190219121114
> // http://localhost:11000/oozie/v1/jobs?jobtype=xxx
> {
>   "errorMessage": "java.lang.NullPointerException",
>   "httpStatusCode": 500
> }
> {noformat}
> We should give a meaningful error message in this case like {{unrecognized 
> jobtype: xxx}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774070#comment-16774070
 ] 

Shubham commented on OOZIE-3439:


[~kmarton], Thanks a lot for committing this into master.

> Hive2 action is not parsing application ID for TEZ from log file properly
> -
>
> Key: OOZIE-3439
> URL: https://issues.apache.org/jira/browse/OOZIE-3439
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Shubham
>Assignee: Shubham
>Priority: Major
> Fix For: 5.2.0
>
> Attachments: OOZIE-3439-001.patch
>
>
> Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
> able to find child job ids.
> I looked at the code and found that pattern is not correct for hive2 action 
> logs generated in usercache.
> {code:java}
> static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {
> Pattern.compile("Ended Job = (job_\\S*)"),
>  Pattern.compile("Submitted application (application[0-9_]*)"),
>  Pattern.compile("Running with YARN Application = (application[0-9_]*)")
> }
> {code}
> Adding below pattern should help in getting Hive 2 action Tez application id
> {code:java}
> Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774016#comment-16774016
 ] 

Shubham commented on OOZIE-3439:


Thanks for quick response [~kmarton].

> Hive2 action is not parsing application ID for TEZ from log file properly
> -
>
> Key: OOZIE-3439
> URL: https://issues.apache.org/jira/browse/OOZIE-3439
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Shubham
>Assignee: Shubham
>Priority: Major
> Attachments: OOZIE-3439-001.patch
>
>
> Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
> able to find child job ids.
> I looked at the code and found that pattern is not correct for hive2 action 
> logs generated in usercache.
> {code:java}
> static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {
> Pattern.compile("Ended Job = (job_\\S*)"),
>  Pattern.compile("Submitted application (application[0-9_]*)"),
>  Pattern.compile("Running with YARN Application = (application[0-9_]*)")
> }
> {code}
> Adding below pattern should help in getting Hive 2 action Tez application id
> {code:java}
> Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774007#comment-16774007
 ] 

Shubham commented on OOZIE-3439:


[~kmarton],

Hive1 action log files has different log entries, so pattern is different for 
Hive1. (https://issues.apache.org/jira/browse/OOZIE-2112)

 

Hive1 :

{code}

2019-02-20 14:01:36,055 [main] INFO 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
application_1550671202870_0002
2019-02-20 14:01:46,498 [main] INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager - The current user: 
hive, session user: hive
2019-02-20 14:01:46,498 [main] INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager - Current queue name 
is default incoming queue name  

{code}

But for Hive2, we do not have same log entries for yarn application.

Hive2:

{code}

INFO : Status: Running (Executing on YARN cluster with App id 
application_1550671202870_0004)

ESC[2K
ESC[2KESC[36;1m VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
ESC[22;0mESC[2K
ESC[2KMap 1 .. SUCCEEDED 1 1 0 0 0 0
ESC[2K
ESC[2KESC[31;1mVERTICES: 01/01 [==>>] 100% ELAPSED 
TIME: 5.10 s 
ESC[22;0mESC[2K

{code}

 

 

> Hive2 action is not parsing application ID for TEZ from log file properly
> -
>
> Key: OOZIE-3439
> URL: https://issues.apache.org/jira/browse/OOZIE-3439
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Shubham
>Assignee: Shubham
>Priority: Major
> Attachments: OOZIE-3439-001.patch
>
>
> Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
> able to find child job ids.
> I looked at the code and found that pattern is not correct for hive2 action 
> logs generated in usercache.
> {code:java}
> static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {
> Pattern.compile("Ended Job = (job_\\S*)"),
>  Pattern.compile("Submitted application (application[0-9_]*)"),
>  Pattern.compile("Running with YARN Application = (application[0-9_]*)")
> }
> {code}
> Adding below pattern should help in getting Hive 2 action Tez application id
> {code:java}
> Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774007#comment-16774007
 ] 

Shubham edited comment on OOZIE-3439 at 2/21/19 11:52 AM:
--

[~kmarton],

Hive1 action log files has different log entries, so pattern is different for 
Hive1. (https://issues.apache.org/jira/browse/OOZIE-2112)

 Hive1 :
{code:java}
2019-02-20 14:01:36,055 [main] INFO 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
application_1550671202870_0002
2019-02-20 14:01:46,498 [main] INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager - The current user: 
hive, session user: hive
2019-02-20 14:01:46,498 [main] INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager - Current queue name 
is default incoming queue name  

{code}
But for Hive2, we do not have same log entries for yarn application.

Hive2:
{code:java}
INFO : Status: Running (Executing on YARN cluster with App id 
application_1550671202870_0004)

ESC[2K
ESC[2KESC[36;1m VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
ESC[22;0mESC[2K
ESC[2KMap 1 .. SUCCEEDED 1 1 0 0 0 0
ESC[2K
ESC[2KESC[31;1mVERTICES: 01/01 [==>>] 100% ELAPSED 
TIME: 5.10 s 
ESC[22;0mESC[2K

{code}
 

 


was (Author: shubham.chhabra):
[~kmarton],

Hive1 action log files has different log entries, so pattern is different for 
Hive1. (https://issues.apache.org/jira/browse/OOZIE-2112)

 

Hive1 :

{code}

2019-02-20 14:01:36,055 [main] INFO 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
application_1550671202870_0002
2019-02-20 14:01:46,498 [main] INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager - The current user: 
hive, session user: hive
2019-02-20 14:01:46,498 [main] INFO 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager - Current queue name 
is default incoming queue name  

{code}

But for Hive2, we do not have same log entries for yarn application.

Hive2:

{code}

INFO : Status: Running (Executing on YARN cluster with App id 
application_1550671202870_0004)

ESC[2K
ESC[2KESC[36;1m VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
ESC[22;0mESC[2K
ESC[2KMap 1 .. SUCCEEDED 1 1 0 0 0 0
ESC[2K
ESC[2KESC[31;1mVERTICES: 01/01 [==>>] 100% ELAPSED 
TIME: 5.10 s 
ESC[22;0mESC[2K

{code}

 

 

> Hive2 action is not parsing application ID for TEZ from log file properly
> -
>
> Key: OOZIE-3439
> URL: https://issues.apache.org/jira/browse/OOZIE-3439
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Shubham
>Assignee: Shubham
>Priority: Major
> Attachments: OOZIE-3439-001.patch
>
>
> Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
> able to find child job ids.
> I looked at the code and found that pattern is not correct for hive2 action 
> logs generated in usercache.
> {code:java}
> static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {
> Pattern.compile("Ended Job = (job_\\S*)"),
>  Pattern.compile("Submitted application (application[0-9_]*)"),
>  Pattern.compile("Running with YARN Application = (application[0-9_]*)")
> }
> {code}
> Adding below pattern should help in getting Hive 2 action Tez application id
> {code:java}
> Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham updated OOZIE-3439:
---
Attachment: OOZIE-3439-001.patch

> Hive2 action is not parsing application ID for TEZ from log file properly
> -
>
> Key: OOZIE-3439
> URL: https://issues.apache.org/jira/browse/OOZIE-3439
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Shubham
>Priority: Major
> Attachments: OOZIE-3439-001.patch
>
>
> Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
> able to find child job ids.
> I looked at the code and found that pattern is not correct for hive2 action 
> logs generated in usercache.
> {code:java}
> static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {
> Pattern.compile("Ended Job = (job_\\S*)"),
>  Pattern.compile("Submitted application (application[0-9_]*)"),
>  Pattern.compile("Running with YARN Application = (application[0-9_]*)")
> }
> {code}
> Adding below pattern should help in getting Hive 2 action Tez application id
> {code:java}
> Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)


 [ 
https://issues.apache.org/jira/browse/OOZIE-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham updated OOZIE-3439:
---
Description: 
Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
able to find child job ids.

I looked at the code and found that pattern is not correct for hive2 action 
logs generated in usercache.
{code:java}
static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {

Pattern.compile("Ended Job = (job_\\S*)"),
 Pattern.compile("Submitted application (application[0-9_]*)"),
 Pattern.compile("Running with YARN Application = (application[0-9_]*)")
}

{code}
Adding below pattern should help in getting Hive 2 action Tez application id
{code:java}
Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),

{code}

  was:
Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
able to find child job ids.

I looked at the code and found that pattern is not correct for hive2 action 
logs generated in usercache.

 

{code}static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {

Pattern.compile("Ended Job = (job_\\S*)"),
 Pattern.compile("Submitted application (application[0-9_]*)"),
 Pattern.compile("Running with YARN Application = (application[0-9_]*)")
}

{code}

 

Adding below pattern should help in getting Hive 2 action Tez application id

{code}

Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),

{code}


> Hive2 action is not parsing application ID for TEZ from log file properly
> -
>
> Key: OOZIE-3439
> URL: https://issues.apache.org/jira/browse/OOZIE-3439
> Project: Oozie
>  Issue Type: Bug
>  Components: action
>Affects Versions: trunk
>Reporter: Shubham
>Priority: Major
>
> Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
> able to find child job ids.
> I looked at the code and found that pattern is not correct for hive2 action 
> logs generated in usercache.
> {code:java}
> static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {
> Pattern.compile("Ended Job = (job_\\S*)"),
>  Pattern.compile("Submitted application (application[0-9_]*)"),
>  Pattern.compile("Running with YARN Application = (application[0-9_]*)")
> }
> {code}
> Adding below pattern should help in getting Hive 2 action Tez application id
> {code:java}
> Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (OOZIE-3439) Hive2 action is not parsing application ID for TEZ from log file properly

2019-02-21 Thread Shubham (JIRA)
Shubham created OOZIE-3439:
--

 Summary: Hive2 action is not parsing application ID for TEZ from 
log file properly
 Key: OOZIE-3439
 URL: https://issues.apache.org/jira/browse/OOZIE-3439
 Project: Oozie
  Issue Type: Bug
  Components: action
Affects Versions: trunk
Reporter: Shubham


Oozie workflow does not populate ChildJobUrl for Hive2 Action while Hive1 is 
able to find child job ids.

I looked at the code and found that pattern is not correct for hive2 action 
logs generated in usercache.

 

{code}static final Pattern[] HIVE2_JOB_IDS_PATTERNS = {

Pattern.compile("Ended Job = (job_\\S*)"),
 Pattern.compile("Submitted application (application[0-9_]*)"),
 Pattern.compile("Running with YARN Application = (application[0-9_]*)")
}

{code}

 

Adding below pattern should help in getting Hive 2 action Tez application id

{code}

Pattern.compile("Executing on YARN cluster with App id (application[0-9_]*)"),

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OOZIE-3273) FsAction should fail on retry if destination path exists

2018-07-20 Thread Shubham (JIRA)


[ 
https://issues.apache.org/jira/browse/OOZIE-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550779#comment-16550779
 ] 

Shubham commented on OOZIE-3273:


Right [~matijhs]. If action is not able to do movement in second attempt, 
Ideally it should fail.

> FsAction should fail on retry if destination path exists
> 
>
> Key: OOZIE-3273
> URL: https://issues.apache.org/jira/browse/OOZIE-3273
> Project: Oozie
>  Issue Type: Bug
>Affects Versions: 4.2.0
>Reporter: Shubham
>Priority: Major
>
> This FsAction fails with error code FS008 if the source files already exist 
> in target folder.
> The expected behavior should be that Oozie will try this action once again 
> after 1 minute, and marked the action as failed because the error is still 
> there.
> However, Oozie marks the action as success on retry. (we didn't clean up the 
> target folder)
> Logs:
> {code}
> 2018-05-15 00:08:05,187 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> ACTION[061-180514024838863-oozie-oozi-W@loading] Start action 
> [061-180514024838863-oozie-oozi-W@loading] with user-retry state : 
> userRetryCount [0], userRetryMax [2], userRetryInterval [1]
> 2018-05-15 00:08:05,201 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> ACTION[061-180514024838863-oozie-oozi-W@loading] Error starting action 
> [load-staging]. ErrorType [ERROR], ErrorCode [FS008], Message [FS008: move, 
> could not move 
> [hdfs://nn:8020/user/hive/audit/data/ingestion/USER_ACCOUNT_AF_A/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
>  to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]]
> org.apache.oozie.action.ActionExecutorException: FS008: move, could not move 
> [hdfs://nn:8020/user/hive/audit/data/ingestion/SAMPLE_WF/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
>  to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.move(FsActionExecutor.java:509)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
>  at 
> org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:609)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
>  at 
> org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:65)
>  at org.apache.oozie.command.XCommand.call(XCommand.java:287)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
>  at 
> org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> 2018-05-15 00:08:05,202 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> ACTION[061-180514024838863-oozie-oozi-W@loading] Setting Action Status to 
> [DONE]
> 2018-05-15 00:08:05,202 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> ACTION[061-180514024838863-oozie-oozi-W@loading] Preparing retry this 
> action [061-180514024838863-oozie-oozi-W@loading], errorCode [FS008], 
> userRetryCount [0], userRetryMax [2], userRetryInterval [1]
> 2018-05-15 00:09:05,254 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> ACTION[061-180514024838863-oozie-oozi-W@loading] Start action 
> [061-180514024838863-oozie-oozi-W@loading] with user-retry state : 
> userRetryCount [1], userRetryMax [2], userRetryInterval [1]
> 2018-05-15 00:09:05,276 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> ACTION[061-180514024838863-oozie-oozi-W@loading] 
> [***061-180514024838863-oozie-oozi-W@loading***]Action status=DONE
> 2018-05-15 00:09:05,277 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
> USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
> JOB[061-180514024838863-oozie-oozi-W] 
> 

[jira] [Created] (OOZIE-3273) FsAction should fail on retry if destination path exists

2018-06-04 Thread Shubham (JIRA)
Shubham created OOZIE-3273:
--

 Summary: FsAction should fail on retry if destination path exists
 Key: OOZIE-3273
 URL: https://issues.apache.org/jira/browse/OOZIE-3273
 Project: Oozie
  Issue Type: Bug
Reporter: Shubham


This FsAction fails with error code FS008 if the source files already exist in 
target folder.

The expected behavior should be that Oozie will try this action once again 
after 1 minute, and marked the action as failed because the error is still 
there.

However, Oozie marks the action as success on retry. (we didn't clean up the 
target folder)

Logs:

{code}
2018-05-15 00:08:05,187 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] Start action 
[061-180514024838863-oozie-oozi-W@loading] with user-retry state : 
userRetryCount [0], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:08:05,201 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] Error starting action 
[load-staging]. ErrorType [ERROR], ErrorCode [FS008], Message [FS008: move, 
could not move 
[hdfs://nn:8020/user/hive/audit/data/ingestion/USER_ACCOUNT_AF_A/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
 to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]]
org.apache.oozie.action.ActionExecutorException: FS008: move, could not move 
[hdfs://nn:8020/user/hive/audit/data/ingestion/SAMPLE_WF/1522284431816-2018-03-28_1747_11.816-PT2M10.096S-TEST.0-19462_24325-67401946-8fcf-4940-91ec-063016a5da48.avro]
 to [hdfs://nn:8020/user/hive/audit/data/staging/USER_ACCOUNT_AF_A]
 at 
org.apache.oozie.action.hadoop.FsActionExecutor.move(FsActionExecutor.java:509)
 at 
org.apache.oozie.action.hadoop.FsActionExecutor.doOperations(FsActionExecutor.java:199)
 at 
org.apache.oozie.action.hadoop.FsActionExecutor.start(FsActionExecutor.java:609)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:234)
 at 
org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:65)
 at org.apache.oozie.command.XCommand.call(XCommand.java:287)
 at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:331)
 at 
org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:260)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:178)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2018-05-15 00:08:05,202 WARN ActionStartXCommand:523 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] Setting Action Status to 
[DONE]
2018-05-15 00:08:05,202 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] Preparing retry this 
action [061-180514024838863-oozie-oozi-W@loading], errorCode [FS008], 
userRetryCount [0], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:09:05,254 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] Start action 
[061-180514024838863-oozie-oozi-W@loading] with user-retry state : 
userRetryCount [1], userRetryMax [2], userRetryInterval [1]
2018-05-15 00:09:05,276 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] 
[***061-180514024838863-oozie-oozi-W@loading***]Action status=DONE
2018-05-15 00:09:05,277 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@loading] 
[***061-180514024838863-oozie-oozi-W@loading***]Action updated in DB!
2018-05-15 00:09:05,314 INFO ActionStartXCommand:520 - SERVER[mn2.sf.priv] 
USER[hive] GROUP[-] TOKEN[] APP[sample-wf] 
JOB[061-180514024838863-oozie-oozi-W] 
ACTION[061-180514024838863-oozie-oozi-W@end] Start action 
[061-180514024838863-oozie-oozi-W@end] with user-retry state : 
userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2018-05-15 00:09:05,314 INFO