[jira] [Updated] (OOZIE-2457) Oozie log parsing regex consume more than 90% cpu
[ https://issues.apache.org/jira/browse/OOZIE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2457: --- Attachment: OOZIE-2457-1.patch > Oozie log parsing regex consume more than 90% cpu > - > > Key: OOZIE-2457 > URL: https://issues.apache.org/jira/browse/OOZIE-2457 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2457-1.patch > > > http-0.0.0.0-4080-26 TID=62215 STATE=RUNNABLE CPU_TIME=1992 (92.59%) > USER_TIME=1990 (92.46%) Allocted: 269156584 > java.util.regex.Pattern$Curly.match0(Pattern.java:4170) > java.util.regex.Pattern$Curly.match(Pattern.java:4132) > java.util.regex.Pattern$GroupHead.match(Pattern.java:4556) > java.util.regex.Matcher.match(Matcher.java:1221) > java.util.regex.Matcher.matches(Matcher.java:559) > org.apache.oozie.util.XLogFilter.matches(XLogFilter.java:136) > > org.apache.oozie.util.TimestampedMessageParser.parseNextLine(TimestampedMessageParser.java:145) > > org.apache.oozie.util.TimestampedMessageParser.increment(TimestampedMessageParser.java:92) > Regex > {code} > (.* USER\[[^\]]*\] GROUP\[[^\]]*\] TOKEN\[[^\]]*\] APP\[[^\]]*\] > JOB\[000-150625114739728-oozie-puru-W\] ACTION\[[^\]]*\] .*) > {code} > For single line parsing we use two regex. > 1. > {code} > public ArrayList splitLogMessage(String logLine) { > Matcher splitter = SPLITTER_PATTERN.matcher(logLine); > if (splitter.matches()) { > ArrayList logParts = new ArrayList(); > logParts.add(splitter.group(1));// timestamp > logParts.add(splitter.group(2));// log level > logParts.add(splitter.group(3));// Log Message > return logParts; > } > else { > return null; > } > } > {code} > 2. > {code} > public boolean matches(ArrayList logParts) { > if (getStartDate() != null) { > if (logParts.get(0).substring(0, > 19).compareTo(getFormattedStartDate()) < 0) { > return false; > } > } > String logLevel = logParts.get(1); > String logMessage = logParts.get(2); > if (this.logLevels == null || > this.logLevels.containsKey(logLevel.toUpperCase())) { > Matcher logMatcher = filterPattern.matcher(logMessage); > return logMatcher.matches(); > } > else { > return false; > } > } > {code} > Also there is repetitive parsing for same log message in > {code} > private String parseTimestamp(String line) { > String timestamp = null; > ArrayList logParts = filter.splitLogMessage(line); > if (logParts != null) { > timestamp = logParts.get(0); > } > return timestamp; > } > {code} > where the {{line}} has already parsed using regex and we already know the > {{logParts}} if any. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2457) Oozie log parsing regex consume more than 90% cpu
[ https://issues.apache.org/jira/browse/OOZIE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2457: --- Attachment: OOZIE-2457-2.patch > Oozie log parsing regex consume more than 90% cpu > - > > Key: OOZIE-2457 > URL: https://issues.apache.org/jira/browse/OOZIE-2457 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2457-1.patch, OOZIE-2457-2.patch > > > http-0.0.0.0-4080-26 TID=62215 STATE=RUNNABLE CPU_TIME=1992 (92.59%) > USER_TIME=1990 (92.46%) Allocted: 269156584 > java.util.regex.Pattern$Curly.match0(Pattern.java:4170) > java.util.regex.Pattern$Curly.match(Pattern.java:4132) > java.util.regex.Pattern$GroupHead.match(Pattern.java:4556) > java.util.regex.Matcher.match(Matcher.java:1221) > java.util.regex.Matcher.matches(Matcher.java:559) > org.apache.oozie.util.XLogFilter.matches(XLogFilter.java:136) > > org.apache.oozie.util.TimestampedMessageParser.parseNextLine(TimestampedMessageParser.java:145) > > org.apache.oozie.util.TimestampedMessageParser.increment(TimestampedMessageParser.java:92) > Regex > {code} > (.* USER\[[^\]]*\] GROUP\[[^\]]*\] TOKEN\[[^\]]*\] APP\[[^\]]*\] > JOB\[000-150625114739728-oozie-puru-W\] ACTION\[[^\]]*\] .*) > {code} > For single line parsing we use two regex. > 1. > {code} > public ArrayList splitLogMessage(String logLine) { > Matcher splitter = SPLITTER_PATTERN.matcher(logLine); > if (splitter.matches()) { > ArrayList logParts = new ArrayList(); > logParts.add(splitter.group(1));// timestamp > logParts.add(splitter.group(2));// log level > logParts.add(splitter.group(3));// Log Message > return logParts; > } > else { > return null; > } > } > {code} > 2. > {code} > public boolean matches(ArrayList logParts) { > if (getStartDate() != null) { > if (logParts.get(0).substring(0, > 19).compareTo(getFormattedStartDate()) < 0) { > return false; > } > } > String logLevel = logParts.get(1); > String logMessage = logParts.get(2); > if (this.logLevels == null || > this.logLevels.containsKey(logLevel.toUpperCase())) { > Matcher logMatcher = filterPattern.matcher(logMessage); > return logMatcher.matches(); > } > else { > return false; > } > } > {code} > Also there is repetitive parsing for same log message in > {code} > private String parseTimestamp(String line) { > String timestamp = null; > ArrayList logParts = filter.splitLogMessage(line); > if (logParts != null) { > timestamp = logParts.get(0); > } > return timestamp; > } > {code} > where the {{line}} has already parsed using regex and we already know the > {{logParts}} if any. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2461) Workflow, Coordinator and Bundle job querying should have last modified filter
Satish Subhashrao Saley created OOZIE-2461: -- Summary: Workflow, Coordinator and Bundle job querying should have last modified filter Key: OOZIE-2461 URL: https://issues.apache.org/jira/browse/OOZIE-2461 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley To get currently running coordinator and id, one user had to do http://localhost:11000/oozie/v1/jobs?jobtype=coord=user%3satish_1.0%3B=1=3000 They could not use name in the filter as they include a version and keep changing. For eg: urs_satish_filter-0.1-daily-coord urs_puru_service-0.4-hourly-coord It would be good to have last modified filter to get recently active coordinators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2447) Illegal character 0x0 oozie client
[ https://issues.apache.org/jira/browse/OOZIE-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2447: --- Attachment: OOZIE-2447-2.patch Updated patch. Using a different approach after discussing with puru. > Illegal character 0x0 oozie client > -- > > Key: OOZIE-2447 > URL: https://issues.apache.org/jira/browse/OOZIE-2447 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2447-1.patch, OOZIE-2447-2.patch > > > Sometimes oozie client query fails with below message: > Error: HTTP error code: 400 : Illegal character 0x0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2246) CoordinatorInputCheckCommand does not behave properly when har file is one of data dependency and doesn't exist
[ https://issues.apache.org/jira/browse/OOZIE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2246: --- Attachment: OOZIE-2246-v4.patch > CoordinatorInputCheckCommand does not behave properly when har file is one of > data dependency and doesn't exist > --- > > Key: OOZIE-2246 > URL: https://issues.apache.org/jira/browse/OOZIE-2246 > Project: Oozie > Issue Type: Bug >Reporter: Ryota Egashira >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2246-v2.patch, OOZIE-2246-v3.patch, > OOZIE-2246-v4.patch, OOZIE-2246.patch > > > when har file doesn't exist, FileSystem.createFileSystem in > HadoopAccessorService throw exception, failing CoordActionInputCheck. > thus, even thought there are other data dependencies which already exist, it > is not reflected on DB. > coordinator job cannot start until the har file becomes available anyway, and > once available, this error doesn't happen, so basic functionality is fine, > but it's misleading. > {code} > 2014-03-13 22:00:00,051 WARN CallableQueueService$CallableWrapper:542 > [pool-2-thread-288] - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] > exception callable [coord_action_input], E1021: Coord Action Input Check > Error: org.apache.oozie.service.HadoopAccessorException: E0902: Exception > occured: [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > org.apache.oozie.command.CommandException: E1021: Coord Action Input Check > Error: org.apache.oozie.service.HadoopAccessorException: E0902: Exception > occured: [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:182) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:67) > at org.apache.oozie.command.XCommand.call(XCommand.java:280) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.io.IOException: > org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: > [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.pathExists(CoordActionInputCheckXCommand.java:493) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkListOfPaths(CoordActionInputCheckXCommand.java:459) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkResolvedUris(CoordActionInputCheckXCommand.java:429) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkInput(CoordActionInputCheckXCommand.java:259) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:132) > ... 6 more > Caused by: org.apache.oozie.service.HadoopAccessorException: E0902: Exception > occured: [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > at > org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:430) > at > org.apache.oozie.dependency.FSURIHandler.getFileSystem(FSURIHandler.java:134) > at org.apache.oozie.dependency.FSURIHandler.exists(FSURIHandler.java:99) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.pathExists(CoordActionInputCheckXCommand.java:488) > ... 10 more > Caused by: java.io.IOException: Invalid path for the Har Filesystem. No index > file in har://:8020/data/2014031322/archive.har > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:139) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2160) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:303) > at > org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:422) > at > org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:420) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1284) > at > org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:420) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2446) Job does not fail during submission if non existent credential is specified
[ https://issues.apache.org/jira/browse/OOZIE-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2446: --- Attachment: OOZIE-2446-2.patch Thank you [~jaydeepvishwakarma] for review. I have made the changes you suggested. > Job does not fail during submission if non existent credential is specified > --- > > Key: OOZIE-2446 > URL: https://issues.apache.org/jira/browse/OOZIE-2446 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2446-1.patch, OOZIE-2446-2.patch > > > {noformat} > > > .. > > {noformat} > User had specified howlauth instead of hcatauth. Job was launched and failed > because it could not connect to hcat. It should have failed in the server > itself when no definition for howlauth credential was found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2444) Need conditional logic in bundles
[ https://issues.apache.org/jira/browse/OOZIE-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2444: --- Attachment: OOZIE-2444-5.patch correction in the test case > Need conditional logic in bundles > - > > Key: OOZIE-2444 > URL: https://issues.apache.org/jira/browse/OOZIE-2444 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-2444-1.patch, OOZIE-2444-2.patch, > OOZIE-2444-3.patch, OOZIE-2444-4-Doc.patch, OOZIE-2444-5.patch > > > Sometimes, the users have a semi-complicated pipeline that needs to run in > slightly different ways depending on whether they are running against live > data, reprocessing recent data, or reprocessing historical data from another > cluster. Instead of having to create multiple different bundles to capture > these various cases, it would be good to have some sort of conditional logic > in the bundle XML file that users can use to enable or disable specific > coordinators within the bundle based on the properties passed in. That way, > we can control, either from the properties file or from oozie command line > options, the coordinators that get run and the mode that overall pipeline is > processing in. > Ideally, this would be supported by extending the tag with a > new "enabled" attribute that takes a boolean expression and supports standard > expression syntax and functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2473) Connection pool for SMTP connection
[ https://issues.apache.org/jira/browse/OOZIE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2473: --- Summary: Connection pool for SMTP connection (was: Connection pool for SMPT connection) > Connection pool for SMTP connection > --- > > Key: OOZIE-2473 > URL: https://issues.apache.org/jira/browse/OOZIE-2473 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > > Currently, to send an email we setup new connection every time which seems > costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2473) Connection pool for SMPT connection
Satish Subhashrao Saley created OOZIE-2473: -- Summary: Connection pool for SMPT connection Key: OOZIE-2473 URL: https://issues.apache.org/jira/browse/OOZIE-2473 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley Currently, to send an email we setup new connection every time which seems costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2471) Show child job url tab for distcp
[ https://issues.apache.org/jira/browse/OOZIE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2471: --- Attachment: OOZIE-2471-2.patch > Show child job url tab for distcp > - > > Key: OOZIE-2471 > URL: https://issues.apache.org/jira/browse/OOZIE-2471 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2471-1.patch, OOZIE-2471-2.patch > > > The actual distcp job url is not displayed in Child Jobs tab and one has to > go to the launcher job URL to find it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2471) Show child job url tab for distcp
[ https://issues.apache.org/jira/browse/OOZIE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2471: --- Attachment: OOZIE-2471-1.patch > Show child job url tab for distcp > - > > Key: OOZIE-2471 > URL: https://issues.apache.org/jira/browse/OOZIE-2471 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2471-1.patch > > > The actual distcp job url is not displayed in Child Jobs tab and one has to > go to the launcher job URL to find it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2475) Oozie does not cleanup failed actions; fills up namespace quota
[ https://issues.apache.org/jira/browse/OOZIE-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2475: --- Attachment: OOZIE-2475-1.patch > Oozie does not cleanup failed actions; fills up namespace quota > --- > > Key: OOZIE-2475 > URL: https://issues.apache.org/jira/browse/OOZIE-2475 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2475-1.patch > > > There is corner case where leaking happens. > when workflow KillCommand is issued, WfEndXCommand is invoked in the end > finally block, and WfEndXCommand.deleteWFDir() kills action Dir (e.g., > /user/satish/oozie_satish/123450-15-oozie_satish-W) > but, when this happens right before launcher mapper uploading actionData to > HDFS, previously deleted actionDir is created again. the actionDir will not > be cleaned up after. > The solution is to delete clean up action dir in ActionKillXCommand. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2475) Oozie does not cleanup failed actions; fills up namespace quota
Satish Subhashrao Saley created OOZIE-2475: -- Summary: Oozie does not cleanup failed actions; fills up namespace quota Key: OOZIE-2475 URL: https://issues.apache.org/jira/browse/OOZIE-2475 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley There is corner case where leaking happens. when workflow KillCommand is issued, WfEndXCommand is invoked in the end finally block, and WfEndXCommand.deleteWFDir() kills action Dir (e.g., /user/satish/oozie_satish/123450-15-oozie_satish-W) but, when this happens right before launcher mapper uploading actionData to HDFS, previously deleted actionDir is created again. the actionDir will not be cleaned up after. The solution is to delete clean up action dir in ActionKillXCommand. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-1402) Increase retry interval for non-progressing coordinator action using exponential backoff concept
[ https://issues.apache.org/jira/browse/OOZIE-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-1402: --- Attachment: OOZIE-1402-1.patch > Increase retry interval for non-progressing coordinator action using > exponential backoff concept > - > > Key: OOZIE-1402 > URL: https://issues.apache.org/jira/browse/OOZIE-1402 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Mona Chitnis >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-1402-1.patch > > > Currently every coordinator action is retried to check data directory in the > next minute. > We could make it better by waiting longer for coordinator action that is not > progressing (i.e. find no new directory) for repeated retries > The waiting time should start from 1 minute for X retries. Then the action > should wait for 2 minutes. After X retries it should wait for 3. The same way > it will go to some max-wait-time and stay there until timeout -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2471) Show child job url tab for distcp
[ https://issues.apache.org/jira/browse/OOZIE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2471: --- Attachment: OOZIE-2471-4.patch minor change in logger > Show child job url tab for distcp > - > > Key: OOZIE-2471 > URL: https://issues.apache.org/jira/browse/OOZIE-2471 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2471-1.patch, OOZIE-2471-2.patch, > OOZIE-2471-3.patch, OOZIE-2471-4.patch > > > The actual distcp job url is not displayed in Child Jobs tab and one has to > go to the launcher job URL to find it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2503) show ChildJobURLs to spark action
[ https://issues.apache.org/jira/browse/OOZIE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2503: --- Attachment: OOZIE-2503-1.patch > show ChildJobURLs to spark action > - > > Key: OOZIE-2503 > URL: https://issues.apache.org/jira/browse/OOZIE-2503 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2503-1.patch > > > To support spark action in oozie, please add "ChildJobURLs" to spark action. > So the actual SPARK job info is accessible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-1735) Support resuming of failed coordinator job and rerun of a failed coordinator action
[ https://issues.apache.org/jira/browse/OOZIE-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-1735: --- Attachment: OOZIE-1735-Doc-V4.patch [Oozie documentation|https://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html#Pre-Conditions] says {quote} Rerun coordinator action must be in TIMEDOUT/SUCCEEDED/KILLED/FAILED. Coordinator actions cannot be rerun if the coordinator job is in the KILLED or FAILED state. {quote} It's not true after the support for resuming of failed coordinator job and rerun of a failed coordinator action was added. Uploading documentation change patch. > Support resuming of failed coordinator job and rerun of a failed coordinator > action > --- > > Key: OOZIE-1735 > URL: https://issues.apache.org/jira/browse/OOZIE-1735 > Project: Oozie > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: Purshotam Shah > Fix For: 4.1.0 > > Attachments: OOZIE-1735-Doc-V4.patch, OOZIE-1735-V2.patch, > OOZIE-1735-V2.patch, OOZIE-1735-V3.patch, OOZIE-1735_v1.patch > > > We should support resuming of failed coordinator job. Job are set to failed > if there are runtime error( like SQL timeout). > In current scenario there is no way to recover beside running SQL. > Resuming of failed coordinator job should also set pending to 1 ,reset > doneMaterialization and last modified to current time. So that > materialization continues. > We should also provide an option of resuming failed action. The behavior will > be same as killed option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2506) Add logs into RecoverService for logging information about queued commnads
[ https://issues.apache.org/jira/browse/OOZIE-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234472#comment-15234472 ] Satish Subhashrao Saley commented on OOZIE-2506: {{non-binding}} It would be good to have wording of logs in similar fashion for all of the three - bundle, coord and workflow. For bundle and workflow, we have {code} log.debug("Recover a bundle action from [KILLED] status and resubmit CoordKillXCommand :[{0}]", baction.getCoordId()); log.debug("Recover a workflow action from [{0}] status and resubmit ActionEndXCommand :[{1}]",action.getStatus(), action.getId()); {code} But for coord {code} log.debug("Recover a [KILLED] coord action and resubmit KillXCommand :[{0}]", caction.getId()); {code} It's nice to be consistent {code} log.debug("Recover a coord action from [KILLED] status and resubmit KillXCommand :[{0}]", caction.getId()); {code} > Add logs into RecoverService for logging information about queued commnads > -- > > Key: OOZIE-2506 > URL: https://issues.apache.org/jira/browse/OOZIE-2506 > Project: Oozie > Issue Type: Bug > Components: core >Reporter: abhishek bafna >Assignee: abhishek bafna > Fix For: 4.3.0 > > Attachments: OOZIE-2506-01.patch > > > Currently, RecoveryService does not logs information about workflow actions > commands and coordinator commands. It just logs the counter for different > commands got queued. Logging the different commands after queuing can be > helpful in debugging a system which have a lot work load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2503) show ChildJobURLs to spark action
Satish Subhashrao Saley created OOZIE-2503: -- Summary: show ChildJobURLs to spark action Key: OOZIE-2503 URL: https://issues.apache.org/jira/browse/OOZIE-2503 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley Priority: Minor To support spark action in oozie, please add "ChildJobURLs" to spark action. So the actual SPARK job info is accessible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2479) SparkContext Not Using Yarn Config
[ https://issues.apache.org/jira/browse/OOZIE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232673#comment-15232673 ] Satish Subhashrao Saley commented on OOZIE-2479: You may need to set up {{oozie.service.HadoopAccessorService.hadoop.configurations}} property which will point to the *-site.xmls of hadoop. [https://oozie.apache.org/docs/4.2.0/oozie-default.xml#oozie.service.HadoopAccessorService.hadoop.configurations] > SparkContext Not Using Yarn Config > -- > > Key: OOZIE-2479 > URL: https://issues.apache.org/jira/browse/OOZIE-2479 > Project: Oozie > Issue Type: Bug > Components: workflow >Affects Versions: 4.2.0 > Environment: Oozie 4.2.0.2.3.4.0-3485 > Spark 1.4.1 > Scala 2.10.5 > HDP 2.3 >Reporter: Breandán Mac Parland >Assignee: Satish Subhashrao Saley > > The spark action does not appear to use the jobTracker setting in > job.properties (or in the yarn config) when creating the SparkContext. When > jobTracker property is set to use myDomain:8050 (to match the > yarn.resourcemanager.address setting), I can see in the oozie UI (click on > job > action > action configuration) that myDomain:8050 is being submitted > but when I drill down into the hadoop job history logs I see the error > indicating that a default 0.0.0.0:8032 is being used: > *job.properties* > {code} > nameNode=hdfs://myDomain:8020 > jobTracker=myOtherDomain:8050 > queueName=default > master=yarn # have also tried yarn-cluster and yarn-client > > oozie.use.system.libpath=true > oozie.wf.application.path=${nameNode}/bmp/ > oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I > need in here > {code} > > *workflow* > {code} > > > > > ${jobTracker} > ${nameNode} > > > > ${master} > My Workflow > uk.co.bmp.drivers.MyDriver > ${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar > --conf > spark.yarn.historyServer.address=http://myDomain:18088 --conf > spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf > spark.eventLog.enabled=true > ${nameNode}/bmp/input/input_file.csv > > > > > > Workflow failed, error > message[${wf:errorMessage(wf:lastErrorNode())}] > > > > > {code} > *Error* > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From > myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: > java.net.ConnectException: Connection refused. For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > ... > at org.apache.spark.SparkContext.(SparkContext.scala:497) > ... > {code} > Where is it pulling 8032 from? Why does it not use the port configured in the > job.properties? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OOZIE-2479) SparkContext Not Using Yarn Config
[ https://issues.apache.org/jira/browse/OOZIE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley reassigned OOZIE-2479: -- Assignee: Satish Subhashrao Saley > SparkContext Not Using Yarn Config > -- > > Key: OOZIE-2479 > URL: https://issues.apache.org/jira/browse/OOZIE-2479 > Project: Oozie > Issue Type: Bug > Components: workflow >Affects Versions: 4.2.0 > Environment: Oozie 4.2.0.2.3.4.0-3485 > Spark 1.4.1 > Scala 2.10.5 > HDP 2.3 >Reporter: Breandán Mac Parland >Assignee: Satish Subhashrao Saley > > The spark action does not appear to use the jobTracker setting in > job.properties (or in the yarn config) when creating the SparkContext. When > jobTracker property is set to use myDomain:8050 (to match the > yarn.resourcemanager.address setting), I can see in the oozie UI (click on > job > action > action configuration) that myDomain:8050 is being submitted > but when I drill down into the hadoop job history logs I see the error > indicating that a default 0.0.0.0:8032 is being used: > *job.properties* > {code} > nameNode=hdfs://myDomain:8020 > jobTracker=myOtherDomain:8050 > queueName=default > master=yarn # have also tried yarn-cluster and yarn-client > > oozie.use.system.libpath=true > oozie.wf.application.path=${nameNode}/bmp/ > oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I > need in here > {code} > > *workflow* > {code} > > > > > ${jobTracker} > ${nameNode} > > > > ${master} > My Workflow > uk.co.bmp.drivers.MyDriver > ${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar > --conf > spark.yarn.historyServer.address=http://myDomain:18088 --conf > spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf > spark.eventLog.enabled=true > ${nameNode}/bmp/input/input_file.csv > > > > > > Workflow failed, error > message[${wf:errorMessage(wf:lastErrorNode())}] > > > > > {code} > *Error* > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From > myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: > java.net.ConnectException: Connection refused. For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > ... > at org.apache.spark.SparkContext.(SparkContext.scala:497) > ... > {code} > Where is it pulling 8032 from? Why does it not use the port configured in the > job.properties? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2508) Documentation change for Coord action rerun
Satish Subhashrao Saley created OOZIE-2508: -- Summary: Documentation change for Coord action rerun Key: OOZIE-2508 URL: https://issues.apache.org/jira/browse/OOZIE-2508 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley One user 1) Killed a coordinator job. 2) Rerun one of the actions of the killed coordinator. As of [Oozie Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] Rerun should not work. {quote} Coordinator actions cannot be rerun if the coordinator job is in the KILLED or FAILED state. {quote} But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the support for resuming of failed coordinator job and rerun of a failed coordinator action. Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-1735) Support resuming of failed coordinator job and rerun of a failed coordinator action
[ https://issues.apache.org/jira/browse/OOZIE-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-1735: --- Attachment: (was: OOZIE-1735-Doc-V4.patch) > Support resuming of failed coordinator job and rerun of a failed coordinator > action > --- > > Key: OOZIE-1735 > URL: https://issues.apache.org/jira/browse/OOZIE-1735 > Project: Oozie > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: Purshotam Shah > Fix For: 4.1.0 > > Attachments: OOZIE-1735-V2.patch, OOZIE-1735-V2.patch, > OOZIE-1735-V3.patch, OOZIE-1735_v1.patch > > > We should support resuming of failed coordinator job. Job are set to failed > if there are runtime error( like SQL timeout). > In current scenario there is no way to recover beside running SQL. > Resuming of failed coordinator job should also set pending to 1 ,reset > doneMaterialization and last modified to current time. So that > materialization continues. > We should also provide an option of resuming failed action. The behavior will > be same as killed option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2508) Documentation change for Coord action rerun [OOZIE-1735]
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2508: --- Attachment: OOZIE-2508-1.patch > Documentation change for Coord action rerun [OOZIE-1735] > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2508-1.patch > > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2503) show ChildJobURLs to spark action
[ https://issues.apache.org/jira/browse/OOZIE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2503: --- Description: Adding "ChildJobURLs" to spark action. So the actual SPARK job info is easily accessible. (was: To support spark action in oozie, please add "ChildJobURLs" to spark action. So the actual SPARK job info is accessible.) > show ChildJobURLs to spark action > - > > Key: OOZIE-2503 > URL: https://issues.apache.org/jira/browse/OOZIE-2503 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2503-1.patch > > > Adding "ChildJobURLs" to spark action. So the actual SPARK job info is easily > accessible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2508) Documentation change for Coord action rerun [OOZIE-1735]
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2508: --- Summary: Documentation change for Coord action rerun [OOZIE-1735] (was: Documentation change for Coord action rerun ) > Documentation change for Coord action rerun [OOZIE-1735] > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2508) Documentation change for Coord action rerun
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2508: --- Summary: Documentation change for Coord action rerun (was: Documentation change for Coord action rerun) > Documentation change for Coord action rerun > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2503) show ChildJobURLs to spark action
[ https://issues.apache.org/jira/browse/OOZIE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235589#comment-15235589 ] Satish Subhashrao Saley commented on OOZIE-2503: Test failures are flaky and not related. > show ChildJobURLs to spark action > - > > Key: OOZIE-2503 > URL: https://issues.apache.org/jira/browse/OOZIE-2503 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2503-1.patch > > > Adding "ChildJobURLs" to spark action. So the actual SPARK job info is easily > accessible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-1735) Support resuming of failed coordinator job and rerun of a failed coordinator action
[ https://issues.apache.org/jira/browse/OOZIE-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235691#comment-15235691 ] Satish Subhashrao Saley commented on OOZIE-1735: Created new jira to update documentation - [OOZIE-2508 | https://issues.apache.org/jira/browse/OOZIE-2508] > Support resuming of failed coordinator job and rerun of a failed coordinator > action > --- > > Key: OOZIE-1735 > URL: https://issues.apache.org/jira/browse/OOZIE-1735 > Project: Oozie > Issue Type: Bug >Reporter: Purshotam Shah >Assignee: Purshotam Shah > Fix For: 4.1.0 > > Attachments: OOZIE-1735-V2.patch, OOZIE-1735-V2.patch, > OOZIE-1735-V3.patch, OOZIE-1735_v1.patch > > > We should support resuming of failed coordinator job. Job are set to failed > if there are runtime error( like SQL timeout). > In current scenario there is no way to recover beside running SQL. > Resuming of failed coordinator job should also set pending to 1 ,reset > doneMaterialization and last modified to current time. So that > materialization continues. > We should also provide an option of resuming failed action. The behavior will > be same as killed option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2473) Connection pool for SMTP connection
[ https://issues.apache.org/jira/browse/OOZIE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2473: --- Attachment: OOZIE-2473-1.patch > Connection pool for SMTP connection > --- > > Key: OOZIE-2473 > URL: https://issues.apache.org/jira/browse/OOZIE-2473 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2473-1.patch > > > Currently, to send an email we setup new connection every time which seems > costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2473) Connection pool for SMTP connection
[ https://issues.apache.org/jira/browse/OOZIE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185262#comment-15185262 ] Satish Subhashrao Saley commented on OOZIE-2473: [Review board link|https://reviews.apache.org/r/44516] > Connection pool for SMTP connection > --- > > Key: OOZIE-2473 > URL: https://issues.apache.org/jira/browse/OOZIE-2473 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2473-1.patch > > > Currently, to send an email we setup new connection every time which seems > costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2473) Connection pool for SMTP connection
[ https://issues.apache.org/jira/browse/OOZIE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2473: --- Description: Currently, to send an email we setup new connection every time which seems costly. It would be good to have a connection pool. was:Currently, to send an email we setup new connection every time which seems costly. It would be good to have a connection pool. > Connection pool for SMTP connection > --- > > Key: OOZIE-2473 > URL: https://issues.apache.org/jira/browse/OOZIE-2473 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2473-1.patch > > > Currently, to send an email we setup new connection every time which seems > costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2473) Connection pool for SMTP connection
[ https://issues.apache.org/jira/browse/OOZIE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185272#comment-15185272 ] Satish Subhashrao Saley commented on OOZIE-2473: Following are the highlights: 1. Currently, we use Transport.send() static method to send an email. This will create new connection every time. 2. To avoid this, we can use sendMessage() method of Transport class and save the transport object inside pool. 3. Our current javax.mail package is quite old as per (http://mvnrepository.com/artifact/javax.mail/mail/1.4). It was when Sun Microsystems owned Java. I was getting "javax.mail.NoSuchProviderException: Invalid protocol: null" as they changed provider names. new - javax.mail.Provider[TRANSPORT,smtp,com.sun.mail.smtp.SMTPTransport,Oracle] old - javax.mail.Provider[TRANSPORT,smtp,com.sun.mail.smtp.SMTPTransport,Sun Microsystems, Inc] Therefore, we switched to 1.5.5 version (https://java.net/projects/javamail/pages/Home). 4. We have used Apache Commons Pool (https://commons.apache.org/proper/commons-pool/) library for implementing pool. 5. PoolService has been introduced so that in future, anyone can make use of this/any library and implement their own pool. Through PoolService they can access it from anywhere. 6. oozie-default introduces some name-value pairs with respect to SMTP connection pool. > Connection pool for SMTP connection > --- > > Key: OOZIE-2473 > URL: https://issues.apache.org/jira/browse/OOZIE-2473 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2473-1.patch > > > Currently, to send an email we setup new connection every time which seems > costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2246) CoordinatorInputCheckCommand does not behave properly when har file is one of data dependency and doesn't exist
[ https://issues.apache.org/jira/browse/OOZIE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178251#comment-15178251 ] Satish Subhashrao Saley commented on OOZIE-2246: Thank you [~puru] > CoordinatorInputCheckCommand does not behave properly when har file is one of > data dependency and doesn't exist > --- > > Key: OOZIE-2246 > URL: https://issues.apache.org/jira/browse/OOZIE-2246 > Project: Oozie > Issue Type: Bug >Reporter: Ryota Egashira >Assignee: Satish Subhashrao Saley > Fix For: trunk > > Attachments: OOZIE-2246-v2.patch, OOZIE-2246-v3.patch, > OOZIE-2246-v4.patch, OOZIE-2246.patch > > > when har file doesn't exist, FileSystem.createFileSystem in > HadoopAccessorService throw exception, failing CoordActionInputCheck. > thus, even thought there are other data dependencies which already exist, it > is not reflected on DB. > coordinator job cannot start until the har file becomes available anyway, and > once available, this error doesn't happen, so basic functionality is fine, > but it's misleading. > {code} > 2014-03-13 22:00:00,051 WARN CallableQueueService$CallableWrapper:542 > [pool-2-thread-288] - USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] > exception callable [coord_action_input], E1021: Coord Action Input Check > Error: org.apache.oozie.service.HadoopAccessorException: E0902: Exception > occured: [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > org.apache.oozie.command.CommandException: E1021: Coord Action Input Check > Error: org.apache.oozie.service.HadoopAccessorException: E0902: Exception > occured: [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:182) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:67) > at org.apache.oozie.command.XCommand.call(XCommand.java:280) > at > org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.io.IOException: > org.apache.oozie.service.HadoopAccessorException: E0902: Exception occured: > [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.pathExists(CoordActionInputCheckXCommand.java:493) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkListOfPaths(CoordActionInputCheckXCommand.java:459) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkResolvedUris(CoordActionInputCheckXCommand.java:429) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.checkInput(CoordActionInputCheckXCommand.java:259) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.execute(CoordActionInputCheckXCommand.java:132) > ... 6 more > Caused by: org.apache.oozie.service.HadoopAccessorException: E0902: Exception > occured: [Invalid path for the Har Filesystem. No index file in > har://:8020/data/2014031322/archive.har] > at > org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:430) > at > org.apache.oozie.dependency.FSURIHandler.getFileSystem(FSURIHandler.java:134) > at org.apache.oozie.dependency.FSURIHandler.exists(FSURIHandler.java:99) > at > org.apache.oozie.command.coord.CoordActionInputCheckXCommand.pathExists(CoordActionInputCheckXCommand.java:488) > ... 10 more > Caused by: java.io.IOException: Invalid path for the Har Filesystem. No index > file in har://:8020/data/2014031322/archive.har > at org.apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:139) > at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2160) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:303) > at > org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:422) > at > org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:420) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1284) > at > org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:420) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2471) Show child job url tab for distcp
[ https://issues.apache.org/jira/browse/OOZIE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2471: --- Attachment: OOZIE-2471-3.patch Minor change in logging > Show child job url tab for distcp > - > > Key: OOZIE-2471 > URL: https://issues.apache.org/jira/browse/OOZIE-2471 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2471-1.patch, OOZIE-2471-2.patch, > OOZIE-2471-3.patch > > > The actual distcp job url is not displayed in Child Jobs tab and one has to > go to the launcher job URL to find it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2446) Job does not fail during submission if non existent credential is specified
[ https://issues.apache.org/jira/browse/OOZIE-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178320#comment-15178320 ] Satish Subhashrao Saley commented on OOZIE-2446: Thank you [~puru] > Job does not fail during submission if non existent credential is specified > --- > > Key: OOZIE-2446 > URL: https://issues.apache.org/jira/browse/OOZIE-2446 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-2446-1.patch, OOZIE-2446-2.patch, > OOZIE-2446-3.patch, OOZIE-2446-4.patch, OOZIE-2446-5.patch > > > {noformat} > > > .. > > {noformat} > User had specified howlauth instead of hcatauth. Job was launched and failed > because it could not connect to hcat. It should have failed in the server > itself when no definition for howlauth credential was found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2430) Add root logger for hive,sqoop action
[ https://issues.apache.org/jira/browse/OOZIE-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2430: --- Attachment: OOZIE-2430-5-amend.patch amend patch to fix layout for log statements > Add root logger for hive,sqoop action > - > > Key: OOZIE-2430 > URL: https://issues.apache.org/jira/browse/OOZIE-2430 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Fix For: trunk > > Attachments: OOZIE-2430-1.patch, OOZIE-2430-2.patch, > OOZIE-2430-3.patch, OOZIE-2430-4.patch, OOZIE-2430-5-amend.patch > > > There is no root logger for hive,sqoop action like Pig and so it only logs > statements from hive and thus misses hadoop and Tez logs statements. Need to > add a root logger Oozie server configuration and use that value for root > logger in both Pig, sqoop and hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (OOZIE-2430) Add root logger for hive,sqoop action
[ https://issues.apache.org/jira/browse/OOZIE-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley reopened OOZIE-2430: reopening to test amend patch > Add root logger for hive,sqoop action > - > > Key: OOZIE-2430 > URL: https://issues.apache.org/jira/browse/OOZIE-2430 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Fix For: trunk > > Attachments: OOZIE-2430-1.patch, OOZIE-2430-2.patch, > OOZIE-2430-3.patch, OOZIE-2430-4.patch, OOZIE-2430-5-amend.patch > > > There is no root logger for hive,sqoop action like Pig and so it only logs > statements from hive and thus misses hadoop and Tez logs statements. Need to > add a root logger Oozie server configuration and use that value for root > logger in both Pig, sqoop and hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2440) Exponential re-try policy for workflow action
[ https://issues.apache.org/jira/browse/OOZIE-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178623#comment-15178623 ] Satish Subhashrao Saley commented on OOZIE-2440: Test failures are unrelated > Exponential re-try policy for workflow action > - > > Key: OOZIE-2440 > URL: https://issues.apache.org/jira/browse/OOZIE-2440 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2440-2.patch, OOZIE-2440-3.patch, > OOZIE-2440-4.patch > > > Currently the user can mention retry interval and maximum number of retries. > We will add another element in action tag through which the user can mention > policy for the retry. Policy could be exponential or periodic. Periodic will > be the default as it is the current policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2430) Add root logger for hive,sqoop action
[ https://issues.apache.org/jira/browse/OOZIE-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2430: --- Attachment: OOZIE-2430-6-amend.patch Sorry, I had "Organize imports" options checked under "Save Actions" on my eclipse and I must had chosen wrong HadoopShims when it asked me which HadoopShims I wanted. Uploading the amend patch. > Add root logger for hive,sqoop action > - > > Key: OOZIE-2430 > URL: https://issues.apache.org/jira/browse/OOZIE-2430 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Fix For: trunk > > Attachments: OOZIE-2430-1.patch, OOZIE-2430-2.patch, > OOZIE-2430-3.patch, OOZIE-2430-4.patch, OOZIE-2430-5-amend.patch, > OOZIE-2430-6-amend.patch > > > There is no root logger for hive,sqoop action like Pig and so it only logs > statements from hive and thus misses hadoop and Tez logs statements. Need to > add a root logger Oozie server configuration and use that value for root > logger in both Pig, sqoop and hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2430) Add root logger for hive,sqoop action
[ https://issues.apache.org/jira/browse/OOZIE-2430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2430: --- Attachment: OOZIE-2430-7-amend.patch > Add root logger for hive,sqoop action > - > > Key: OOZIE-2430 > URL: https://issues.apache.org/jira/browse/OOZIE-2430 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Fix For: trunk > > Attachments: OOZIE-2430-1.patch, OOZIE-2430-2.patch, > OOZIE-2430-3.patch, OOZIE-2430-4.patch, OOZIE-2430-5-amend.patch, > OOZIE-2430-6-amend.patch, OOZIE-2430-7-amend.patch > > > There is no root logger for hive,sqoop action like Pig and so it only logs > statements from hive and thus misses hadoop and Tez logs statements. Need to > add a root logger Oozie server configuration and use that value for root > logger in both Pig, sqoop and hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2492) JSON security issue in js code
[ https://issues.apache.org/jira/browse/OOZIE-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15216728#comment-15216728 ] Satish Subhashrao Saley commented on OOZIE-2492: +1 > JSON security issue in js code > -- > > Key: OOZIE-2492 > URL: https://issues.apache.org/jira/browse/OOZIE-2492 > Project: Oozie > Issue Type: Bug > Components: client, security >Affects Versions: 4.1.0 >Reporter: Ferenc Denes >Assignee: Ferenc Denes > Labels: security, web-console > Fix For: trunk > > Attachments: OOZIE-2492-1.patch > > > JSON parsing is done using the eval js method in several places in the > oozie-console.js, which allows code injection. > The project already contains a json parser library, which should be used all > around the code. > We are aware that most of the json documents parsed are from the oozie > server, and not from the user directly. However fixing it all will make the > code most robust and consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-1402) Increase retry interval for non-progressing coordinator action using exponential backoff concept
[ https://issues.apache.org/jira/browse/OOZIE-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-1402: --- Attachment: OOZIE-1402-2.patch > Increase retry interval for non-progressing coordinator action using > exponential backoff concept > - > > Key: OOZIE-1402 > URL: https://issues.apache.org/jira/browse/OOZIE-1402 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Mona Chitnis >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-1402-1.patch, OOZIE-1402-2.patch > > > Currently every coordinator action is retried to check data directory in the > next minute. > We could make it better by waiting longer for coordinator action that is not > progressing (i.e. find no new directory) for repeated retries > The waiting time should start from 1 minute for X retries. Then the action > should wait for 2 minutes. After X retries it should wait for 3. The same way > it will go to some max-wait-time and stay there until timeout -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OOZIE-1402) Increase retry interval for non-progressing coordinator action using exponential backoff concept
[ https://issues.apache.org/jira/browse/OOZIE-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley reassigned OOZIE-1402: -- Assignee: Satish Subhashrao Saley > Increase retry interval for non-progressing coordinator action using > exponential backoff concept > - > > Key: OOZIE-1402 > URL: https://issues.apache.org/jira/browse/OOZIE-1402 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Mona Chitnis >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > > Currently every coordinator action is retried to check data directory in the > next minute. > We could make it better by waiting longer for coordinator action that is not > progressing (i.e. find no new directory) for repeated retries > The waiting time should start from 1 minute for X retries. Then the action > should wait for 2 minutes. After X retries it should wait for 3. The same way > it will go to some max-wait-time and stay there until timeout -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2473) Connection pool for SMTP connection
[ https://issues.apache.org/jira/browse/OOZIE-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2473: --- Attachment: OOZIE-2473-1.patch > Connection pool for SMTP connection > --- > > Key: OOZIE-2473 > URL: https://issues.apache.org/jira/browse/OOZIE-2473 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2473-1.patch, OOZIE-2473-1.patch > > > Currently, to send an email we setup new connection every time which seems > costly. It would be good to have a connection pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2474) is not being applied to the launcher job
[ https://issues.apache.org/jira/browse/OOZIE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15224579#comment-15224579 ] Satish Subhashrao Saley commented on OOZIE-2474: +1. Users need to be notified, because their {{oozie.launcher}} properties, which were present idly in job.xml, will be getting into effect when this fix gets shipped. > is not being applied to the launcher job > -- > > Key: OOZIE-2474 > URL: https://issues.apache.org/jira/browse/OOZIE-2474 > Project: Oozie > Issue Type: Bug >Affects Versions: trunk >Reporter: Robert Kanter >Assignee: Robert Kanter > Attachments: OOZIE-2474.001.patch > > > Properties included via {{}} don't get applied to the Oozie Launcher > MR job, only to the action child jobs. This includes {{oozie.launcher.\*}} > properties. The {{oozie.launcher.\*}} properties should end up in the > launcher job as {{oozie.launcher.foo}} and {{foo}}, just like if you had put > them in {{}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2500) -DtestJarSimple option mentioned in minioozie doc does not work
Satish Subhashrao Saley created OOZIE-2500: -- Summary: -DtestJarSimple option mentioned in minioozie doc does not work Key: OOZIE-2500 URL: https://issues.apache.org/jira/browse/OOZIE-2500 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley [Minioozie doc|https://oozie.apache.org/docs/4.2.0/ENG_MiniOozie.html] says to run {{$ mvn clean install -DskipTests -DtestJarSimple}} to populate local maven repo with required jars. But the command fails with following message: {code} [INFO] 100 errors [INFO] - [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Oozie Main .. SUCCESS [ 1.339 s] [INFO] Apache Oozie Hadoop Utils .. SUCCESS [ 0.999 s] [INFO] Apache Oozie Hadoop Distcp hadoop-1-4.3.0-SNAPSHOT . SUCCESS [ 0.138 s] [INFO] Apache Oozie Hadoop Auth hadoop-1-4.3.0-SNAPSHOT ... SUCCESS [ 0.244 s] [INFO] Apache Oozie Hadoop Libs ... SUCCESS [ 0.052 s] [INFO] Apache Oozie Client SUCCESS [ 3.282 s] [INFO] Apache Oozie Share Lib Oozie ... SUCCESS [ 2.257 s] [INFO] Apache Oozie Share Lib HCatalog SUCCESS [ 2.088 s] [INFO] Apache Oozie Share Lib Distcp .. SUCCESS [ 0.693 s] [INFO] Apache Oozie Core .. SUCCESS [ 20.951 s] [INFO] Apache Oozie Share Lib Streaming ... FAILURE [ 2.688 s] [INFO] Apache Oozie Share Lib Pig . SKIPPED [INFO] Apache Oozie Share Lib Hive SKIPPED [INFO] Apache Oozie Share Lib Hive 2 .. SKIPPED [INFO] Apache Oozie Share Lib Sqoop ... SKIPPED [INFO] Apache Oozie Examples .. SKIPPED [INFO] Apache Oozie Share Lib Spark ... SKIPPED [INFO] Apache Oozie Share Lib . SKIPPED [INFO] Apache Oozie Docs .. SKIPPED [INFO] Apache Oozie WebApp SKIPPED [INFO] Apache Oozie Tools . SKIPPED [INFO] Apache Oozie MiniOozie . SKIPPED [INFO] Apache Oozie Distro SKIPPED [INFO] Apache Oozie ZooKeeper Security Tests .. SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 35.474 s [INFO] Finished at: 2016-04-04T16:04:41-07:00 [INFO] Final Memory: 161M/1262M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile (default-testCompile) on project oozie-sharelib-streaming: Compilation failure: Compilation failure: [ERROR] /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestStreamingMain.java:[31,39] error: cannot find symbol [ERROR] class MainTestCase [ERROR] /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[76,49] error: cannot find symbol [ERROR] class ActionExecutorTestCase [ERROR] /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[342,14] error: cannot find symbol [ERROR] class TestMapReduceActionExecutor [ERROR] /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[365,14] error: cannot find symbol [ERROR] class TestMapReduceActionExecutor [ERROR] /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[389,38] error: cannot find symbol [ERROR] class TestMapReduceActionExecutor [ERROR] /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestStreamingMain.java:[34,24] error: cannot find symbol [ERROR] class TestStreamingMain {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2500) -DtestJarSimple option mentioned in minioozie doc does not work
[ https://issues.apache.org/jira/browse/OOZIE-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2500: --- Attachment: out.txt > -DtestJarSimple option mentioned in minioozie doc does not work > --- > > Key: OOZIE-2500 > URL: https://issues.apache.org/jira/browse/OOZIE-2500 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley > Attachments: out.txt > > > [Minioozie doc|https://oozie.apache.org/docs/4.2.0/ENG_MiniOozie.html] says > to run {{$ mvn clean install -DskipTests -DtestJarSimple}} to populate local > maven repo with required jars. But the command fails with following message: > {code} > [INFO] 100 errors > [INFO] - > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Apache Oozie Main .. SUCCESS [ 1.339 > s] > [INFO] Apache Oozie Hadoop Utils .. SUCCESS [ 0.999 > s] > [INFO] Apache Oozie Hadoop Distcp hadoop-1-4.3.0-SNAPSHOT . SUCCESS [ 0.138 > s] > [INFO] Apache Oozie Hadoop Auth hadoop-1-4.3.0-SNAPSHOT ... SUCCESS [ 0.244 > s] > [INFO] Apache Oozie Hadoop Libs ... SUCCESS [ 0.052 > s] > [INFO] Apache Oozie Client SUCCESS [ 3.282 > s] > [INFO] Apache Oozie Share Lib Oozie ... SUCCESS [ 2.257 > s] > [INFO] Apache Oozie Share Lib HCatalog SUCCESS [ 2.088 > s] > [INFO] Apache Oozie Share Lib Distcp .. SUCCESS [ 0.693 > s] > [INFO] Apache Oozie Core .. SUCCESS [ 20.951 > s] > [INFO] Apache Oozie Share Lib Streaming ... FAILURE [ 2.688 > s] > [INFO] Apache Oozie Share Lib Pig . SKIPPED > [INFO] Apache Oozie Share Lib Hive SKIPPED > [INFO] Apache Oozie Share Lib Hive 2 .. SKIPPED > [INFO] Apache Oozie Share Lib Sqoop ... SKIPPED > [INFO] Apache Oozie Examples .. SKIPPED > [INFO] Apache Oozie Share Lib Spark ... SKIPPED > [INFO] Apache Oozie Share Lib . SKIPPED > [INFO] Apache Oozie Docs .. SKIPPED > [INFO] Apache Oozie WebApp SKIPPED > [INFO] Apache Oozie Tools . SKIPPED > [INFO] Apache Oozie MiniOozie . SKIPPED > [INFO] Apache Oozie Distro SKIPPED > [INFO] Apache Oozie ZooKeeper Security Tests .. SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 35.474 s > [INFO] Finished at: 2016-04-04T16:04:41-07:00 > [INFO] Final Memory: 161M/1262M > [INFO] > > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:2.3.2:testCompile > (default-testCompile) on project oozie-sharelib-streaming: Compilation > failure: Compilation failure: > [ERROR] > /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestStreamingMain.java:[31,39] > error: cannot find symbol > [ERROR] class MainTestCase > [ERROR] > /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[76,49] > error: cannot find symbol > [ERROR] class ActionExecutorTestCase > [ERROR] > /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[342,14] > error: cannot find symbol > [ERROR] class TestMapReduceActionExecutor > [ERROR] > /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[365,14] > error: cannot find symbol > [ERROR] class TestMapReduceActionExecutor > [ERROR] > /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestMapReduceActionExecutor.java:[389,38] > error: cannot find symbol > [ERROR] class TestMapReduceActionExecutor > [ERROR] > /Users/saley/src/oozie/sharelib/streaming/src/test/java/org/apache/oozie/action/hadoop/TestStreamingMain.java:[34,24] > error: cannot find symbol > [ERROR] class TestStreamingMain > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2471) Show child job url tab for distcp
[ https://issues.apache.org/jira/browse/OOZIE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2471: --- Attachment: OOZIE-2471-5.patch > Show child job url tab for distcp > - > > Key: OOZIE-2471 > URL: https://issues.apache.org/jira/browse/OOZIE-2471 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2471-1.patch, OOZIE-2471-2.patch, > OOZIE-2471-3.patch, OOZIE-2471-4.patch, OOZIE-2471-5.patch > > > The actual distcp job url is not displayed in Child Jobs tab and one has to > go to the launcher job URL to find it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-1402) Increase retry interval for non-progressing coordinator action with fix value
[ https://issues.apache.org/jira/browse/OOZIE-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-1402: --- Attachment: OOZIE-1402-3.patch > Increase retry interval for non-progressing coordinator action with fix value > - > > Key: OOZIE-1402 > URL: https://issues.apache.org/jira/browse/OOZIE-1402 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Mona Chitnis >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-1402-1.patch, OOZIE-1402-2.patch, > OOZIE-1402-3.patch > > > Currently every coordinator action is retried to check data directory in the > next minute. > We could make it better by waiting longer for coordinator action that is not > progressing (i.e. find no new directory) for repeated retries > The waiting time should start from 1 minute for X retries. Then the action > should wait for 2 minutes. After X retries it should wait for 3. The same way > it will go to some max-wait-time and stay there until timeout -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2330) Spark action should take the global jobTracker and nameNode configs by default
[ https://issues.apache.org/jira/browse/OOZIE-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252279#comment-15252279 ] Satish Subhashrao Saley commented on OOZIE-2330: Thank you [~rkanter]. I will verify and update accordingly. > Spark action should take the global jobTracker and nameNode configs by default > -- > > Key: OOZIE-2330 > URL: https://issues.apache.org/jira/browse/OOZIE-2330 > Project: Oozie > Issue Type: Improvement > Components: action >Reporter: Wei Yan >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2330-1.patch > > > In Spark Action 0.1 schema, the job-tracker and name-node are required. > {code} > > > {code} > It would be better that the spark action can take default values from the > global configs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2512) ShareLibservice returns incorrect path for jar
Satish Subhashrao Saley created OOZIE-2512: -- Summary: ShareLibservice returns incorrect path for jar Key: OOZIE-2512 URL: https://issues.apache.org/jira/browse/OOZIE-2512 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley If we have {{oozie.service.ShareLibService.mapping.file}} setting pointing to a metafile, then ShareLibServe loads paths from that file. We can mention path to a directory or path to a direct file. When path to a directory is mentioned, the paths are populated correctly, but not when we mentioned direct path for a file. Consider following paths: * /sharelib/pig/ ** pig.jar ** some.jar * /sharelib/spark ** spark-assembly.jar In metafile, we have {code} oozie.pig=/sharelib/pig/ oozie.spark=/sharelib/spark/spark-assembly.jar {code} Now, the SharelibService calculates the paths as pig - hdfs://clustername.com:8020/sharelib/pig/pig.jar,hdfs://clustername.com:8020/sharelib/pig/some.jar spark - /sharelib/spark/spark-assembly.jar The spark path does not have hdfs prefixed. Later on, when we run a spark action, it fails with {code} Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, File file:/sharelib/spark/spark-assembly.jar does not exist java.io.FileNotFoundException: File file:/sharelib/spark/spark-assembly.jar does not exist {code} Remember, if we already mentioned hdfs prefixed paths in metafile, then it works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-1402) Increase retry interval for non-progressing coordinator action with fix value
[ https://issues.apache.org/jira/browse/OOZIE-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254045#comment-15254045 ] Satish Subhashrao Saley commented on OOZIE-1402: Thank you [~puru] for review. I have made the changes accordingly. Test failures are not related. > Increase retry interval for non-progressing coordinator action with fix value > - > > Key: OOZIE-1402 > URL: https://issues.apache.org/jira/browse/OOZIE-1402 > Project: Oozie > Issue Type: Improvement > Components: core >Affects Versions: trunk >Reporter: Mona Chitnis >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-1402-1.patch, OOZIE-1402-2.patch, > OOZIE-1402-3.patch > > > Currently every coordinator action is retried to check data directory in the > next minute. > We could make it better by waiting longer for coordinator action that is not > progressing (i.e. find no new directory) for repeated retries > The waiting time should start from 1 minute for X retries. Then the action > should wait for 2 minutes. After X retries it should wait for 3. The same way > it will go to some max-wait-time and stay there until timeout -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2508) Documentation change for Coord action rerun [OOZIE-1735]
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2508: --- Attachment: OOZIE-2508-1.patch > Documentation change for Coord action rerun [OOZIE-1735] > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2508-1.patch, OOZIE-2508-1.patch > > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2508) Documentation change for Coord action rerun [OOZIE-1735]
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15254301#comment-15254301 ] Satish Subhashrao Saley commented on OOZIE-2508: Uploaded patch with suggested changes. > Documentation change for Coord action rerun [OOZIE-1735] > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2508-1.patch, OOZIE-2508-2.patch > > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2508) Documentation change for Coord action rerun [OOZIE-1735]
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2508: --- Attachment: (was: OOZIE-2508-1.patch) > Documentation change for Coord action rerun [OOZIE-1735] > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2508-1.patch, OOZIE-2508-2.patch > > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2514) Checkstyle violation while doing mvn verify or mvn install
Satish Subhashrao Saley created OOZIE-2514: -- Summary: Checkstyle violation while doing mvn verify or mvn install Key: OOZIE-2514 URL: https://issues.apache.org/jira/browse/OOZIE-2514 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley Priority: Minor {code} $ mvn verify -DskipTests -X [INFO] Starting audit... /home/saley/tmp/oozie/sharelib/spark/src/test/java/org/apache/oozie/action/hadoop/TestSparkOptionsSplitter.java:9: Line does not match expected header line of ' *'. Audit done. [INFO] There are 1 checkstyle errors. [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Oozie Main . SUCCESS [1.559s] [INFO] Apache Oozie Hadoop Utils . SUCCESS [0.825s] [INFO] Apache Oozie Hadoop Distcp hadoop-1-4.3.0-SNAPSHOT SUCCESS [0.193s] [INFO] Apache Oozie Hadoop Auth hadoop-1-4.3.0-SNAPSHOT .. SUCCESS [0.237s] [INFO] Apache Oozie Hadoop Libs .. SUCCESS [0.058s] [INFO] Apache Oozie Client ... SUCCESS [1.096s] [INFO] Apache Oozie Share Lib Oozie .. SUCCESS [0.957s] [INFO] Apache Oozie Share Lib HCatalog ... SUCCESS [0.625s] [INFO] Apache Oozie Share Lib Distcp . SUCCESS [0.229s] [INFO] Apache Oozie Core . SUCCESS [8.575s] [INFO] Apache Oozie Share Lib Streaming .. SUCCESS [2.885s] [INFO] Apache Oozie Share Lib Pig SUCCESS [0.501s] [INFO] Apache Oozie Share Lib Hive ... SUCCESS [0.806s] [INFO] Apache Oozie Share Lib Hive 2 . SUCCESS [0.911s] [INFO] Apache Oozie Share Lib Sqoop .. SUCCESS [0.486s] [INFO] Apache Oozie Examples . SUCCESS [0.840s] [INFO] Apache Oozie Share Lib Spark .. FAILURE [1.552s] [INFO] Apache Oozie Share Lib SKIPPED [INFO] Apache Oozie Docs . SKIPPED [INFO] Apache Oozie WebApp ... SKIPPED [INFO] Apache Oozie Tools SKIPPED [INFO] Apache Oozie MiniOozie SKIPPED [INFO] Apache Oozie Distro ... SKIPPED [INFO] Apache Oozie ZooKeeper Security Tests . SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 23.249s [INFO] Finished at: Thu Apr 21 22:47:09 UTC 2016 [INFO] Final Memory: 40M/322M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:2.9.1:check (default) on project oozie-sharelib-spark: You have 1 Checkstyle violation. -> [Help 1] {code} Checkstyle plugin does not like structure of some lines in Apache License statement. Its giving error on {code} * * http://www.apache.org/licenses/LICENSE-2.0 * {code} We need to remove tag. I will ship it with [OOZIE-2503|https://issues.apache.org/jira/browse/OOZIE-2503] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2508) Documentation change for Coord action rerun [OOZIE-1735]
[ https://issues.apache.org/jira/browse/OOZIE-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252964#comment-15252964 ] Satish Subhashrao Saley commented on OOZIE-2508: Hi [~puru], Moving comment here. I think you mistakenly put under OOZIE-2503. {code} - * Coordinator actions cannot be rerun if the coordinator job is in the KILLED or FAILED state. + * Coordinator actions can be rerun even if the coordinator job is in the KILLED or FAILED state. {code} Maybe we should say that "Coordinator actions cannot be rerun if the coordinator job is in the PREP or IGNORED state". Users should know on what coord job status they can't rerun coord action. - I will put it that way. > Documentation change for Coord action rerun [OOZIE-1735] > > > Key: OOZIE-2508 > URL: https://issues.apache.org/jira/browse/OOZIE-2508 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2508-1.patch > > > One user > > 1) Killed a coordinator job. > 2) Rerun one of the actions of the killed coordinator. > As of [Oozie > Documentation|http://oozie.apache.org/docs/4.2.0/DG_CoordinatorRerun.html] > Rerun should not work. > {quote} > Coordinator actions cannot be rerun if the coordinator job is in the KILLED > or FAILED state. > {quote} > But [OOZIE-1735|https://issues.apache.org/jira/browse/OOZIE-1735] added the > support for resuming of failed coordinator job and rerun of a failed > coordinator action. > Therefore, documentation needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2471) Show child job url tab for distcp
[ https://issues.apache.org/jira/browse/OOZIE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15252971#comment-15252971 ] Satish Subhashrao Saley commented on OOZIE-2471: Thank you for your comment. I will try to do the same for OOZIE-2503 > Show child job url tab for distcp > - > > Key: OOZIE-2471 > URL: https://issues.apache.org/jira/browse/OOZIE-2471 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2471-1.patch, OOZIE-2471-2.patch, > OOZIE-2471-3.patch, OOZIE-2471-4.patch > > > The actual distcp job url is not displayed in Child Jobs tab and one has to > go to the launcher job URL to find it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-1402) Increase retry interval for non-progressing coordinator action with fix value
[ https://issues.apache.org/jira/browse/OOZIE-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-1402: --- Summary: Increase retry interval for non-progressing coordinator action with fix value (was: Increase retry interval for non-progressing coordinator action using exponential backoff concept ) > Increase retry interval for non-progressing coordinator action with fix value > - > > Key: OOZIE-1402 > URL: https://issues.apache.org/jira/browse/OOZIE-1402 > Project: Oozie > Issue Type: Improvement >Affects Versions: trunk >Reporter: Mona Chitnis >Assignee: Satish Subhashrao Saley >Priority: Minor > Fix For: trunk > > Attachments: OOZIE-1402-1.patch, OOZIE-1402-2.patch > > > Currently every coordinator action is retried to check data directory in the > next minute. > We could make it better by waiting longer for coordinator action that is not > progressing (i.e. find no new directory) for repeated retries > The waiting time should start from 1 minute for X retries. Then the action > should wait for 2 minutes. After X retries it should wait for 3. The same way > it will go to some max-wait-time and stay there until timeout -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2512) ShareLibservice returns incorrect path for jar
[ https://issues.apache.org/jira/browse/OOZIE-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2512: --- Attachment: OOZIE-2512-1.patch > ShareLibservice returns incorrect path for jar > -- > > Key: OOZIE-2512 > URL: https://issues.apache.org/jira/browse/OOZIE-2512 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2512-1.patch > > > If we have {{oozie.service.ShareLibService.mapping.file}} setting pointing to > a metafile, then ShareLibServe loads paths from that file. > We can mention path to a directory or path to a direct file. > When path to a directory is mentioned, the paths are populated correctly, but > not when we mentioned direct path for a file. > Consider following paths: > * /sharelib/pig/ > ** pig.jar > ** some.jar > * /sharelib/spark > ** spark-assembly.jar > In metafile, we have > {code} > oozie.pig=/sharelib/pig/ > oozie.spark=/sharelib/spark/spark-assembly.jar > {code} > Now, the SharelibService calculates the paths as > pig - > hdfs://clustername.com:8020/sharelib/pig/pig.jar,hdfs://clustername.com:8020/sharelib/pig/some.jar > spark - /sharelib/spark/spark-assembly.jar > The spark path does not have hdfs prefixed. > Later on, when we run a spark action, it fails with > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, File > file:/sharelib/spark/spark-assembly.jar does not exist > java.io.FileNotFoundException: File file:/sharelib/spark/spark-assembly.jar > does not exist > {code} > Remember, if we already mentioned hdfs prefixed paths in metafile, then it > works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2330) Spark action should take the global jobTracker and nameNode configs by default
[ https://issues.apache.org/jira/browse/OOZIE-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2330: --- Attachment: OOZIE-2330-3.patch > Spark action should take the global jobTracker and nameNode configs by default > -- > > Key: OOZIE-2330 > URL: https://issues.apache.org/jira/browse/OOZIE-2330 > Project: Oozie > Issue Type: Improvement > Components: action >Reporter: Wei Yan >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2330-1.patch, OOZIE-2330-2.patch, > OOZIE-2330-3.patch > > > In Spark Action 0.1 schema, the job-tracker and name-node are required. > {code} > > > {code} > It would be better that the spark action can take default values from the > global configs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292628#comment-15292628 ] Satish Subhashrao Saley commented on OOZIE-2482: Thank you for review Robert. I set spark.executorEnv.PYTHONPATH=pyspark.zip:py4j-0.9-src.zip and it started working. I am checking whether we should have it by default as well. > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, OOZIE-2482-2.patch, > OOZIE-2482-zip.patch, py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull >
[jira] [Updated] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2482: --- Attachment: OOZIE-2482-3.patch - referring to hdfs location for pyspark dependencies in --py-files option - setting PYTHONPATH in case of local mode - documentation > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, OOZIE-2482-2.patch, > OOZIE-2482-3.patch, OOZIE-2482-zip.patch, py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null >
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291537#comment-15291537 ] Satish Subhashrao Saley commented on OOZIE-2482: Could please share the logs for application_1461692698792_19525? > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, OOZIE-2482-2.patch, > OOZIE-2482-zip.patch, py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties
[jira] [Updated] (OOZIE-2529) Pass on the secret keys from Credentials to launcher job conf
[ https://issues.apache.org/jira/browse/OOZIE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2529: --- Attachment: OOZIE-2529-1.patch > Pass on the secret keys from Credentials to launcher job conf > - > > Key: OOZIE-2529 > URL: https://issues.apache.org/jira/browse/OOZIE-2529 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2529-1.patch > > > Currently, we insert credentials tokens to launcher job conf. We should also > insert credential secret keys to launcher job conf. That way, user can pass > on some secret information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2482: --- Attachment: py4j-0.9-src.zip > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null: >
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285023#comment-15285023 ] Satish Subhashrao Saley commented on OOZIE-2482: Tests are failing because jenkins is unable to find {{py4j}} and {{pyspark}} zip in test resources. Attaching them here as per discussion with [~rohini]. I have added those in {{sharelib/spark/src/test/resources}} > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull >
[jira] [Updated] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2482: --- Attachment: pyspark.zip > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null: > spark.executorEnv.SPARK_HOME ->
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285702#comment-15285702 ] Satish Subhashrao Saley commented on OOZIE-2482: Setting up SPARK_HOME=. will work as well, but we need to make sure that pyspark and py4j zip files are under $SPARK_HOME/python/lib/ directory as spark will look for it in [this code|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1049-L1051]. Main reason for moving to spark 1.6.1 is the version mismatch errors I faced while writing the tests. {code} Exception: Python in worker has different version 2.7 than that in driver /Users/saley/src/oozie/sharelib/spark/target/test-da ta/minicluster/mapred/local/1_0/taskTracker/test/jobcache/job_0001/attempt_0001_m_00_0/work/tmp/spark-f71bd1cd-72f6-458d-b3c2-930c5a0eeb00, PySpark cannot run with different minor versions {code} [~rkanter] I agree with you regarding documenting the change and appropriate error messages. Also, if users are already using {{oozie.service.ShareLibService.mapping.file}} for spark sharelib, then we can encourage them to add paths for pyspark and py4j zip files in there. That way individual user does not need copy over the zip files in workflow lib/ directory. > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch, OOZIE-2482-zip.patch, > py4j-0.9-src.zip, pyspark.zip > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at >
[jira] [Updated] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2482: --- Attachment: OOZIE-2482-1.patch > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2482-1.patch > > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null: > spark.executorEnv.SPARK_HOME ->
[jira] [Updated] (OOZIE-2531) Prevent Spark trying for token which is already available
[ https://issues.apache.org/jira/browse/OOZIE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2531: --- Attachment: OOZIE-2531-2.patch > Prevent Spark trying for token which is already available > -- > > Key: OOZIE-2531 > URL: https://issues.apache.org/jira/browse/OOZIE-2531 > Project: Oozie > Issue Type: Improvement >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2531-1.patch, OOZIE-2531-2.patch > > > As per [Apache Spark > documentation|http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties] > : > The property {{spark.yarn.security.tokens.service.enabled}} in Apache Spark > which controls whether to retrieve delegation tokens for non-HDFS services > when security is enabled. By default, delegation tokens for all supported > services are retrieved when those services are configured, but it's possible > to disable that behavior if it somehow conflicts with the application being > run. > Currently supported services are: hive, hbase > It would be good to have default as > {{spark.yarn.security.tokens.hive.enabled=false}} to avoid having Spark > redundantly retry obtaining a token when it already has one (passed > originally from Oozie's hcat credentials). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2529) Pass on the secret keys from Credentials to launcher job conf
[ https://issues.apache.org/jira/browse/OOZIE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2529: --- Attachment: (was: OOZIE-2529-1.patch) > Pass on the secret keys from Credentials to launcher job conf > - > > Key: OOZIE-2529 > URL: https://issues.apache.org/jira/browse/OOZIE-2529 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2529-1.patch > > > Currently, we insert credentials tokens to launcher job conf. We should also > insert credential secret keys to launcher job conf. That way, user can pass > on some secret information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2531) Prevent Spark trying for token which is already available
Satish Subhashrao Saley created OOZIE-2531: -- Summary: Prevent Spark trying for token which is already available Key: OOZIE-2531 URL: https://issues.apache.org/jira/browse/OOZIE-2531 Project: Oozie Issue Type: Improvement Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley As per [Apache Spark documentation|http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties] : The property {{spark.yarn.security.tokens.service.enabled}} in Apache Spark which controls whether to retrieve delegation tokens for non-HDFS services when security is enabled. By default, delegation tokens for all supported services are retrieved when those services are configured, but it's possible to disable that behavior if it somehow conflicts with the application being run. Currently supported services are: hive, hbase It would be good to have default as {{spark.yarn.security.tokens.hive.enabled=false}} to avoid having Spark redundantly retry obtaining a token when it already has one (passed originally from Oozie's hcat credentials). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2531) Prevent Spark trying for token which is already available
[ https://issues.apache.org/jira/browse/OOZIE-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2531: --- Attachment: OOZIE-2531-1.patch > Prevent Spark trying for token which is already available > -- > > Key: OOZIE-2531 > URL: https://issues.apache.org/jira/browse/OOZIE-2531 > Project: Oozie > Issue Type: Improvement >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2531-1.patch > > > As per [Apache Spark > documentation|http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties] > : > The property {{spark.yarn.security.tokens.service.enabled}} in Apache Spark > which controls whether to retrieve delegation tokens for non-HDFS services > when security is enabled. By default, delegation tokens for all supported > services are retrieved when those services are configured, but it's possible > to disable that behavior if it somehow conflicts with the application being > run. > Currently supported services are: hive, hbase > It would be good to have default as > {{spark.yarn.security.tokens.hive.enabled=false}} to avoid having Spark > redundantly retry obtaining a token when it already has one (passed > originally from Oozie's hcat credentials). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2529) Support adding secret keys to Credentials of Launcher
[ https://issues.apache.org/jira/browse/OOZIE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283256#comment-15283256 ] Satish Subhashrao Saley commented on OOZIE-2529: Testing JIRA OOZIE-2529 Cleaning local git workspace {color:green}+1 PATCH_APPLIES{color} {color:green}+1 CLEAN{color} {color:green}+1 RAW_PATCH_ANALYSIS{color} .{color:green}+1{color} the patch does not introduce any @author tags .{color:green}+1{color} the patch does not introduce any tabs .{color:green}+1{color} the patch does not introduce any trailing spaces .{color:green}+1{color} the patch does not introduce any line longer than 132 .{color:green}+1{color} the patch does adds/modifies 2 testcase(s) {color:green}+1 RAT{color} .{color:green}+1{color} the patch does not seem to introduce new RAT warnings {color:green}+1 JAVADOC{color} .{color:green}+1{color} the patch does not seem to introduce new Javadoc warnings {color:green}+1 COMPILE{color} .{color:green}+1{color} HEAD compiles .{color:green}+1{color} patch compiles .{color:green}+1{color} the patch does not seem to introduce new javac warnings {color:green}+1 BACKWARDS_COMPATIBILITY{color} .{color:green}+1{color} the patch does not change any JPA Entity/Colum/Basic/Lob/Transient annotations .{color:green}+1{color} the patch does not modify JPA files {color:green}+1 TESTS{color} .Tests run: 1777 {color:green}+1 DISTRO{color} .{color:green}+1{color} distro tarball builds with the patch {color:green}*+1 Overall result, good!, no -1s*{color} The full output of the test-patch run is available at . https://builds.apache.org/job/oozie-trunk-precommit-build/2876/ > Support adding secret keys to Credentials of Launcher > - > > Key: OOZIE-2529 > URL: https://issues.apache.org/jira/browse/OOZIE-2529 > Project: Oozie > Issue Type: Improvement >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2529-1.patch > > > Currently, we only add credentials tokens to launcher job conf. We should > also add credential secret keys to launcher job conf. That way, custom > Credentials implementations can pass on secret keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2529) Pass on the secret keys from Credentials to launcher job conf
Satish Subhashrao Saley created OOZIE-2529: -- Summary: Pass on the secret keys from Credentials to launcher job conf Key: OOZIE-2529 URL: https://issues.apache.org/jira/browse/OOZIE-2529 Project: Oozie Issue Type: Bug Reporter: Satish Subhashrao Saley Assignee: Satish Subhashrao Saley Priority: Minor Currently, we insert credentials tokens to launcher job conf. We should also insert credential secret keys to launcher job conf. That way, user can pass on some secret information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2479) SparkContext Not Using Yarn Config
[ https://issues.apache.org/jira/browse/OOZIE-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15269579#comment-15269579 ] Satish Subhashrao Saley commented on OOZIE-2479: [~OneDeadEar] Did it work for you? > SparkContext Not Using Yarn Config > -- > > Key: OOZIE-2479 > URL: https://issues.apache.org/jira/browse/OOZIE-2479 > Project: Oozie > Issue Type: Bug > Components: workflow >Affects Versions: 4.2.0 > Environment: Oozie 4.2.0.2.3.4.0-3485 > Spark 1.4.1 > Scala 2.10.5 > HDP 2.3 >Reporter: Breandán Mac Parland >Assignee: Satish Subhashrao Saley > > The spark action does not appear to use the jobTracker setting in > job.properties (or in the yarn config) when creating the SparkContext. When > jobTracker property is set to use myDomain:8050 (to match the > yarn.resourcemanager.address setting), I can see in the oozie UI (click on > job > action > action configuration) that myDomain:8050 is being submitted > but when I drill down into the hadoop job history logs I see the error > indicating that a default 0.0.0.0:8032 is being used: > *job.properties* > {code} > nameNode=hdfs://myDomain:8020 > jobTracker=myOtherDomain:8050 > queueName=default > master=yarn # have also tried yarn-cluster and yarn-client > > oozie.use.system.libpath=true > oozie.wf.application.path=${nameNode}/bmp/ > oozie.action.sharelib.for.spark=spark2 # I've added the updated spark libs I > need in here > {code} > > *workflow* > {code} > > > > > ${jobTracker} > ${nameNode} > > > > ${master} > My Workflow > uk.co.bmp.drivers.MyDriver > ${nameNode}/bmp/lib/bmp.spark-assembly-1.0.jar > --conf > spark.yarn.historyServer.address=http://myDomain:18088 --conf > spark.eventLog.dir=hdfs://myDomain/user/spark/applicationHistory --conf > spark.eventLog.enabled=true > ${nameNode}/bmp/input/input_file.csv > > > > > > Workflow failed, error > message[${wf:errorMessage(wf:lastErrorNode())}] > > > > > {code} > *Error* > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception,Call From > myDomain/ipAddress to 0.0.0.0:8032 failed on connection exception: > java.net.ConnectException: Connection refused. For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > ... > at org.apache.spark.SparkContext.(SparkContext.scala:497) > ... > {code} > Where is it pulling 8032 from? Why does it not use the port configured in the > job.properties? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2477) Oozie Spark Node to support Standalone and Mesos Deployment modes.
[ https://issues.apache.org/jira/browse/OOZIE-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272731#comment-15272731 ] Satish Subhashrao Saley commented on OOZIE-2477: There is [org.apache.spark.launcher.SparkLauncher|https://spark.apache.org/docs/latest/api/java/org/apache/spark/launcher/SparkLauncher.html] java API available since spark version 1.4.0. Also, lets us control the child process in which job gets launched. > Oozie Spark Node to support Standalone and Mesos Deployment modes. > -- > > Key: OOZIE-2477 > URL: https://issues.apache.org/jira/browse/OOZIE-2477 > Project: Oozie > Issue Type: Improvement >Reporter: Ahmed Kamal > Labels: Spark > > I'm interested in extending the current spark node to support them and > contributing this to the project. An initial design document is proposed here > https://docs.google.com/document/d/12uf3B6VMgp_sI4sUiOwcmiMLTgLb5kL2weM7T0cgZNk/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley reassigned OOZIE-2482: -- Assignee: Satish Subhashrao Saley (was: Ferenc Denes) > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null: > spark.executorEnv.SPARK_HOME -> /opt/application/Spark/current >
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15267198#comment-15267198 ] Satish Subhashrao Saley commented on OOZIE-2482: Earlier I tried pyspark with {{yarn-cluster}} on single node cluster on my mac and it was very easy. But running pyspark with {{yarn-cluster}} mode on multinode cluster needs few more things. 1. When we submit a spark job, [Spark code | https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1047] checks for {{PYSPARK_ARCHIVES_PATH}}. If {{PYSPARK_ARCHIVES_PATH}} is not present then it looks for {{SPARK_HOME}}. Therefore, we must have at least one of them set up correctly. We can set this environment variable using {{oozie.launcher.mapred.child.env}} property. 2. The py4j-0.9-src.zip and pyspark.zip (versions may vary based on spark version) are necessary to run python script in spark. Therefore, we need both of them present in classpath while executing the script. Simple way is to put them under lib/ directory of our workflow. 3. [--py-files option | https://github.com/apache/spark/blob/30e980ad8e644354f3c2d48b3904499545cf/docs/submitting-applications.md#bundling-your-applications-dependencies] must be configured and passed in {{}} Settings would look like - {code} . . oozie.launcher.mapred.child.env PYSPARK_ARCHIVES_PATH=pyspark.zip yarn-cluster pyspark example /hdfs/path/to/pi.py --queue satishq --conf spark.yarn.historyServer.address=http://spark.yarn.hsaddress.com:#port --conf spark.ui.view.acls=* --conf spark.eventLog.dir=hdfs://hdfspath/mapred/sparkhistory --py-files pyspark.zip,py4j-0.9-src.zip {code} Oozie can do some extra work to make user's life easy by setting {{PYSPARK_ARCHIVES_PATH}}, adding --py-files option automatically by figuring out location of pyspark.zip and py4j-0.9-src.zip based on the mapping file provided by user in {{oozie.service.ShareLibService.mapping.file}} or from default sharelib location if user has not provided any mapping file. > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at >
[jira] [Commented] (OOZIE-2512) ShareLibservice returns incorrect path for jar
[ https://issues.apache.org/jira/browse/OOZIE-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278316#comment-15278316 ] Satish Subhashrao Saley commented on OOZIE-2512: Test failures are not related. > ShareLibservice returns incorrect path for jar > -- > > Key: OOZIE-2512 > URL: https://issues.apache.org/jira/browse/OOZIE-2512 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2512-1.patch, OOZIE-2512-2.patch, > OOZIE-2512-3.patch > > > If we have {{oozie.service.ShareLibService.mapping.file}} setting pointing to > a metafile, then ShareLibServe loads paths from that file. > We can mention path to a directory or path to a direct file. > When path to a directory is mentioned, the paths are populated correctly, but > not when we mentioned direct path for a file. > Consider following paths: > * /sharelib/pig/ > ** pig.jar > ** some.jar > * /sharelib/spark > ** spark-assembly.jar > In metafile, we have > {code} > oozie.pig=/sharelib/pig/ > oozie.spark=/sharelib/spark/spark-assembly.jar > {code} > Now, the SharelibService calculates the paths as > pig - > hdfs://clustername.com:8020/sharelib/pig/pig.jar,hdfs://clustername.com:8020/sharelib/pig/some.jar > spark - /sharelib/spark/spark-assembly.jar > The spark path does not have hdfs prefixed. > Later on, when we run a spark action, it fails with > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, File > file:/sharelib/spark/spark-assembly.jar does not exist > java.io.FileNotFoundException: File file:/sharelib/spark/spark-assembly.jar > does not exist > {code} > Remember, if we already mentioned hdfs prefixed paths in metafile, then it > works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2475) Oozie does not cleanup failed actions; fills up namespace quota
[ https://issues.apache.org/jira/browse/OOZIE-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278320#comment-15278320 ] Satish Subhashrao Saley commented on OOZIE-2475: Test failures are not related > Oozie does not cleanup failed actions; fills up namespace quota > --- > > Key: OOZIE-2475 > URL: https://issues.apache.org/jira/browse/OOZIE-2475 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2475-1.patch > > > There is corner case where leaking happens. > when workflow KillCommand is issued, WfEndXCommand is invoked in the end > finally block, and WfEndXCommand.deleteWFDir() kills action Dir (e.g., > /user/satish/oozie_satish/123450-15-oozie_satish-W) > but, when this happens right before launcher mapper uploading actionData to > HDFS, previously deleted actionDir is created again. the actionDir will not > be cleaned up after. > The solution is to delete clean up action dir in ActionKillXCommand. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2503) show ChildJobURLs to spark action
[ https://issues.apache.org/jira/browse/OOZIE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278331#comment-15278331 ] Satish Subhashrao Saley commented on OOZIE-2503: test failures are unrelated. > show ChildJobURLs to spark action > - > > Key: OOZIE-2503 > URL: https://issues.apache.org/jira/browse/OOZIE-2503 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2503-1.patch, OOZIE-2503-2.patch > > > Adding "ChildJobURLs" to spark action. So the actual SPARK job info is easily > accessible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2512) ShareLibservice returns incorrect path for jar
[ https://issues.apache.org/jira/browse/OOZIE-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2512: --- Attachment: OOZIE-2512-2.patch > ShareLibservice returns incorrect path for jar > -- > > Key: OOZIE-2512 > URL: https://issues.apache.org/jira/browse/OOZIE-2512 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2512-1.patch, OOZIE-2512-2.patch > > > If we have {{oozie.service.ShareLibService.mapping.file}} setting pointing to > a metafile, then ShareLibServe loads paths from that file. > We can mention path to a directory or path to a direct file. > When path to a directory is mentioned, the paths are populated correctly, but > not when we mentioned direct path for a file. > Consider following paths: > * /sharelib/pig/ > ** pig.jar > ** some.jar > * /sharelib/spark > ** spark-assembly.jar > In metafile, we have > {code} > oozie.pig=/sharelib/pig/ > oozie.spark=/sharelib/spark/spark-assembly.jar > {code} > Now, the SharelibService calculates the paths as > pig - > hdfs://clustername.com:8020/sharelib/pig/pig.jar,hdfs://clustername.com:8020/sharelib/pig/some.jar > spark - /sharelib/spark/spark-assembly.jar > The spark path does not have hdfs prefixed. > Later on, when we run a spark action, it fails with > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, File > file:/sharelib/spark/spark-assembly.jar does not exist > java.io.FileNotFoundException: File file:/sharelib/spark/spark-assembly.jar > does not exist > {code} > Remember, if we already mentioned hdfs prefixed paths in metafile, then it > works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2512) ShareLibservice returns incorrect path for jar
[ https://issues.apache.org/jira/browse/OOZIE-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15271400#comment-15271400 ] Satish Subhashrao Saley commented on OOZIE-2512: Thank you for review [~puru]. I have updated the patch accordingly. It was not possible to construct {{org.apache.hadoop.fs.Path}} out of "/some/dir/hive-site.xml#hive-site.xml" using {{org.apache.hadoop.fs.FileSystem.getFileStatus}} since the file {{hive-site.xml#hive-site.xml}} does not exist. I manually constructed the path. > ShareLibservice returns incorrect path for jar > -- > > Key: OOZIE-2512 > URL: https://issues.apache.org/jira/browse/OOZIE-2512 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2512-1.patch, OOZIE-2512-2.patch > > > If we have {{oozie.service.ShareLibService.mapping.file}} setting pointing to > a metafile, then ShareLibServe loads paths from that file. > We can mention path to a directory or path to a direct file. > When path to a directory is mentioned, the paths are populated correctly, but > not when we mentioned direct path for a file. > Consider following paths: > * /sharelib/pig/ > ** pig.jar > ** some.jar > * /sharelib/spark > ** spark-assembly.jar > In metafile, we have > {code} > oozie.pig=/sharelib/pig/ > oozie.spark=/sharelib/spark/spark-assembly.jar > {code} > Now, the SharelibService calculates the paths as > pig - > hdfs://clustername.com:8020/sharelib/pig/pig.jar,hdfs://clustername.com:8020/sharelib/pig/some.jar > spark - /sharelib/spark/spark-assembly.jar > The spark path does not have hdfs prefixed. > Later on, when we run a spark action, it fails with > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, File > file:/sharelib/spark/spark-assembly.jar does not exist > java.io.FileNotFoundException: File file:/sharelib/spark/spark-assembly.jar > does not exist > {code} > Remember, if we already mentioned hdfs prefixed paths in metafile, then it > works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2330) Spark action should take the global jobTracker and nameNode configs by default
[ https://issues.apache.org/jira/browse/OOZIE-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2330: --- Attachment: OOZIE-2330-2.patch > Spark action should take the global jobTracker and nameNode configs by default > -- > > Key: OOZIE-2330 > URL: https://issues.apache.org/jira/browse/OOZIE-2330 > Project: Oozie > Issue Type: Improvement > Components: action >Reporter: Wei Yan >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2330-1.patch, OOZIE-2330-2.patch > > > In Spark Action 0.1 schema, the job-tracker and name-node are required. > {code} > > > {code} > It would be better that the spark action can take default values from the > global configs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2526) Spark action have no way to specify spark driver jvm settings for yarn-client mode
[ https://issues.apache.org/jira/browse/OOZIE-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276606#comment-15276606 ] Satish Subhashrao Saley commented on OOZIE-2526: {quote} The configuration element, if present, contains configuration properties that are passed to the Spark job. This is shouldn't be spark configuration. It should be mapreduce configuration for launcher job. I tried following but it doesn't gets applied to launcher mapreduce job which indicates it's being passed to spark instead of launcher mapreduce. mapreduce.map.memory.mb 8192 mapreduce.map.java.opts -Xmx7000m {quote} You can prefix any property with {{oozie.launcher.}} and it will be applied to launcher mapreduce. > Spark action have no way to specify spark driver jvm settings for yarn-client > mode > -- > > Key: OOZIE-2526 > URL: https://issues.apache.org/jira/browse/OOZIE-2526 > Project: Oozie > Issue Type: Bug >Affects Versions: 4.1.0, 4.2.0 >Reporter: nirav patel > > Currently oozie spark action has spark-opts elements which are basically > passed on to `org.apache.spark.deploy.SparkSubmit` as spark configuration. > In yarn-client mode this is too late and driver JVM is infact started when > calling > `org.apache.oozie.action.hadoop.SparkMain` class. Because oozie bypasses > spark-submit.sh script and directly calls > org.apache.spark.deploy.SparkSubmit. Hence even user specify > --driver-memory=3g it has no effect on running jvm as it's already too late. > I think oozie:launcher task which is a parent map-reduce job itself should > launch its map task (spark driver) with some user specified JVM arguments. > Oozie spark action doc says: > The configuration element, if present, contains configuration properties that > are passed to the Spark job. This is shouldn't be spark configuration. It > should be mapreduce configuration for launcher job. I tried following but it > doesn't gets applied to launcher mapreduce job which indicates it's being > passed to spark instead of launcher mapreduce. > > > mapreduce.map.memory.mb > 8192 > > > mapreduce.map.java.opts > -Xmx7000m > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2512) ShareLibservice returns incorrect path for jar
[ https://issues.apache.org/jira/browse/OOZIE-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2512: --- Attachment: OOZIE-2512-3.patch > ShareLibservice returns incorrect path for jar > -- > > Key: OOZIE-2512 > URL: https://issues.apache.org/jira/browse/OOZIE-2512 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2512-1.patch, OOZIE-2512-2.patch, > OOZIE-2512-3.patch > > > If we have {{oozie.service.ShareLibService.mapping.file}} setting pointing to > a metafile, then ShareLibServe loads paths from that file. > We can mention path to a directory or path to a direct file. > When path to a directory is mentioned, the paths are populated correctly, but > not when we mentioned direct path for a file. > Consider following paths: > * /sharelib/pig/ > ** pig.jar > ** some.jar > * /sharelib/spark > ** spark-assembly.jar > In metafile, we have > {code} > oozie.pig=/sharelib/pig/ > oozie.spark=/sharelib/spark/spark-assembly.jar > {code} > Now, the SharelibService calculates the paths as > pig - > hdfs://clustername.com:8020/sharelib/pig/pig.jar,hdfs://clustername.com:8020/sharelib/pig/some.jar > spark - /sharelib/spark/spark-assembly.jar > The spark path does not have hdfs prefixed. > Later on, when we run a spark action, it fails with > {code} > Failing Oozie Launcher, Main class > [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, File > file:/sharelib/spark/spark-assembly.jar does not exist > java.io.FileNotFoundException: File file:/sharelib/spark/spark-assembly.jar > does not exist > {code} > Remember, if we already mentioned hdfs prefixed paths in metafile, then it > works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2461) Workflow, Coordinator and Bundle job querying should have last modified filter
[ https://issues.apache.org/jira/browse/OOZIE-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276818#comment-15276818 ] Satish Subhashrao Saley commented on OOZIE-2461: [~abhishekbafna] Thank you for review. I have updated the patch. > Workflow, Coordinator and Bundle job querying should have last modified filter > -- > > Key: OOZIE-2461 > URL: https://issues.apache.org/jira/browse/OOZIE-2461 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2461-1.patch, OOZIE-2461-6.patch, > OOZIE-2461-7.patch, OOZIE-2461-8.patch > > > To get currently running coordinator and id, one user had to do > http://localhost:11000/oozie/v1/jobs?jobtype=coord=user%3satish_1.0%3B=1=3000 > They could not use name in the filter as they include a version and keep > changing. For eg: > urs_satish_filter-0.1-daily-coord > urs_puru_service-0.4-hourly-coord > It would be good to have last modified filter to get recently active > coordinators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2461) Workflow, Coordinator and Bundle job querying should have last modified filter
[ https://issues.apache.org/jira/browse/OOZIE-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2461: --- Attachment: OOZIE-2461-9.patch > Workflow, Coordinator and Bundle job querying should have last modified filter > -- > > Key: OOZIE-2461 > URL: https://issues.apache.org/jira/browse/OOZIE-2461 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > Attachments: OOZIE-2461-1.patch, OOZIE-2461-6.patch, > OOZIE-2461-7.patch, OOZIE-2461-8.patch, OOZIE-2461-9.patch > > > To get currently running coordinator and id, one user had to do > http://localhost:11000/oozie/v1/jobs?jobtype=coord=user%3satish_1.0%3B=1=3000 > They could not use name in the filter as they include a version and keep > changing. For eg: > urs_satish_filter-0.1-daily-coord > urs_puru_service-0.4-hourly-coord > It would be good to have last modified filter to get recently active > coordinators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2503) show ChildJobURLs to spark action
[ https://issues.apache.org/jira/browse/OOZIE-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2503: --- Attachment: OOZIE-2503-2.patch > show ChildJobURLs to spark action > - > > Key: OOZIE-2503 > URL: https://issues.apache.org/jira/browse/OOZIE-2503 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2503-1.patch, OOZIE-2503-2.patch > > > Adding "ChildJobURLs" to spark action. So the actual SPARK job info is easily > accessible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242102#comment-15242102 ] Satish Subhashrao Saley commented on OOZIE-2482: Hi [~BigDataOrange], Could you please check if you are facing https://issues.apache.org/jira/browse/SPARK-10795? [This|https://issues.apache.org/jira/browse/SPARK-10795?focusedCommentId=15180011=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15180011] and [this|https://issues.apache.org/jira/browse/SPARK-10795?focusedCommentId=15157683=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15157683] comment mentioned when the issue was seen. Internally, Oozie uses {{spark-submit}} to submit spark job. {quote} Parsed arguments: master yarn-master {quote} yarn-master is not a valid argument for master. [Spark doc|http://spark.apache.org/docs/latest/submitting-applications.html#master-urls] does not mention it. > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null >
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243898#comment-15243898 ] Satish Subhashrao Saley commented on OOZIE-2482: Hi [~BigDataOrange], I set SPARK_HOME incorrectly in hadoop-env.sh and faced same issue. After setting it correctly, I was able to execute pi.py. {{export SPARK_HOME=/Users/saley/hadoop-stuff/spark-1.6.1-bin-hadoop2.6}} Try setting {{export PYSPARK_ARCHIVES_PATH=$SPARK_HOME/python/lib/pyspark.zip,$SPARK_HOME/python/lib/py4j-0.9-src.zip}} But it should work even if you don't set {{PYSPARK_ARCHIVES_PATH}} variable, the [else block|https://github.com/apache/spark/blob/branch-1.6/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L1049-L1058] in the code will get executed. > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Ferenc Denes > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archives
[jira] [Updated] (OOZIE-2330) Spark action should take the global jobTracker and nameNode configs by default
[ https://issues.apache.org/jira/browse/OOZIE-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2330: --- Attachment: OOZIE-2330-1.patch > Spark action should take the global jobTracker and nameNode configs by default > -- > > Key: OOZIE-2330 > URL: https://issues.apache.org/jira/browse/OOZIE-2330 > Project: Oozie > Issue Type: Improvement > Components: action >Reporter: Wei Yan >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2330-1.patch > > > In Spark Action 0.1 schema, the job-tracker and name-node are required. > {code} > > > {code} > It would be better that the spark action can take default values from the > global configs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley reassigned OOZIE-2482: -- Assignee: Satish Subhashrao Saley > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Satish Subhashrao Saley > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null: > spark.executorEnv.SPARK_HOME -> /opt/application/Spark/current >
[jira] [Commented] (OOZIE-2330) Spark action should take the global jobTracker and nameNode configs by default
[ https://issues.apache.org/jira/browse/OOZIE-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240185#comment-15240185 ] Satish Subhashrao Saley commented on OOZIE-2330: Hi [~ywskycn], are you planning to put a patch or shall I reassign ticket to me? > Spark action should take the global jobTracker and nameNode configs by default > -- > > Key: OOZIE-2330 > URL: https://issues.apache.org/jira/browse/OOZIE-2330 > Project: Oozie > Issue Type: Improvement > Components: action >Reporter: Wei Yan >Assignee: Wei Yan >Priority: Minor > > In Spark Action 0.1 schema, the job-tracker and name-node are required. > {code} > > > {code} > It would be better that the spark action can take default values from the > global configs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2482) Pyspark job fails with Oozie
[ https://issues.apache.org/jira/browse/OOZIE-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242873#comment-15242873 ] Satish Subhashrao Saley commented on OOZIE-2482: [~fdenes] Have you resolved the issue already (saw ticket reassigned)? If not, I am willing to work on it. > Pyspark job fails with Oozie > > > Key: OOZIE-2482 > URL: https://issues.apache.org/jira/browse/OOZIE-2482 > Project: Oozie > Issue Type: Bug > Components: core, workflow >Affects Versions: 4.2.0 > Environment: Hadoop 2.7.2, Spark 1.6.0 on Yarn, Oozie 4.2.0 > Cluster secured with Kerberos >Reporter: Alexandre Linte >Assignee: Ferenc Denes > > Hello, > I'm trying to run pi.py example in a pyspark job with Oozie. Every try I made > failed for the same reason: key not found: SPARK_HOME. > Note: A scala job works well in the environment with Oozie. > The logs on the executors are: > {noformat} > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/mnt/hd4/hadoop/yarn/local/filecache/145/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/mnt/hd2/hadoop/yarn/local/filecache/155/spark-assembly-1.6.0-hadoop2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/application/Hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > log4j:ERROR setFile(null,true) call failed. > java.io.FileNotFoundException: > /mnt/hd7/hadoop/yarn/log/application_1454673025841_13136/container_1454673025841_13136_01_01 > (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:221) > at java.io.FileOutputStream.(FileOutputStream.java:142) > at org.apache.log4j.FileAppender.setFile(FileAppender.java:294) > at > org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165) > at > org.apache.hadoop.yarn.ContainerLogAppender.activateOptions(ContainerLogAppender.java:55) > at > org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172) > at > org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104) > at > org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:809) > at > org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735) > at > org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502) > at > org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547) > at > org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483) > at org.apache.log4j.LogManager.(LogManager.java:127) > at > org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155) > at > org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132) > at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:275) > at > org.apache.hadoop.service.AbstractService.(AbstractService.java:43) > Using properties file: null > Parsed arguments: > master yarn-master > deployMode cluster > executorMemory null > executorCores null > totalExecutorCores null > propertiesFile null > driverMemorynull > driverCores null > driverExtraClassPathnull > driverExtraLibraryPath null > driverExtraJavaOptions null > supervise false > queue null > numExecutorsnull > files null > pyFiles null > archivesnull > mainClass null > primaryResource > hdfs://hadoopsandbox/User/toto/WORK/Oozie/pyspark/lib/pi.py > namePysparkpi example > childArgs [100] > jarsnull > packagesnull > packagesExclusions null > repositoriesnull > verbose true > Spark properties used, including those specified through > --conf and those from the properties file null:
[jira] [Commented] (OOZIE-2536) Shell action got stuck for 6 hours even after Exit status is 0
[ https://issues.apache.org/jira/browse/OOZIE-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299031#comment-15299031 ] Satish Subhashrao Saley commented on OOZIE-2536: My initial guess is that sometimes {{propagation-conf.xml}} is getting deleted and AsyncDispatcher event handler is unable to find it. Following is analysis so far - 1. [OOZIE-2129|https://issues.apache.org/jira/browse/OOZIE-2129] added {{propagation-conf.xml}} to configuration in Mapper phase of LauncherMapper. {code} Configuration.addDefaultResource(PROPAGATION_CONF_XML); {code} 2. An Event handler tries to relocalize (delete unnecessory) files in current directory {code} private void relocalize() { File[] curLocalFiles = curDir.listFiles(); for (int j = 0; j < curLocalFiles.length; ++j) { if (!localizedFiles.contains(curLocalFiles[j])) { // found one that wasn't there before: delete it boolean deleted = false; try { if (curFC != null) { // this is recursive, unlike File delete(): deleted = curFC.delete(new Path(curLocalFiles[j].getName()),true); } } catch (IOException e) { deleted = false; } if (!deleted) { LOG.warn("Unable to delete unexpected local file/dir " + curLocalFiles[j].getName() + ": insufficient permissions?"); } } } {code} If we follow the code from [here | https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java#L230], it would be as -- {code} runTask(launchEv, localMapFiles); -> runSubtask(remoteTask, ytask.getType(), attemptID, numMapTasks,(numReduceTasks > 0), localMapFiles); -> relocalize(); {code} I suspect that sometimes the hash set named {{localizedFiles}} does not contain the {{propagation-conf.xml}}. Reason for that would be - 3. {{localizedFiles}} gets populated in [constructor of LocalContainerLauncher|https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java#L89-L115] {code} // Save list of files/dirs that are supposed to be present so can delete // any extras created by one task before starting subsequent task. Note // that there's no protection against deleted or renamed localization; // users who do that get what they deserve (and will have to disable // uberization in order to run correctly). File[] curLocalFiles = curDir.listFiles(); localizedFiles = new HashSet(curLocalFiles.length); for (int j = 0; j < curLocalFiles.length; ++j) { localizedFiles.add(curLocalFiles[j]); } {code} In ApplicationMaster, [serviceInit() method | https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java#L438], we are instantiating {{ContainerLauncherRouter}} which contains {{LocalContainerLauncher}}. In the comment, it has been mentioned that {code} /** * By the time life-cycle of this router starts, job-init would have already * happened. */ private final class ContainerLauncherRouter extends AbstractService {code} It makes me think that sometimes, {{propagation-conf.xml}} gets added to the current working after the {{localizedFiles}} gets populated. If this is true, then in {{relocalize()}} method, {{propagation-conf.xml}} would get deleted. And when AsyncDispatcher Event Handler, is in [process of committing the job|https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java#L1700-L1718], it fails because it does not find {{propagation-conf.xml}} which was part of conf. > Shell action got stuck for 6 hours even after Exit status is 0 > -- > > Key: OOZIE-2536 > URL: https://issues.apache.org/jira/browse/OOZIE-2536 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley > > In out environment, we faced an issue where uberized Shell action was getting > stuck even though the shell action got completed with status 0. Please refer > the attached syslog and stdout if launcher job, here I point out partially > stdout : > {quote} > >>> Invoking Shell command line now >> > Stdoutput myshellType=qmyshellUpdate > Exit code of the Shell command 0 > <<< Invocation of Shell command completed <<< > <<< Invocation of Main class completed <<< > {quote} > syslog > {quote} > 2016-05-23 11:15:52,587 WARN [uber-SubtaskRunner] >
[jira] [Updated] (OOZIE-2440) Exponential re-try policy for workflow action
[ https://issues.apache.org/jira/browse/OOZIE-2440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated OOZIE-2440: --- Attachment: OOZIE-2440-5.patch Thank you [~jaydeepvishwakarma] for review. > Exponential re-try policy for workflow action > - > > Key: OOZIE-2440 > URL: https://issues.apache.org/jira/browse/OOZIE-2440 > Project: Oozie > Issue Type: Bug >Reporter: Satish Subhashrao Saley >Assignee: Satish Subhashrao Saley >Priority: Minor > Attachments: OOZIE-2440-2.patch, OOZIE-2440-3.patch, > OOZIE-2440-4.patch, OOZIE-2440-5.patch > > > Currently the user can mention retry interval and maximum number of retries. > We will add another element in action tag through which the user can mention > policy for the retry. Policy could be exponential or periodic. Periodic will > be the default as it is the current policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)