[jira] [Commented] (OOZIE-2243) Kill Command does not kill the child job for java action
[ https://issues.apache.org/jira/browse/OOZIE-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15138124#comment-15138124 ] Mohammad Kamrul Islam commented on OOZIE-2243: -- any update for this? also does it cover all action types (such as pig, hive, spark etc)? > Kill Command does not kill the child job for java action > > > Key: OOZIE-2243 > URL: https://issues.apache.org/jira/browse/OOZIE-2243 > Project: Oozie > Issue Type: Bug >Reporter: Narayan Periwal >Assignee: Narayan Periwal >Priority: Minor > Attachments: OOZIE-2243-v0.patch, OOZIE-2243-v1.patch, > OOZIE-2243-v2.patch, OOZIE-2243-v3.patch, OOZIE-2243-v4.patch, > OOZIE-2243-v5.patch > > > Lets say, there is launcher job that launches another map-reduce job through > java-action. When we kill the launcher job, the child job launched by it does > not get killed and only the launcher job gets killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2339) Use JAXB to provide a Java interface for writing Jobs based on the XSD schemas
[ https://issues.apache.org/jira/browse/OOZIE-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730368#comment-14730368 ] Mohammad Kamrul Islam commented on OOZIE-2339: -- Strong +1. Actually Falcon is utilizing this. > Use JAXB to provide a Java interface for writing Jobs based on the XSD schemas > -- > > Key: OOZIE-2339 > URL: https://issues.apache.org/jira/browse/OOZIE-2339 > Project: Oozie > Issue Type: New Feature > Components: client >Affects Versions: trunk >Reporter: Robert Kanter > > Users often complain about the XML they have to write for Oozie jobs. It > would be nice if they could write them in something like Java, but we don't > want to have to maintain a separate Java API for this. I was looking around > and saw that JAXB might be the right thing here. From what I can tell, it > lets you create Java classes from XSD schemas. So, we should be able to > auto-generate a Java API for writing Oozie jobs, without having to really > maintain it. > We should investigate if this is feasible and, if so, implement it. > Some useful looking links: > https://en.wikipedia.org/wiki/Java_Architecture_for_XML_Binding > https://jaxb.java.net/2.2.11/docs/ch03.html > https://java.net/projects/maven-jaxb2-plugin/pages/Home -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2259) Create a callback action
[ https://issues.apache.org/jira/browse/OOZIE-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660216#comment-14660216 ] Mohammad Kamrul Islam commented on OOZIE-2259: -- We do not need launcher, The implementation would be same as FsActionExecutor/EmailActionExecutor. In general, we prefer any new action to run through launcher. The proposed action is connecting to an external system, the behavior of that system may impact the performance of Oozie server itself. EmailAction also falls into the same logic, I think we should have done that in launcher as well. For example, if the external system takes long time to reply, one thread in Oozie server will wait for that long time. Since Oozie is multi-tenant, we should isolate these types of behavior from Oozie core service. Is there any problem if we follow launcher approach? Create a callback action - Key: OOZIE-2259 URL: https://issues.apache.org/jira/browse/OOZIE-2259 Project: Oozie Issue Type: New Feature Components: action Reporter: Jaydeep Vishwakarma Assignee: Jaydeep Vishwakarma Attachments: OOZIE-2259-v1.patch, OOZIE-2259-v3.patch Need an action to send notification to external server by oozie. We should be able to do multiple types of callback, Currently I know jms and http call. It should suppose to have capability to call diffrent types of methods along with n number of arguments. The sample workflow with callback action {code:xml} workflow-app name=[WF-DEF-NAME] xmlns=uri:oozie:workflow:0.3 ... action name=[NODE-NAME] callback host[HOST]/host method[METHOD]/command arg key[KEY]/keyvalue[VALUE]/value arg ... /action ... /callback ... /workflow-app {code} HOST : by the host system can figure out if it is http or jms callback action. System will send the notification to that host. METHOD : it can be POST/GET/QUEUE/TOPIC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2187) Add a way to specify a default JT/RM and NN
[ https://issues.apache.org/jira/browse/OOZIE-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14384321#comment-14384321 ] Mohammad Kamrul Islam commented on OOZIE-2187: -- +1 for this long awaited feature. We should also default the hive-site.xml for hive action. Add a way to specify a default JT/RM and NN --- Key: OOZIE-2187 URL: https://issues.apache.org/jira/browse/OOZIE-2187 Project: Oozie Issue Type: New Feature Components: core Reporter: Robert Kanter Assignee: Robert Kanter Oozie is cluster agnostic, which is why we require an RM/JT and NN per action in your workflow (or once via the global section). In practice, many users use one Oozie server per cluster, so it's an extra burden for them to have to specify this all the time. It would be convenient if we added configuration properties to oozie-site that would let you specify a default RM/JT and NN to use. This way, these users could completely omit the {{job-tracker}} and {{name-node}} fields from their workflows; as an added benefit, they can easily update these values if they ever rename/move their RM/JT or NN. We'd of course still allow specifying {{job-tracker}} and {{name-node}} in each action and {{global}} to allow individual workflows or actions to override the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OOZIE-2106) Make tomcat download url configurable in the pom file
Mohammad Kamrul Islam created OOZIE-2106: Summary: Make tomcat download url configurable in the pom file Key: OOZIE-2106 URL: https://issues.apache.org/jira/browse/OOZIE-2106 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Currently it is hard-coded to a specific tomcat download URL. If anyone wants to override this with other version, he needs to change the pom file. Instead, if it is a variable, user can override this from the command line using something like mvn -Dtomcat.download.url=myurl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2106) Make tomcat download url configurable in the pom file
[ https://issues.apache.org/jira/browse/OOZIE-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-2106: - Attachment: OOZIE-6258.1.patch Make tomcat download url configurable in the pom file - Key: OOZIE-2106 URL: https://issues.apache.org/jira/browse/OOZIE-2106 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: OOZIE-6258.1.patch Currently it is hard-coded to a specific tomcat download URL. If anyone wants to override this with other version, he needs to change the pom file. Instead, if it is a variable, user can override this from the command line using something like mvn -Dtomcat.download.url=myurl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2105) Make version of submodules configurable with parent version
[ https://issues.apache.org/jira/browse/OOZIE-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265610#comment-14265610 ] Mohammad Kamrul Islam commented on OOZIE-2105: -- Thanks [~rkanter] for quick comments. So if the version is 4.2.0-SNAPSHOT , mvn will not publish for some modules. right? Overall, do you think we need this patch? If not, how to handle something like this mvn version:set newVersion=4.1.0.48? Make version of submodules configurable with parent version Key: OOZIE-2105 URL: https://issues.apache.org/jira/browse/OOZIE-2105 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: OOZIE-6252.1.patch Currently the versions of the Oozie sub-modules are hard-coded with parent version. If someone changes the parent version, all sub-modules versions will need to be explicitly updated. For example, in hadooplibs/hadoop-1/pom.xml we use groupIdorg.apache.oozie/groupId artifactIdoozie-hadoop/artifactId version1.1.1.oozie-4.1.0/version ... If you want to modify the Oozie version to 4.1.1 (say), you need to go to all pom files and manually change replace 4.1.0 to 4.1.1. This JIRA is to use parent.version instead of hard-coding. For example, use this : version1.1.1.oozie-${parent.version}/version This will allow to change only the parent version in root pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2106) Make tomcat download url configurable in the pom file
[ https://issues.apache.org/jira/browse/OOZIE-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-2106: - Attachment: OOZIE-2106.1.patch Make tomcat download url configurable in the pom file - Key: OOZIE-2106 URL: https://issues.apache.org/jira/browse/OOZIE-2106 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: OOZIE-2106.1.patch, OOZIE-6258.1.patch Currently it is hard-coded to a specific tomcat download URL. If anyone wants to override this with other version, he needs to change the pom file. Instead, if it is a variable, user can override this from the command line using something like mvn -Dtomcat.download.url=myurl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OOZIE-2105) Make version of submodules configurable with parent version
[ https://issues.apache.org/jira/browse/OOZIE-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265623#comment-14265623 ] Mohammad Kamrul Islam edited comment on OOZIE-2105 at 1/6/15 3:19 AM: -- Sorry I forgot to upload the latest code. Looks like Swetha already removed those files:) So I can close this JIRA. was (Author: kamrul): Sorry I forgot to upload the latest code. Looks like Swetha already removed those files:) So I close this JIRA. Make version of submodules configurable with parent version Key: OOZIE-2105 URL: https://issues.apache.org/jira/browse/OOZIE-2105 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: OOZIE-6252.1.patch Currently the versions of the Oozie sub-modules are hard-coded with parent version. If someone changes the parent version, all sub-modules versions will need to be explicitly updated. For example, in hadooplibs/hadoop-1/pom.xml we use groupIdorg.apache.oozie/groupId artifactIdoozie-hadoop/artifactId version1.1.1.oozie-4.1.0/version ... If you want to modify the Oozie version to 4.1.1 (say), you need to go to all pom files and manually change replace 4.1.0 to 4.1.1. This JIRA is to use parent.version instead of hard-coding. For example, use this : version1.1.1.oozie-${parent.version}/version This will allow to change only the parent version in root pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2105) Make version of submodules configurable with parent version
[ https://issues.apache.org/jira/browse/OOZIE-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265623#comment-14265623 ] Mohammad Kamrul Islam commented on OOZIE-2105: -- Sorry I forgot to upload the latest code. Looks like Swetha already removed those files:) So I close this JIRA. Make version of submodules configurable with parent version Key: OOZIE-2105 URL: https://issues.apache.org/jira/browse/OOZIE-2105 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Attachments: OOZIE-6252.1.patch Currently the versions of the Oozie sub-modules are hard-coded with parent version. If someone changes the parent version, all sub-modules versions will need to be explicitly updated. For example, in hadooplibs/hadoop-1/pom.xml we use groupIdorg.apache.oozie/groupId artifactIdoozie-hadoop/artifactId version1.1.1.oozie-4.1.0/version ... If you want to modify the Oozie version to 4.1.1 (say), you need to go to all pom files and manually change replace 4.1.0 to 4.1.1. This JIRA is to use parent.version instead of hard-coding. For example, use this : version1.1.1.oozie-${parent.version}/version This will allow to change only the parent version in root pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2098) Add Apache parent POM to oozie
[ https://issues.apache.org/jira/browse/OOZIE-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14253870#comment-14253870 ] Mohammad Kamrul Islam commented on OOZIE-2098: -- Thanks [~sureshms] for the patch. For my education purpose, what do we need to do for publishing to maven central or nexus after this patch. Is there any wiki? +1 for the patch and for branch 4.1. Add Apache parent POM to oozie -- Key: OOZIE-2098 URL: https://issues.apache.org/jira/browse/OOZIE-2098 Project: Oozie Issue Type: Bug Affects Versions: 4.1.0 Reporter: Suresh Srinivas Attachments: OOZIE-2098.patch This jira proposes adding Apache parent POM to oozie to help publishing the release artifacts to nexus repo and to maven central. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2079) Notify when a coordinator action status becomes RUNNING
[ https://issues.apache.org/jira/browse/OOZIE-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-2079: - Attachment: OOZIE-2079.1.patch initial patch uploaded. Notify when a coordinator action status becomes RUNNING --- Key: OOZIE-2079 URL: https://issues.apache.org/jira/browse/OOZIE-2079 Project: Oozie Issue Type: Bug Components: core Affects Versions: trunk, 4.0.1 Reporter: Venkat Ramachandran Assignee: Mohammad Kamrul Islam Attachments: OOZIE-2079.1.patch When a coordinator-action is materialized with WAITING status, Oozie calls the oozie.coord.action.notification.url as expected, but it does not when the coordinator-action status changes to RUNNING. It appears that CoordActionStartXCommand.execute() (CoordActionStartXCommand.java) method does not invoke a CoordActionNotificationXCommand(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-2086) Duplicate coord action notification at suspend, resume
[ https://issues.apache.org/jira/browse/OOZIE-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14232424#comment-14232424 ] Mohammad Kamrul Islam commented on OOZIE-2086: -- is it JMS message? or web service based notification? Duplicate coord action notification at suspend, resume Key: OOZIE-2086 URL: https://issues.apache.org/jira/browse/OOZIE-2086 Project: Oozie Issue Type: Bug Reporter: Purshotam Shah Reported by [~mchiang_4w...@yahoo.com]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2079) Notify when a coordinator action status becomes RUNNING
[ https://issues.apache.org/jira/browse/OOZIE-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-2079: - Assignee: Mohammad Kamrul Islam Notify when a coordinator action status becomes RUNNING --- Key: OOZIE-2079 URL: https://issues.apache.org/jira/browse/OOZIE-2079 Project: Oozie Issue Type: Bug Components: core Affects Versions: trunk, 4.0.1 Reporter: Venkat Ramachandran Assignee: Mohammad Kamrul Islam When a coordinator-action is materialized with WAITING status, Oozie calls the oozie.coord.action.notification.url as expected, but it does not when the coordinator-action status changes to RUNNING. It appears that CoordActionStartXCommand.execute() (CoordActionStartXCommand.java) method does not invoke a CoordActionNotificationXCommand(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OOZIE-2080) Include WorkflowJob is in the coordinator Action START notification
[ https://issues.apache.org/jira/browse/OOZIE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-2080: - Assignee: Mohammad Kamrul Islam Include WorkflowJob is in the coordinator Action START notification --- Key: OOZIE-2080 URL: https://issues.apache.org/jira/browse/OOZIE-2080 Project: Oozie Issue Type: Bug Components: core Affects Versions: trunk, 4.1.0, 4.0.1 Reporter: Venkat Ramachandran Assignee: Mohammad Kamrul Islam Oozie calls 'oozie.coord.action.notification.url' when a coordinator action is ready to run (see JIRA OOZIE-2079). The callback URL to include external_id (the workflow job) in addition to coordinator action id and status. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-1795) Please create a DOAP file for your TLP
[ https://issues.apache.org/jira/browse/OOZIE-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14135034#comment-14135034 ] Mohammad Kamrul Islam commented on OOZIE-1795: -- I will do it shortly. Please create a DOAP file for your TLP -- Key: OOZIE-1795 URL: https://issues.apache.org/jira/browse/OOZIE-1795 Project: Oozie Issue Type: Task Reporter: Sebb Assignee: Rohini Palaniswamy Attachments: OOZIE-1795-1.patch As per my recent e-mail to your dev list, please can you set up a DOAP for your project and get it added to files.xml? Please see http://projects.apache.org/create.html Once you have created the DOAP and committed it to your source code repository, please submit it for inclusion in the Apache projects listing as per: http://projects.apache.org/create.html#submit Remember, if you ever move or rename the doap file in future, please ensure that files.xml is updated to point to the new location. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OOZIE-1855) TestPriorityDelayQueue#testPoll failed intermittently in Jenkins
[ https://issues.apache.org/jira/browse/OOZIE-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006744#comment-14006744 ] Mohammad Kamrul Islam commented on OOZIE-1855: -- [~omaliuvanchuk] : Do you think making it to 15 would be sufficient? If yes, we can save another 5 seconds runtime. TestPriorityDelayQueue#testPoll failed intermittently in Jenkins Key: OOZIE-1855 URL: https://issues.apache.org/jira/browse/OOZIE-1855 Project: Oozie Issue Type: Bug Environment: Windows Reporter: Ostap Assignee: Ostap Attachments: BUG-18115.patch Test failed with the next error: {noformat} java.lang.NullPointerException at org.apache.oozie.util.TestPriorityDelayQueue.testPoll(TestPriorityDelayQueue.java:167) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:243) at junit.framework.TestSuite.run(TestSuite.java:238) at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) at org.apache.maven.surefire.junitcore.ClassDemarcatingRunner.run(ClassDemarcatingRunner.java:58) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:24) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (OOZIE-1756) hadoop-auth version is wrong if profile isn't selected
[ https://issues.apache.org/jira/browse/OOZIE-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13945967#comment-13945967 ] Mohammad Kamrul Islam commented on OOZIE-1756: -- +1 hadoop-auth version is wrong if profile isn't selected -- Key: OOZIE-1756 URL: https://issues.apache.org/jira/browse/OOZIE-1756 Project: Oozie Issue Type: Bug Components: build Affects Versions: trunk, 4.0.0 Reporter: Robert Kanter Assignee: Robert Kanter Fix For: trunk, 4.0.1 Attachments: OOZIE-1756.patch, OOZIE-1756.patch In the hadooplibs for the non-sepecified version of hadoop (e.g. if you build with no profile (i.e. Hadoop 1) and look at the hadoop-2 sharelib), it will have the wrong version of the hadoop-auth jar. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (OOZIE-1462) Compress lob columns before storing in database
[ https://issues.apache.org/jira/browse/OOZIE-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13715935#comment-13715935 ] Mohammad Kamrul Islam commented on OOZIE-1462: -- Adding to Robert's comment: * Overall it is a good step forward. * How much performance degradation for compression and decompression(most frequent one)? * Will this compression/decompression be active for all the time? Or it could be configurable using oozie-site.xml. * What other fields? Or all *lob fields. * Please consider MySQL case as well. Compress lob columns before storing in database --- Key: OOZIE-1462 URL: https://issues.apache.org/jira/browse/OOZIE-1462 Project: Oozie Issue Type: Improvement Reporter: Virag Kothari Assignee: Virag Kothari Storing huge data in lobs is very inefficient. Making Oozie compress the data before storing will reduce size of data to be stored in lobs and help in reducing the time for queries. Also most databases like oracle, mysql support storing lob data in tablerow (inline) if the data is of smaller size. Inline storage has much better performance compared to outline storage (storage outside of tablerow) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1455) Can't stop oozie to create actions in WAITING state
[ https://issues.apache.org/jira/browse/OOZIE-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710139#comment-13710139 ] Mohammad Kamrul Islam commented on OOZIE-1455: -- Overriding the throttle value to 1 should solve this. right? Can't stop oozie to create actions in WAITING state --- Key: OOZIE-1455 URL: https://issues.apache.org/jira/browse/OOZIE-1455 Project: Oozie Issue Type: Bug Components: coordinator Affects Versions: 3.3.2 Environment: CDH 4.3 Reporter: Sergey Priority: Critical I've created coordinator with controls settings: {code} controls timeout15/timeout concurrency1/concurrency throttle-1/throttle /controls {code} I have data partitioned by hour and this data should be processed sequentially. When first partition is processed, some derivative is generated used in processing for the next hour. Oozie starts actions (trasition to RUNNING state) one by one. It's good. The bad thing is that oozie create future actions in READY/WAITING state. READY - when prev action did prepare derivate dataset, WAITING - when prev action didn't prepare derivate. Sometimes WAITING is changed to TIMEOUT and the whole process is broken. Is there any possiblity to forbid oozie to create actions in future while we have running action? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OOZIE-1378) EL function coord:tzOffset doesn't consider DST correctly
Mohammad Kamrul Islam created OOZIE-1378: Summary: EL function coord:tzOffset doesn't consider DST correctly Key: OOZIE-1378 URL: https://issues.apache.org/jira/browse/OOZIE-1378 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Email communications from Paul Chavez. From: Paul Chavez pcha...@verticalsearchworks.com To: u...@oozie.apache.org u...@oozie.apache.org Sent: Tuesday, May 14, 2013 10:23 AM Subject: tzOffset() not working as expected It seems the tzOffset() function is not accounting for DST. I am in Pacific Daylight Saving Time right now, which is a -7 offset from UTC. However, a coordinator configured to use tzOffset to calculate localtime paths and date strings is returning offsets of -8, which is the Standard Time offset. Below is a test coordinator I put together for debugging purposes along with dryrun output showing that it's offsetting by -8 hours. Can someone please verify if the tzOffset() function is supposed to be accounting for time changes? Thank you, Paul Chavez Relevant part of coordinator XML, I'm trying to get both the Current hour (local) when the coordinator triggers, as well as the Previous hour. The coordinator was submitted with a start date of 5pm UTC which is currently 10am Pacific Daylight Time. coordinator-app xmlns=uri:oozie:coordinator:0.1 name=Test tzOffset frequency=60 start=2013-05-14T17:00Z end=2013-11-01T08:30Z timezone=UTC freq_timeunit=MINUTE end_of_duration=NONE input-events data-in name=CurrentHourLogs dataset=LogPath dataset name=LogPath frequency=60 initial-instance=2013-04-27T07:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-templatehdfs://nameservice1/logs/datekey=${YEAR}${MONTH}${DAY}/hour=${HOUR}/uri-template done-flag / /dataset instance${coord:current(coord:tzOffset()/60)}/instance /data-in /input-events output-events data-out name=CurrentHour dataset=IntHour dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${HOUR}/uri-template /dataset instance${coord:current(coord:tzOffset()/60)}/instance /data-out data-out name=CurrentDay dataset=IntDay dataset name=IntDay frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${YEAR}${MONTH}${DAY}/uri-template /dataset instance${coord:current(coord:tzOffset()/60)}/instance /data-out data-out name=PreviousHour dataset=IntHour dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${HOUR}/uri-template /dataset instance${coord:current((coord:tzOffset()/60)-1)}/instance /data-out data-out name=PreviousDay dataset=IntDay dataset name=IntDay frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${YEAR}${MONTH}${DAY}/uri-template /dataset instance${coord:current((coord:tzOffset()/60)-1)}/instance /data-out /output-events Dryrun output, expecting CurrentHour to be 10 and PreviousHour to be 09: coordAction instance: 1: coordinator-app xmlns=uri:oozie:coordinator:0.1 name=Test tzOffset frequency=60 timezone=UTC freq_timeunit=MINUTE end_of_duration=NONE instance-number=1 action-nominal-time=2013-05-14T17:00Z action-actual-time=2013-05-14T17:14Z input-events data-in name=CurrentHourLogs dataset=LogPath urishdfs://nameservice1/logs/datekey=20130514/hour=09/uris dataset name=LogPath frequency=60 initial-instance=2013-04-27T07:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-templatehdfs://nameservice1/logs/datekey=${YEAR}${MONTH}${DAY}/hour=${HOUR}/uri-template done-flag / /dataset /data-in /input-events output-events data-out name=CurrentHour dataset=IntHour uris09/uris dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${HOUR}/uri-template /dataset /data-out data-out name=CurrentDay dataset=IntDay uris20130514/uris dataset name=IntDay frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${YEAR}${MONTH}${DAY}/uri-template /dataset /data-out data-out name=PreviousHour dataset=IntHour uris08/uris dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE
[jira] [Updated] (OOZIE-1378) EL function coord:tzOffset doesn't consider DST correctly
[ https://issues.apache.org/jira/browse/OOZIE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-1378: - Attachment: OOZIE-1378_v1.patch EL function coord:tzOffset doesn't consider DST correctly -- Key: OOZIE-1378 URL: https://issues.apache.org/jira/browse/OOZIE-1378 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Attachments: OOZIE-1378_v1.patch Email communications from Paul Chavez. From: Paul Chavez pcha...@verticalsearchworks.com To: u...@oozie.apache.org u...@oozie.apache.org Sent: Tuesday, May 14, 2013 10:23 AM Subject: tzOffset() not working as expected It seems the tzOffset() function is not accounting for DST. I am in Pacific Daylight Saving Time right now, which is a -7 offset from UTC. However, a coordinator configured to use tzOffset to calculate localtime paths and date strings is returning offsets of -8, which is the Standard Time offset. Below is a test coordinator I put together for debugging purposes along with dryrun output showing that it's offsetting by -8 hours. Can someone please verify if the tzOffset() function is supposed to be accounting for time changes? Thank you, Paul Chavez Relevant part of coordinator XML, I'm trying to get both the Current hour (local) when the coordinator triggers, as well as the Previous hour. The coordinator was submitted with a start date of 5pm UTC which is currently 10am Pacific Daylight Time. coordinator-app xmlns=uri:oozie:coordinator:0.1 name=Test tzOffset frequency=60 start=2013-05-14T17:00Z end=2013-11-01T08:30Z timezone=UTC freq_timeunit=MINUTE end_of_duration=NONE input-events data-in name=CurrentHourLogs dataset=LogPath dataset name=LogPath frequency=60 initial-instance=2013-04-27T07:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-templatehdfs://nameservice1/logs/datekey=${YEAR}${MONTH}${DAY}/hour=${HOUR}/uri-template done-flag / /dataset instance${coord:current(coord:tzOffset()/60)}/instance /data-in /input-events output-events data-out name=CurrentHour dataset=IntHour dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${HOUR}/uri-template /dataset instance${coord:current(coord:tzOffset()/60)}/instance /data-out data-out name=CurrentDay dataset=IntDay dataset name=IntDay frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${YEAR}${MONTH}${DAY}/uri-template /dataset instance${coord:current(coord:tzOffset()/60)}/instance /data-out data-out name=PreviousHour dataset=IntHour dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${HOUR}/uri-template /dataset instance${coord:current((coord:tzOffset()/60)-1)}/instance /data-out data-out name=PreviousDay dataset=IntDay dataset name=IntDay frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${YEAR}${MONTH}${DAY}/uri-template /dataset instance${coord:current((coord:tzOffset()/60)-1)}/instance /data-out /output-events Dryrun output, expecting CurrentHour to be 10 and PreviousHour to be 09: coordAction instance: 1: coordinator-app xmlns=uri:oozie:coordinator:0.1 name=Test tzOffset frequency=60 timezone=UTC freq_timeunit=MINUTE end_of_duration=NONE instance-number=1 action-nominal-time=2013-05-14T17:00Z action-actual-time=2013-05-14T17:14Z input-events data-in name=CurrentHourLogs dataset=LogPath urishdfs://nameservice1/logs/datekey=20130514/hour=09/uris dataset name=LogPath frequency=60 initial-instance=2013-04-27T07:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-templatehdfs://nameservice1/logs/datekey=${YEAR}${MONTH}${DAY}/hour=${HOUR}/uri-template done-flag / /dataset /data-in /input-events output-events data-out name=CurrentHour dataset=IntHour uris09/uris dataset name=IntHour frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles freq_timeunit=MINUTE end_of_duration=NONE uri-template${HOUR}/uri-template /dataset /data-out data-out name=CurrentDay dataset=IntDay uris20130514/uris dataset name=IntDay frequency=60 initial-instance=2013-05-09T00:00Z timezone=America/Los_Angeles
[jira] [Commented] (OOZIE-1231) Provide access to launcher job URL from web console when using Map Reduce action
[ https://issues.apache.org/jira/browse/OOZIE-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633507#comment-13633507 ] Mohammad Kamrul Islam commented on OOZIE-1231: -- Overall I think this is a very confusing and long discussed issue. No single approach will solve this. UI: I don't like the option of including another tab for only this. Rather a new item after Console URL in action Info will be better. Backend : More confusions are here. Let's solve the original inconsistency issue. Robert's comments are better. But it will also break backward-compatibility. One option to workaround this is: Whoever will call older version of REST API (such as v1), Oozie will return the way it is now. For new REST API version (v2+), we will return the newer fields as LauncherId, ChildIDs. Provide access to launcher job URL from web console when using Map Reduce action Key: OOZIE-1231 URL: https://issues.apache.org/jira/browse/OOZIE-1231 Project: Oozie Issue Type: Bug Affects Versions: trunk Reporter: Ryota Egashira Assignee: Ryota Egashira Labels: oozie Attachments: Screen Shot Launcher_Job_URL_tab.png there are applications where custom inputformat used in MR action, and log message from the inputformat is written on launcher task log. for debugging purpose, users need to check the launcher task log. but currently in MR action, oozie automatically swaps external ID, and do not expose the launcher ID in web-console. (now only way is to to grep oozie.log). this jira is to show launcher job URL on web console when using Map Reduce action -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1292) Add Hadoop 0.23 Poms in hadooplibs to enable a build with testcases against 0.23
[ https://issues.apache.org/jira/browse/OOZIE-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13626065#comment-13626065 ] Mohammad Kamrul Islam commented on OOZIE-1292: -- is it adding a new pom files like hadoop-2? Wandering if the same (hadoop-2) can be use for 0.23 as well. Add Hadoop 0.23 Poms in hadooplibs to enable a build with testcases against 0.23 Key: OOZIE-1292 URL: https://issues.apache.org/jira/browse/OOZIE-1292 Project: Oozie Issue Type: Task Components: tests Affects Versions: trunk, 4.0.0 Reporter: Mona Chitnis Assignee: Mona Chitnis Fix For: trunk, 4.0.0 Attachments: OOZIE-1292.patch Original Estimate: 2h Remaining Estimate: 2h Adding provision for hadooplibs to also have Pom files for a Hadoop 0.23.x version (latest available on maven is 0.23.6), to enable an 'mvn test -Dhadoop.version=0.23.6' build. An initial run resulted in a few test failures - will investigate them more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1111) change HCatURI to comply with templeton URI format
[ https://issues.apache.org/jira/browse/OOZIE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510822#comment-13510822 ] Mohammad Kamrul Islam commented on OOZIE-: -- So I can see there multiple options: 1. Keep the current shortened format. 2. Extend the current format by only adding /v#/. 3. Follow the pattern used in (HCatalog/Templeton). 4. Ask the HCatalog team to provide a standard URI that could be used in Pig script (HCat Storer/Loader), HCat MR API. I always preferred the option #4. But this might be a long shot. In that case, we could go ahead with option # 2 also. Any comment? we need to close it quickly. change HCatURI to comply with templeton URI format -- Key: OOZIE- URL: https://issues.apache.org/jira/browse/OOZIE- Project: Oozie Issue Type: Sub-task Reporter: Ryota Egashira Assignee: Ryota Egashira -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1111) change HCatURI to comply with templeton URI format
[ https://issues.apache.org/jira/browse/OOZIE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510226#comment-13510226 ] Mohammad Kamrul Islam commented on OOZIE-: -- Comment copied and pasted from review: Mohammad Islam: Why are you changing back to adding database, table etc? I think we talked to add partition. Although it was our original plan. We changed it after getting feedback from Alejandro. As stated database table, why not assuming the database and table names are the first and second elements of the path, then instead hcat:host:port/db/DB/table/TABLE/... it would just be #hcat:host:port/DB/TABLE/ Ryota Egashira 6 hours, 7 minutes from now (Dec. 5, 2012, 12:39 a.m.) yeah, we removed it in original design, but to follow templeton URI format which require these, we are adding https://issues.apache.org/jira/browse/OOZIE-561?focusedCommentId=13509053page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13509053 Rohini Palaniswamy 6 hours, 12 minutes from now (Dec. 5, 2012, 12:44 a.m.) That's the templeton format (http://people.apache.org/~thejas/templeton_doc_latest/descpartition.html). If we are trying to comply with it, we should totally comply with it (https://issues.apache.org/jira/browse/OOZIE-561?focusedCommentId=13509053page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13509053). Adding partition without database and table does not make sense. Else we can go with hcat:host:port/DB/TABLE/PARTITION assuming that the third one will be partition. Both has its pros and cons. One would be more standard complying with hcatalog folks. The other is simplified (less to type for user) and takes less space for string while storing in database. Had not thought about templeton uri taking more space in oozie database before. We need an agreement on the final format of the uri. I am fine with either one of the two approaches except the former one where partition was part of query params. Ryota Egashira 7 hours, 3 minutes from now (Dec. 5, 2012, 1:35 a.m.) one alternative coming from discussion is to use URI encoder/decoder we could have 2 formats, one exposed to user as uti-template, one for oozie internal format, and use encoder/decoder class to convert one to the other. advantage is that we can isolate these two, also we can make internal format to tiny not to eat up size in DB. Mohammad Islam 7 hours, 59 minutes from now (Dec. 5, 2012, 2:31 a.m.) I'm also not 100% convinced in one way or the other. But we have to consider few things: 1. User should not type extra things. 2. The design should be extendable. 3. For oozie internal, we should store as little as possible. The problem is: HCatalog didn't define anything. If they do, our life would be much easier. HCatalog doesn't want to support any URI like format. As far as I know, Templeton is a separate/independent project. Since it is an interface issue (which we shouldn't modify often), I would ask Alejandro's comment. I will ping him and upload this discussion to the JIRA as well. change HCatURI to comply with templeton URI format -- Key: OOZIE- URL: https://issues.apache.org/jira/browse/OOZIE- Project: Oozie Issue Type: Sub-task Reporter: Ryota Egashira Assignee: Ryota Egashira -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1111) change HCatURI to comply with templeton URI format
[ https://issues.apache.org/jira/browse/OOZIE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510298#comment-13510298 ] Mohammad Kamrul Islam commented on OOZIE-: -- Very good link. I was told templeton will be a separate product! I didn't know that it was added as part of HCatalog. Is there anyone using that feature? any idea? One other option for Oozie could be, we can add a version entry in the URI following the REST API like format (something like hcat://server:port/v0/db/table/?ki=vi). This will give us a flexible option for future adoption. For example, if we support any newer URI, we can bump up the version with that format. Anyway, let's see how other comments. change HCatURI to comply with templeton URI format -- Key: OOZIE- URL: https://issues.apache.org/jira/browse/OOZIE- Project: Oozie Issue Type: Sub-task Reporter: Ryota Egashira Assignee: Ryota Egashira -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OOZIE-1105) Resolve issues found in integration
Mohammad Kamrul Islam created OOZIE-1105: Summary: Resolve issues found in integration Key: OOZIE-1105 URL: https://issues.apache.org/jira/browse/OOZIE-1105 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam This JIRa is to fix the issue found during integration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1095) Add HCatalog jar as resource for building
[ https://issues.apache.org/jira/browse/OOZIE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507029#comment-13507029 ] Mohammad Kamrul Islam commented on OOZIE-1095: -- +1 as stop gap. Will need to revert back when catalog is released. Add HCatalog jar as resource for building - Key: OOZIE-1095 URL: https://issues.apache.org/jira/browse/OOZIE-1095 Project: Oozie Issue Type: Sub-task Reporter: Mona Chitnis Assignee: Mona Chitnis Fix For: trunk Attachments: OOZIE-1095.patch OOZIE-1050 depends on a patched Hcatalog jar (version 0.4.1) for the notifications related classes e.g. HCatEventMessage. Therefore, this JAR is made available as a resource along with build instructions, as a temporary solution until the new HCatalog jar is available via maven repo -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1089) DistributedCache workaround for Hadoop 2.0.2-alpha
[ https://issues.apache.org/jira/browse/OOZIE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1350#comment-1350 ] Mohammad Kamrul Islam commented on OOZIE-1089: -- Is there anyway for oozie to enforce the uniqueness of jar in DC? Currently if same jar file is included in multiple places (such as wf/lib. user/lib, share/lib), there is no issue. With this new hadoop DC behavior, it will be an issue. That will break Oozie backward compatibility. From the above perspective, can you please comment on the proposed two options that I mentioned above? DistributedCache workaround for Hadoop 2.0.2-alpha -- Key: OOZIE-1089 URL: https://issues.apache.org/jira/browse/OOZIE-1089 Project: Oozie Issue Type: Bug Components: workflow Affects Versions: 3.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.3.0 Attachments: OOZIE-1089.patch, OOZIE-1089-trunk.patch As explained in MAPREDUCE-4820, Hadoop 2.0.2-alpha introduced a duplicate check that exposes an change of behavior in how the distributed-cache works in Hadoop 2 (as opposed to Hadoop-1). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1089) DistributedCache workaround for Hadoop 2.0.2-alpha
[ https://issues.apache.org/jira/browse/OOZIE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504910#comment-13504910 ] Mohammad Kamrul Islam commented on OOZIE-1089: -- This was the root cause and that's why we asked hadoop team to retract the patch from 0.23. End user can't enforce the file name at the share/lib. We need to find a resolution at Oozie level too. which one of the two options (mentioned above) will be good for this? (btw, we could fix it in Oozie in later release) About the patch: My concern was to support hadoop 2.x for 3.3. The reasons are : Oozie was not tested against 2.x. I heard pig streaming will not also work for some unrelated issue. Moreover, not sure how stable is hadoop 2.x alpha. Having said that, if it is must from your side, I'm good to go. +1. One coding clarification comment : The following line of code is only for writing a log in Launcher Main. right? launcherConf.setBoolean(oozie.hadoop-2.0.2-alpha.workaround.for.distributed.cache, true); DistributedCache workaround for Hadoop 2.0.2-alpha -- Key: OOZIE-1089 URL: https://issues.apache.org/jira/browse/OOZIE-1089 Project: Oozie Issue Type: Bug Components: workflow Affects Versions: 3.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.3.0 Attachments: OOZIE-1089.patch, OOZIE-1089-trunk.patch As explained in MAPREDUCE-4820, Hadoop 2.0.2-alpha introduced a duplicate check that exposes an change of behavior in how the distributed-cache works in Hadoop 2 (as opposed to Hadoop-1). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1078) Help - Documentation and Help - Online Help should link to oozie.apache.org/
[ https://issues.apache.org/jira/browse/OOZIE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504270#comment-13504270 ] Mohammad Kamrul Islam commented on OOZIE-1078: -- +1 committing Help - Documentation and Help - Online Help should link to oozie.apache.org/ -- Key: OOZIE-1078 URL: https://issues.apache.org/jira/browse/OOZIE-1078 Project: Oozie Issue Type: Task Components: workflow Affects Versions: trunk Reporter: jun aoki Priority: Trivial Labels: documentation Fix For: trunk Attachments: help_link_opens_new_tab.png, help_menu_alert.png, OOZIE-1078-2.patch, OOZIE-1078.patch Original Estimate: 1h Remaining Estimate: 1h Help - Documentation and Help - Online Help currently show an alert saying to be implemented soon. They should link to a workflowgenerator's maven site, which does not exist in a public place (not even jenkins' precommit job's workspace) The ideal place is somewhere under http://oozie.apache.org/docs/ once it is generated, but it can at least link to the oozie top page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1089) DistributedCache workaround for Hadoop 2.0.2-alpha
[ https://issues.apache.org/jira/browse/OOZIE-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504410#comment-13504410 ] Mohammad Kamrul Islam commented on OOZIE-1089: -- I was considering two alternative options: Option 1: Before adding any jar file into DC, we can check if the jar filename is already in the DC. If yes, we can skip the addition to DC. This way we can avoid the duplicate files into DC. Option 2: Oozie can store all the jars into a local data structure (say HashSet). At then end, Oozie can add those jars (from HashSet) into class path. Comments? DistributedCache workaround for Hadoop 2.0.2-alpha -- Key: OOZIE-1089 URL: https://issues.apache.org/jira/browse/OOZIE-1089 Project: Oozie Issue Type: Bug Components: workflow Affects Versions: 3.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 3.3.0 Attachments: OOZIE-1089.patch As explained in MAPREDUCE-4820, Hadoop 2.0.2-alpha introduced a duplicate check that exposes an change of behavior in how the distributed-cache works in Hadoop 2 (as opposed to Hadoop-1). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1050) Implement logic to update dependencies via push JMS message
[ https://issues.apache.org/jira/browse/OOZIE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504425#comment-13504425 ] Mohammad Kamrul Islam commented on OOZIE-1050: -- +1 committed Implement logic to update dependencies via push JMS message --- Key: OOZIE-1050 URL: https://issues.apache.org/jira/browse/OOZIE-1050 Project: Oozie Issue Type: Sub-task Affects Versions: trunk Reporter: Mona Chitnis Assignee: Mona Chitnis Fix For: trunk Attachments: OOZIE-1050v2.patch Original Estimate: 72h Remaining Estimate: 72h Implementation of the 'handler' that will read the push JMS messages received on the message bus about partition availability, and subsequently update the cache. May also trigger readiness for starting action -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1067) Support Amazon EMR action executor in oozie installed on EC2
[ https://issues.apache.org/jira/browse/OOZIE-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13496968#comment-13496968 ] Mohammad Kamrul Islam commented on OOZIE-1067: -- Very good effort. Please let us know if we can be helpful in anyway. Support Amazon EMR action executor in oozie installed on EC2 Key: OOZIE-1067 URL: https://issues.apache.org/jira/browse/OOZIE-1067 Project: Oozie Issue Type: New Feature Components: action, coordinator, workflow Affects Versions: trunk Environment: Oozie, Amazon EMR availability, EC2 instance, access to Amazon S3 or S3N filesystem. Reporter: Shaik Idris Ali Labels: Amazon, EC2, EMR, s3 Fix For: trunk Original Estimate: 506h Remaining Estimate: 506h Oozie is being adopted as default workflow/scheduling engine for BigData. Currently, small organizations prefer on demand clusters like Amazon's EMR instead of full fledged Hadoop setup. However, currently we don't have support for powerful workflow engine like oozie, which seamlessly schedules/executes user jobs on EMR. Oozie can provide a new ActionExecutor class like EMRActionExecutor, which can take all the required credentials for EMR. Oozie can be installed on Amazon EC2 instance, which can then talk to any dynamic EMR cluster. Though, Oozie has support for other filesystems other than HDFS, we might need to tweak a bit to support Filesystems like S3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OOZIE-1069) Update dataIn and dataOut EL functions to support partitions
Mohammad Kamrul Islam created OOZIE-1069: Summary: Update dataIn and dataOut EL functions to support partitions Key: OOZIE-1069 URL: https://issues.apache.org/jira/browse/OOZIE-1069 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Update dataIn() and dataOut() el functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1069) Update dataIn and dataOut EL functions to support partitions
[ https://issues.apache.org/jira/browse/OOZIE-1069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497617#comment-13497617 ] Mohammad Kamrul Islam commented on OOZIE-1069: -- Review at RB: https://reviews.apache.org/r/8060/ Update dataIn and dataOut EL functions to support partitions Key: OOZIE-1069 URL: https://issues.apache.org/jira/browse/OOZIE-1069 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: trunk Update dataIn() and dataOut() el functions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1063) Provide documentation on configuring Oozie to use new Hadoop API
[ https://issues.apache.org/jira/browse/OOZIE-1063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13496799#comment-13496799 ] Mohammad Kamrul Islam commented on OOZIE-1063: -- Some comments are there in old github wiki: https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases Looks for How to run Map-reduce job written using new Hadoop API? We might need to integrate this into latest doc. Provide documentation on configuring Oozie to use new Hadoop API Key: OOZIE-1063 URL: https://issues.apache.org/jira/browse/OOZIE-1063 Project: Oozie Issue Type: Improvement Components: docs Reporter: Michael Katzenellenbogen To use the newer mapreduce.* Hadoop API, the Oozie workflow XML configuration should have the mapred.mapper|reducer.new-api property set to true. This is currently undocumented (on the Oozie side), and throws mysterious RuntimeExceptions when using the newer API. Example RuntimeExceptions: java.lang.RuntimeException: class com.foo.bar.SomeMapper not org.apache.hadoop.mapred.Mapper java.lang.RuntimeException: class com.foo.bar.SomeOutputFormat not org.apache.hadoop.mapred.OutputFormat -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1059) Add static method to create URI String in HCatURI
[ https://issues.apache.org/jira/browse/OOZIE-1059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494598#comment-13494598 ] Mohammad Kamrul Islam commented on OOZIE-1059: -- Please upload the patch here. Add static method to create URI String in HCatURI - Key: OOZIE-1059 URL: https://issues.apache.org/jira/browse/OOZIE-1059 Project: Oozie Issue Type: Sub-task Reporter: Ryota Egashira Assignee: Ryota Egashira Fix For: trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1056) Command to update push-based dependency
[ https://issues.apache.org/jira/browse/OOZIE-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494355#comment-13494355 ] Mohammad Kamrul Islam commented on OOZIE-1056: -- Review at https://reviews.apache.org/r/7992/ Command to update push-based dependency --- Key: OOZIE-1056 URL: https://issues.apache.org/jira/browse/OOZIE-1056 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: trunk Get partition information from available map and update the action table. At last remove the entry from the available map. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OOZIE-1042) Coordinator action table schema change
Mohammad Kamrul Islam created OOZIE-1042: Summary: Coordinator action table schema change Key: OOZIE-1042 URL: https://issues.apache.org/jira/browse/OOZIE-1042 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (OOZIE-1042) Coordinator action table schema change
[ https://issues.apache.org/jira/browse/OOZIE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam reassigned OOZIE-1042: Assignee: Mohammad Kamrul Islam Coordinator action table schema change -- Key: OOZIE-1042 URL: https://issues.apache.org/jira/browse/OOZIE-1042 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1042) Coordinator action table schema change
[ https://issues.apache.org/jira/browse/OOZIE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13486713#comment-13486713 ] Mohammad Kamrul Islam commented on OOZIE-1042: -- Create a new column called Push_missingDependency of type BLOB in table COORD_ACTIONS. Needs to modify CoorinatorAction and JsonCoordinatorAction class. There will be two columns to identify two different dependencies. 1. Missing dependencies (existing): Only those dependency that needs polling. Such as : directory based dependencies. 2. Push Missing dependencies: Dependencies that rely on the push model where some messaging system will inform when new data is available. For example, partition-based dependencies provided by hcatalog. Coordinator action table schema change -- Key: OOZIE-1042 URL: https://issues.apache.org/jira/browse/OOZIE-1042 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (OOZIE-1042) Coordinator action table schema change
[ https://issues.apache.org/jira/browse/OOZIE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-1042: - Comment: was deleted (was: Create a new column called Push_missingDependency of type BLOB in table COORD_ACTIONS. Needs to modify CoorinatorAction and JsonCoordinatorAction class. There will be two columns to identify two different dependencies. 1. Missing dependencies (existing): Only those dependency that needs polling. Such as : directory based dependencies. 2. Push Missing dependencies: Dependencies that rely on the push model where some messaging system will inform when new data is available. For example, partition-based dependencies provided by hcatalog. ) Coordinator action table schema change -- Key: OOZIE-1042 URL: https://issues.apache.org/jira/browse/OOZIE-1042 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: trunk Create a new column called Push_missingDependency of type BLOB in table COORD_ACTIONS. Needs to modify CoorinatorAction and JsonCoordinatorAction class. There will be two columns to identify two different dependencies. 1. Missing dependencies (existing): Only those dependency that needs polling. Such as : directory based dependencies. 2. Push Missing dependencies: Dependencies that rely on the push model where some messaging system will inform when new data is available. For example, partition-based dependencies provided by hcatalog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (OOZIE-1042) Coordinator action table schema change
[ https://issues.apache.org/jira/browse/OOZIE-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-1042: - Description: Create a new column called Push_missingDependency of type BLOB in table COORD_ACTIONS. Needs to modify CoorinatorAction and JsonCoordinatorAction class. There will be two columns to identify two different dependencies. 1. Missing dependencies (existing): Only those dependency that needs polling. Such as : directory based dependencies. 2. Push Missing dependencies: Dependencies that rely on the push model where some messaging system will inform when new data is available. For example, partition-based dependencies provided by hcatalog. Coordinator action table schema change -- Key: OOZIE-1042 URL: https://issues.apache.org/jira/browse/OOZIE-1042 Project: Oozie Issue Type: Sub-task Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Fix For: trunk Create a new column called Push_missingDependency of type BLOB in table COORD_ACTIONS. Needs to modify CoorinatorAction and JsonCoordinatorAction class. There will be two columns to identify two different dependencies. 1. Missing dependencies (existing): Only those dependency that needs polling. Such as : directory based dependencies. 2. Push Missing dependencies: Dependencies that rely on the push model where some messaging system will inform when new data is available. For example, partition-based dependencies provided by hcatalog. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-1026) Add missing SLA documentation
[ https://issues.apache.org/jira/browse/OOZIE-1026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13479647#comment-13479647 ] Mohammad Kamrul Islam commented on OOZIE-1026: -- sure. Add missing SLA documentation - Key: OOZIE-1026 URL: https://issues.apache.org/jira/browse/OOZIE-1026 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Assignee: Mohammad Kamrul Islam Priority: Minor Attachments: OOZIE-1026.patch Oozie stores sla information of a job. But there is no such documents. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (OOZIE-1021) Update build doc to set umask to 0022
Mohammad Kamrul Islam created OOZIE-1021: Summary: Update build doc to set umask to 0022 Key: OOZIE-1021 URL: https://issues.apache.org/jira/browse/OOZIE-1021 Project: Oozie Issue Type: Bug Reporter: Mohammad Kamrul Islam Priority: Minor It could be updated to oozie Getting start and how to contribute page. More from email thread: Hi Oozie developers, I've recently worked on OOZIE-1012 and I've noticed that on my box oozie tests seems to be requiring umask 022 (linux, Ubuntu 12.04), which actually seems to be implied by HDFS not by Oozie. If I run tests with umask 002 (my default), most of the tests will fail with following message: Cannot lock storage build/test/data/dfs/name1. The directory is already locked. I believe that this error message is just a consequence of following NullPointerException that appears in setUp method of affected test cases: java.lang.NullPointerException at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:280) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:124) at org.apache.oozie.test.XTestCase.setUpEmbeddedHadoop(XTestCase.java:708) at org.apache.oozie.test.XTestCase.setUp(XTestCase.java:281) at org.apache.oozie.test.XFsTestCase.setUp(XFsTestCase.java:58) at org.apache.oozie.action.hadoop.ActionExecutorTestCase.setUp(ActionExecutorTestCase.java:62) I believe that reason of this NullPointerException is absence of usable data directories as *-output logs contains following fragments: 12/10/12 11:16:12 WARN datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for build/test/data/dfs/data/data1, expected: rwxr-xr-x, while actual: rwxrwxr-x 12/10/12 11:16:12 WARN datanode.DataNode: Invalid directory in dfs.data.dir: Incorrect permission for build/test/data/dfs/data/data2, expected: rwxr-xr-x, while actual: rwxrwxr-x 12/10/12 11:16:12 ERROR datanode.DataNode: All directories in dfs.data.dir are invalid. Please note that this output was generated with umask 002 and changing it to 022 will fix the issue. Does anyone else noticed this behaviour (failures) as well? If so then it might make sense to document this need on HowToContribute page [1] and maybe improve the XTestCase.setUpEmbeddedHadoop by catching NPE and informing developer that we're not able to bootstrap MiniDFSCluster? Jarcec -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-561) Integrate Oozie with HCatalog
[ https://issues.apache.org/jira/browse/OOZIE-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476611#comment-13476611 ] Mohammad Kamrul Islam commented on OOZIE-561: - High-level Requirements: • Allow users to specify table/partition based data dependencies used in HCatalog. • User should be able to specify the existing directory based data dependency as well. • User can use both types of dataset (such directory and table-partition based) for the same coordinator. • Single data source should be from the same type (either table-partition or directory). • For table-partition based dataset, enable user to utilize the existing concept defined through EL functions such as current(), latest(), future(). • Oozie should allow to utilize metadata from multiple HCatalog servers. • User can optionally provide the HCatalog server end-points. Oozie should have a default HCatalog server. • Oozie will allow passing the input data destinations as db/table/partition(filter) that could easily be used in MR job and pig script. •Oozie will allow passing the output data destination as db/table/partition(filter) that could easily be used in MR job and pig script. • Include “remove partition “ like statement into prepare block. • Include HCatalog Action. (Future) • In condition expression ((e.g. case statement) ), support the same functionality provided for directory-based system. Provide equivalent new EL function with table-partition support based logic. (Future) • Support non-timed based data dependency (Asynchronous data processing) (Future) The future items will not be implemented as part of this JIRA. However, it should be considered in our design consideration. Integrate Oozie with HCatalog - Key: OOZIE-561 URL: https://issues.apache.org/jira/browse/OOZIE-561 Project: Oozie Issue Type: New Feature Reporter: Santhosh Srinivasan Assignee: Mona Chitnis With the incubation of HCatalog, we have a mechanism to abstract data and storage on HDFS. A natural progression for Oozie is to interact with HCatalog to facilitate the interplay between MapReduce, Pig and Hive. In addition, the support for notification in HCatalog will alleviate (and not eliminate) the need to poll HDFS for data sets represented as tables and partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (OOZIE-561) Integrate Oozie with HCatalog
[ https://issues.apache.org/jira/browse/OOZIE-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Kamrul Islam updated OOZIE-561: Attachment: Oozie-HCatHighLevel.pptx This presentation is relevant to design of this feature. Integrate Oozie with HCatalog - Key: OOZIE-561 URL: https://issues.apache.org/jira/browse/OOZIE-561 Project: Oozie Issue Type: New Feature Reporter: Santhosh Srinivasan Assignee: Mona Chitnis Attachments: Oozie-HCatHighLevel.pptx With the incubation of HCatalog, we have a mechanism to abstract data and storage on HDFS. A natural progression for Oozie is to interact with HCatalog to facilitate the interplay between MapReduce, Pig and Hive. In addition, the support for notification in HCatalog will alleviate (and not eliminate) the need to poll HDFS for data sets represented as tables and partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-993) Hadoop 23 doesn't accept user defined jobtracker
[ https://issues.apache.org/jira/browse/OOZIE-993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13468734#comment-13468734 ] Mohammad Kamrul Islam commented on OOZIE-993: - +1 Hadoop 23 doesn't accept user defined jobtracker Key: OOZIE-993 URL: https://issues.apache.org/jira/browse/OOZIE-993 Project: Oozie Issue Type: Bug Affects Versions: 3.3.0 Reporter: Virag Kothari Assignee: Virag Kothari Fix For: 3.3.0 Attachments: OOZIE-993.patch As the deprecation of mapred.job.tracker will not be handled in 23 (MAPREDUCE-4044), Oozie needs to set yarn.resourcemanager.address explicitly similar to OOZIE-779. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (OOZIE-991) action prepare executions work only with HDFS filesystems
[ https://issues.apache.org/jira/browse/OOZIE-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13453561#comment-13453561 ] Mohammad Kamrul Islam commented on OOZIE-991: - found the default is : oozie.service.HadoopAccessorService.supported.filesystems=hdfs in oozie-site.xml +1 action prepare executions work only with HDFS filesystems - Key: OOZIE-991 URL: https://issues.apache.org/jira/browse/OOZIE-991 Project: Oozie Issue Type: Bug Components: core Affects Versions: 3.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 3.3.0 Attachments: OOZIE-991.patch, OOZIE-991.patch, OOZIE-991.patch The supported filesystems from HadoopAccessorService are not propagated to the FileSystemActions and this class is enforcing HDFS filesystem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira