[
https://issues.apache.org/jira/browse/OOZIE-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
purshotam shah updated OOZIE-1504:
----------------------------------
Description:
Usage-case...
One wants it to start on the same specific date so that each day, it processes
one more data set than the last day (the datasets are daily). Something like
the following
<input-events>
<data-in name="event_input_path_format1" dataset="EVENT_INPUT_FORMAT1">
<start-instance>${coord:absolute(2013-03-15T00:00Z)}</start-instance>
<end-instance>$
{coord:current(-1)}
</end-instance>
</data-in>
<data-in name="event_input_path_format2" dataset="EVENT_INPUT_FORMAT2">
<start-instance>${coord:absolute(2013-03-15T00:00Z)}</start-instance>
<end-instance>$
{coord:current(-1)}
</end-instance>
</data-in>
</input-events>
---------------------------------
Specifying a fixed date as the start instance is useful if processing needs to
process all dataset instances from a specific instance to the current instance.
was:
for example, coordinator.xml
-----
<input-events>
<data-in name="foo" dataset="bar">
<start-instance>${coord:latest(-365)}</start-instance>
<end-instance>${coord:latest(0)}</end-instance>
</data-in>
</input-events>
-----
there are use cases to use the same coordinator.xml for varying number of data
instances (not always 365). but the parameter to coord EL function cannot be
parametrized (by using job
config), customer needs to copy coordinator.xml with different number just to
change the parameter of coord:latest(), which is not optimal for maintenance.
> Allow specifying a fixed instance as the start instance of a data-in
> --------------------------------------------------------------------
>
> Key: OOZIE-1504
> URL: https://issues.apache.org/jira/browse/OOZIE-1504
> Project: Oozie
> Issue Type: Improvement
> Affects Versions: trunk
> Reporter: Ryota Egashira
> Assignee: purshotam shah
> Fix For: trunk
>
> Attachments: OOZIE-1504_v4.patch
>
>
> Usage-case...
> One wants it to start on the same specific date so that each day, it
> processes one more data set than the last day (the datasets are daily).
> Something like the following
>
> <input-events>
> <data-in name="event_input_path_format1" dataset="EVENT_INPUT_FORMAT1">
> <start-instance>${coord:absolute(2013-03-15T00:00Z)}</start-instance>
> <end-instance>$
> {coord:current(-1)}
> </end-instance>
> </data-in>
> <data-in name="event_input_path_format2" dataset="EVENT_INPUT_FORMAT2">
> <start-instance>${coord:absolute(2013-03-15T00:00Z)}</start-instance>
> <end-instance>$
> {coord:current(-1)}
> </end-instance>
> </data-in>
> </input-events>
> ---------------------------------
> Specifying a fixed date as the start instance is useful if processing needs
> to process all dataset instances from a specific instance to the current
> instance.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)