Robert Kanter created OOZIE-2018:
------------------------------------

             Summary: Coordinator materialization problems with cron syntax
                 Key: OOZIE-2018
                 URL: https://issues.apache.org/jira/browse/OOZIE-2018
             Project: Oozie
          Issue Type: Bug
          Components: coordinator
    Affects Versions: 4.0.0, trunk
            Reporter: Robert Kanter


Suppose you submit the following coordinator job:
{code:xml}
<coordinator-app name="DailySleep"
  frequency="*/2 * * * *"
  start="2013-06-01T00:00Z" end="2013-06-05T00:00Z" 
timezone="America/Los_Angeles"
  xmlns="uri:oozie:coordinator:0.2"
  >
  <controls>
    <timeout>-1</timeout>
    <concurrency>1</concurrency>
    <execution>FIFO</execution>
    <throttle>2</throttle>
  </controls>
  <datasets>
    <dataset name="sleep_time" frequency="${coord:days(1)}"
             initial-instance="2012-05-31T00:00Z" 
timezone="America/Los_Angeles">
      <uri-template>${DAY}</uri-template>
      <done-flag></done-flag>
    </dataset>
  </datasets>
  <action>
    <workflow>
      <app-path>${wf_application_path}</app-path>
      <configuration>
        <property>
          <name>REDUCER_SLEEP_TIME</name>
          <value>120000</value>
        </property>
        <property>
          <name>oozie.use.system.libpath</name>
          <value>true</value>
        </property>
      </configuration>
   </workflow>
  </action>
</coordinator-app>
{code}
Where {{$\{wf_application_path}}} points to a workflow that simply runs a sleep 
MR job for 2 mins.
Notice that the above coordinator job is set to run with a frequency of {{*/2 * 
* * *}}, which means every 2 minutes, and the throttle is 2.

When you run this job, you’ll see a few anomalies:
# Other than the first action, each action is materialized twice.  The action 
numbering works fine, but you’ll see two actions for each Nominal Time.  You 
can see this in the job info below.
# You can’t see this in the job info below, but while it’s running, there are 
actually 3 jobs READY at the same time, when there should be only 2 (because 
throttle was set to 2)
# OOZIE-1680 added an oozie-site config property 
{{oozie.service.coord.check.maximum.frequency=true}} which is supposed to block 
jobs with frequencies faster than 5 minutes; it didn’t stop this coordinator

Points 1 and 2 above are likely the same problem.  Point 3 is somewhat trivial.

Here’s the job info (I killed the job before it finished, and I cut out 
non-relevent info to make it easier to read):
{noformat}
---------------------------------------------------------------------------------------------------------------------------------------
ID                                      External ID                             
Created                         Nominal Time
---------------------------------------------------------------------------------------------------------------------------------------
0000005-140922161548481-oozie-oozi-C@1  0000006-140922161548481-oozie-oozi-W    
2014-09-22 23:34:38 GMT         2013-06-01 00:00:00 GMT
---------------------------------------------------------------------------------------------------------------------------------------
0000005-140922161548481-oozie-oozi-C@2  0000007-140922161548481-oozie-oozi-W    
2014-09-22 23:34:38 GMT         2013-06-01 00:02:00 GMT
---------------------------------------------------------------------------------------------------------------------------------------
0000005-140922161548481-oozie-oozi-C@3  0000008-140922161548481-oozie-oozi-W    
2014-09-22 23:36:11 GMT         2013-06-01 00:02:00 GMT
---------------------------------------------------------------------------------------------------------------------------------------
0000005-140922161548481-oozie-oozi-C@4  0000009-140922161548481-oozie-oozi-W    
2014-09-22 23:36:11 GMT         2013-06-01 00:04:00 GMT
---------------------------------------------------------------------------------------------------------------------------------------
0000005-140922161548481-oozie-oozi-C@5  0000005-140922161548481-oozie-oozi-C    
2014-09-22 23:41:11 GMT         2013-06-01 00:04:00 GMT
---------------------------------------------------------------------------------------------------------------------------------------
0000005-140922161548481-oozie-oozi-C@6  0000005-140922161548481-oozie-oozi-C    
2014-09-22 23:41:11 GMT         2013-06-01 00:06:00 GMT
---------------------------------------------------------------------------------------------------------------------------------------
{noformat}

I tried the same coordinator job, but used the old frequency syntax 
({{$\{coord:minutes(2)}}}, and even though we don’t recommend a 2 min 
frequency, it actually worked correctly (once I set 
{{oozie.service.coord.check.maximum.frequency=false}} of course).  So this 
appears to be a problem with the cron syntax.  If ({{$\{coord:minutes(2)}}} 
didn’t work either, then I’d say it’s just once of the quirks of too high a 
frequency, but that’s not the case here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to