The behavior where the ActionCheckXCommand calls handleNonTransient() with START_MANUAL when the JT can't be reached after the retries and on RESUME command will resubmit the job was something I did for OOZIE-994. In hindsight, we shouldn't have done it that way.
Yes, it will fail if job recovery is not enabled in the JT/RM; but I think this is the more correct behavior as this is something that the external system should be taking care of. - Robert On Wed, Aug 7, 2013 at 5:05 PM, Virag Kothari <[email protected]> wrote: > Alejandro, I agree that functionality would be preserved if action is left > in RUNNING during a transient error. > > Few questions > > 1) START_MANUAL seems to be set only by handleNonTransient(). If this is a > bug, do you know for what purpose it was introduced? > I thought having START_MANUAL is a way to distinguish between Oozie > suspending job due to transient error and a user manually suspending the > job. > > 2) With no oozie retry on 'RESUME', jobs will fail if JT/RM recovery is > not enabled. And it seems that YARN recovery is still not there as > YARN-128 is not yet committed (Not sure if looking at right JIRA). > Its a concern for us as we ask users to RESUME their jobs after hadoop > upgrade. Now they have to resume wf and rerun the failed actions. > > Thanks, > Virag > > > > On 8/7/13 2:48 PM, "Alejandro Abdelnur" <[email protected]> wrote: > > >[joining the party a bit late] > > > >I just add an offline call with RobertK who brought me up to speed. > > > >By design, Oozie will retry starting a workflow action ONLY if it couldn't > >start the WF action before. If Oozie started the WF action successfully, > >the WF action state goes into RUNNING, and from then on it is the > >responsibility of the external system running the action to recover it. > >Oozie will not attempt any recovery after that point. > > > >This means that with Hadoop (JT or YARN) job recovery, the launcher job > >will be recovered by Hadoop without any intervention from Oozie. > > > >It is clear that to have recovery for MR action we need to get rid of > >the > >swap and just hold onto the MR launcher job as we do for the other > >actions. > > > >Now, on the whole discussion on the ActionCheckXCommand retries. We have a > >bug in the ActionCheckXCommand, on handleNonTransient() we should not > >change the status of the WF action to START_MANUAL, we should leave it in > >RUNNING. hadnleNonTransient() will suspend the WF job thus switching off > >action checks. On WF job resume, the action checks will start working > >again, and if Hadoop has job recovery, things will work fine. Else the WF > >action will fail because the launcher job is not known (the external > >system > >does not know how to recover jobs). Because we are reseting the status to > >START_MANUAL we are dialing back on the lifecycle of the action, that is > >incorrect and that creates the race condition that introduces 2 jobs. > > > >So again, Oozie is not responsible for recovering actions. With that > >assumption, fixing the handleNonTransient() to leave the status in RUNNING > >and getting rid of the RM swap logic we should be good. > > > >Thoughts? > > > > > > > > > >On Wed, Aug 7, 2013 at 12:27 AM, Virag Kothari <[email protected]> > >wrote: > > > >> Robert, > >> > >> I have been thinking on this for a while and have few more concerns if > >>the > >> job retries are not streamlined through Oozie. > >> > >> 1) Till the JT finishes recovering the job, the wf job/wf action status > >> will be SUSPENDED/START_MANUAL. > >> Isn't it misleading as the hadoop job is RUNNING while oozie incorrectly > >> shows as SUSPENDED? Even if allow this, after the job completes, what if > >> the callback is lost or oozie is down? > >> To prevent the job being in SUSPENDED forever, we need to hack our > >> services to pull SUSPENDED/START_MANUAL jobs from db and update their > >> status. > >> > >> 2) Should we allow failing of the user RESUME command if the action is > >>in > >> START_MANUAL to prevent the race condition we were discussing? > >> This would mean changing the semantics of the states. > >> > >> 3) Confused on mapred.job.restart.recover. Reading > >> http://archive.cloudera.com/cdh4/cdh/4/mr1/mapred-default.html, it says > >> that the default value of this is true. So, > >> if mapred.jobtracker.restart.recover (system config) is already enabled, > >> is job recovery on by default? Also, does recover mean the job will > >>start > >> where it left from or is it just plain restart? > >> > >> In summary, IMO allowing hadoop to recover jobs independently bypassing > >> Oozie ins't trivial. It would have helped if the JT produced > >>notification > >> when it comes online, so Oozie could retry after consuming those. But > >> currently, notification only happens when task completes. > >> > >> An alternate approach is to modify the semantics of START_MANUAL. > >> Currently Oozie puts the action/job in START_MANUAL/SUSPENDED and > >>expects > >> the user to resume it. We can change this and make Oozie retry the > >> START_MANUAL actions at configurable interval (~30 mins or some scheme > >> like exp back off) . Of course, this is is bad as oozie will keep > >>polling > >> hadoop at some interval but manual resume of jobs who have faced > >>transient > >> errors will no longer be mandatory. > >> > >> --Virag > >> > >> > >> On 8/6/13 4:38 PM, "Robert Kanter" <[email protected]> wrote: > >> > >> >If ActionCheckX is trying to retry, and the JT recovers the job, that > >> >should be fine. The "retry" is to simply try connecting to the JT to > >>get > >> >the status for the job. If the user issues a "RESUME" for a > >>START_MANUAL > >> >job, then yes, Oozie will try to resubmit a new job for that action and > >> >we'd have two of them if the JT also recovers it. > >> > > >> >What if we modified the ActionStartXCommand/ResumeActionXCommand > >> >precondition to check if the action already has a Job ID that is valid > >> >(i.e. not unknown to the JT), then it fails the precondition check or > >> >something similar? > >> > > >> >- Robert > >> > > >> > > >> >On Tue, Aug 6, 2013 at 4:23 PM, Virag Kothari <[email protected]> > >> wrote: > >> > > >> >> ActionCheckx first retries for a configurable amount of time and then > >> >> makes the status as START_MANUAL. > >> >> So, the problem might happen when JT recovers the job during the same > >> >>time > >> >> when 1) ActionCheckX is trying to retry or the 2) user issues a > >>"RESUME" > >> >> for a start_manual job. > >> >> We have to ensure that this doesn't happen otherwise we will have two > >> >> hadoop jobs for the same action. > >> >> The callback happens only when the task is completed which might be > >>too > >> >> late. During that time, Oozie might have already submitted a new > >>hadoop > >> >> job for that wf action. > >> >> So it doesn't seem straightforward to prevent Oozie to submit a new > >>job > >> >>if > >> >> the JT is already recovering the older one. > >> >> > >> >> > >> >> > >> >> On 8/6/13 4:01 PM, "Robert Kanter" <[email protected]> wrote: > >> >> > >> >> >Yes, if JT recovers the job, it uses the same ID. If the JT comes > >>up > >> >> >quickly and recovers the job, Oozie continues working just fine > >> >>(without > >> >> >the ID swap issues discussed earlier). When the JT takes longer > >>than > >> >>the > >> >> >10min ActionCheck interval, and the action is START_MANUAL, that > >>still > >> >> >needs to be figured out. > >> >> > > >> >> >I haven't tested on Hadoop 2.x yet, but I've been told that it > >>should > >> >>have > >> >> >the same behavior. The only differences are that the name of the > >> >>property > >> >> >to enable recoverability on the server (not the job-level one) is > >> >> >different > >> >> >obviously because it doesn't have "jobtracker" in it and it can also > >> >> >recover the completed tasks, which shouldn't be a problem because > >>the > >> >> >launcher jar has the one task. I'll of course double check this > >> >>though. > >> >> > > >> >> > > >> >> >- Robert > >> >> > > >> >> > > >> >> >On Tue, Aug 6, 2013 at 3:23 PM, Rohini Palaniswamy > >> >> ><[email protected]>wrote: > >> >> > > >> >> >> Robert, > >> >> >> You will not get a unknown hadoop job if JT has retry > >>configured > >> >> >>right? > >> >> >> What happens in that case? Especially what happens when Oozie > >>retry > >> >> >>happens > >> >> >> when JT comes up quickly? Also do you know what is the behaviour > >> >>with > >> >> >> Hadoop 2.x ? > >> >> >> > >> >> >> Mayank, > >> >> >> OOZIE-1231 already has the changes to show Mapreduce job id in > >>the > >> >> >>Child > >> >> >> job page to be consistent with other job types. The v1 API has the > >> >>older > >> >> >> behaviour with map job url in externalId, while v2 API has it in > >> >> >> childjobids. So there is a UI change but v1 REST API has not > >> >>changed. > >> >> >>But > >> >> >> OOZIE-1231 has not changed any code with respect to id swap. > >> >> >> > >> >> >> Regards, > >> >> >> Rohini > >> >> >> > >> >> >> On Tue, Aug 6, 2013 at 2:39 PM, Robert Kanter > >><[email protected]> > >> >> >> wrote: > >> >> >> > >> >> >> > Ya, I saw a precondition failed message. > >> >> >> > > >> >> >> > I just tried out what happens when the job is SUSPENDED, the > >> >>action is > >> >> >> > START_MANUAL, and the JT recovers the hadoop job: It doesn't > >> >>continue > >> >> >>the > >> >> >> > workflow. It fails the eagerVerifyPrecondition from > >> >> >> > CompletedActionXCommand because the action isn't RUNNING. > >>Perhaps > >> >>we > >> >> >> > should make the CallbackService change the status in this > >> >>situation? > >> >> >> > > >> >> >> > Just to clarify, the above only happens when the JT has been > >>down > >> >>long > >> >> >> > enough that the ActionCheckXCommand (every 10min by default) + > >>the > >> >> >> retries > >> >> >> > (3 x 1min) happen. If it comes back sooner than that, > >>everything > >> >> >>works > >> >> >> > fine. > >> >> >> > > >> >> >> > thanks > >> >> >> > - Robert > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > On Tue, Aug 6, 2013 at 1:43 PM, Virag Kothari > >><[email protected] > >> > > >> >> >> wrote: > >> >> >> > > >> >> >> > > Oh..okay. Seems like RecoveryService queues the StartX command > >> >>but > >> >> >>the > >> >> >> > > verifyPrecondition() fails as the wf job is > >> >> >> > > Suspended (Plz verify this from logs). > >> >> >> > > > >> >> >> > > In that case, if Oozie is not auto-retrying and resubmitting, > >> >>then > >> >> >>it > >> >> >> > > seems fair to have the JT recover the job. > >> >> >> > > But if JT recovers the job, can we make sure that the workflow > >> >>job > >> >> >> > > transits to RUNNING from SUSPENDED and wf action from > >> >>START_MANUAL > >> >> >>to > >> >> >> > > RUNNING? > >> >> >> > > It should not happen that the user resumes the job which makes > >> >>Oozie > >> >> >> > > submit a new hadoop job while the JT is also recovering the > >>same > >> >> >>job. > >> >> >> > > Also, I think the error can still be considered transient from > >> >>Oozie > >> >> >> > > perspective as it is temporary depending on state of JT. > >> >> >> > > > >> >> >> > > Thanks, > >> >> >> > > Virag > >> >> >> > > > >> >> >> > > > >> >> >> > > On 8/6/13 1:12 PM, "Robert Kanter" <[email protected]> > >>wrote: > >> >> >> > > > >> >> >> > > >Virag, > >> >> >> > > >I just tested out killing the JT and waiting for the Checker > >> >> >>service > >> >> >> to > >> >> >> > > >retry and give up: the action goes to START_MANUAL and the > >>job > >> >>gets > >> >> >> > > >SUSPENDED. I waited around long enough, but the > >>RecoveryService > >> >> >> didn't > >> >> >> > do > >> >> >> > > >anything. Does it kick in for you? As a side note, looking > >>at > >> >>the > >> >> >> > code, > >> >> >> > > >the RecoveryService looks like it can handle START_MANUAL, > >> >> >>END_MANUAL, > >> >> >> > and > >> >> >> > > >USER_RETRY, which all sound like things the user should be > >> >>doing; > >> >> >>is > >> >> >> it > >> >> >> > > >correct that RecoveryService is handling these? > >> >> >> > > >The Unknown Hadoop Job error happens when the JT comes back > >>in > >> >>time > >> >> >> > > >because > >> >> >> > > >it won't know about the old ID if its not recovering jobs. > >>So, > >> >> >>Oozie > >> >> >> > > >tries > >> >> >> > > >to ask it about a job that no longer exists. I'm not sure > >>that > >> >> >>this > >> >> >> > > >should > >> >> >> > > >be a transient error because there's no way to determine if > >>its > >> >> >> because > >> >> >> > > >the > >> >> >> > > >JT restarted and Oozie should resubmit the job or if > >>something > >> >>else > >> >> >> > > >happened. > >> >> >> > > > > >> >> >> > > >Mayank, > >> >> >> > > >That is a good point. We could either make a v3 API or add > >>an > >> >> >> > oozie-site > >> >> >> > > >config to turn on/off the id swap behavior and keep the v2 > >>API. > >> >> >> > > > > >> >> >> > > >thanks > >> >> >> > > >- Robert > >> >> >> > > > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > >On Tue, Aug 6, 2013 at 10:48 AM, Mayank Bansal > >> >><[email protected]> > >> >> >> > wrote: > >> >> >> > > > > >> >> >> > > >> Robert, > >> >> >> > > >> > >> >> >> > > >> Thats a break in backward compatibility. Till now user are > >> >>used > >> >> >>to > >> >> >> > > >>click on > >> >> >> > > >> to link to go to MR page. > >> >> >> > > >> > >> >> >> > > >> Is there a better way to handle this? > >> >> >> > > >> > >> >> >> > > >> Thanks, > >> >> >> > > >> Mayank > >> >> >> > > >> > >> >> >> > > >> > >> >> >> > > >> > >> >> >> > > >> > >> >> >> > > >> On Tue, Aug 6, 2013 at 10:42 AM, Robert Kanter < > >> >> >> [email protected]> > >> >> >> > > >> wrote: > >> >> >> > > >> > >> >> >> > > >> > Mona, > >> >> >> > > >> > As far as I'm aware, the "retry" that Oozie is doing is > >>just > >> >> >> > retrying > >> >> >> > > >>to > >> >> >> > > >> > connect to the JT (which is why when the JT comes back > >>up, > >> >> >>Oozie > >> >> >> > > >> > can continue monitoring the hadoop job if it still has > >>the > >> >>same > >> >> >> ID); > >> >> >> > > >>it > >> >> >> > > >> > doesn't try to submit the job again as part of the > >>"retry". > >> >> >> > > >> > > >> >> >> > > >> > Mayank, > >> >> >> > > >> > We can put the ID for the actual job in the Child IDs tab > >> >>(like > >> >> >> with > >> >> >> > > >> Pig). > >> >> >> > > >> > > >> >> >> > > >> > > >> >> >> > > >> > - Robert > >> >> >> > > >> > > >> >> >> > > >> > > >> >> >> > > >> > On Tue, Aug 6, 2013 at 10:41 AM, Mayank Bansal > >> >> >><[email protected] > >> >> >> > > >> >> >> > > >> wrote: > >> >> >> > > >> > > >> >> >> > > >> > > I agree , we should handle these two scenarios, I am ok > >> >>with > >> >> >> > > >>changing > >> >> >> > > >> the > >> >> >> > > >> > > launcher behavior for MR however if we remove the id > >>swap > >> >> >>then > >> >> >> how > >> >> >> > > >>we > >> >> >> > > >> > > nevigate to MR jobs from UI as we do right now? > >> >> >> > > >> > > > >> >> >> > > >> > > Thanks, > >> >> >> > > >> > > Mayank > >> >> >> > > >> > > > >> >> >> > > >> > > > >> >> >> > > >> > > On Tue, Aug 6, 2013 at 10:24 AM, Robert Kanter > >> >> >> > > >><[email protected]> > >> >> >> > > >> > > wrote: > >> >> >> > > >> > > > >> >> >> > > >> > > > Suppose we leave the MR ID swap thing as is but set > >>the > >> >> >> launcher > >> >> >> > > >> > recover > >> >> >> > > >> > > to > >> >> >> > > >> > > > 0 and job to 1; then consider these two scenarios: > >> >> >> > > >> > > > > >> >> >> > > >> > > > 1. JT gets restarted during the launcher job but > >>before > >> >>the > >> >> >> > > >>launcher > >> >> >> > > >> > job > >> >> >> > > >> > > > actually launches the real job: > >> >> >> > > >> > > > - The launcher job won't be recovered because we > >> >>told > >> >> >>it > >> >> >> > not > >> >> >> > > >>to > >> >> >> > > >> > > > - The real job was never launched > >> >> >> > > >> > > > ---> Action never completes and Oozie marks it > >>as > >> >> >>failed > >> >> >> > > >> > > > > >> >> >> > > >> > > > 2. Launcher job submits the real job, but JT gets > >> >>restarted > >> >> >> > before > >> >> >> > > >> the > >> >> >> > > >> > > > Oozie server has a chance to swap IDs (its not an > >>atomic > >> >> >> > > >>operation): > >> >> >> > > >> > > > - The launcher job won't be recovered because we > >> >>told > >> >> >>it > >> >> >> > not > >> >> >> > > >>to > >> >> >> > > >> > > > - The real job will be recovered and finish > >> >> >>successfully > >> >> >> > > >> > > > ---> Oozie marks the action as failed even > >>though > >> >>the > >> >> >> > actual > >> >> >> > > >>job > >> >> >> > > >> > > > succeeded because it didn't know about the ID swap > >> >> >> > > >> > > > > >> >> >> > > >> > > > It would only work for the case where the JT gets > >> >>restarted > >> >> >> > after > >> >> >> > > >>the > >> >> >> > > >> > ID > >> >> >> > > >> > > > swap occurs. > >> >> >> > > >> > > > > >> >> >> > > >> > > > > >> >> >> > > >> > > > - Robert > >> >> >> > > >> > > > > >> >> >> > > >> > > > > >> >> >> > > >> > > > On Tue, Aug 6, 2013 at 10:17 AM, Mayank Bansal < > >> >> >> > [email protected] > >> >> >> > > > > >> >> >> > > >> > > wrote: > >> >> >> > > >> > > > > >> >> >> > > >> > > > > Hi Robert, > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > +1 for oozie to set launcher to 1 and 0 to jobs for > >> >> >>recovery > >> >> >> > in > >> >> >> > > >>all > >> >> >> > > >> > the > >> >> >> > > >> > > > > cases except MR. > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > As after Id swapped Oozie only know about MR job > >>isn't > >> >> >>it? > >> >> >> > then > >> >> >> > > >> there > >> >> >> > > >> > > > > should not be any problem. > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > If we set MR launcher recover to 0 and job to 1 > >>then > >> >>job > >> >> >> will > >> >> >> > be > >> >> >> > > >> > > succeded > >> >> >> > > >> > > > > in case of JT restart. > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > AM I missing something? > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > Thanks, > >> >> >> > > >> > > > > Mayank > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > On Tue, Aug 6, 2013 at 9:59 AM, Robert Kanter < > >> >> >> > > >> [email protected]> > >> >> >> > > >> > > > > wrote: > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > > I think you usually just get the "Unknown Hadoop > >> >>Job" > >> >> >> error > >> >> >> > > >> message > >> >> >> > > >> > > > > because > >> >> >> > > >> > > > > > Oozie tries to look up the Hadoop Job ID it > >>already > >> >> >>has, > >> >> >> but > >> >> >> > > >>the > >> >> >> > > >> JT > >> >> >> > > >> > > no > >> >> >> > > >> > > > > > longer has that ID because it was restarted. > >>With > >> >>JT > >> >> >> > > >> > Recoverability > >> >> >> > > >> > > > > turned > >> >> >> > > >> > > > > > on, it will restart the job using the same ID, so > >> >>Oozie > >> >> >> > > >>continues > >> >> >> > > >> > > just > >> >> >> > > >> > > > > > fine. > >> >> >> > > >> > > > > > > >> >> >> > > >> > > > > > - Robert > >> >> >> > > >> > > > > > > >> >> >> > > >> > > > > > > >> >> >> > > >> > > > > > On Mon, Aug 5, 2013 at 5:27 PM, Rohini > >>Palaniswamy > >> >> >> > > >> > > > > > <[email protected]>wrote: > >> >> >> > > >> > > > > > > >> >> >> > > >> > > > > > > Wouldn't oozie poll for the job status and > >>decide > >> >> >>that > >> >> >> it > >> >> >> > > >>has > >> >> >> > > >> > > failed > >> >> >> > > >> > > > > and > >> >> >> > > >> > > > > > > when JT comes up launch another one if retry is > >> >> >> > configured? > >> >> >> > > >> > > > > > > > >> >> >> > > >> > > > > > > On Mon, Aug 5, 2013 at 3:11 PM, Robert Kanter < > >> >> >> > > >> > > [email protected]> > >> >> >> > > >> > > > > > > wrote: > >> >> >> > > >> > > > > > > > >> >> >> > > >> > > > > > > > Hi, > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > We looked into how to support Job > >>Recoverability > >> >> >>(i.e. > >> >> >> > > >>the JT > >> >> >> > > >> > is > >> >> >> > > >> > > > > > > restarted > >> >> >> > > >> > > > > > > > and it wants to restart the jobs that were > >> >>running; > >> >> >> > > >>similarly > >> >> >> > > >> > for > >> >> >> > > >> > > > > YARN) > >> >> >> > > >> > > > > > > and > >> >> >> > > >> > > > > > > > have a pretty simple solution for all of the > >> >>action > >> >> >> > types > >> >> >> > > >> > except > >> >> >> > > >> > > > for > >> >> >> > > >> > > > > > > > MapReduce. If we set > >> >> >> mapreduce.job.restart.recover=true > >> >> >> > > >>for > >> >> >> > > >> > the > >> >> >> > > >> > > > > > launcher > >> >> >> > > >> > > > > > > > job and mapreduce.job.restart.recover=false > >>for > >> >>the > >> >> >> jobs > >> >> >> > > >> > launched > >> >> >> > > >> > > > by > >> >> >> > > >> > > > > > the > >> >> >> > > >> > > > > > > > launcher, then when the JT restarts, it will > >> >> >>recover > >> >> >> the > >> >> >> > > >> > launcher > >> >> >> > > >> > > > job > >> >> >> > > >> > > > > > but > >> >> >> > > >> > > > > > > > not the child jobs -- the launcher job will > >>then > >> >> >>take > >> >> >> > > >>care of > >> >> >> > > >> > > > > > relaunching > >> >> >> > > >> > > > > > > > the child jobs. > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > For MapReduce, because of the optimization > >>with > >> >> >>the id > >> >> >> > > >>swap, > >> >> >> > > >> > this > >> >> >> > > >> > > > > won't > >> >> >> > > >> > > > > > > > work. It would be very tricky, if it's even > >> >> >> practical, > >> >> >> > > >>to do > >> >> >> > > >> > > > > something > >> >> >> > > >> > > > > > > > similar for the MR action. Instead, we > >>think it > >> >> >>would > >> >> >> > be > >> >> >> > > >> best > >> >> >> > > >> > if > >> >> >> > > >> > > > we > >> >> >> > > >> > > > > > > simply > >> >> >> > > >> > > > > > > > remove the MR optimization and make it just > >>like > >> >> >>the > >> >> >> > other > >> >> >> > > >> > action > >> >> >> > > >> > > > > > types. > >> >> >> > > >> > > > > > > I > >> >> >> > > >> > > > > > > > know we normally don't want to remove > >> >> >>optimizations, > >> >> >> but > >> >> >> > > >> there > >> >> >> > > >> > > are > >> >> >> > > >> > > > > many > >> >> >> > > >> > > > > > > > advantages in this case, and it's only > >>saving a > >> >> >>single > >> >> >> > Map > >> >> >> > > >> slot > >> >> >> > > >> > > for > >> >> >> > > >> > > > > MR > >> >> >> > > >> > > > > > > jobs > >> >> >> > > >> > > > > > > > only. > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > I've created OOZIE-1483 < > >> >> >> > > >> > > > > > > > >>https://issues.apache.org/jira/browse/OOZIE-1483> > >> >> >> > > >> > > > > > > > with > >> >> >> > > >> > > > > > > > more details and should have a patch soon. > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > Thoughts? > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > thanks > >> >> >> > > >> > > > > > > > - Robert > >> >> >> > > >> > > > > > > > > >> >> >> > > >> > > > > > > > >> >> >> > > >> > > > > > > >> >> >> > > >> > > > > > >> >> >> > > >> > > > > >> >> >> > > >> > > > >> >> >> > > >> > > >> >> >> > > >> > >> >> >> > > > >> >> >> > > > >> >> >> > > >> >> >> > >> >> > >> >> > >> > >> > > > > > >-- > >Alejandro > >
