Hi Andras and Attila:
Thanks for your advice.
I will check the cluster utility when this job runs next time, but I find
some warning in oozie.log:

2017-06-05 02:18:18,952  WARN CallableQueueService:523 - SERVER[
363748lpp2mn006.geicoddc.net] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-]
ACTION[-] max concurrency for callable [switch] exceeded, requeueing with
[500]ms delay

2017-06-05 02:18:38,433  WARN CallableQueueService:523 - SERVER[
363748lpp2mn006.geicoddc.net] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-]
ACTION[-] max concurrency for callable [#composite#job.notification]
exceeded, requeueing with [500]ms delay

Does it mean I should increase oozie.service.CallableQueueService.callable.
concurrency?

BTW, I am using Oozie 4.2.0.

Thanks


2017-06-06 21:04 GMT+08:00 Attila Sasvari <asasv...@cloudera.com>:

> Hi Dong Ying,
>
> Many thanks Andras, these are good ideas.
>
> In addition, can you confirm that you have enough vcores / memory in your
> cluster for containers?
>
> You can check and try to adjust the following YARN settings:
> - yarn.nodemanager.resource.cpu-vcores
> - yarn.nodemanager.resource.memory-mb
>  (look at your yarn-site.xml / yarn-default.xml)
>
> Also I would also recommend check overall cluster utilization when Oozie
> jobs get into PREP state. Are there a lot of running jobs using a lot of
> resources (vcores, memory) at the time when your coordinator tries to
> submit the job? You can look at resource manager and history server. Hope
> this helps.
>
> Best,
> - Attila
>
> * yarn settings -
> https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-
> default.xml
>
>
>
>
> On Tue, Jun 6, 2017 at 2:26 PM, Andras Piros <andras.pi...@cloudera.com>
> wrote:
>
> > Hi Dong Ying,
> >
> > do you see any logs having this snippet queue is full within the Oozie
> > webapp logs?
> >
> > What are the values of these parameters:
> >
> >    -
> >
> >    oozie.service.CallableQueueService.queue.size
> >
> >    -
> >
> >    oozie.service.CallableQueueService.threads
> >
> >    -
> >
> >    oozie.service.CallableQueueService.callable.concurrency
> >
> >
> > Regards,
> >
> > Andras
> >
> > On Tue, Jun 6, 2017 at 9:04 AM, Dongying Jiao <pineapple...@gmail.com>
> > wrote:
> >
> > > Hi:
> > > I have a oozie coordinator job run at 02:00 o'clock everyday,
> sometimes,
> > > the job can run smoothly, but sometimes, the job is stuck in PREP state
> > for
> > > a long time.
> > >
> > > This is my part of my coordinator.xml:
> > > <coordinator-app name="CoordinatorForETL"
> > >   frequency="${coordinatorFrequency}"
> > >   start="${startTime}" end="${endTime}" timezone="America/New_York"
> > >   xmlns="uri:oozie:coordinator:0.2">
> > >   <controls>
> > >     <timeout>10</timeout>
> > >     <concurrency>1</concurrency>
> > >   </controls>
> > >   <action>
> > >     <workflow>
> > > .............
> > > This is part of the workflow.xml:
> > > ......
> > > <start to="flowDecision"/>
> > >   <decision name="flowDecision">
> > >   <switch>
> > >     <case to="q1">${workflowType eq "etl" || workflowType eq
> > "all"}</case>
> > >     <case to="prediction">${workflowType eq "prediction"}</case>
> > >     <case to="errorOnDecision">${workflowType eq "cleaning"}</case>
> > >     <default to="errorOnDecision"/>
> > >   </switch>
> > >    </decision>
> > > .......
> > >
> > > From my latest run, the job in PREP state for about 30 min. From oozie
> > log,
> > > the "start" node of the job is done at 02:00, but until 02:32, the
> > > "flowDecision" node started to execute. During that period, I can see
> > other
> > > oozie jobs are running from log, but didn't find any error or exception
> > in
> > > log.
> > >
> > > From my understanding, oozie job in PREP state means the job is not
> > > submitted to yarn yet, so can't find application id on yarn.
> > > I wonder if this relates to oozie queue mechanism or concurrency
> control.
> > > If yes, do you have experience on how to tune them?
> > >
> > > Thanks a lot.
> > >
> > > Best Regards,
> > > Dong Ying
> > >
> >
>

Reply via email to