Hi,

DUCC has some logfiles that show more details of the machine and the job
which would allow us to answer your questions about machine physical
resources. These are located in $DUCC_HOME/logs, and in particular the
agent log would be very helpful. The logfile name is {machine
name}.{domain}.agent.log
Please restart ducc so we can see the log from agent startup thru running
the job one time.

As for the JD memory requirement, the JD driver should not contain any of
the analytic pipeline. Its purpose is normally to send a reference to the
input data to the Job Processes which will read the input data, process it
and write results. (This is described at
http://uima.apache.org/d/uima-ducc-2.0.0/duccbook.html#x1-1600008.1 )

It should be possible for you to take *just* the collection reader
component from the cTAKES pipeline and use that for the JobDriver.
Hopefully this would need much less than Xmx400,

Regards,
Eddie


On Thu, Mar 10, 2016 at 12:07 PM, Selina Chu <[email protected]> wrote:

> Hi Eddie,
>
> Thanks so much for taking the time to look at my issue and for your reply.
>
> The reason I had to increase the heap size for the JD is because I'm
> running cTAKES (http://ctakes.apache.org/) with DUCC.  The increased heap
> size is to accommodate loading all the models from cTAKES into memory.
> Before, when I didn't increase the memory size, DUCC would cancel the
> driver and ends.  cTAKES would return back the error of
> "java.lang.OutOfMemoryError: Java heap space”.
>
> Would you say that this problem is mainly a limitation of my physical
> memory and processes that are running on my computer or can it be adjusted
> in DUCC, like making parameter adjustments so I can use an increased heap
> size or maybe a way to pre-allocate enough memory to be used by DUCC?
>
> Thanks again,
> Selina
>
>
> On Wed, Mar 9, 2016 at 7:35 PM, Eddie Epstein <[email protected]> wrote:
>
> > Hi Selina,
> >
> > I suspect that the problem is due to the following job parameter:
> >       driver_jvm_args                -Xmx4g
> >
> > This would certainly be true if cgroups have been configured on for DUCC.
> > The default cgroup size for a JD is 450MB, so specifying an Xmx of 4GB
> can
> > cause the JVM to spill into swap space and cause erratic behavior.
> >
> > Comparing a "fast" job (96) vs "slow" job (97), the time to process the
> > single work item was 8 sec vs 9 sec:
> >    09 Mar 2016 08:46:08,556  INFO JobDriverHelper - T[20] summarize
> > workitem  statistics  [sec]  avg=8.14 min=8.14 max=8.14 stddev=.00
> > vs
> >    09 Mar 2016 08:56:46,583  INFO JobDriverHelper - T[19] summarize
> > workitem  statistics  [sec]  avg=9.41 min=9.41 max=9.41 stddev=.00
> >
> > The extra delays between the two jobs appear associated with the Job
> > Driver.
> >
> > Was there some reason you specified heap size for the JD? The default JD
> > heap size is Xmx400m.
> >
> > Regards,
> > Eddie
> >
> >
> >
> > On Wed, Mar 9, 2016 at 2:41 PM, Selina Chu <[email protected]> wrote:
> >
> > > Hi
> > >
> > > I’m kind of new to DUCC and this forum.  I was hoping to see if someone
> > > could give me some insights as to why DUCC is behaving strangely and a
> > bit
> > > unstable.
> > >
> > > So what I'm trying to do is: I’m using DUCC to process a cTAKES job.
> > > Currently DUCC is just using a single node.  DUCC seems to act randomly
> > in
> > > processing the jobs, varying between 4.5 minutes to 23 minutes, and I
> > > wasn’t running anything else that is CPU intensive. When I don’t use
> DUCC
> > > and use cTAKES alone, the times for processing are pretty consistent.
> > >
> > > To demonstrate this strange behavior in DUCC, I submitted the exact
> same
> > > job 10 times in a row (job ID 95-104), without modification to the
> > > settings.
> > > The duration for finishing each of the jobs are: 4:41, 4:43, 12:48,
> 8:41,
> > > 5:24, 4:38, 7:07, 23:08, 8:08, 20:37 (canceled by system). The first 9
> > jobs
> > > were completed and the last one got canceled.  Even before the last
> job,
> > > the first 9 jobs were varying in duration times.
> > > After restarting DUCC a couple of times and resetting it, I submitted
> the
> > > same job (job ID 110), that job was completed without a problem (long
> > > processing time)
> > >
> > > I noticed that when a job takes a long time to finish, past 5 minutes,
> it
> > > seemed to be stuck at the “initializing” and “completing” states for
> the
> > > longest.
> > >
> > > It seems like DUCC is doing something randomly.  I tried examining the
> > log
> > > files, but they are all similar, except for the time between each
> state.
> > > (I’ve also placed the related logs and job file in a repo
> > > https://github.com/selinachu/Templogs, in case anyone is interested in
> > > examining them.)
> > >
> > > I’m baffled with the random behaviors from DUCC. I was hoping maybe
> > someone
> > > could clarify this more for me.
> > >
> > > After completing a job, what does DUCC do? Does it save something in
> > > memory, which carries over to the next job, which probably relates to
> the
> > > initialization process?  Are there some parameter settings that might
> > > alleviate this type of behavior?
> > >
> > > I would appreciate any insight.  Thanks in advance for your help.
> > >
> > >
> > > Cheers,
> > > Selina Chu
> > >
> >
>

Reply via email to