Yi-Wen,

Privileged ducc_ling is for situations where sharing of resources and data
privacy are desired.  For example, user "ducc" would use ducc_ling to
assume the identity of the job submitter in order to gain access to read
and write.  This is useful on large compute clusters with many users. When
testing on my small simulated cluster I do not use a privileged ducc_ling.

Your second question is harder to answer without more information.  Was
there any contention for resources (CPU,disk)?  Look at the Work Items tab
for the longest and shortest jobs and see if you notice any pattern.  Were
work items slow on a particular node?  Were delivery times longer?  Were
process times longer for a few or all the work items?  Was there a lot of
pre-emption (two or more jobs running at once)?  Was there an equal amount
of resource (e.g. number of processes) allocated to each job?

Lou.

On Tue, Nov 24, 2015 at 5:35 PM, Yi-Wen Liu <yiwen...@usc.edu> wrote:

> Hello,
>
> I have some small questions about DUCC, most of them are not technical,
> hope somebody can help me out, thanks!
>
> From https://cwiki.apache.org/confluence/display/UIMA/DUCC, it includes a
> step "Privileged ducc_ling".
> But if I ignore this step, DUCC still works well, I am wondering what is
> this step especially for?
>
> The second question is, I submitted same input files many times, and the
> completed time were very different, range from 2 min to 7 min.
> While I was running DUCC I didn't let the computer busy, and one job
> running at a time.
> Is there a reason why it sometimes finished early sometimes it took such a
> long time to complete the job?
>
> Thanks,
> Yi-Wen
>

Reply via email to