Hi Richard,
Our last DUCC cluster was retired earlier this year. I would vote for
retirement.
Regards,
Eddie
On Fri, Nov 11, 2022 at 10:38 AM Richard Eckart de Castilho
wrote:
> Hi all,
>
> is anybody using UIMA DUCC?
>
> If yes, it would be great if you could lend us a hand in preparing a new
Fri, Aug 26, 2022 at 10:01 AM Daniel Cosio wrote:
> Any chance you could point me to where this is defined in the docs?
> Daniel Cosio
> dcco...@gmail.com
>
>
>
> > On Aug 26, 2022, at 8:52 AM, Eddie Epstein wrote:
> >
> > UIMA-AS supports timeouts for remot
e connection that communicates the CAS releases..I was
> >> wonder if there was any way of getting the temp queue connection and
> >> sending the message back to return the CAS.. Possible in a shutdown
> hook.
> >>
> >>
> >> Daniel Cosio
> >>
Daniel, is this again a uima-as deployment? If so, since the OS kills
processes, is it some remote AE being killed?
Eddie
On Wed, Aug 24, 2022 at 10:04 AM Daniel Cosio wrote:
> Hi, I have some instances where the OS has killed a pipeline to recover
> resources.. When this happens the pipeline n
Richard,
Looks promising! I put a few comments in the drive document.
Regards, Eddie
On Fri, Aug 20, 2021 at 5:27 AM Richard Eckart de Castilho
wrote:
> Hi all,
>
> to facilitate working with UIMA CAS data and to promote interoperability
> between different UIMA implementations, a new UIMA JSON
r, asynchronous calls weren't registering and the
>CasConsumer would return without writing anything in the Elasticsearch
>index. I checked the job logs and couldn't find any error messages.
>
> I'm sorry for another long message and I truly am grateful to you for
/tutorials_and_users_guides.html#ugr.tug.cpe
On Sun, Jun 14, 2020 at 7:06 PM Eddie Epstein wrote:
> In this case the problem is not DUCC, rather it is the high overhead of
> opening small files and sending them to a remote computer individually. I/O
> works much more efficiently with larger
gt; Hello,
>
> Thank you very much for your response and even more so for the detailed
> explanation.
>
> So, if I understand it correctly, DUCC is more suited for scenarios where
> we have large input documents rather than many small ones?
>
> Thank you once again.
>
> O
7;s no heavy
> I/O processing is happening in the code.
>
> Any ideas please?
>
> Thank you.
>
> On 2020/05/18 12:47:41, Eddie Epstein wrote:
> > Hi,
> >
> > Removing the AE from the pipeline was a good idea to help isolate the
> > bottleneck. The othe
Hi,
Removing the AE from the pipeline was a good idea to help isolate the
bottleneck. The other two most likely possibilities are the collection
reader pulling from elastic search or the CAS consumer writing the
processing output.
DUCC Jobs are a simple way to scale out compute bottlenecks across
Besides very large documents and remote data, another major motivation was
for non-text data, such as audio or video.
Eddie
On Fri, Oct 25, 2019 at 1:33 PM Marshall Schor wrote:
> Hi,
>
> Here's what I vaguely remember was the driving use-cases for the sofa as a
> URI.
>
> 1. The main use case
Unless all CLI/API submissions are done from the head node, DUCC still has
a dependency on a shared filesystem to authenticate such requests for
configurations where user processes run with user credentials.
On Wed, Sep 4, 2019 at 9:41 AM Lou DeGenaro wrote:
> The DUCC Book for the Apache-UIMA D
> Florian
>
> On Mi, Mai 1, 2019 at 12:13 AM, Eddie Epstein
> wrote:
> > Hi Florian,
> >
> > Interesting questions. First, yes the intended behavior is to leave 1
> > instance running. Services are either started by having
> > autostart=true, or
> > by
Hi Florian,
Interesting questions. First, yes the intended behavior is to leave 1
instance running. Services are either started by having autostart=true, or
by a job or another service having a dependency on the service. Logically
it could be possible to let a pinger stop all instances and have th
Hi Rohit,
Hopefully this is something fairly easy to fix. Thanks for the information.
Eddie
On Thu, Aug 2, 2018 at 2:46 AM, Rohit Yadav wrote:
> Hi,
>
> I've tried running DUCC Job for various languages but all the content is
> replaced by (Question Mark)
>
> But for english it works fine.
Hi Erik,
Your user ID has hit the limit for "max user processes" on the machine.
Note that processes and threads are the same in Linux, and a single JVM may
spawn many threads (for example many GC threads :) This parameter used to
be ulimited for users, but there was a change in Red Hat distros t
Hi,
As you may see, the default DUCC pinger for UIMA-AS services scraps JMX
stats from the service broker to report the number of producers and
consumers, the queue depth and a few other things. This pinger also does
stat reset operations on each call to the pinger, I think every 2 minutes.
A cust
Hi Rohit,
What is the collection reader running in the job driver doing? Look at the
memory use (RSS) value for the job driver on the job details page. If
nothing is logged (be sure to check ducc.log file) my guess would be that
the JD ran out of RAM in its cgroup and was killed. The JD cgroup siz
I think the UIMA code uses the annotator context to map the _InitialView
and the context remains static for the life of the annotator. Replicating
annotators to handle different views has been used here too, but agree it
is ugly.
If the annotator code can be changed, then one approach would be to
ere the cas is generated
> by DUCC.
> This can also be a issue of the enviornment(Language) of DUCC because the
> default language is english.
>
> Bets Regards
> Rohit
>
> On 2018/07/03 13:11:50, Eddie Epstein wrote:
> > Rohit,
> >
> > Before sending the d
Rohit,
Before sending the data into jcas if i force encode it :-
>
> String content2 = null;
> content2 = new String(content.getBytes("UTF-8"), "ISO-8859-1");
> jcas.setDocumentText(content2);
>
Where is this code, in the job CR?
>
> And when i go in my first annotator i force decode it:-
>
>
Hi Rohit,
In a DUCC job the CAS created by users CR in the Job Driver is serialized
into cas.xmi format, transported to the Job Process where it is
deserialized and given to the users analytics. Likely the problem is in CAS
serialization or deserialization, perhaps due to the active LANG
environme
but in the processing of refactoring my
> CollectionReader I was trying to slim it down and just have it pass
> document identifiers to the aggregate analysis engine. I'm fuzzy on whether
> 2) is an option and if so how to implement.
>
> -John
>
>
> ______
I may not understand the scenario.
For meta-data that would modify the behavior of the analysis, for example
changing what analysis is run for a CAS, putting it into the CAS itself is
definitely recommended.
The example above is for the UIMA service to access the artifact itself
from a remote so
he pipeline and finally cached by the CC.
> Then, I can somehow (have to read this up) have the work item CAS sent to
> the CC as the effective “batch processing complete” signal.
>
> Is that correct?
>
> > On 15. May 2018, at 20:50, Eddie Epstein wrote:
> >
> > Hi Erik
Hi Erik,
There is a brief discussion of this in the duccbook in section 9.3 ...
https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1880009.3
In particular, the 3rd option, "Flushing cached data". This assumes that
the batch of work to be flushed is represented by each workitem CAS.
Regar
DUCC is designed for multi-user environments, and in particular tries to
balance resources fairly quickly across users in a fair-share allocation.
The default mechanism used is preemption. To eliminate preemption specify a
"non-preemptable" scheduling class for jobs such as "fixed".
Other options
Hi,
Are you specifying to DUCC all three component descriptors: CM, AE and CC?
I'm guessing not, but rather your CM is included in the AAE aggregate given
to DUCC as the AE_Descriptor. Can you confirm?
Eddie
On Fri, Apr 13, 2018 at 8:21 AM, Erik Fäßler
wrote:
> Hi Eddie, thanks for the reply!
Hi Erik,
DUCC jobs can scale out user's components in two ways, horizontally by
running multiple processes (process_deployments_max) and vertically by
running the pipeline defined by the CM, AE and CC components in multiple
threads (process_pipeline_count). Since the constructed top AAE is
desig
Hi,
An annotation feature structure can only be added to the index of the view
it was created in.
It looks like the application at
edu.cmu.lti.oaqa.baseqa.evidence.concept.PassageConceptRecognizer.process(
PassageConceptRecognizer.java:96)*
is trying to add an annotation created in one view to th
There will be a new mechanism to help do this in the upcoming
uima-as-2.10.2 version. This version includes an additional listener on
every service that can be addressed individually. A uima-as client could
then iterate thru all service instances calling CPC, assuming the client
knew about all exis
Hi,
Annotator class "org.orkash.annotator.AnalysisEngine.TreebankChunkerAnnotator"
was not found ... means that this class is not in the classpath specified
by the registration.
Eddie
On Mon, Nov 20, 2017 at 9:17 AM, priyank sharma
wrote:
> Hi!
>
> When i am registering the service on the ducc
because of the Java Heap Space?
>
> Please suggest something as there are nothing in the logs regarding to my
> problem.
>
> Thanks and Regards
> Priyank Sharma
>
> On Friday 10 November 2017 09:00 PM, Eddie Epstein wrote:
>
>> Hi Priyank,
>>
>> Looks like
Hi Priyank,
Looks like you are running DUCC v2.0.x. There are so many bugs fixed in
subsequent versions, the latest being v2.2.1. Newer versions have a
ducc_update command that will upgrade an existing install, but given all
the changes since v2.0.x I suggest a clean install.
Eddie
On Fri, Nov 1
> computing host, and it seems like Hadoop/Spark are much more likely to be
> supported there.
>
> David Fox
>
> On 9/15/17, 1:57 PM, "Eddie Epstein" wrote:
>
> >There are a few DUCC features that might be of particular interest for
> >scaling out UIMA
There are a few DUCC features that might be of particular interest for
scaling out UIMA analytics.
- all user code for batch processing continues to use the existing UIMA
component model: collection readers, cas multiplers, analysis engines, and
cas consumers.**
- DUCC supports assembling and d
How long does the job run before stopping? Cancelled by user could come if
the job is submitted with cancel_on_interrupt and the client submitting the
job were stopped.
Eddie
On Tue, May 16, 2017 at 8:31 AM, Lou DeGenaro
wrote:
> Dunno why the connection would be refused. Are the JD and JP on
Hi Erik,
A few words about DUCC and your application. DUCC is a cluster controller
that includes a resource manager and 3 applications: batch processing, long
running services and singleton processes.
The batch processing application consists of a users CollectionReader which
defines work items a
quired.
>
> 2016-11-09 9:40 GMT-05:00, Eddie Epstein :
> > Is behavior the same for single-threaded AnalysisEngine instantiation?
> >
> > On Tue, Nov 8, 2016 at 10:00 AM, nelson rivera >
> > wrote:
> >
> >> I have a aggregate analysis engine that conta
Is behavior the same for single-threaded AnalysisEngine instantiation?
On Tue, Nov 8, 2016 at 10:00 AM, nelson rivera
wrote:
> I have a aggregate analysis engine that contains a casmultiplier
> annotator. I instantiate this aggregate with the interface
> UIMAFramework.produceAnalysisEngine(speci
ith xmiCas serialization everything works fine. The client and
> the input Cas have identical type system definitions, because i get
> the cas from UimaAsynchronousEngine with the line
> "asynchronousEngine.getCAS()", any idea of problem
>
> 2016-11-03 16:49 GMT-0
Hi,
Binary serialization for a service call only works if the client and
service have identical type system definitions. Have you confirmed
everything works with the default XmiCas serialization?
Eddie
On Thu, Nov 3, 2016 at 3:51 PM, nelson rivera
wrote:
> I want to consume a service uima-as a
ot;Comment from
> NodeMemInfoCollector.java: if running ducc in simulation mode skip memory
> adjustment. Report free memory = fakeMemorySize". But I am not sure if we
> can use this safely since it is for testing.
>
> So we basically want to give ducc an upper limit of usable memory.
>
Hi Daniel,
For each node Ducc sums RSS for all "system" user processes and excludes
that from Ducc usable memory on the node. System users are defined by a
ducc.properties setting with default value:
ducc.agent.node.metrics.sys.gid.max = 500
Ducc's simulation mode is intended for creating a scale
Hi Wahed,
One approach would be to configure the service itself to self-destruct if
processing exceeds a processing threshold. UIMA-AS error configuration does
support timeouts for remote delegates, but not for in-process delegates. So
this would require starting a timer thread in the annotator th
Hi Sean,
There are example .mak files for compiling and creating shared libraries
from C++ annotator code. A couple of env parameters need to be set for the
build. It should be straightforward to configure eclipse CDT to build an
annotator and a C++ application calling annotators from a makefile.
;
> --
> Benjamin De Boe | Product Manager
> M: +32 495 19 19 27 | T: +32 2 464 97 33
> InterSystems Corporation | http://www.intersystems.com
>
> -Original Message-
> From: Eddie Epstein [mailto:eaepst...@gmail.com]
> Sent: Tuesday, April 26, 2016 4:58 AM
> To: user@
hanks,
> benjamin
>
> --
> Benjamin De Boe | Product Manager
> M: +32 495 19 19 27 | T: +32 2 464 97 33
> InterSystems Corporation | http://www.intersystems.com
>
> -Original Message-
> From: Eddie Epstein [mailto:eaepst...@gmail.com]
> Sent: Thursday, April 7
ed error
>
> (5002)
>
> at org.apache.uima.uimacpp.UimacppEngine.destroyJNI(Native Method)
>
> at
> org.apache.uima.uimacpp.UimacppEngine.destroy(UimacppEngine.java:304)
>
> at
> org.apache.uima.uimacpp.UimacppAnalysisComponent.destroy(UimacppAnalysisComponent.java
Hi Benjamin,
UIMACPP is thread safe, as is the JNI interface. To confirm, I just created
a UIMA-AS service with 10 instances of DaveDetector, and fed the service
800 CASes with up to 10 concurrent CASes at any time.
It is not the case with DaveDetector, but at annotator initialization some
analyt
Hi Reshu,
Reserve type allows users to allocate an unconstrained resource. Because
reserve allocations are not constrained by cgroup containers, in v2.x these
allocations were restricted to be an entire machine.
Fixed type allocations, which are always associated with a specific user
process, hav
Hi Reshu,
This is caused by the CollectionReader running in the JobDriver putting
character data in the work item CAS that cannot be XML serialized. DUCC
needs to do better in making this problem clear.
Two choices to fix this: 1) have the CR screen for illegal characters and
not put them in the
Hi Hemati,
If all the components listed are delegates of a single AAE, and the AAE is
deployed by UIMA-AS as "async", then by default only a single instance of
each delegate will be instantiated. Does the UIMA-AS deployment descriptor
specify more than one instance of any of the delegates?
Regard
Hi Reshu,
Each DUCC machine has an agent responsible for starting and killing
processes.
There was a bug ( https://issues.apache.org/jira/browse/UIMA-4194 ) where
the
agent failed to issue a kill -9 against "hung" JPs when a job was stopping.
The fix is in v2.0.
Regards,
Eddie
On Tue, Jan 5, 20
Hi,
Did you confirm that user ducc@ducc-head can do passwordless ssh to
ducc-node-1? If so, running ./check_ducc from the admin folder should give
some useful feedback about ducc-node-1.
Eddie
On Mon, Nov 30, 2015 at 5:14 AM, Sylvain Surcin
wrote:
> Hello,
>
> Despite experimenting for a few
There are several remote AE samples in the UIMA-AS sdk, currently "Apache
UIMA Version 2.6.0" download link at http://uima.apache.org/downloads.cgi.
$UIMA_HOME/examples/deploy/as includes
Deploy_MeetingDetectorTAE.xml
Deploy_MeetingFinder.xml
Deploy_RoomNumberAnnotator.xml
After unpackin
> After installing DUCC 2.1 from binary when I click onDuccBook on
> webserver,I got following error.
>
> HTTP ERROR: 404
>
> Problem accessing /doc/duccbook.html. Reason:
>
> Not Found
>
>
> Thanks and Regards,
> Satya Nand Kanodia
>
> On 10/05/2015 07:01 PM, Eddie Epstein wrote:
>
>> 0 made a small change in cgconfig.conf, adding two lines to enable
>>
>
>
_in_bytes memory.usage_in_bytes
> memory.limit_in_bytes memory.move_charge_at_immigrate memory.use_hierarchy
>
> these are the permissions on /cgroup/ducc
>
> drwxr-xr-x 2 ducc root 0 Oct 5 09:29 .
>
>
> Thanks and Regards,
> Satya Nand Kanodia
>
> On 10/01/2015 07:49 PM,
memory.soft_limit_in_bytes
release_agent
~$ ls -ld /cgroup/ducc/
drwxr-xr-x 2 ducc root 0 Sep 5 11:31 /cgroup/ducc/
On Thu, Oct 1, 2015 at 8:20 AM, Eddie Epstein wrote:
> Well, please list the contents of /cgroups to confirm that the custom
> cgconfig.conf is operating.
> Eddie
>
> On Thu, Oct 1, 2
so written
> in documentation.)
>
> anything else ?
>
> Thanks and Regards,
> Satya Nand Kanodia
>
> On 09/30/2015 05:28 PM, Eddie Epstein wrote:
>
>> Hi Satya,
>>
>> There is a custom cgconfig.conf that has to be installed in /etc/ before
>> starti
x27;s owner or permissions. It is having currently 644
> permissions.
>
> Thanks and Regards,
> Satya Nand Kanodia
>
> On 09/29/2015 06:46 PM, Eddie Epstein wrote:
>
>> DUCC's /etc/cgconfig.conf specifies user=ducc to create cgroups.
>> Is DUCC running as user=d
DUCC's /etc/cgconfig.conf specifies user=ducc to create cgroups.
Is DUCC running as user=ducc?
Using sudo for cgreate testing suggests that the ducc userid is not being
used.
Eddie
On Tue, Sep 29, 2015 at 3:12 AM, Satya Nand Kanodia <
satya.kano...@orkash.com> wrote:
> Hi,
>
> I am using CentOS
nless UIMA
> would be default drop any CAS that has its only remaining view removed.
>
> Dropping the whole unit-of-work (the CAS) instead of stripping its content
> appear to me a cleaner solution.
>
> -- Richard
>
> On 07.09.2015, at 17:45, Eddie Epstein wrote:
>
>
the document text, but
> as far as I know the document text cannot be changed once it is set.
>
> Am 07/09/15 17:14 schrieb "Eddie Epstein" unter :
>
> >Can the filter in the INNER_AAE modify such CASes, perhaps
> >by deleting data, that would result in the existing
Petr,
> > > (I'm somewhat tempted to cut my losses short (much too late) and
> > > abandon UIMA flow control altogether, using only simple pipelines and
> > > having custom glue code to connect these together, as it seems like
> > > getting the flow to work in interesting cases is a huge time s
Can the filter in the INNER_AAE modify such CASes, perhaps
by deleting data, that would result in the existing consumer
effectively ignoring them?
On Mon, Sep 7, 2015 at 11:08 AM, Zesch, Torsten
wrote:
> >The consumer does not have to be modified if the flow controller
> >drops CASes marked to b
ion with a special FeatureStructure, but
> this has the disadvantage that the consumer needs to be aware of that.
> It would be easier if some CASes could simply be dropped.
> I guess this could even be useful for flat workflows.
>
> -Torsten
>
>
> Am 06/09/15 17:31 schrie
;t get why.
>
> Cheers,
>
> -- Richard
>
> On 06.09.2015, at 17:14, Eddie Epstein wrote:
>
> > How about the filter adds a FeatureStructure indicating that the CAS
> should
> > be dropped.
> > Then when the INNER_AAE returns the CAS, the flow controller in the
>
e do explicitly not want certain CASes to continue the processing path.
>
> -- Richard
>
> On 06.09.2015, at 17:04, Eddie Epstein wrote:
>
> > Richard,
> >
> > In general the input CAS must continue down some processing path.
> > Where is it stored and what trigg
returned by processAndOutputNewCASes does not contain the input CAS?
>
> Cheers,
>
> -- Richard
>
> On 06.09.2015, at 16:21, Eddie Epstein wrote:
>
> > Hi Richard,
> >
> > FinalStep() in a CasMultiplier aggregate means to stop further flow
> > in
Hi Petr
On Sun, Sep 6, 2015 at 10:11 AM, Petr Baudis wrote:
> Hi!
>
> I'm currently struggling to perform a complex flow transformation with
> UIMA. I have multiple (N) CASes with some fulltext search results.
> I chop these search results to sentences and would like to pick the top
> M sen
not be dropped?
>
> Cheers,
>
> -- Richard
>
> On 06.09.2015, at 15:58, Eddie Epstein wrote:
>
> > Hi Torsten,
> >
> > The documentation says ...
> >
> > public FinalStep(boolean aForceCasToBeDropped)
> >
> > Creates a new FinalStep, and may
Hi Torsten,
The documentation says ...
public FinalStep(boolean aForceCasToBeDropped)
Creates a new FinalStep, and may indicate that a CAS should be dropped.
This can only be used for CASes that are produced internally to the
aggregate.
It is an error to attempt to drop a CAS that was p
l the
> other
> nodes.
> Maybe it will save someone like me a couple of hours.
>
> Thanks again and cheers,
> Sergii
>
> On Wed, Jul 22, 2015 at 3:02 PM, Eddie Epstein
> wrote:
>
> > Hi Sergii,
> >
> > The ducc_runtime tree needs to be installed on a
Hi Sergii,
The ducc_runtime tree needs to be installed on a shared filesystem
that all DUCC nodes have mounted in the same location. Just install
the ducc runtime once from the DUCC head node. All other DUCC
nodes simply need to have the mounted filesystem and common user
accounts with identical u
Hi Petr,
Good comments which will likely generate lots of responses.
For now please see comments on scaleout below.
On Thu, Jul 9, 2015 at 6:52 PM, Petr Baudis wrote:
> * UIMAfit is not part of core UIMA and UIMA-AS is not part of core
> UIMA. It seems to me that UIMA-AS is doing things
eline instances in separate threads? UIMA-AS would do
this by specifying N instances of a synchronous top-level aggregate.
Eddie
On Wed, May 20, 2015 at 8:49 AM, Petr Baudis wrote:
> Hi!
>
> On Wed, May 20, 2015 at 07:56:33AM -0400, Eddie Epstein wrote:
> > Parallel-step curre
Parallel-step currently only works with remote delegates. The other
approach, using CasMultipliers, allows an arbitrarily amount of parallel
processing in-process. A CM would create a separate CAS for each delegate
intended to run in parallel, and use a feature structure to hold a unique
identifier
process_dd. But
> How??
>
> Thanks in advanced.
>
> Reshu.
>
>
> On 05/01/2015 03:28 AM, Eddie Epstein wrote:
>
>> The simplest way of vertically scaling a Job process is to specify the
>> analysis pipeline using core UIMA descriptors and then using
>> --pro
The simplest way of vertically scaling a Job process is to specify the
analysis pipeline using core UIMA descriptors and then using
--process_thread_count to specify how many copies of the pipeline to
deploy, each in a different thread. No use of UIMA-AS at all. Please check
out the "Raw Text Proce
UIMA-AS has example deployment descriptors using placeholders for the
broker: ${defaultBrokerURL}
If these placeholders are used and the user doesn't specify a value for the
Java property "defaultBrokerURL" then some code in UIMA-AS will use a
default value of tcp://localhost:61616. That is the onl
e me
> consider implementing a custom multithreaded collection processor but I
> wanted to avoid this.
>
> Hope this clarifies what I am trying to do. Cheers :)
>
> > On 24 Apr 2015, at 16:50 , Eddie Epstein wrote:
> >
> > Can you give more details on the overall pipel
Can you give more details on the overall pipeline deployment? The initial
description mentions a CPE and it mentions services. The CPE was created
before flow controllers or CasMutipliers existed and has no support of
them. Services could be Vinci services for the CPE or UIMA-AS services or
???
On
The CPE runs pipeline threads in parallel, not necessarily CAS processors.
In a CPE descriptor, generally all non-CasConsumer components make up the
pipeline.
Change the following line to indicate how many pipeline threads to run, and
make sure the casPoolSize is number of threads +2.
Eddie
On
troy()
>>>> methods
>>>> in UIMA-AS are not called.
>>>> There should be some evidence in JP logs at the very end. Look for
>>>> something like this:
>>>>
>>>> Process Received a Message. Is Process target for message:tr
Hi Silvestre,
An aggregate deployed with UIMA-AS can be used to run delegate annotators
in parallel, with a few restrictions.
- the aggregate must be deployed as async=true
- the parallel delegates must each be running in remote processes
- the delegates must not modify preexisting FS
As Jens
Hi Reshu,
On Fri, Dec 19, 2014 at 12:26 AM, reshu.agarwal
wrote:
>
> Hi,
>
> Is there any problem if one Agent node is on Physical(Master) and one
> agent node is on virtual?
>
> I am running a job which is having avg processing timing of 20 min when I
> have configured a single machine DUCC (ph
I think that is not supported directly. One could use the CasCopier to copy
the view(s) of interest to a new, empty CAS and serialize to xmi file from
that.
Eddie
On Wed, Dec 3, 2014 at 9:04 AM, Jakob Sahlström
wrote:
> Hi,
>
> I'm dealing with a CAS with multiple views, namely a Gold View and
On Sun, Nov 30, 2014 at 11:48 AM, Simon Hafner
wrote:
> 2014-11-30 7:25 GMT-06:00 Eddie Epstein :
> > On Sat, Nov 29, 2014 at 4:46 PM, Simon Hafner
> wrote:
> >
> >> I've thrown some numbers at it (doubling each) and it's running at
> >> comfort
On Sat, Nov 29, 2014 at 4:46 PM, Simon Hafner wrote:
> I've thrown some numbers at it (doubling each) and it's running at
> comfortable 125 procs. However, at about 6.1k of 6.5k items, the procs
> drop down to 30.
>
125 processes at 8 threads each = 1000 active pipelines. How CPU cores
are these
on - the BaseCap -
> > so a max of 16 will be scheduled for it, subject to fair-share
> constraint.
> >
> > 17 Nov 2014 15:07:38,880 INFO RM.RmJob - */getPrjCap/* 208927 bobuser
> O 2
> > T 343171 NTh 128 TI 143171 TR 6748.601431980907 R 1.8967e-02 QR 5043 P
> 6509
>
-28 10:45 GMT-06:00 Eddie Epstein :
> > DuccCasCC component has presumably created
> > /home/ducc/analysis/txt.processed/5911.txt_0_processed.zip_temp and
> written
> > to it?
> I don't know, the _temp file doesn't exist anymore.
>
> > Did you run this s
DuccCasCC component has presumably created
/home/ducc/analysis/txt.processed/5911.txt_0_processed.zip_temp and written
to it?
Did you run this sample job in something other than cluster mode?
On Fri, Nov 28, 2014 at 10:23 AM, Simon Hafner
wrote:
> When running DUCC in cluster mode, I get "Re
Those are the only two log files? Should be a ducc.log (probably with no
more info than on the console), and either one or both of the job driver
logfiles: jd.out.log and jobid-JD-jdnode-jdpid.log. If for some reason the
job driver failed to start, check the job driver agent log (the agent
managing
On Thu, Nov 20, 2014 at 10:01 PM, D. Heinze wrote:
> Eddie... thanks. Yes, that sounds like I would not have the advantage of
> DUCC managing the UIMA pipeline.
>
Depends on the definition of "managing". DUCC manages the lifecycle of
analytic pipelines running as job processes and as services.
Ooops, in this case the web server would be feeding the service directly.
On Thu, Nov 20, 2014 at 9:04 PM, Eddie Epstein wrote:
> The preferred approach is to run the analytics as a DUCC service, and have
> an application driver that feeds the service instances with incoming data.
The preferred approach is to run the analytics as a DUCC service, and have
an application driver that feeds the service instances with incoming data.
This service would be a scalable UIMA-AS service, which could have as
many instances as are needed to keep up with the load. The driver would
use the
On Tue, Nov 18, 2014 at 1:05 AM, reshu.agarwal
wrote:
>
> Hi,
>
> I am bit confused. Why we need un-managed reservation? Suppose we give 5GB
> Memory size to this reservation. Can this RAM be consumed by any process if
> required?
>
Basically yes. See more info about "Rogue Process" in the duccb
DuccRawTextSpec.job specifies that each job process (JP)
run 8 analytic pipeline threads. So for this job with 100 work
items, no more than 13 JPs would ever be started.
After successful initialization of the first JP, DUCC begins scaling
up the number of JPs using doubling. During JP scale up the
On Fri, Nov 14, 2014 at 8:11 PM, Simon Hafner wrote:
> So to run effectively, I would need more memory, because the job wants
> two shares? ... Yes. With a larger node it works. What would be a
> reasonable memory size for a ducc node?
>
> Really depends on the application code. Quoting from the
1 - 100 of 290 matches
Mail list logo