Re: Anybody using UIMA DUCC? Care to give a hand?

2022-11-11 Thread Eddie Epstein
Hi Richard, Our last DUCC cluster was retired earlier this year. I would vote for retirement. Regards, Eddie On Fri, Nov 11, 2022 at 10:38 AM Richard Eckart de Castilho wrote: > Hi all, > > is anybody using UIMA DUCC? > > If yes, it would be great if you could lend us a hand in preparing a new

Re: Recover or invalidate Collection Reader CAS

2022-08-26 Thread Eddie Epstein
Fri, Aug 26, 2022 at 10:01 AM Daniel Cosio wrote: > Any chance you could point me to where this is defined in the docs? > Daniel Cosio > dcco...@gmail.com > > > > > On Aug 26, 2022, at 8:52 AM, Eddie Epstein wrote: > > > > UIMA-AS supports timeouts for remot

Re: Recover or invalidate Collection Reader CAS

2022-08-26 Thread Eddie Epstein
e connection that communicates the CAS releases..I was > >> wonder if there was any way of getting the temp queue connection and > >> sending the message back to return the CAS.. Possible in a shutdown > hook. > >> > >> > >> Daniel Cosio > >>

Re: Recover or invalidate Collection Reader CAS

2022-08-25 Thread Eddie Epstein
Daniel, is this again a uima-as deployment? If so, since the OS kills processes, is it some remote AE being killed? Eddie On Wed, Aug 24, 2022 at 10:04 AM Daniel Cosio wrote: > Hi, I have some instances where the OS has killed a pipeline to recover > resources.. When this happens the pipeline n

Re: Towards a (new) UIMA CAS JSON format - feedback welcome!

2021-08-26 Thread Eddie Epstein
Richard, Looks promising! I put a few comments in the drive document. Regards, Eddie On Fri, Aug 20, 2021 at 5:27 AM Richard Eckart de Castilho wrote: > Hi all, > > to facilitate working with UIMA CAS data and to promote interoperability > between different UIMA implementations, a new UIMA JSON

Re: UIMA DUCC slow processing

2020-06-15 Thread Eddie Epstein
r, asynchronous calls weren't registering and the >CasConsumer would return without writing anything in the Elasticsearch >index. I checked the job logs and couldn't find any error messages. > > I'm sorry for another long message and I truly am grateful to you for

Re: UIMA DUCC slow processing

2020-06-14 Thread Eddie Epstein
/tutorials_and_users_guides.html#ugr.tug.cpe On Sun, Jun 14, 2020 at 7:06 PM Eddie Epstein wrote: > In this case the problem is not DUCC, rather it is the high overhead of > opening small files and sending them to a remote computer individually. I/O > works much more efficiently with larger

Re: UIMA DUCC slow processing

2020-06-14 Thread Eddie Epstein
gt; Hello, > > Thank you very much for your response and even more so for the detailed > explanation. > > So, if I understand it correctly, DUCC is more suited for scenarios where > we have large input documents rather than many small ones? > > Thank you once again. > > O

Re: UIMA DUCC slow processing

2020-06-12 Thread Eddie Epstein
7;s no heavy > I/O processing is happening in the code. > > Any ideas please? > > Thank you. > > On 2020/05/18 12:47:41, Eddie Epstein wrote: > > Hi, > > > > Removing the AE from the pipeline was a good idea to help isolate the > > bottleneck. The othe

Re: UIMA DUCC slow processing

2020-05-18 Thread Eddie Epstein
Hi, Removing the AE from the pipeline was a good idea to help isolate the bottleneck. The other two most likely possibilities are the collection reader pulling from elastic search or the CAS consumer writing the processing output. DUCC Jobs are a simple way to scale out compute bottlenecks across

Re: Use of CASes with sofaURI?

2019-10-25 Thread Eddie Epstein
Besides very large documents and remote data, another major motivation was for non-text data, such as audio or video. Eddie On Fri, Oct 25, 2019 at 1:33 PM Marshall Schor wrote: > Hi, > > Here's what I vaguely remember was the driving use-cases for the sofa as a > URI. > > 1. The main use case

Re: DUCC without shared file system

2019-09-05 Thread Eddie Epstein
Unless all CLI/API submissions are done from the head node, DUCC still has a dependency on a shared filesystem to authenticate such requests for configurations where user processes run with user credentials. On Wed, Sep 4, 2019 at 9:41 AM Lou DeGenaro wrote: > The DUCC Book for the Apache-UIMA D

Re: Customizing Sample Pinger of Uima

2019-05-10 Thread Eddie Epstein
> Florian > > On Mi, Mai 1, 2019 at 12:13 AM, Eddie Epstein > wrote: > > Hi Florian, > > > > Interesting questions. First, yes the intended behavior is to leave 1 > > instance running. Services are either started by having > > autostart=true, or > > by

Re: Customizing Sample Pinger of Uima

2019-04-30 Thread Eddie Epstein
Hi Florian, Interesting questions. First, yes the intended behavior is to leave 1 instance running. Services are either started by having autostart=true, or by a job or another service having a dependency on the service. Logically it could be possible to let a pinger stop all instances and have th

Re: DUCC Job does not work on any other language except English

2018-08-04 Thread Eddie Epstein
Hi Rohit, Hopefully this is something fairly easy to fix. Thanks for the information. Eddie On Thu, Aug 2, 2018 at 2:46 AM, Rohit Yadav wrote: > Hi, > > I've tried running DUCC Job for various languages but all the content is > replaced by (Question Mark) > > But for english it works fine.

Re: Restrict resource of a DUCC node

2018-07-25 Thread Eddie Epstein
Hi Erik, Your user ID has hit the limit for "max user processes" on the machine. Note that processes and threads are the same in Linux, and a single JVM may spawn many threads (for example many GC threads :) This parameter used to be ulimited for users, but there was a change in Red Hat distros t

Re: DUCC services statistics

2018-07-19 Thread Eddie Epstein
Hi, As you may see, the default DUCC pinger for UIMA-AS services scraps JMX stats from the service broker to report the number of producers and consumers, the queue depth and a few other things. This pinger also does stat reset operations on each call to the pinger, I think every 2 minutes. A cust

Re: High CPU Load on Job Driver

2018-07-19 Thread Eddie Epstein
Hi Rohit, What is the collection reader running in the job driver doing? Look at the memory use (RSS) value for the job driver on the job details page. If nothing is logged (be sure to check ducc.log file) my guess would be that the JD ran out of RAM in its cgroup and was killed. The JD cgroup siz

Re: run existing AE instance on different view

2018-07-10 Thread Eddie Epstein
I think the UIMA code uses the annotator context to map the _InitialView and the context remains static for the life of the annotator. Replicating annotators to handle different views has been used here too, but agree it is ugly. If the annotator code can be changed, then one approach would be to

Re: Problem in running DUCC Job for Arabic Language

2018-07-05 Thread Eddie Epstein
ere the cas is generated > by DUCC. > This can also be a issue of the enviornment(Language) of DUCC because the > default language is english. > > Bets Regards > Rohit > > On 2018/07/03 13:11:50, Eddie Epstein wrote: > > Rohit, > > > > Before sending the d

Re: Problem in running DUCC Job for Arabic Language

2018-07-03 Thread Eddie Epstein
Rohit, Before sending the data into jcas if i force encode it :- > > String content2 = null; > content2 = new String(content.getBytes("UTF-8"), "ISO-8859-1"); > jcas.setDocumentText(content2); > Where is this code, in the job CR? > > And when i go in my first annotator i force decode it:- > >

Re: Problem in running DUCC Job for Arabic Language

2018-06-18 Thread Eddie Epstein
Hi Rohit, In a DUCC job the CAS created by users CR in the Job Driver is serialized into cas.xmi format, transported to the Job Process where it is deserialized and given to the users analytics. Likely the problem is in CAS serialization or deserialization, perhaps due to the active LANG environme

Re: [External Sender] Re: Runtime Parameters to Annotators Running as Services

2018-06-04 Thread Eddie Epstein
but in the processing of refactoring my > CollectionReader I was trying to slim it down and just have it pass > document identifiers to the aggregate analysis engine. I'm fuzzy on whether > 2) is an option and if so how to implement. > > -John > > > ______

Re: Runtime Parameters to Annotators Running as Services

2018-05-31 Thread Eddie Epstein
I may not understand the scenario. For meta-data that would modify the behavior of the analysis, for example changing what analysis is run for a CAS, putting it into the CAS itself is definitely recommended. The example above is for the UIMA service to access the artifact itself from a remote so

Re: Batch Checkpoints with DUCC?

2018-05-16 Thread Eddie Epstein
he pipeline and finally cached by the CC. > Then, I can somehow (have to read this up) have the work item CAS sent to > the CC as the effective “batch processing complete” signal. > > Is that correct? > > > On 15. May 2018, at 20:50, Eddie Epstein wrote: > > > > Hi Erik

Re: Batch Checkpoints with DUCC?

2018-05-15 Thread Eddie Epstein
Hi Erik, There is a brief discussion of this in the duccbook in section 9.3 ... https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1880009.3 In particular, the 3rd option, "Flushing cached data". This assumes that the batch of work to be flushed is represented by each workitem CAS. Regar

Re: DUCC job Issue

2018-04-20 Thread Eddie Epstein
DUCC is designed for multi-user environments, and in particular tries to balance resources fairly quickly across users in a fair-share allocation. The default mechanism used is preemption. To eliminate preemption specify a "non-preemptable" scheduling class for jobs such as "fixed". Other options

Re: DUCC and CAS Consumers

2018-04-16 Thread Eddie Epstein
Hi, Are you specifying to DUCC all three component descriptors: CM, AE and CC? I'm guessing not, but rather your CM is included in the AAE aggregate given to DUCC as the AE_Descriptor. Can you confirm? Eddie On Fri, Apr 13, 2018 at 8:21 AM, Erik Fäßler wrote: > Hi Eddie, thanks for the reply!

Re: DUCC and CAS Consumers

2018-04-11 Thread Eddie Epstein
Hi Erik, DUCC jobs can scale out user's components in two ways, horizontally by running multiple processes (process_deployments_max) and vertically by running the pipeline defined by the CM, AE and CC components in multiple threads (process_pipeline_count). Since the constructed top AAE is desig

Re: Exception: UIMA - Annotator Processing Failed

2018-02-28 Thread Eddie Epstein
Hi, An annotation feature structure can only be added to the index of the view it was created in. It looks like the application at edu.cmu.lti.oaqa.baseqa.evidence.concept.PassageConceptRecognizer.process( PassageConceptRecognizer.java:96)* is trying to add an annotation created in one view to th

Re: Completion event for replicated components

2018-01-18 Thread Eddie Epstein
There will be a new mechanism to help do this in the upcoming uima-as-2.10.2 version. This version includes an additional listener on every service that can be addressed individually. A uima-as client could then iterate thru all service instances calling CPC, assuming the client knew about all exis

Re: Ducc Service Registration Error

2017-11-20 Thread Eddie Epstein
Hi, Annotator class "org.orkash.annotator.AnalysisEngine.TreebankChunkerAnnotator" was not found ... means that this class is not in the classpath specified by the registration. Eddie On Mon, Nov 20, 2017 at 9:17 AM, priyank sharma wrote: > Hi! > > When i am registering the service on the ducc

Re: DUCC's job goes into infintie loop

2017-11-13 Thread Eddie Epstein
because of the Java Heap Space? > > Please suggest something as there are nothing in the logs regarding to my > problem. > > Thanks and Regards > Priyank Sharma > > On Friday 10 November 2017 09:00 PM, Eddie Epstein wrote: > >> Hi Priyank, >> >> Looks like

Re: DUCC's job goes into infintie loop

2017-11-10 Thread Eddie Epstein
Hi Priyank, Looks like you are running DUCC v2.0.x. There are so many bugs fixed in subsequent versions, the latest being v2.2.1. Newer versions have a ducc_update command that will upgrade an existing install, but given all the changes since v2.0.x I suggest a clean install. Eddie On Fri, Nov 1

Re: UIMA analysis from a database

2017-09-15 Thread Eddie Epstein
> computing host, and it seems like Hadoop/Spark are much more likely to be > supported there. > > David Fox > > On 9/15/17, 1:57 PM, "Eddie Epstein" wrote: > > >There are a few DUCC features that might be of particular interest for > >scaling out UIMA

Re: UIMA analysis from a database

2017-09-15 Thread Eddie Epstein
There are a few DUCC features that might be of particular interest for scaling out UIMA analytics. - all user code for batch processing continues to use the existing UIMA component model: collection readers, cas multiplers, analysis engines, and cas consumers.** - DUCC supports assembling and d

Re: DUCC job automatically fails and gives Reason,or extraordinary status as cancelled by User | DUCC Version: 2.0.1

2017-05-17 Thread Eddie Epstein
How long does the job run before stopping? Cancelled by user could come if the job is submitted with cancel_on_interrupt and the client submitting the job were stopped. Eddie On Tue, May 16, 2017 at 8:31 AM, Lou DeGenaro wrote: > Dunno why the connection would be refused. Are the JD and JP on

Re: Synchonizing Batches AE and StatusCallbackListener

2017-04-21 Thread Eddie Epstein
Hi Erik, A few words about DUCC and your application. DUCC is a cluster controller that includes a resource manager and 3 applications: batch processing, long running services and singleton processes. The batch processing application consists of a users CollectionReader which defines work items a

Re: Free instance of agreggate with cas multiplier in MultiprocessingAnalysisEngine

2016-11-09 Thread Eddie Epstein
quired. > > 2016-11-09 9:40 GMT-05:00, Eddie Epstein : > > Is behavior the same for single-threaded AnalysisEngine instantiation? > > > > On Tue, Nov 8, 2016 at 10:00 AM, nelson rivera > > > wrote: > > > >> I have a aggregate analysis engine that conta

Re: Free instance of agreggate with cas multiplier in MultiprocessingAnalysisEngine

2016-11-09 Thread Eddie Epstein
Is behavior the same for single-threaded AnalysisEngine instantiation? On Tue, Nov 8, 2016 at 10:00 AM, nelson rivera wrote: > I have a aggregate analysis engine that contains a casmultiplier > annotator. I instantiate this aggregate with the interface > UIMAFramework.produceAnalysisEngine(speci

Re: java.lang.ClassCastException with binary SerializationStrategy

2016-11-03 Thread Eddie Epstein
ith xmiCas serialization everything works fine. The client and > the input Cas have identical type system definitions, because i get > the cas from UimaAsynchronousEngine with the line > "asynchronousEngine.getCAS()", any idea of problem > > 2016-11-03 16:49 GMT-0

Re: java.lang.ClassCastException with binary SerializationStrategy

2016-11-03 Thread Eddie Epstein
Hi, Binary serialization for a service call only works if the client and service have identical type system definitions. Have you confirmed everything works with the default XmiCas serialization? Eddie On Thu, Nov 3, 2016 at 3:51 PM, nelson rivera wrote: > I want to consume a service uima-as a

Re: UIMA DUCC limit max memory of node

2016-11-01 Thread Eddie Epstein
ot;Comment from > NodeMemInfoCollector.java: if running ducc in simulation mode skip memory > adjustment. Report free memory = fakeMemorySize". But I am not sure if we > can use this safely since it is for testing. > > So we basically want to give ducc an upper limit of usable memory. >

Re: UIMA DUCC limit max memory of node

2016-10-31 Thread Eddie Epstein
Hi Daniel, For each node Ducc sums RSS for all "system" user processes and excludes that from Ducc usable memory on the node. System users are defined by a ducc.properties setting with default value: ducc.agent.node.metrics.sys.gid.max = 500 Ducc's simulation mode is intended for creating a scale

Re: Uima Ducc Service restart on timeout

2016-10-29 Thread Eddie Epstein
Hi Wahed, One approach would be to configure the service itself to self-destruct if processing exceeds a processing threshold. UIMA-AS error configuration does support timeouts for remote delegates, but not for in-process delegates. So this would require starting a timer thread in the annotator th

Re: C++/Python annotators in Eclipse on Mac OS

2016-05-06 Thread Eddie Epstein
Hi Sean, There are example .mak files for compiling and creating shared libraries from C++ annotator code. A couple of env parameters need to be set for the build. It should be straightforward to configure eclipse CDT to build an annotator and a C++ application calling annotators from a makefile.

Re: UIMACPP and multi-threading

2016-04-28 Thread Eddie Epstein
; > -- > Benjamin De Boe | Product Manager > M: +32 495 19 19 27 | T: +32 2 464 97 33 > InterSystems Corporation | http://www.intersystems.com > > -Original Message- > From: Eddie Epstein [mailto:eaepst...@gmail.com] > Sent: Tuesday, April 26, 2016 4:58 AM > To: user@

Re: UIMACPP and multi-threading

2016-04-25 Thread Eddie Epstein
hanks, > benjamin > > -- > Benjamin De Boe | Product Manager > M: +32 495 19 19 27 | T: +32 2 464 97 33 > InterSystems Corporation | http://www.intersystems.com > > -Original Message- > From: Eddie Epstein [mailto:eaepst...@gmail.com] > Sent: Thursday, April 7

Re: UIMACPP and multi-threading

2016-04-07 Thread Eddie Epstein
ed error > > (5002) > > at org.apache.uima.uimacpp.UimacppEngine.destroyJNI(Native Method) > > at > org.apache.uima.uimacpp.UimacppEngine.destroy(UimacppEngine.java:304) > > at > org.apache.uima.uimacpp.UimacppAnalysisComponent.destroy(UimacppAnalysisComponent.java

Re: UIMACPP and multi-threading

2016-04-04 Thread Eddie Epstein
Hi Benjamin, UIMACPP is thread safe, as is the JNI interface. To confirm, I just created a UIMA-AS service with 10 instances of DaveDetector, and fed the service 800 CASes with up to 10 concurrent CASes at any time. It is not the case with DaveDetector, but at annotator initialization some analyt

Re: DUCC: Unable to do "Fixed" type of Reservation

2016-03-31 Thread Eddie Epstein
Hi Reshu, Reserve type allows users to allocate an unconstrained resource. Because reserve allocations are not constrained by cgroup containers, in v2.x these allocations were restricted to be an entire machine. Fixed type allocations, which are always associated with a specific user process, hav

Re: DUCC 2.0.1 : JP Http Client Unable to Communicate with JD

2016-01-12 Thread Eddie Epstein
Hi Reshu, This is caused by the CollectionReader running in the JobDriver putting character data in the work item CAS that cannot be XML serialized. DUCC needs to do better in making this problem clear. Two choices to fix this: 1) have the CR screen for illegal characters and not put them in the

Re: Uima-AS Cas merger with cas multiplier

2016-01-11 Thread Eddie Epstein
Hi Hemati, If all the components listed are delegates of a single AAE, and the AAE is deployed by UIMA-AS as "async", then by default only a single instance of each delegate will be instantiated. Does the UIMA-AS deployment descriptor specify more than one instance of any of the delegates? Regard

Re: DUCC 1.1.0- Remain in Completing state.

2016-01-05 Thread Eddie Epstein
Hi Reshu, Each DUCC machine has an agent responsible for starting and killing processes. There was a bug ( https://issues.apache.org/jira/browse/UIMA-4194 ) where the agent failed to issue a kill -9 against "hung" JPs when a job was stopping. The fix is in v2.0. Regards, Eddie On Tue, Jan 5, 20

Re: UIMA-DUCC installation with multiple machines

2015-11-30 Thread Eddie Epstein
Hi, Did you confirm that user ducc@ducc-head can do passwordless ssh to ducc-node-1? If so, running ./check_ducc from the admin folder should give some useful feedback about ducc-node-1. Eddie On Mon, Nov 30, 2015 at 5:14 AM, Sylvain Surcin wrote: > Hello, > > Despite experimenting for a few

Re: remote Analysis Engines freely available

2015-10-13 Thread Eddie Epstein
There are several remote AE samples in the UIMA-AS sdk, currently "Apache UIMA Version 2.6.0" download link at http://uima.apache.org/downloads.cgi. $UIMA_HOME/examples/deploy/as includes Deploy_MeetingDetectorTAE.xml Deploy_MeetingFinder.xml Deploy_RoomNumberAnnotator.xml After unpackin

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-06 Thread Eddie Epstein
> After installing DUCC 2.1 from binary when I click onDuccBook on > webserver,I got following error. > > HTTP ERROR: 404 > > Problem accessing /doc/duccbook.html. Reason: > > Not Found > > > Thanks and Regards, > Satya Nand Kanodia > > On 10/05/2015 07:01 PM, Eddie Epstein wrote: > >> 0 made a small change in cgconfig.conf, adding two lines to enable >> > >

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-05 Thread Eddie Epstein
_in_bytes memory.usage_in_bytes > memory.limit_in_bytes memory.move_charge_at_immigrate memory.use_hierarchy > > these are the permissions on /cgroup/ducc > > drwxr-xr-x 2 ducc root 0 Oct 5 09:29 . > > > Thanks and Regards, > Satya Nand Kanodia > > On 10/01/2015 07:49 PM,

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-01 Thread Eddie Epstein
memory.soft_limit_in_bytes release_agent ~$ ls -ld /cgroup/ducc/ drwxr-xr-x 2 ducc root 0 Sep 5 11:31 /cgroup/ducc/ On Thu, Oct 1, 2015 at 8:20 AM, Eddie Epstein wrote: > Well, please list the contents of /cgroups to confirm that the custom > cgconfig.conf is operating. > Eddie > > On Thu, Oct 1, 2

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-01 Thread Eddie Epstein
so written > in documentation.) > > anything else ? > > Thanks and Regards, > Satya Nand Kanodia > > On 09/30/2015 05:28 PM, Eddie Epstein wrote: > >> Hi Satya, >> >> There is a custom cgconfig.conf that has to be installed in /etc/ before >> starti

Re: C-Groups status remains off in web server after installing C-Groups

2015-09-30 Thread Eddie Epstein
x27;s owner or permissions. It is having currently 644 > permissions. > > Thanks and Regards, > Satya Nand Kanodia > > On 09/29/2015 06:46 PM, Eddie Epstein wrote: > >> DUCC's /etc/cgconfig.conf specifies user=ducc to create cgroups. >> Is DUCC running as user=d

Re: C-Groups status remains off in web server after installing C-Groups

2015-09-29 Thread Eddie Epstein
DUCC's /etc/cgconfig.conf specifies user=ducc to create cgroups. Is DUCC running as user=ducc? Using sudo for cgreate testing suggests that the ducc userid is not being used. Eddie On Tue, Sep 29, 2015 at 3:12 AM, Satya Nand Kanodia < satya.kano...@orkash.com> wrote: > Hi, > > I am using CentOS

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
nless UIMA > would be default drop any CAS that has its only remaining view removed. > > Dropping the whole unit-of-work (the CAS) instead of stripping its content > appear to me a cleaner solution. > > -- Richard > > On 07.09.2015, at 17:45, Eddie Epstein wrote: > >

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
the document text, but > as far as I know the document text cannot be changed once it is set. > > Am 07/09/15 17:14 schrieb "Eddie Epstein" unter : > > >Can the filter in the INNER_AAE modify such CASes, perhaps > >by deleting data, that would result in the existing

Re: CAS merger/multiplier N:M mapping

2015-09-07 Thread Eddie Epstein
Petr, > > > (I'm somewhat tempted to cut my losses short (much too late) and > > > abandon UIMA flow control altogether, using only simple pipelines and > > > having custom glue code to connect these together, as it seems like > > > getting the flow to work in interesting cases is a huge time s

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
Can the filter in the INNER_AAE modify such CASes, perhaps by deleting data, that would result in the existing consumer effectively ignoring them? On Mon, Sep 7, 2015 at 11:08 AM, Zesch, Torsten wrote: > >The consumer does not have to be modified if the flow controller > >drops CASes marked to b

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
ion with a special FeatureStructure, but > this has the disadvantage that the consumer needs to be aware of that. > It would be easier if some CASes could simply be dropped. > I guess this could even be useful for flat workflows. > > -Torsten > > > Am 06/09/15 17:31 schrie

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
;t get why. > > Cheers, > > -- Richard > > On 06.09.2015, at 17:14, Eddie Epstein wrote: > > > How about the filter adds a FeatureStructure indicating that the CAS > should > > be dropped. > > Then when the INNER_AAE returns the CAS, the flow controller in the >

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
e do explicitly not want certain CASes to continue the processing path. > > -- Richard > > On 06.09.2015, at 17:04, Eddie Epstein wrote: > > > Richard, > > > > In general the input CAS must continue down some processing path. > > Where is it stored and what trigg

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
returned by processAndOutputNewCASes does not contain the input CAS? > > Cheers, > > -- Richard > > On 06.09.2015, at 16:21, Eddie Epstein wrote: > > > Hi Richard, > > > > FinalStep() in a CasMultiplier aggregate means to stop further flow > > in

Re: CAS merger/multiplier N:M mapping

2015-09-06 Thread Eddie Epstein
Hi Petr On Sun, Sep 6, 2015 at 10:11 AM, Petr Baudis wrote: > Hi! > > I'm currently struggling to perform a complex flow transformation with > UIMA. I have multiple (N) CASes with some fulltext search results. > I chop these search results to sentences and would like to pick the top > M sen

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
not be dropped? > > Cheers, > > -- Richard > > On 06.09.2015, at 15:58, Eddie Epstein wrote: > > > Hi Torsten, > > > > The documentation says ... > > > > public FinalStep(boolean aForceCasToBeDropped) > > > > Creates a new FinalStep, and may

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
Hi Torsten, The documentation says ... public FinalStep(boolean aForceCasToBeDropped) Creates a new FinalStep, and may indicate that a CAS should be dropped. This can only be used for CASes that are produced internally to the aggregate. It is an error to attempt to drop a CAS that was p

Re: DUCC multi-node installation. Beginner's questions.

2015-07-23 Thread Eddie Epstein
l the > other > nodes. > Maybe it will save someone like me a couple of hours. > > Thanks again and cheers, > Sergii > > On Wed, Jul 22, 2015 at 3:02 PM, Eddie Epstein > wrote: > > > Hi Sergii, > > > > The ducc_runtime tree needs to be installed on a

Re: DUCC multi-node installation. Beginner's questions.

2015-07-22 Thread Eddie Epstein
Hi Sergii, The ducc_runtime tree needs to be installed on a shared filesystem that all DUCC nodes have mounted in the same location. Just install the ducc runtime once from the DUCC head node. All other DUCC nodes simply need to have the mounted filesystem and common user accounts with identical u

Re: UIMAj3 ideas

2015-07-10 Thread Eddie Epstein
Hi Petr, Good comments which will likely generate lots of responses. For now please see comments on scaleout below. On Thu, Jul 9, 2015 at 6:52 PM, Petr Baudis wrote: > * UIMAfit is not part of core UIMA and UIMA-AS is not part of core > UIMA. It seems to me that UIMA-AS is doing things

Re: Multi-threaded UIMA ParallelStep

2015-05-20 Thread Eddie Epstein
eline instances in separate threads? UIMA-AS would do this by specifying N instances of a synchronous top-level aggregate. Eddie On Wed, May 20, 2015 at 8:49 AM, Petr Baudis wrote: > Hi! > > On Wed, May 20, 2015 at 07:56:33AM -0400, Eddie Epstein wrote: > > Parallel-step curre

Re: Multi-threaded UIMA ParallelStep

2015-05-20 Thread Eddie Epstein
Parallel-step currently only works with remote delegates. The other approach, using CasMultipliers, allows an arbitrarily amount of parallel processing in-process. A CM would create a separate CAS for each delegate intended to run in parallel, and use a feature structure to hold a unique identifier

Re: DUCC- process_dd

2015-05-01 Thread Eddie Epstein
process_dd. But > How?? > > Thanks in advanced. > > Reshu. > > > On 05/01/2015 03:28 AM, Eddie Epstein wrote: > >> The simplest way of vertically scaling a Job process is to specify the >> analysis pipeline using core UIMA descriptors and then using >> --pro

Re: DUCC- process_dd

2015-04-30 Thread Eddie Epstein
The simplest way of vertically scaling a Job process is to specify the analysis pipeline using core UIMA descriptors and then using --process_thread_count to specify how many copies of the pipeline to deploy, each in a different thread. No use of UIMA-AS at all. Please check out the "Raw Text Proce

Re: UIMA-AS and ActiveMQ ports

2015-04-27 Thread Eddie Epstein
UIMA-AS has example deployment descriptors using placeholders for the broker: ${defaultBrokerURL} If these placeholders are used and the user doesn't specify a value for the Java property "defaultBrokerURL" then some code in UIMA-AS will use a default value of tcp://localhost:61616. That is the onl

Re: Error handling in flow control

2015-04-26 Thread Eddie Epstein
e me > consider implementing a custom multithreaded collection processor but I > wanted to avoid this. > > Hope this clarifies what I am trying to do. Cheers :) > > > On 24 Apr 2015, at 16:50 , Eddie Epstein wrote: > > > > Can you give more details on the overall pipel

Re: Error handling in flow control

2015-04-24 Thread Eddie Epstein
Can you give more details on the overall pipeline deployment? The initial description mentions a CPE and it mentions services. The CPE was created before flow controllers or CasMutipliers existed and has no support of them. Services could be Vinci services for the CPE or UIMA-AS services or ??? On

Re: UIMA CPE appears not to utilise more than a single thread

2015-04-13 Thread Eddie Epstein
The CPE runs pipeline threads in parallel, not necessarily CAS processors. In a CPE descriptor, generally all non-CasConsumer components make up the pipeline. Change the following line to indicate how many pipeline threads to run, and make sure the casPoolSize is number of threads +2. Eddie On

Re: Ducc Problems

2015-03-03 Thread Eddie Epstein
troy() >>>> methods >>>> in UIMA-AS are not called. >>>> There should be some evidence in JP logs at the very end. Look for >>>> something like this: >>>> >>>> Process Received a Message. Is Process target for message:tr

Re: Ruta parallel execution

2014-12-19 Thread Eddie Epstein
Hi Silvestre, An aggregate deployed with UIMA-AS can be used to run delegate annotators in parallel, with a few restrictions. - the aggregate must be deployed as async=true - the parallel delegates must each be running in remote processes - the delegates must not modify preexisting FS As Jens

Re: DUCC- Agent1 is on Physical and Agent2 is on virtual=Slow the job process timing

2014-12-19 Thread Eddie Epstein
Hi Reshu, On Fri, Dec 19, 2014 at 12:26 AM, reshu.agarwal wrote: > > Hi, > > Is there any problem if one Agent node is on Physical(Master) and one > agent node is on virtual? > > I am running a job which is having avg processing timing of 20 min when I > have configured a single machine DUCC (ph

Re: Serializing Specific View to XMI

2014-12-04 Thread Eddie Epstein
I think that is not supported directly. One could use the CasCopier to copy the view(s) of interest to a new, empty CAS and serialize to xmi file from that. Eddie On Wed, Dec 3, 2014 at 9:04 AM, Jakob Sahlström wrote: > Hi, > > I'm dealing with a CAS with multiple views, namely a Gold View and

Re: DUCC doesn't use all available machines

2014-11-30 Thread Eddie Epstein
On Sun, Nov 30, 2014 at 11:48 AM, Simon Hafner wrote: > 2014-11-30 7:25 GMT-06:00 Eddie Epstein : > > On Sat, Nov 29, 2014 at 4:46 PM, Simon Hafner > wrote: > > > >> I've thrown some numbers at it (doubling each) and it's running at > >> comfort

Re: DUCC doesn't use all available machines

2014-11-30 Thread Eddie Epstein
On Sat, Nov 29, 2014 at 4:46 PM, Simon Hafner wrote: > I've thrown some numbers at it (doubling each) and it's running at > comfortable 125 procs. However, at about 6.1k of 6.5k items, the procs > drop down to 30. > 125 processes at 8 threads each = 1000 active pipelines. How CPU cores are these

Re: DUCC doesn't use all available machines

2014-11-28 Thread Eddie Epstein
on - the BaseCap - > > so a max of 16 will be scheduled for it, subject to fair-share > constraint. > > > > 17 Nov 2014 15:07:38,880 INFO RM.RmJob - */getPrjCap/* 208927 bobuser > O 2 > > T 343171 NTh 128 TI 143171 TR 6748.601431980907 R 1.8967e-02 QR 5043 P > 6509 >

Re: Ducc: Rename failed

2014-11-28 Thread Eddie Epstein
-28 10:45 GMT-06:00 Eddie Epstein : > > DuccCasCC component has presumably created > > /home/ducc/analysis/txt.processed/5911.txt_0_processed.zip_temp and > written > > to it? > I don't know, the _temp file doesn't exist anymore. > > > Did you run this s

Re: Ducc: Rename failed

2014-11-28 Thread Eddie Epstein
DuccCasCC component has presumably created /home/ducc/analysis/txt.processed/5911.txt_0_processed.zip_temp and written to it? Did you run this sample job in something other than cluster mode? On Fri, Nov 28, 2014 at 10:23 AM, Simon Hafner wrote: > When running DUCC in cluster mode, I get "Re

Re: DUCC org.apache.uima.util.InvalidXMLException and no logs

2014-11-27 Thread Eddie Epstein
Those are the only two log files? Should be a ducc.log (probably with no more info than on the console), and either one or both of the job driver logfiles: jd.out.log and jobid-JD-jdnode-jdpid.log. If for some reason the job driver failed to start, check the job driver agent log (the agent managing

Re: DUCC web server interfacing

2014-11-21 Thread Eddie Epstein
On Thu, Nov 20, 2014 at 10:01 PM, D. Heinze wrote: > Eddie... thanks. Yes, that sounds like I would not have the advantage of > DUCC managing the UIMA pipeline. > Depends on the definition of "managing". DUCC manages the lifecycle of analytic pipelines running as job processes and as services.

Re: DUCC web server interfacing

2014-11-20 Thread Eddie Epstein
Ooops, in this case the web server would be feeding the service directly. On Thu, Nov 20, 2014 at 9:04 PM, Eddie Epstein wrote: > The preferred approach is to run the analytics as a DUCC service, and have > an application driver that feeds the service instances with incoming data.

Re: DUCC web server interfacing

2014-11-20 Thread Eddie Epstein
The preferred approach is to run the analytics as a DUCC service, and have an application driver that feeds the service instances with incoming data. This service would be a scalable UIMA-AS service, which could have as many instances as are needed to keep up with the load. The driver would use the

Re: DUCC-Un-managed Reservation??

2014-11-18 Thread Eddie Epstein
On Tue, Nov 18, 2014 at 1:05 AM, reshu.agarwal wrote: > > Hi, > > I am bit confused. Why we need un-managed reservation? Suppose we give 5GB > Memory size to this reservation. Can this RAM be consumed by any process if > required? > Basically yes. See more info about "Rogue Process" in the duccb

Re: DUCC doesn't use all available machines

2014-11-17 Thread Eddie Epstein
DuccRawTextSpec.job specifies that each job process (JP) run 8 analytic pipeline threads. So for this job with 100 work items, no more than 13 JPs would ever be started. After successful initialization of the first JP, DUCC begins scaling up the number of JPs using doubling. During JP scale up the

Re: DUCC stuck at WaitingForResources on an Amazon Linux

2014-11-15 Thread Eddie Epstein
On Fri, Nov 14, 2014 at 8:11 PM, Simon Hafner wrote: > So to run effectively, I would need more memory, because the job wants > two shares? ... Yes. With a larger node it works. What would be a > reasonable memory size for a ducc node? > > Really depends on the application code. Quoting from the

  1   2   3   >