Re: Anybody using UIMA DUCC? Care to give a hand?

2022-11-11 Thread Eddie Epstein
Hi Richard, Our last DUCC cluster was retired earlier this year. I would vote for retirement. Regards, Eddie On Fri, Nov 11, 2022 at 10:38 AM Richard Eckart de Castilho wrote: > Hi all, > > is anybody using UIMA DUCC? > > If yes, it would be great if you could lend us a hand in preparing a

Re: Recover or invalidate Collection Reader CAS

2022-08-26 Thread Eddie Epstein
On Fri, Aug 26, 2022 at 10:01 AM Daniel Cosio wrote: > Any chance you could point me to where this is defined in the docs? > Daniel Cosio > dcco...@gmail.com > > > > > On Aug 26, 2022, at 8:52 AM, Eddie Epstein wrote: > > > > UIMA-AS supports timeouts for re

Re: Recover or invalidate Collection Reader CAS

2022-08-26 Thread Eddie Epstein
on that communicates the CAS releases..I was > >> wonder if there was any way of getting the temp queue connection and > >> sending the message back to return the CAS.. Possible in a shutdown > hook. > >> > >> > >> Daniel Cosio > >> dcco...@g

Re: Towards a (new) UIMA CAS JSON format - feedback welcome!

2021-08-26 Thread Eddie Epstein
Richard, Looks promising! I put a few comments in the drive document. Regards, Eddie On Fri, Aug 20, 2021 at 5:27 AM Richard Eckart de Castilho wrote: > Hi all, > > to facilitate working with UIMA CAS data and to promote interoperability > between different UIMA implementations, a new UIMA JSON

Re: UIMA DUCC slow processing

2020-06-15 Thread Eddie Epstein
CasConsumer would return without writing anything in the Elasticsearch >index. I checked the job logs and couldn't find any error messages. > > I'm sorry for another long message and I truly am grateful to you for your > kind guidance. > > Thank you very much. > > On

Re: UIMA DUCC slow processing

2020-06-14 Thread Eddie Epstein
/tutorials_and_users_guides.html#ugr.tug.cpe On Sun, Jun 14, 2020 at 7:06 PM Eddie Epstein wrote: > In this case the problem is not DUCC, rather it is the high overhead of > opening small files and sending them to a remote computer individually. I/O > works much more efficiently with larg

Re: UIMA DUCC slow processing

2020-06-14 Thread Eddie Epstein
gt; Hello, > > Thank you very much for your response and even more so for the detailed > explanation. > > So, if I understand it correctly, DUCC is more suited for scenarios where > we have large input documents rather than many small ones? > > Thank you once again. > > O

Re: UIMA DUCC slow processing

2020-06-12 Thread Eddie Epstein
ening in the code. > > Any ideas please? > > Thank you. > > On 2020/05/18 12:47:41, Eddie Epstein wrote: > > Hi, > > > > Removing the AE from the pipeline was a good idea to help isolate the > > bottleneck. The other two most likely possibilities are the

Re: UIMA DUCC slow processing

2020-05-18 Thread Eddie Epstein
Hi, Removing the AE from the pipeline was a good idea to help isolate the bottleneck. The other two most likely possibilities are the collection reader pulling from elastic search or the CAS consumer writing the processing output. DUCC Jobs are a simple way to scale out compute bottlenecks

Re: Use of CASes with sofaURI?

2019-10-25 Thread Eddie Epstein
Besides very large documents and remote data, another major motivation was for non-text data, such as audio or video. Eddie On Fri, Oct 25, 2019 at 1:33 PM Marshall Schor wrote: > Hi, > > Here's what I vaguely remember was the driving use-cases for the sofa as a > URI. > > 1. The main use case

Re: DUCC without shared file system

2019-09-05 Thread Eddie Epstein
Unless all CLI/API submissions are done from the head node, DUCC still has a dependency on a shared filesystem to authenticate such requests for configurations where user processes run with user credentials. On Wed, Sep 4, 2019 at 9:41 AM Lou DeGenaro wrote: > The DUCC Book for the Apache-UIMA

Re: Customizing Sample Pinger of Uima

2019-04-30 Thread Eddie Epstein
Hi Florian, Interesting questions. First, yes the intended behavior is to leave 1 instance running. Services are either started by having autostart=true, or by a job or another service having a dependency on the service. Logically it could be possible to let a pinger stop all instances and have

Re: DUCC Job does not work on any other language except English

2018-08-04 Thread Eddie Epstein
Hi Rohit, Hopefully this is something fairly easy to fix. Thanks for the information. Eddie On Thu, Aug 2, 2018 at 2:46 AM, Rohit Yadav wrote: > Hi, > > I've tried running DUCC Job for various languages but all the content is > replaced by (Question Mark) > > But for english it works

Re: Restrict resource of a DUCC node

2018-07-25 Thread Eddie Epstein
Hi Erik, Your user ID has hit the limit for "max user processes" on the machine. Note that processes and threads are the same in Linux, and a single JVM may spawn many threads (for example many GC threads :) This parameter used to be ulimited for users, but there was a change in Red Hat distros

Re: DUCC services statistics

2018-07-19 Thread Eddie Epstein
Hi, As you may see, the default DUCC pinger for UIMA-AS services scraps JMX stats from the service broker to report the number of producers and consumers, the queue depth and a few other things. This pinger also does stat reset operations on each call to the pinger, I think every 2 minutes. A

Re: High CPU Load on Job Driver

2018-07-19 Thread Eddie Epstein
Hi Rohit, What is the collection reader running in the job driver doing? Look at the memory use (RSS) value for the job driver on the job details page. If nothing is logged (be sure to check ducc.log file) my guess would be that the JD ran out of RAM in its cgroup and was killed. The JD cgroup

Re: run existing AE instance on different view

2018-07-10 Thread Eddie Epstein
I think the UIMA code uses the annotator context to map the _InitialView and the context remains static for the life of the annotator. Replicating annotators to handle different views has been used here too, but agree it is ugly. If the annotator code can be changed, then one approach would be to

Re: Problem in running DUCC Job for Arabic Language

2018-07-05 Thread Eddie Epstein
ere the cas is generated > by DUCC. > This can also be a issue of the enviornment(Language) of DUCC because the > default language is english. > > Bets Regards > Rohit > > On 2018/07/03 13:11:50, Eddie Epstein wrote: > > Rohit, > > > > Before sending the d

Re: Problem in running DUCC Job for Arabic Language

2018-07-03 Thread Eddie Epstein
Rohit, Before sending the data into jcas if i force encode it :- > > String content2 = null; > content2 = new String(content.getBytes("UTF-8"), "ISO-8859-1"); > jcas.setDocumentText(content2); > Where is this code, in the job CR? > > And when i go in my first annotator i force decode it:- > >

Re: Problem in running DUCC Job for Arabic Language

2018-06-18 Thread Eddie Epstein
Hi Rohit, In a DUCC job the CAS created by users CR in the Job Driver is serialized into cas.xmi format, transported to the Job Process where it is deserialized and given to the users analytics. Likely the problem is in CAS serialization or deserialization, perhaps due to the active LANG

Re: [External Sender] Re: Runtime Parameters to Annotators Running as Services

2018-06-04 Thread Eddie Epstein
), but in the processing of refactoring my > CollectionReader I was trying to slim it down and just have it pass > document identifiers to the aggregate analysis engine. I'm fuzzy on whether > 2) is an option and if so how to implement. > > -John > > > _______

Re: Runtime Parameters to Annotators Running as Services

2018-05-31 Thread Eddie Epstein
I may not understand the scenario. For meta-data that would modify the behavior of the analysis, for example changing what analysis is run for a CAS, putting it into the CAS itself is definitely recommended. The example above is for the UIMA service to access the artifact itself from a remote

Re: Batch Checkpoints with DUCC?

2018-05-16 Thread Eddie Epstein
output > by the CM, processed by the pipeline and finally cached by the CC. > Then, I can somehow (have to read this up) have the work item CAS sent to > the CC as the effective “batch processing complete” signal. > > Is that correct? > > > On 15. May 2018, at 20:50, Eddie Epstein &l

Re: Batch Checkpoints with DUCC?

2018-05-15 Thread Eddie Epstein
Hi Erik, There is a brief discussion of this in the duccbook in section 9.3 ... https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-1880009.3 In particular, the 3rd option, "Flushing cached data". This assumes that the batch of work to be flushed is represented by each workitem CAS.

Re: DUCC job Issue

2018-04-20 Thread Eddie Epstein
DUCC is designed for multi-user environments, and in particular tries to balance resources fairly quickly across users in a fair-share allocation. The default mechanism used is preemption. To eliminate preemption specify a "non-preemptable" scheduling class for jobs such as "fixed". Other options

Re: DUCC and CAS Consumers

2018-04-11 Thread Eddie Epstein
Hi Erik, DUCC jobs can scale out user's components in two ways, horizontally by running multiple processes (process_deployments_max) and vertically by running the pipeline defined by the CM, AE and CC components in multiple threads (process_pipeline_count). Since the constructed top AAE is

Re: Exception: UIMA - Annotator Processing Failed

2018-02-28 Thread Eddie Epstein
Hi, An annotation feature structure can only be added to the index of the view it was created in. It looks like the application at edu.cmu.lti.oaqa.baseqa.evidence.concept.PassageConceptRecognizer.process( PassageConceptRecognizer.java:96)* is trying to add an annotation created in one view to

Re: Completion event for replicated components

2018-01-18 Thread Eddie Epstein
There will be a new mechanism to help do this in the upcoming uima-as-2.10.2 version. This version includes an additional listener on every service that can be addressed individually. A uima-as client could then iterate thru all service instances calling CPC, assuming the client knew about all

Re: Ducc Service Registration Error

2017-11-20 Thread Eddie Epstein
Hi, Annotator class "org.orkash.annotator.AnalysisEngine.TreebankChunkerAnnotator" was not found ... means that this class is not in the classpath specified by the registration. Eddie On Mon, Nov 20, 2017 at 9:17 AM, priyank sharma wrote: > Hi! > > When i am

Re: DUCC's job goes into infintie loop

2017-11-10 Thread Eddie Epstein
Hi Priyank, Looks like you are running DUCC v2.0.x. There are so many bugs fixed in subsequent versions, the latest being v2.2.1. Newer versions have a ducc_update command that will upgrade an existing install, but given all the changes since v2.0.x I suggest a clean install. Eddie On Fri, Nov

Re: UIMA analysis from a database

2017-09-15 Thread Eddie Epstein
some of our workload to a cloud > computing host, and it seems like Hadoop/Spark are much more likely to be > supported there. > > David Fox > > On 9/15/17, 1:57 PM, "Eddie Epstein" <eaepst...@gmail.com> wrote: > > >There are a few DUCC features that might be

Re: UIMA analysis from a database

2017-09-15 Thread Eddie Epstein
There are a few DUCC features that might be of particular interest for scaling out UIMA analytics. - all user code for batch processing continues to use the existing UIMA component model: collection readers, cas multiplers, analysis engines, and cas consumers.** - DUCC supports assembling and

Re: DUCC job automatically fails and gives Reason,or extraordinary status as cancelled by User | DUCC Version: 2.0.1

2017-05-17 Thread Eddie Epstein
How long does the job run before stopping? Cancelled by user could come if the job is submitted with cancel_on_interrupt and the client submitting the job were stopped. Eddie On Tue, May 16, 2017 at 8:31 AM, Lou DeGenaro wrote: > Dunno why the connection would be

Re: Synchonizing Batches AE and StatusCallbackListener

2017-04-21 Thread Eddie Epstein
Hi Erik, A few words about DUCC and your application. DUCC is a cluster controller that includes a resource manager and 3 applications: batch processing, long running services and singleton processes. The batch processing application consists of a users CollectionReader which defines work items

Re: Free instance of agreggate with cas multiplier in MultiprocessingAnalysisEngine

2016-11-09 Thread Eddie Epstein
erate all > child cas required. > > 2016-11-09 9:40 GMT-05:00, Eddie Epstein <eaepst...@gmail.com>: > > Is behavior the same for single-threaded AnalysisEngine instantiation? > > > > On Tue, Nov 8, 2016 at 10:00 AM, nelson rivera <nelsonriver...@gmail.com > &

Re: Free instance of agreggate with cas multiplier in MultiprocessingAnalysisEngine

2016-11-09 Thread Eddie Epstein
Is behavior the same for single-threaded AnalysisEngine instantiation? On Tue, Nov 8, 2016 at 10:00 AM, nelson rivera wrote: > I have a aggregate analysis engine that contains a casmultiplier > annotator. I instantiate this aggregate with the interface >

Re: java.lang.ClassCastException with binary SerializationStrategy

2016-11-03 Thread Eddie Epstein
ver...@gmail.com> wrote: > Yes with xmiCas serialization everything works fine. The client and > the input Cas have identical type system definitions, because i get > the cas from UimaAsynchronousEngine with the line > "asynchronousEngine.getCAS()", any idea of problem > >

Re: java.lang.ClassCastException with binary SerializationStrategy

2016-11-03 Thread Eddie Epstein
Hi, Binary serialization for a service call only works if the client and service have identical type system definitions. Have you confirmed everything works with the default XmiCas serialization? Eddie On Thu, Nov 3, 2016 at 3:51 PM, nelson rivera wrote: > I want to

Re: UIMA DUCC limit max memory of node

2016-11-01 Thread Eddie Epstein
mment from > NodeMemInfoCollector.java: if running ducc in simulation mode skip memory > adjustment. Report free memory = fakeMemorySize". But I am not sure if we > can use this safely since it is for testing. > > So we basically want to give ducc an upper limit of usable memory. > >

Re: UIMA DUCC limit max memory of node

2016-10-31 Thread Eddie Epstein
Hi Daniel, For each node Ducc sums RSS for all "system" user processes and excludes that from Ducc usable memory on the node. System users are defined by a ducc.properties setting with default value: ducc.agent.node.metrics.sys.gid.max = 500 Ducc's simulation mode is intended for creating a

Re: Uima Ducc Service restart on timeout

2016-10-29 Thread Eddie Epstein
Hi Wahed, One approach would be to configure the service itself to self-destruct if processing exceeds a processing threshold. UIMA-AS error configuration does support timeouts for remote delegates, but not for in-process delegates. So this would require starting a timer thread in the annotator

Re: C++/Python annotators in Eclipse on Mac OS

2016-05-06 Thread Eddie Epstein
Hi Sean, There are example .mak files for compiling and creating shared libraries from C++ annotator code. A couple of env parameters need to be set for the build. It should be straightforward to configure eclipse CDT to build an annotator and a C++ application calling annotators from a makefile.

Re: UIMACPP and multi-threading

2016-04-28 Thread Eddie Epstein
Benjamin De Boe | Product Manager > M: +32 495 19 19 27 | T: +32 2 464 97 33 > InterSystems Corporation | http://www.intersystems.com > > -Original Message- > From: Eddie Epstein [mailto:eaepst...@gmail.com] > Sent: Tuesday, April 26, 2016 4:58 AM > To: user@uima.apache.org

Re: UIMACPP and multi-threading

2016-04-25 Thread Eddie Epstein
> -- > Benjamin De Boe | Product Manager > M: +32 495 19 19 27 | T: +32 2 464 97 33 > InterSystems Corporation | http://www.intersystems.com > > -Original Message- > From: Eddie Epstein [mailto:eaepst...@gmail.com] > Sent: Thursday, April 7, 2016 1:58 PM > To: u

Re: UIMACPP and multi-threading

2016-04-04 Thread Eddie Epstein
Hi Benjamin, UIMACPP is thread safe, as is the JNI interface. To confirm, I just created a UIMA-AS service with 10 instances of DaveDetector, and fed the service 800 CASes with up to 10 concurrent CASes at any time. It is not the case with DaveDetector, but at annotator initialization some

Re: DUCC: Unable to do "Fixed" type of Reservation

2016-03-31 Thread Eddie Epstein
Hi Reshu, Reserve type allows users to allocate an unconstrained resource. Because reserve allocations are not constrained by cgroup containers, in v2.x these allocations were restricted to be an entire machine. Fixed type allocations, which are always associated with a specific user process,

Re: DUCC 2.0.1 : JP Http Client Unable to Communicate with JD

2016-01-12 Thread Eddie Epstein
Hi Reshu, This is caused by the CollectionReader running in the JobDriver putting character data in the work item CAS that cannot be XML serialized. DUCC needs to do better in making this problem clear. Two choices to fix this: 1) have the CR screen for illegal characters and not put them in the

Re: DUCC 1.1.0- Remain in Completing state.

2016-01-05 Thread Eddie Epstein
Hi Reshu, Each DUCC machine has an agent responsible for starting and killing processes. There was a bug ( https://issues.apache.org/jira/browse/UIMA-4194 ) where the agent failed to issue a kill -9 against "hung" JPs when a job was stopping. The fix is in v2.0. Regards, Eddie On Tue, Jan 5,

Re: UIMA-DUCC installation with multiple machines

2015-11-30 Thread Eddie Epstein
Hi, Did you confirm that user ducc@ducc-head can do passwordless ssh to ducc-node-1? If so, running ./check_ducc from the admin folder should give some useful feedback about ducc-node-1. Eddie On Mon, Nov 30, 2015 at 5:14 AM, Sylvain Surcin wrote: > Hello, > >

Re: remote Analysis Engines freely available

2015-10-13 Thread Eddie Epstein
There are several remote AE samples in the UIMA-AS sdk, currently "Apache UIMA Version 2.6.0" download link at http://uima.apache.org/downloads.cgi. $UIMA_HOME/examples/deploy/as includes Deploy_MeetingDetectorTAE.xml Deploy_MeetingFinder.xml Deploy_RoomNumberAnnotator.xml After

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-05 Thread Eddie Epstein
_in_bytes memory.usage_in_bytes > memory.limit_in_bytes memory.move_charge_at_immigrate memory.use_hierarchy > > these are the permissions on /cgroup/ducc > > drwxr-xr-x 2 ducc root 0 Oct 5 09:29 . > > > Thanks and Regards, > Satya Nand Kanodia > > On 10/01/2015 07:49 PM,

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-01 Thread Eddie Epstein
as also written > in documentation.) > > anything else ? > > Thanks and Regards, > Satya Nand Kanodia > > On 09/30/2015 05:28 PM, Eddie Epstein wrote: > >> Hi Satya, >> >> There is a custom cgconfig.conf that has to be installed in /etc/ before >> starti

Re: C-Groups status remains off in web server after installing C-Groups

2015-10-01 Thread Eddie Epstein
memory.soft_limit_in_bytes release_agent ~$ ls -ld /cgroup/ducc/ drwxr-xr-x 2 ducc root 0 Sep 5 11:31 /cgroup/ducc/ On Thu, Oct 1, 2015 at 8:20 AM, Eddie Epstein <eaepst...@gmail.com> wrote: > Well, please list the contents of /cgroups to confirm that the custom > cgconfig.conf is operating. > Eddie

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
ompletely empty the CAS including the document text, but > as far as I know the document text cannot be changed once it is set. > > Am 07/09/15 17:14 schrieb "Eddie Epstein" unter <eaepst...@gmail.com>: > > >Can the filter in the INNER_AAE modify such CASes, per

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
t; We also thought about the solution with a special FeatureStructure, but > this has the disadvantage that the consumer needs to be aware of that. > It would be easier if some CASes could simply be dropped. > I guess this could even be useful for flat workflows. > > -Torsten > &g

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
Can the filter in the INNER_AAE modify such CASes, perhaps by deleting data, that would result in the existing consumer effectively ignoring them? On Mon, Sep 7, 2015 at 11:08 AM, Zesch, Torsten wrote: > >The consumer does not have to be modified if the flow controller

Re: Error when trying to drop CAS with FlowController

2015-09-07 Thread Eddie Epstein
be a clean solution unless UIMA > would be default drop any CAS that has its only remaining view removed. > > Dropping the whole unit-of-work (the CAS) instead of stripping its content > appear to me a cleaner solution. > > -- Richard > > On 07.09.2015, at 17:45

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
Hi Torsten, The documentation says ... public FinalStep(boolean aForceCasToBeDropped) Creates a new FinalStep, and may indicate that a CAS should be dropped. This can only be used for CASes that are produced internally to the aggregate. It is an error to attempt to drop a CAS that was

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
e the aggregate not be dropped? > > Cheers, > > -- Richard > > On 06.09.2015, at 15:58, Eddie Epstein <eaepst...@gmail.com> wrote: > > > Hi Torsten, > > > > The documentation says ... > > > > public FinalStep(boolean aForceCasToBeDropped) > >

Re: CAS merger/multiplier N:M mapping

2015-09-06 Thread Eddie Epstein
Hi Petr On Sun, Sep 6, 2015 at 10:11 AM, Petr Baudis wrote: > Hi! > > I'm currently struggling to perform a complex flow transformation with > UIMA. I have multiple (N) CASes with some fulltext search results. > I chop these search results to sentences and would like to pick

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
k that the iterator > returned by processAndOutputNewCASes does not contain the input CAS? > > Cheers, > > -- Richard > > On 06.09.2015, at 16:21, Eddie Epstein <eaepst...@gmail.com> wrote: > > > Hi Richard, > > > > FinalStep() in a CasMultiplier aggregate means

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
NSUMER. > > So we do explicitly not want certain CASes to continue the processing path. > > -- Richard > > On 06.09.2015, at 17:04, Eddie Epstein <eaepst...@gmail.com> wrote: > > > Richard, > > > > In general the input CAS must continue down some processing

Re: Error when trying to drop CAS with FlowController

2015-09-06 Thread Eddie Epstein
I/we just don't get why. > > Cheers, > > -- Richard > > On 06.09.2015, at 17:14, Eddie Epstein <eaepst...@gmail.com> wrote: > > > How about the filter adds a FeatureStructure indicating that the CAS > should > > be dropped. > > Then when the INNER_AAE retu

Re: DUCC multi-node installation. Beginner's questions.

2015-07-22 Thread Eddie Epstein
Hi Sergii, The ducc_runtime tree needs to be installed on a shared filesystem that all DUCC nodes have mounted in the same location. Just install the ducc runtime once from the DUCC head node. All other DUCC nodes simply need to have the mounted filesystem and common user accounts with identical

Re: Multi-threaded UIMA ParallelStep

2015-05-20 Thread Eddie Epstein
Parallel-step currently only works with remote delegates. The other approach, using CasMultipliers, allows an arbitrarily amount of parallel processing in-process. A CM would create a separate CAS for each delegate intended to run in parallel, and use a feature structure to hold a unique

Re: Multi-threaded UIMA ParallelStep

2015-05-20 Thread Eddie Epstein
instances in separate threads? UIMA-AS would do this by specifying N instances of a synchronous top-level aggregate. Eddie On Wed, May 20, 2015 at 8:49 AM, Petr Baudis pa...@ucw.cz wrote: Hi! On Wed, May 20, 2015 at 07:56:33AM -0400, Eddie Epstein wrote: Parallel-step currently only works

Re: DUCC- process_dd

2015-05-01 Thread Eddie Epstein
?? Thanks in advanced. Reshu. On 05/01/2015 03:28 AM, Eddie Epstein wrote: The simplest way of vertically scaling a Job process is to specify the analysis pipeline using core UIMA descriptors and then using --process_thread_count to specify how many copies of the pipeline to deploy, each

Re: UIMA-AS and ActiveMQ ports

2015-04-27 Thread Eddie Epstein
UIMA-AS has example deployment descriptors using placeholders for the broker: ${defaultBrokerURL} If these placeholders are used and the user doesn't specify a value for the Java property defaultBrokerURL then some code in UIMA-AS will use a default value of tcp://localhost:61616. That is the only

Re: Error handling in flow control

2015-04-24 Thread Eddie Epstein
Can you give more details on the overall pipeline deployment? The initial description mentions a CPE and it mentions services. The CPE was created before flow controllers or CasMutipliers existed and has no support of them. Services could be Vinci services for the CPE or UIMA-AS services or ???

Re: UIMA CPE appears not to utilise more than a single thread

2015-04-13 Thread Eddie Epstein
The CPE runs pipeline threads in parallel, not necessarily CAS processors. In a CPE descriptor, generally all non-CasConsumer components make up the pipeline. Change the following line to indicate how many pipeline threads to run, and make sure the casPoolSize is number of threads +2.

Re: Ducc Problems

2015-03-03 Thread Eddie Epstein
processing at cas consumer level like PersonTitleDBWriterCasConsumer. Thanks in advanced. Reshu. On 03/31/2014 04:14 PM, reshu.agarwal wrote: On 03/28/2014 05:28 PM, Eddie Epstein wrote: Another alternative would be to do the final flush in the Cas consumer's destroy method. Another

Re: DUCC- Agent1 is on Physical and Agent2 is on virtual=Slow the job process timing

2014-12-19 Thread Eddie Epstein
Hi Reshu, On Fri, Dec 19, 2014 at 12:26 AM, reshu.agarwal reshu.agar...@orkash.com wrote: Hi, Is there any problem if one Agent node is on Physical(Master) and one agent node is on virtual? I am running a job which is having avg processing timing of 20 min when I have configured a single

Re: Ruta parallel execution

2014-12-19 Thread Eddie Epstein
Hi Silvestre, An aggregate deployed with UIMA-AS can be used to run delegate annotators in parallel, with a few restrictions. - the aggregate must be deployed as async=true - the parallel delegates must each be running in remote processes - the delegates must not modify preexisting FS As Jens

Re: Serializing Specific View to XMI

2014-12-04 Thread Eddie Epstein
I think that is not supported directly. One could use the CasCopier to copy the view(s) of interest to a new, empty CAS and serialize to xmi file from that. Eddie On Wed, Dec 3, 2014 at 9:04 AM, Jakob Sahlström jakob.sahlst...@gmail.com wrote: Hi, I'm dealing with a CAS with multiple views,

Re: DUCC doesn't use all available machines

2014-11-30 Thread Eddie Epstein
On Sat, Nov 29, 2014 at 4:46 PM, Simon Hafner reactorm...@gmail.com wrote: I've thrown some numbers at it (doubling each) and it's running at comfortable 125 procs. However, at about 6.1k of 6.5k items, the procs drop down to 30. 125 processes at 8 threads each = 1000 active pipelines. How

Re: Ducc: Rename failed

2014-11-28 Thread Eddie Epstein
wrote: 2014-11-28 10:45 GMT-06:00 Eddie Epstein eaepst...@gmail.com: DuccCasCC component has presumably created /home/ducc/analysis/txt.processed/5911.txt_0_processed.zip_temp and written to it? I don't know, the _temp file doesn't exist anymore. Did you run this sample job in something

Re: DUCC doesn't use all available machines

2014-11-28 Thread Eddie Epstein
, 3:44 PM, Eddie Epstein wrote: DuccRawTextSpec.job specifies that each job process (JP) run 8 analytic pipeline threads. So for this job with 100 work items, no more than 13 JPs would ever be started. After successful initialization of the first JP, DUCC begins scaling up the number

Re: DUCC org.apache.uima.util.InvalidXMLException and no logs

2014-11-27 Thread Eddie Epstein
Those are the only two log files? Should be a ducc.log (probably with no more info than on the console), and either one or both of the job driver logfiles: jd.out.log and jobid-JD-jdnode-jdpid.log. If for some reason the job driver failed to start, check the job driver agent log (the agent

Re: DUCC web server interfacing

2014-11-21 Thread Eddie Epstein
On Thu, Nov 20, 2014 at 10:01 PM, D. Heinze dhei...@gnoetics.com wrote: Eddie... thanks. Yes, that sounds like I would not have the advantage of DUCC managing the UIMA pipeline. Depends on the definition of managing. DUCC manages the lifecycle of analytic pipelines running as job processes

Re: DUCC web server interfacing

2014-11-20 Thread Eddie Epstein
The preferred approach is to run the analytics as a DUCC service, and have an application driver that feeds the service instances with incoming data. This service would be a scalable UIMA-AS service, which could have as many instances as are needed to keep up with the load. The driver would use

Re: DUCC web server interfacing

2014-11-20 Thread Eddie Epstein
Ooops, in this case the web server would be feeding the service directly. On Thu, Nov 20, 2014 at 9:04 PM, Eddie Epstein eaepst...@gmail.com wrote: The preferred approach is to run the analytics as a DUCC service, and have an application driver that feeds the service instances with incoming

Re: DUCC-Un-managed Reservation??

2014-11-18 Thread Eddie Epstein
On Tue, Nov 18, 2014 at 1:05 AM, reshu.agarwal reshu.agar...@orkash.com wrote: Hi, I am bit confused. Why we need un-managed reservation? Suppose we give 5GB Memory size to this reservation. Can this RAM be consumed by any process if required? Basically yes. See more info about Rogue

Re: DUCC stuck at WaitingForResources on an Amazon Linux

2014-11-15 Thread Eddie Epstein
On Fri, Nov 14, 2014 at 8:11 PM, Simon Hafner reactorm...@gmail.com wrote: So to run effectively, I would need more memory, because the job wants two shares? ... Yes. With a larger node it works. What would be a reasonable memory size for a ducc node? Really depends on the application code.

Re: DUCC stuck at WaitingForResources on an Amazon Linux

2014-11-13 Thread Eddie Epstein
Simon, The DUCC resource manager logs into rm.log. Did you look there for reasons the resources are not being allocated? Eddie On Wed, Nov 12, 2014 at 4:07 PM, Simon Hafner reactorm...@gmail.com wrote: 4 shares total, 2 in use. 2014-11-12 5:06 GMT-06:00 Lou DeGenaro lou.degen...@gmail.com:

Re: UIMA DUCC - Multi-machine Installation

2014-11-05 Thread Eddie Epstein
:37 PM, Eddie Epstein eaepst...@gmail.com wrote: Hi Tam, In the install documentation, http://uima.apache.org/d/uima-ducc-1.0.0/installation.html, the section Multi-User Installation and Verification describes how to configure setuid-root for ducc_ling so that DUCC jobs are run

Re: UIMA DUCC - Multi-machine Installation

2014-10-31 Thread Eddie Epstein
successfully setup and ran my UIMA analysis engine on single user mode. I also followed DUCCBOOK to setup ducc_ling but I am sure how to get it worked on a cluster of machines. Thanks, Tam On Thu, Oct 30, 2014 at 11:08 PM, Eddie Epstein eaepst...@gmail.com wrote: The $DUCC_RUNTIME tree needs

Re: UIMA DUCC - Multi-machine Installation

2014-10-30 Thread Eddie Epstein
The $DUCC_RUNTIME tree needs to be on a shared filesystem accessible from all machines. For single user mode ducc_ling could be referenced from there as well. But for multiuser setup, ducc_ling needs setuid and should be installed on the root drive. Eddie On Thu, Oct 30, 2014 at 10:08 AM, James

Re: Could UIMA AS client send custom key value parameters to annotator?

2014-10-01 Thread Eddie Epstein
There is no mechanism for a uima-as client to modify the result specification of a remote service. Since type/feature control cannot indicate many other behavioral characteristics, like speed vs accuracy tradeoffs, the suggested approach for dynamic control is to use dedicated feature structures

Re: Uima AS out of memory

2014-08-20 Thread Eddie Epstein
When using deployAsyncService.sh to start a UIMA AS service, the default Java heap size is Xmx800M. To override this, export an environment parameter UIMA_JVM_OPTS with JVM arguments. For example: $ export UIMA_JVM_OPTS=-Xmx6G -Xms2G $ deployAsyncService.sh myDeploymentDescriptor.xml On

Re: UIMA AS NullPointerException in CasDefinition constructor

2014-08-04 Thread Eddie Epstein
? Thanks, Egbert On Monday, July 28, 2014 05:14:39 PM Eddie Epstein wrote: Hi Egbert, The README file for UIMA-AS shows an application example with Deploy_MeetingDetectorTAE.xml.Does that run OK for you? Assuming yes, can you give more details about the scenario

Re: Building UIMA-CPP on (K)Ubuntu 14.04

2014-07-29 Thread Eddie Epstein
. Where should I report these feature requests? Thanks again! Egbert On Tuesday, July 22, 2014 05:04:36 PM Eddie Epstein wrote: Good to hear the build worked. UIMACPP implements only a core subset of UIMA functionality, basically the CAS API and the ability to create primitive and aggregate

Re: UIMA AS NullPointerException in CasDefinition constructor

2014-07-28 Thread Eddie Epstein
Hi Egbert, The README file for UIMA-AS shows an application example with Deploy_MeetingDetectorTAE.xml.Does that run OK for you? Assuming yes, can you give more details about the scenario, perhaps the explicit commands used? The descriptors used? Eddie On Mon, Jul 28, 2014 at 11:46 AM,

Re: Passing additional parameters through to CPE components

2014-07-24 Thread Eddie Epstein
A CPE descriptor can override configuration parameters defined in any integrated components. Documentation a little bit below http://uima.apache.org/d/uimaj-2.6.0/references.html#ugr.ref.xml.cpe_descriptor.descriptor.cas_processors.individual 3.6.1.2. configurationParameterSettings Element This

Re: Passing additional parameters through to CPE components

2014-07-24 Thread Eddie Epstein
to UimaContext presumably because it isn’t defined in the MyCollectionReader.xml. Hope that helps clear it up. On 24 Jul 2014, at 14:51, Eddie Epstein eaepst...@gmail.com wrote: A CPE descriptor can override configuration parameters defined in any integrated components. Documentation a little bit

Re: Building UIMA-CPP on (K)Ubuntu 14.04

2014-07-22 Thread Eddie Epstein
Looking at a build on RHEL, jni.h was resolved with: --with-jdk=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/include -I/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/include/linux which follows the instructions in README.4src I had built ICU [and all other dependencies] and installed it in a

Re: Is there a way to tell UIMA component to only extract some kind of entities when run opennlp.pear?

2014-06-11 Thread Eddie Epstein
Hi Jeffery, According the info at http://uima.apache.org/d/uimaj-2.6.0/tutorials_and_users_guides.html#ugr.tug.aae.result_specification_setting The default Result Specification is taken from the Engine's output Capability Specification. So it should be possible to deploy the UIMA-AS service

Re: Sofa-unaware AEs that create new views in an AAE

2014-04-22 Thread Eddie Epstein
method. Is this a better model for you? Eddie On Tue, Apr 22, 2014 at 6:47 AM, Peter Klügl pklu...@uni-wuerzburg.dewrote: Am 18.04.2014 15:23, schrieb Eddie Epstein: On Thu, Apr 17, 2014 at 9:17 AM, Peter Klügl pklu...@uni-wuerzburg.de wrote: Am 17.04.2014 15:01, schrieb Eddie Epstein

Re: Sofa-unaware AEs that create new views in an AAE

2014-04-18 Thread Eddie Epstein
On Thu, Apr 17, 2014 at 9:17 AM, Peter Klügl pklu...@uni-wuerzburg.dewrote: Am 17.04.2014 15:01, schrieb Eddie Epstein: Hi Peter, The logic is that since a sofa aware component may have one or more input and/or output views, such a component needs to use getView to specify which to use

Re: problem in calling DUCC Service with ducc_submit

2014-04-01 Thread Eddie Epstein
Declaring a service dependency does not affect application code paths. The job still needs to connect to the service in the normal way. DUCC uses services dependency for several reasons: to automatically start services when needed by a job; to not give resources to a job or service for which a

Re: Cas Timeout Exception in DUCC

2014-03-31 Thread Eddie Epstein
Reshu, Please look in the logfile of the job process. Maybe 10 minutes is still not enough? Eddie On Mon, Mar 31, 2014 at 2:42 AM, reshu.agarwal reshu.agar...@orkash.comwrote: On 03/28/2014 05:36 PM, Eddie Epstein wrote: There is a job specification parameter: --process_per_item_time_max

  1   2   >