Re: AW: Working with very large text documents

2013-10-18 Thread Thilo Goetz
Don't you have a hadoop cluster you can use? Hadoop would handle the file splitting for you, and if your UIMA analysis is well-behaved, you can deploy it as a M/R job, one record at a time. --Thilo On 10/18/2013 12:25 PM, armin.weg...@bka.bund.de wrote: Hi Jens, It's a log file. Cheers,

Re: custom FSRepository(?)

2013-07-19 Thread Thilo Goetz
On 07/19/2013 11:03 AM, Ingo Thon wrote: Hi List-Members, I'm using UIMA in a very large project. For two reasons I would like to store annotations /partly and the SofaText/SofaStream: 1.) The workflow of our application is roughly as follows: First, UIMA AE is used to add Meta Data to the

Re: Multiple References to an Array

2013-07-02 Thread Thilo Goetz
On 07/01/2013 07:39 PM, John David Osborne (Campus) wrote: Thanks Thilo, that was helpful. Is this (1.0) the standard you were referring to? http://docs.oasis-open.org/uima/v1.0/uima-v1.0.html -John Yes. On 6/20/13 7:09 AM, Thilo Goetz twgo...@gmx.de wrote: On 06/19/2013 10:14 PM

Re: Multiple References to an Array

2013-06-20 Thread Thilo Goetz
On 06/19/2013 10:14 PM, John David Osborne (Campus) wrote: Does anybody know what the underlying reason that this WARNING is generated? WARNING: Warning: multiple references to an array. Reference identity will not be preserved in XMI. 6/19/13 2:22:12 PM - 11:

Re: Ruta - Token Order

2013-05-21 Thread Thilo Goetz
On 05/21/2013 01:37 PM, Peter Klügl wrote: Hi, On 21.05.2013 12:47, armin.weg...@bka.bund.de wrote: Hi, In Ruta 2.0.2-SNAPSHOT a token with begin offset 0 and end offset 2 comes before a token with begin offset 0 and end offset 0. The token order is not as I expected. Thus in my case,

Re: Is it possible to add Feature(s) to Top?

2012-10-16 Thread Thilo Goetz
On 16/10/12 01:21, Shahim Essaid wrote: Hi All, Does the UIMA API provide a way to add features to the base type system? I see that the default TS is created and locked in CASImpl so I am assuming that there is no API way for doing this. Can I add additional features in the CASImpl code

Re: PEAR Classpath issues

2012-07-27 Thread Thilo Goetz
Hi Erik, On 27/07/12 10:51, Erik Fäßler wrote: Hi Thilo! Thanks for your answer! Some comments and further questions below: Am 27.07.2012 um 08:24 schrieb Thilo Goetz: I did intentionally not include my libraries in the classpath here, because the documentation says

Re: UTF8 Encoded documents processing

2012-05-27 Thread Thilo Goetz
On 27/05/12 16:59, Seid Muhie wrote: Dear Thilo Goetz Thank you for your response I have aleardy tried different ways of reading text file with different encodings. For example using commons IO FileUtils class, I tried as follows String document = FileUtils.file2String

Re: InlineXMLCasConsumer fails depending on locale

2012-02-21 Thread Thilo Goetz
On 21/02/12 15:59, Jens Grivolla wrote: Hi, it appears that InlineXMLCasConsumer depends on the system locale for some internal transformations. The output appears to be written in UTF8 (outStream.write(xmlAnnotations.getBytes(UTF-8));) but when used on a machine with a locale of ASCII all

Re: InlineXMLCasConsumer fails depending on locale

2012-02-21 Thread Thilo Goetz
On 21/02/12 16:15, Jens Grivolla wrote: On 02/21/2012 04:08 PM, Thilo Goetz wrote: On 21/02/12 15:59, Jens Grivolla wrote: it appears that InlineXMLCasConsumer depends on the system locale for some internal transformations. The output appears to be written in UTF8 (outStream.write

Re: Having mutliple instances of an AE writing in the same output file - thread safe

2012-01-26 Thread Thilo Goetz
On 26/01/12 10:26, Alexander Klenner wrote: Hi there, is there a tutorial for the problem mentioned above? We have multiple instances of an AE that produce output that has to be collected in one final output file (all instances are ought to share this file via e.g UIMAContext), the order

Re: AW: Annotation/Feature creation, changing types

2011-12-07 Thread Thilo Goetz
By [mailto:t...@cmu.edu] Gesendet: Mittwoch, 7. Dezember 2011 07:21 An: user@uima.apache.org Betreff: Re: Annotation/Feature creation, changing types Hi, Thanks for the reply. On Wed, December 7, 2011 12:04 am, Thilo Goetz wrote: On 06/12/11 20:47, Tomas By wrote: So suppose my data looks

Re: AW: Annotation/Feature creation, changing types

2011-12-07 Thread Thilo Goetz
Forgot to add a link to the docs: http://uima.apache.org/d/uimaj-2.3.1/references.html#ugr.ref.cas On 07/12/11 14:07, Thilo Goetz wrote: On 07/12/11 07:44, armin.weg...@bka.bund.de wrote: Hello Tomas, try this in your annotator: // cas is a CAS, not a JCas final Type type

Re: determining whether a feature has been set

2010-05-03 Thread Thilo Goetz
On 5/3/2010 15:15, Klaus Rothenhäusler wrote: Hi, is there any way to determine whether a feature of a primitive numerical type has been set in a particular feature structure? The methods like getIntValue(feat), getFloatValue(feat), etc. all return a zero value if the feature hasn't been set.