subject:"Running CasMultiplier inside a JCasIterable"

Re: Running CasMultiplier inside a JCasIterable

2013-12-05 Thread Richard Eckart de Castilho

No, the issue is still open. 

When I start working on one of the issues that are still recorded on Google 
Code, I open a corresponding issue on the Apache Jira and add a link to each of 
them, pointing to each other. I also set the ASFJira flag on the Google Code 
tracker to true.

-- Richard

On 05.12.2013, at 02:07, Swirl  wrote:

> 
>> Option 2 - let UIMA do the heavy lifting
>> 
>> An alternative and much simple approach might be to create an aggregate which
>> does not only contain the engines, but also the reader. Then you don't have 
>> to 
>> worry about the reader anymore at all. Just create a UIMA JCasIterator and 
>> poll CASes from that until it is empty. Some additional info may be found in
>> the legacy issue 89 [1].
>> 
> 
> Hi Richard,
> Is the code in issue 89, implemented in uimafit 2.0.0?
> It does not work in uimafit 1.4.0 that I currently have.

Re: Running CasMultiplier inside a JCasIterable

2013-12-04 Thread Swirl

 
> Option 2 - let UIMA do the heavy lifting
> 
> An alternative and much simple approach might be to create an aggregate which
> does not only contain the engines, but also the reader. Then you don't have 
> to 
> worry about the reader anymore at all. Just create a UIMA JCasIterator and 
> poll CASes from that until it is empty. Some additional info may be found in
> the legacy issue 89 [1].
> 

Hi Richard,
Is the code in issue 89, implemented in uimafit 2.0.0?
It does not work in uimafit 1.4.0 that I currently have.

Re: Running CasMultiplier inside a JCasIterable

2013-12-04 Thread Richard Eckart de Castilho

Option 1 - by foot:

I guess the uimaFIT JCasIterator should continue to read CAS by CAS
from the reader. However, for each CAS read by the reader, it should be
able to return 0-x CASes. Currently it can only return 1 because it 
calls engine.process(jCas) on each engine in turn. To return 0-x, 

I think, it would have create a single aggregate engine from all the engines, 
call engine.processAndOutputNewCASes(jCas) on that, and handle the UIMA 
JCasIterator
that is returned by it (sorry for two classes having the same name here…).

The UIMA JCasIterator would need to become part of the uimaFIT JCasIterator 
state.
Special handling needs to be introduced to make sure the hasNext() method still 
works,
in particular for the case that a CAS produced by the reader does not result in 
any
output CAS.

Option 2 - let UIMA do the heavy lifting

An alternative and much simple approach might be to create an aggregate which
does not only contain the engines, but also the reader. Then you don't have to 
worry about the reader anymore at all. Just create a UIMA JCasIterator and 
poll CASes from that until it is empty. Some additional info may be found in
the legacy issue 89 [1].

There are probably nasty details, but those should be roughly the general
approaches.

Cheers,

-- Richard

[1] https://code.google.com/p/uimafit/issues/detail?id=89

On 04.12.2013, at 01:16, Swirl  wrote:

> Richard Eckart de Castilho  writes:
> 
>> 
>> For further reference:
>> 
>> https://issues.apache.org/jira/browse/UIMA-3470
> 
> Thanks for raising the Jira.
> 
> I tried looking at the source codes, but I think I am not able to come up 
> with 
> a solution for this.
> Do you have any pointers to get me started?
> 
> Thanks.

Re: Running CasMultiplier inside a JCasIterable

2013-12-03 Thread Swirl

Richard Eckart de Castilho  writes:

> 
> For further reference:
> 
> https://issues.apache.org/jira/browse/UIMA-3470
> 

Thanks for raising the Jira.

I tried looking at the source codes, but I think I am not able to come up with 
a solution for this.
Do you have any pointers to get me started?

Thanks.

Re: Running CasMultiplier inside a JCasIterable

2013-11-28 Thread Richard Eckart de Castilho

For further reference:

https://issues.apache.org/jira/browse/UIMA-3470

-- Richard

On 22.11.2013, at 07:37, Richard Eckart de Castilho  wrote:

> I believe the JCasIterable is currently implemented as a loop which calls
> "process" on the analysis engines for every CAS produced by the reader
> and then returns the corresponding CAS. This wouldn't work with multipliers.
> 
> Can you please file an issue in the Apache Jira, preferrably with a minimal
> test case attached. It shouldn't be a big problem to fix this for the next
> release. A patch already fixing this would also work, of course ;)
> 
> Cheers,
> 
> -- Richard
> 
> On 22.11.2013, at 08:01, Swirl  wrote:
> 
>> I have successfully used CasMultiplier to spilt up a document into segments 
>> for further processing using SimplePipeline.runPipeline().
>> I did this by wrapping the CasMultiplier and the succeeding Annotator within 
>> a 
>> aggregate.
>> 
>> But by simply changing the usage of SimplePipeline.runPipeline() to using 
>> JCasIterable. The code no longer runs correctly, i.e., it's returning as CAS 
>> only the number of physical documents, instead of the segments that i 
>> expected.
>> 
>> How can I can CasMultiplier to work with a JCasIterable?

Re: Running CasMultiplier inside a JCasIterable

2013-11-22 Thread Richard Eckart de Castilho

I believe the JCasIterable is currently implemented as a loop which calls
"process" on the analysis engines for every CAS produced by the reader
and then returns the corresponding CAS. This wouldn't work with multipliers.

Can you please file an issue in the Apache Jira, preferrably with a minimal
test case attached. It shouldn't be a big problem to fix this for the next
release. A patch already fixing this would also work, of course ;)

Cheers,

-- Richard

On 22.11.2013, at 08:01, Swirl  wrote:

> I have successfully used CasMultiplier to spilt up a document into segments 
> for further processing using SimplePipeline.runPipeline().
> I did this by wrapping the CasMultiplier and the succeeding Annotator within 
> a 
> aggregate.
> 
> But by simply changing the usage of SimplePipeline.runPipeline() to using 
> JCasIterable. The code no longer runs correctly, i.e., it's returning as CAS 
> only the number of physical documents, instead of the segments that i 
> expected.
> 
> How can I can CasMultiplier to work with a JCasIterable?

Running CasMultiplier inside a JCasIterable

2013-11-21 Thread Swirl

I have successfully used CasMultiplier to spilt up a document into segments 
for further processing using SimplePipeline.runPipeline().
I did this by wrapping the CasMultiplier and the succeeding Annotator within a 
aggregate.

But by simply changing the usage of SimplePipeline.runPipeline() to using 
JCasIterable. The code no longer runs correctly, i.e., it's returning as CAS 
only the number of physical documents, instead of the segments that i 
expected.

How can I can CasMultiplier to work with a JCasIterable?

Re: Running CasMultiplier inside a JCasIterable

Re: Running CasMultiplier inside a JCasIterable

Re: Running CasMultiplier inside a JCasIterable

Re: Running CasMultiplier inside a JCasIterable

Re: Running CasMultiplier inside a JCasIterable

Re: Running CasMultiplier inside a JCasIterable

Running CasMultiplier inside a JCasIterable

7 matches

Site Navigation

Mail list logo

Footer information