Re: I think I found a bug. [EXTERNAL]

2020-08-31 Thread Miller, Timothy
Peter,
I think the email server doesn't let images through. Can you post an
imgur link maybe?
Tim

On Sun, 2020-08-30 at 14:35 -0700, Peter Abramowitsch wrote:
> * External Email - Caution *
> 
> Hi,
> I was getting a StringIndexOutOfBoundsException in
> DependencyUtil.doesSubsume(annot1, annot2)  with exactly this
> situation:
> 
> negex annotator
> the text begins  "negative for "
> 
> If the chunk negative for xyz is preceded by anything else, even a
> space, the problem goes away.  It also goes away when you choose
> another style of negation.   "no headache", for instance
> 
> I've traced the problem back to some illegal entries in the jCAS  You
> can see from the image below that the ContextAnnotation's begin
> offset is illegal.  
> 
> Clearly there's an off-by-one error and this triggered the exception
> because in my example, the Annotation is created right from the 0th
> char of my note text.  But it occurred to me that in every other
> case, where the annotation doesn't begin on the first character and
> it doesn't throw an exception, it might cause  downstream methods
> like doesSubsume to give the wrong result because the begin/end
> offsets are wrong.
> 
> I'm not sure how to follow this up.  But if anyone wants to tackle
> it?
> 
> This is from HistoryAttributeClassifier beginning at line 274
> 
> 
> 
> 
> 


Re: I think I found a bug.

2020-08-31 Thread Kean Kaufmann
Hi Peter,

I believe I've encountered this too; I never got around to tracking it down
to the root cause, and didn't have the civic-mindedness to report it as you
have.  Thanks!
To shut it up I implemented a brutal brute-force workaround, enclosed for
your possible amusement.

But it occurred to me that in every other case, where the annotation
> doesn't begin on the first character and it doesn't throw an exception, it
> might cause  downstream methods like doesSubsume to give the wrong result
> because the begin/end offsets are wrong.


One would think so, but interestingly enough, this does *not* seem to be
the case.  Everywhere I've checked (quite a few, over the past few years),
non-initial ContextAnnotation offsets look correct.

Workaround: a class that extends NegexAnnotator and adjusts the offsets at
the end of the process() method.

public class NegexAnnotator extends
org.apache.ctakes.ytex.uima.annotators.NegexAnnotator {
...

private void adjustContextOffsets(JCas jCas) {

String text = jCas.getDocumentText();

if (text == null) return;

Collection contexts = JCasUtil.select(jCas,
ContextAnnotation.class);

if (contexts == null || contexts.isEmpty()) return;

contexts.stream()

.filter(c -> c.getBegin() < 0)

.peek(c -> logger.debug("adjusting begin=" + c.getBegin()))

.forEach(c -> c.setBegin(0));

// don't know if this happens

int docTextLen = jCas.getDocumentText().length();

contexts.stream()

.filter(c -> c.getEnd() >= docTextLen)

.peek(c -> logger.debug("adjusting end=" + c.getEnd()))

.forEach(c -> c.setEnd(docTextLen - 1));

}




On Sun, Aug 30, 2020 at 5:35 PM Peter Abramowitsch 
wrote:

> Hi,
> I was getting a StringIndexOutOfBoundsException in
> DependencyUtil.doesSubsume(annot1, annot2)  with exactly this situation:
>
> *negex annotator*
> *the text begins  "negative for "*
>
> If the chunk *negative for xyz *is preceded by anything else, even a
> space, the problem goes away.  It also goes away when you choose another
> style of negation.   "no headache", for instance
>
> I've traced the problem back to some illegal entries in the jCAS  You can
> see from the image below that the ContextAnnotation's begin offset is
> illegal.
>
> Clearly there's an off-by-one error and this triggered the exception
> because in my example, the Annotation is created right from the 0th char of
> my note text.  But it occurred to me that in every other case, where the
> annotation doesn't begin on the first character and it doesn't throw an
> exception, it might cause  downstream methods like doesSubsume to give the
> wrong result because the begin/end offsets are wrong.
>
> I'm not sure how to follow this up.  But if anyone wants to tackle it?
>
> This is from HistoryAttributeClassifier beginning at line 274
>
> [image: image.png]
>
>
>
>


Re: I think I found a bug. [EXTERNAL]

2020-08-31 Thread Peter Abramowitsch
Thanks Jeff,  I don't think the image is needed.  here's what it showed.

With the negex annotator in the pipeline

With "Negative for headache"  as the text starting at position 0
In HistoryAttributeClassifier beginning near line 274
the first IdentifiedAnnotation in the

*List lsmentions*

contains a ContextAnnotation where the offset range is   -1, 13.
Looking at the text, it should probably have been 0, 11.

Add any text ahead of the "Negative for" and it works brilliantly.
Probably one of those  off-by-one errors  that comes from staying up too
late.

Peter



Peter

Peter


On Mon, Aug 31, 2020 at 3:48 AM Miller, Timothy <
timothy.mil...@childrens.harvard.edu> wrote:

> Peter,
> I think the email server doesn't let images through. Can you post an
> imgur link maybe?
> Tim
>
> On Sun, 2020-08-30 at 14:35 -0700, Peter Abramowitsch wrote:
> > * External Email - Caution *
> >
> > Hi,
> > I was getting a StringIndexOutOfBoundsException in
> > DependencyUtil.doesSubsume(annot1, annot2)  with exactly this
> > situation:
> >
> > negex annotator
> > the text begins  "negative for "
> >
> > If the chunk negative for xyz is preceded by anything else, even a
> > space, the problem goes away.  It also goes away when you choose
> > another style of negation.   "no headache", for instance
> >
> > I've traced the problem back to some illegal entries in the jCAS  You
> > can see from the image below that the ContextAnnotation's begin
> > offset is illegal.
> >
> > Clearly there's an off-by-one error and this triggered the exception
> > because in my example, the Annotation is created right from the 0th
> > char of my note text.  But it occurred to me that in every other
> > case, where the annotation doesn't begin on the first character and
> > it doesn't throw an exception, it might cause  downstream methods
> > like doesSubsume to give the wrong result because the begin/end
> > offsets are wrong.
> >
> > I'm not sure how to follow this up.  But if anyone wants to tackle
> > it?
> >
> > This is from HistoryAttributeClassifier beginning at line 274
> >
> >
> >
> >
> >
>