Hi Pei,

I'm not sure if that would solve the problem: change in the ytex branch
causes newlines to be ignored (i.e. not treated as a token).  trunk's
sentence splitter is splits sentences on newlines, so newlines would never
be found in a sentence.  However, if we had a reproducer we could check it
fairly easily in the ytex branch.

Best,

VJ


On Thu, Dec 19, 2013 at 10:15 AM, Chen, Pei
<pei.c...@childrens.harvard.edu>wrote:

> Vj,
> Do you think this is what was causing the NPE's [1]?
> If so, shall we make the same fix in trunk?
> --Pei
>
> [1]
> http://mail-archives.apache.org/mod_mbox/ctakes-dev/201309.mbox/%3C924DE05C19409B438EB81DE683A942D9105A93CB%40CHEXMBX1A.CHBOSTON.ORG%3E
>
> -----Original Message-----
> From: vjapa...@apache.org [mailto:vjapa...@apache.org]
> Sent: Tuesday, December 17, 2013 9:15 PM
> To: comm...@ctakes.apache.org
> Subject: svn commit: r1551805 -
> /ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
>
> Author: vjapache
> Date: Wed Dec 18 02:14:13 2013
> New Revision: 1551805
>
> URL: http://svn.apache.org/r1551805
> Log:
> add support for sentences that contain newline tokens.
>
> Modified:
>
> ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
>
> Modified:
> ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> URL:
> http://svn.apache.org/viewvc/ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java?rev=1551805&r1=1551804&r2=1551805&view=diff
>
> ==============================================================================
> ---
> ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctakes/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCtakesImpl.java
> (original)
> +++ ctakes/branches/ytex/ctakes-assertion/src/main/java/org/apache/ctake
> +++ s/assertion/medfacts/i2b2/api/CharacterOffsetToLineTokenConverterCta
> +++ kesImpl.java Wed Dec 18 02:14:13 2013
> @@ -32,8 +32,8 @@ import org.apache.uima.jcas.tcas.Annotat  import
> org.mitre.medfacts.i2b2.api.ApiConcept;
>  import org.mitre.medfacts.zoner.CharacterOffsetToLineTokenConverter;
>  import org.mitre.medfacts.zoner.LineAndTokenPosition;
> -
>  import org.apache.ctakes.typesystem.type.syntax.BaseToken;
> +import org.apache.ctakes.typesystem.type.syntax.NewlineToken;
>  import org.apache.ctakes.typesystem.type.textspan.Sentence;
>
>  public class CharacterOffsetToLineTokenConverterCtakesImpl implements
> CharacterOffsetToLineTokenConverter
> @@ -78,11 +78,13 @@ public class CharacterOffsetToLineTokenC
>           for (Annotation current : annotationIndex)
>           {
>                   BaseToken bt = (BaseToken)current;
> -                 int begin = bt.getBegin();
> -                 int end = bt.getEnd();
> -
> -                 tokenBeginEndTreeSet.add(begin);
> -                 tokenBeginEndTreeSet.add(end);
> +                 // filter out NewlineToken
> +                 if (!(bt instanceof NewlineToken)) {
> +                         int begin = bt.getBegin();
> +                         int end = bt.getEnd();
> +                         tokenBeginEndTreeSet.add(begin);
> +                         tokenBeginEndTreeSet.add(end);
> +                 }
>           }
>    }
>
>
>
>

Reply via email to