Dear Richard:

I am attaching a patch with a series of selectIntersects and
indexIntersects methods. There's more signatures than the corresponding
selectCovered/Covering methods, as "intersects" could include covering
annotations or not. If they are excluded, the methods use the same approach
you used in selecCovered, advancing the iterator. Otherwise, they defer to
the int interval method used in selectCovering.

Maybe this is useful to someone else besides myself?

Also, I have no experience with unit testing, so I didn't even try adding
to add tests for the new methods. I did some naive testing by hand, and it
seems to work... but I'm particularly bad with interval operations, so I
wouldn't be surprised if I made some egregious error. My apologies in
advance.

Best,
jta


On Mon Jan 26 2015 at 3:19:37 PM José Tomás Atria <jtat...@gmail.com> wrote:

> Cool, I'll look into ti and let you know if I manage to make something
> useful. Thanks for the tips.
>
> On Sun Jan 25 2015 at 12:47:52 PM Richard Eckart de Castilho <
> r...@apache.org> wrote:
>
>> Hi José,
>>
>> we had no need for such a method so far ;) The easiest way would probably
>> be to copy the
>> selectCovering method from uimaFIT and adjust it to catch all
>> intersecting annotations.
>> You can probably add an optimization to a selectIntersecting method which
>> breaks the loop as soon as the begin offset of an annotation is larger than
>> the end offset of your intersection range.
>>
>> Cheers,
>>
>> -- Richard
>>
>> On 24.01.2015, at 22:25, José Tomás Atria <jtat...@gmail.com> wrote:
>>
>> > Hello all,
>> >
>> > I am looking for the best approach to select all annotations of a given
>> > type that intersect an annotation of a different type.
>> >
>> > I am aware of selectCovered and selectCovering, which, as far as I
>> > understand, will select all annotations (of a given type) that cover
>> ranges
>> > of text which are, respectively, subsets or supersets of another
>> > annotation. Is there a similar method for annotations that cover ranges
>> > which merely _intersect_ with the range covered by a given annotation?
>> >
>> > What would the recommended way of achieving this?
>> >
>> > Any help would be apreciated. Thanks!
>> > jta.
>> >
>> > --
>> > entia non sunt multiplicanda praeter necessitatem
>>
>>
Index: src/main/java/org/apache/uima/fit/util/CasUtil.java
===================================================================
--- src/main/java/org/apache/uima/fit/util/CasUtil.java	(revision 1656160)
+++ src/main/java/org/apache/uima/fit/util/CasUtil.java	(working copy)
@@ -616,6 +616,237 @@
   }
 
   /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * Covering annotations are excluded. Use {@link selectIntersects(Type, AnnotationFS, boolean)} if
+   * you want to include covering annotations, but this is significantly slower.
+   *
+   * @param type
+   *        the UIMA type of annotations to select
+   * @param intersect
+   *        the annotation to select intersects for
+   * @return
+   *        a list of annotations of the given type that intersect the given annotation.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static List<AnnotationFS> selectIntersects(Type type, AnnotationFS intersect) {
+    return selectIntersects(intersect.getView(), type, intersect, false);
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * @param type
+   *        the UIMA type of annotations to select
+   * @param intersect
+   *        the annotation to select intersects for
+   * @param covering
+   *        if true, covering annotations are included, but this will be slower.
+   * @return
+   *        a list of annotations of the given type that intersect the given annotation.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static List<AnnotationFS> selectIntersects(Type type, AnnotationFS intersect, boolean covering) {
+    return selectIntersects(intersect.getView(), type, intersect, covering);
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * Covering annotations are excluded. Use {@link selectIntersects(CAS, Type, AnnotationFS, boolean)}
+   * if you want to include covering annotations, but this is significantly slower.
+   *
+   * @param cas
+   *        a CAS
+   * @param type
+   *        the UIMA type of annotations to select
+   * @param intersect
+   *        the annotation to select intersects for
+   * @return
+   *        a list of annotations of the given type that intersect the given annotation.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static List<AnnotationFS> selectIntersects(CAS cas, Type type, AnnotationFS intersect) {
+    return selectIntersects(cas, type, intersect, false);
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * @param cas
+   *        a CAS.
+   * @param type
+   *        the UIMA type of annotations to select.
+   * @param intersect
+   *        the annotation for which to select intersecting annotations.
+   * @param covering
+   *        if true, covering annotations are included, but this will be slower.
+   * @return
+   *        a list of annotations of the given type that intersect the given annotation.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static List<AnnotationFS> selectIntersects(CAS cas, Type type, AnnotationFS intersect, boolean covering) {
+    int begin = intersect.getBegin();
+    int end = intersect.getEnd();
+
+    if (covering) {
+      return selectIntersects(cas, type, intersect.getBegin(), intersect.getEnd(), covering);
+    }
+
+    TypeSystem ts = cas.getTypeSystem();
+    List<AnnotationFS> list = new ArrayList<AnnotationFS>();
+    FSIterator<AnnotationFS> it = cas.getAnnotationIndex(type).iterator();
+
+    it.moveTo(intersect);
+
+    if (!it.isValid()) {
+      it.moveToLast();
+      if (!it.isValid()) {
+        return list;
+      }
+    }
+
+    while (it.isValid() && it.get().getBegin() >= begin) {
+      it.moveToPrevious();
+    }
+
+    if (!it.isValid()) {
+      it.moveToFirst();
+    }
+
+    while (it.isValid() && (it.get()).getEnd() < begin) {
+      it.moveToNext();
+    }
+
+    while (it.isValid()) {
+      System.out.println("looping");
+      AnnotationFS a = it.get();
+      if (a.getBegin() > end) break;
+      it.moveToNext();
+      if (a.getEnd() <= begin) continue;
+
+      assert ( a.getEnd() >= begin ) : "Error";
+      assert ( a.getBegin() <= end ) : "Error";
+
+      if (!a.equals(intersect)) {
+        list.add(a);
+      }
+    }
+
+    return list;
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain range of the CAS text.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   *
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * Covering annotations are excluded. Use {@link selectIntersects(Cas, Type, AnnotationFS, boolean)}
+   * if you want to include covering annotations.
+   *
+   * <p>
+   * <b>Note:</b> this is significantly slower than using
+   * {@link #selectIntersects(CAS, Type, AnnotationFS)}. It is possible to use
+   * {@code  selectInterects(cas, type, new Annotation(jCas, int, int))}, but that will allocate memory
+   * in the jCas for the new annotation. If you do that repeatedly many times, memory may fill up.
+   *
+   * @param cas
+   *          a CAS.
+   * @param type
+   *          the UIMA type of annotations to select.
+   * @param begin
+   *          begin offset.
+   * @param end
+   *          end offset.
+   * @return a list of annotations of the given type that intersect the given annotation.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static List<AnnotationFS> selectIntersects(CAS cas, Type type, int begin, int end) {
+    return selectIntersects( cas, type, begin, end, false );
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain range of the CAS text.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   *
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * <p>
+   * <b>Note:</b> this is significantly slower than using
+   * {@link #selectIntersects(CAS, Type, AnnotationFS)}. It is possible to use
+   * {@code  selectIntersects(cas, type, new Annotation(jCas, int, int))}, but that will allocate memory
+   * in the jCas for the new annotation. If you do that repeatedly many times, memory may fill up.
+   *
+   * @param cas
+   *          a CAS.
+   * @param type
+   *          the UIMA type of annotations to select.
+   * @param begin
+   *          begin offset.
+   * @param end
+   *          end offset.
+   * @param covering
+   *        if true, covering annotations are included, but this will be slower.
+   * @return a list of annotations of the given type that intersect the given annotation.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static List<AnnotationFS> selectIntersects(CAS cas, Type type, int begin, int end, boolean covering) {
+    TypeSystem ts = cas.getTypeSystem();
+    List<AnnotationFS> list = new ArrayList<AnnotationFS>();
+    FSIterator<AnnotationFS> it = cas.getAnnotationIndex(type).iterator();
+
+    while (it.hasNext()) {
+      AnnotationFS a = it.next();
+
+      if (a.getBegin() > end) break;
+      if (!covering && (a.getBegin() < begin && a.getEnd() > end)) continue;
+
+      if (a.getEnd() > begin && (type == null || (ts.subsumes(type, a.getType())))) {
+        list.add(a);
+      }
+    }
+
+    return list;
+  }
+
+  /**
    * Create an index for quickly lookup up the annotations covering a particular annotation. This is
    * preferable to using {@link #selectCovering(CAS, Type, int, int)} because the overhead of
    * scanning the CAS occurs only when the index is build. Subsequent lookups to the index are fast.
@@ -700,6 +931,67 @@
   }
 
   /**
+   * Create an index for quickly lookup up the annotations intersecting a particular annotation. This
+   * is preferable to using {@link #selectIntersects(CAS, Type, int, int)} because the overhead of
+   * scanning the CAS occurs only when the index is build. Subsequent lookups to the index are fast.
+   *
+   * Covering annotations are excluded from the index.
+   * Use {@code indexIntersects(cas, type, begin, end, true)} if you want to include covering annotations.
+   *
+   * @param cas
+   *          a CAS.
+   * @param type
+   *          type to create the index for - this is used in lookups.
+   * @param intersectType
+   *          type of intersecting annotations.
+   * @return the index.
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static Map<AnnotationFS, Collection<AnnotationFS>> indexIntersects( CAS cas, Type type, Type intersectType) {
+    return indexIntersects( cas, type, intersectType, false);
+  }
+
+  /**
+   * Create an index for quickly lookup up the annotations intersecting a particular annotation. This
+   * is preferable to using {@link #selectIntersects(CAS, Type, int, int)} because the overhead of
+   * scanning the CAS occurs only when the index is build. Subsequent lookups to the index are fast.
+   *
+   * @param cas
+   *          a CAS.
+   * @param type
+   *          type to create the index for - this is used in lookups.
+   * @param intersectType
+   *          type of intersecting annotations.
+   * @param covering
+   *          if false, covering annotations are excluded.
+   * @return the index.
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static Map<AnnotationFS, Collection<AnnotationFS>> indexIntersects( CAS cas, Type type, Type intersectType, boolean covering ) {
+    Map<AnnotationFS, Collection<AnnotationFS>> index = new HashMap<AnnotationFS, Collection<AnnotationFS>>() {
+      private static final long serialVersionUID = 1L;
+
+      @Override
+      public Collection<AnnotationFS> get(Object key) {
+        Collection<AnnotationFS> res = super.get(key);
+        if (res == null) return emptyList();
+        else return res;
+      }
+    };
+    for (AnnotationFS s : select(cas, type)) {
+      for (AnnotationFS u : selectIntersects(cas, intersectType, s, covering)) {
+        Collection<AnnotationFS> c = index.get(s);
+        if ( c == EMPTY_LIST ) {
+          c = new LinkedList<AnnotationFS>();
+          index.put(s, c);
+        }
+        c.add(u);
+      }
+    }
+    return unmodifiableMap(index);
+  }
+
+  /**
    * This method exists simply as a convenience method for unit testing. It is not very efficient
    * and should not, in general be used outside the context of unit testing.
    * 
Index: src/main/java/org/apache/uima/fit/util/JCasUtil.java
===================================================================
--- src/main/java/org/apache/uima/fit/util/JCasUtil.java	(revision 1656160)
+++ src/main/java/org/apache/uima/fit/util/JCasUtil.java	(working copy)
@@ -407,6 +407,183 @@
   }
 
   /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * Covering annotations are excluded. Use {@link selectIntersects(Class<T>, AnnotationFS, boolean)} if
+   * you want to include covering annotations, but this is significantly slower.
+   *
+   * @param <T>
+   *          the JCas type.
+   * @param type
+   *          a UIMA type.
+   * @param intersect
+   *          the intersecting annotation.
+   * @return a return value.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation> List<T> selectIntersects(Class<T> type, AnnotationFS intersect) {
+    return cast(CasUtil.selectIntersects(CasUtil.getType(intersect.getCAS(), type), intersect));
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * @param <T>
+   *          the JCas type.
+   * @param type
+   *          a UIMA type.
+   * @param intersect
+   *          the intersecting annotation.
+   * @param covering
+   *          if true, covering annotations are included, but this will be slower.
+   * @return a return value.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation> List<T> selectIntersects(Class<T> type, AnnotationFS intersect, boolean covering) {
+    return cast(CasUtil.selectIntersects(CasUtil.getType(intersect.getCAS(), type), intersect, covering));
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * Covering annotations are excluded. Use {@link selectIntersects(JCas, Class<T>, AnnotationFS, boolean)}
+   * if you want to include covering annotations, but this is significantly slower.
+   *
+   * @param <T>
+   *          the JCas type.
+   * @param jCas
+   *          a JCas.
+   * @param type
+   *          a UIMA type.
+   * @param intersect
+   *          the intersecting annotation.
+   * @return a return value.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation> List<T> selectIntersects(JCas jCas, Class<T> type, AnnotationFS intersect) {
+    return cast(CasUtil.selectIntersects(jCas.getCas(), getType(jCas, type), intersect));
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain annotation.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   * <p>
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * @param <T>
+   *          the JCas type.
+   * @param jCas
+   *          a JCas.
+   * @param type
+   *          a UIMA type.
+   * @param intersect
+   *          the intersecting annotation.
+   * @param covering
+   *          if true, covering annotations are included, but this will be slower.
+   * @return a return value.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation> List<T> selectIntersects(JCas jCas, Class<T> type, AnnotationFS intersect, boolean covering) {
+    return cast(CasUtil.selectIntersects(jCas.getCas(), getType(jCas, type), intersect, covering));
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain range of the CAS text.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   *
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * Covering annotations are excluded. Use {@link selectIntersects(JCas, Class<T>, int, int, boolean)}
+   * if you want to include covering annotations.
+   *
+   * <p>
+   * <b>Note:</b> this is significantly slower than using
+   * {@link #selectIntersects(JCas, Class<T>, AnnotationFS)}. It is possible to use
+   * {@code  selectInterects(cas, type, new Annotation(jCas, int, int))}, but that will allocate memory
+   * in the jCas for the new annotation. If you do that repeatedly many times, memory may fill up.
+   *
+   * @param <T>
+   *          the JCas type.
+   * @param jCas
+   *          a JCas.
+   * @param type
+   *          a UIMA type.
+   * @param begin
+   *          beginning offset.
+   * @param end
+   *          ending offset.
+   * @return a return value.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation> List<T> selectIntersects(JCas jCas, Class<T> type, int begin, int end) {
+    return cast(CasUtil.selectIntersects(jCas.getCas(), getType(jCas, type), begin, end));
+  }
+
+  /**
+   * Get a list of annotations of the given type that intersect a certain range of the CAS text.
+   * Iterates over all annotations of the given type to find intersecting annotations. Does not use
+   * subiterators and does not respect type priorities. Was adapted from {@link Subiterator}. Uses
+   * the same approach except that type priorities are ignored.
+   *
+   * The intersecting annotation is never returned itself, even if it is of the queried-for type or
+   * a subtype of that type.
+   *
+   * <p>
+   * <b>Note:</b> this is significantly slower than using
+   * {@link #selectIntersects(JCas, Class<T>, AnnotationFS)}. It is possible to use
+   * {@code  selectInterects(cas, type, new Annotation(jCas, int, int))}, but that will allocate memory
+   * in the jCas for the new annotation. If you do that repeatedly many times, memory may fill up.
+   *
+   * @param <T>
+   *          the JCas type.
+   * @param jCas
+   *          a JCas.
+   * @param type
+   *          a UIMA type.
+   * @param begin
+   *          beginning offset.
+   * @param end
+   *          ending offset.
+   * @param covering
+   *          if true, covering annotation are included.
+   * @return a return value.
+   * @see Subiterator
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation> List<T> selectIntersects(JCas jCas, Class<T> type, int begin, int end, boolean covering) {
+    return cast(CasUtil.selectIntersects(jCas.getCas(), getType(jCas, type), begin, end, covering));
+  }
+
+  /**
    * Create an index for quickly lookup up the annotations covering a particular annotation. This is
    * preferable to using {@link #selectCovering(JCas, Class, int, int)} because the overhead of
    * scanning the CAS occurs only when the index is build. Subsequent lookups to the index are fast.
@@ -455,6 +632,52 @@
   }
 
   /**
+   * Create an index for quickly lookup up the annotations covered by a particular annotation. This
+   * is preferable to using {@link #selectCovered(JCas, Class, int, int)} because the overhead of
+   * scanning the CAS occurs only when the index is build. Subsequent lookups to the index are fast.
+   *
+   * @param <T>
+   *          the JCas type to index for.
+   * @param <S>
+   *          the JCas type to include in the index.
+   * @param jCas
+   *          a JCas.
+   * @param type
+   *          type to create the index for - this is used in lookups.
+   * @param intersectType
+   * @return the index.
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation, S extends Annotation> Map<T, Collection<S>> indexIntersects(
+          JCas jCas, Class<T> type, Class<S> intersectType ) {
+    return cast(CasUtil.indexIntersects(jCas.getCas(), getType(jCas, type), getType(jCas, intersectType)));
+  }
+
+  /**
+   * Create an index for quickly lookup up the annotations covered by a particular annotation. This
+   * is preferable to using {@link #selectCovered(JCas, Class, int, int)} because the overhead of
+   * scanning the CAS occurs only when the index is build. Subsequent lookups to the index are fast.
+   *
+   * @param <T>
+   *          the JCas type to index for.
+   * @param <S>
+   *          the JCas type to include in the index.
+   * @param jCas
+   *          a JCas.
+   * @param type
+   *          type to create the index for - this is used in lookups.
+   * @param intersectType
+   * @param covering
+   *          if true, the index will include covering annotations.
+   * @return the index.
+   * @see <a href="package-summary.html#SortOrder">Order of selected feature structures</a>
+   */
+  public static <T extends Annotation, S extends Annotation> Map<T, Collection<S>> indexIntersects(
+          JCas jCas, Class<T> type, Class<S> intersectType, boolean covering ) {
+    return cast(CasUtil.indexIntersects(jCas.getCas(), getType(jCas, type), getType(jCas, intersectType), covering));
+  }
+
+  /**
    * Check if the given annotation contains any annotation of the given type.
    * 
    * @param jCas

Reply via email to