Marshall Schor wrote: > > Thilo Goetz wrote: >> Adam Lally wrote: >> >>> On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schor<m...@schor.com> wrote: >>> >>>> Adam Lally wrote: >>>> >>>>> On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmann<kottm...@gmail.com> wrote: >>>>> >>>>> >>>>>> Marshall Schor wrote: >>>>>> >>>>>> >>>>>>> The generification of FSIndex currently specifies one type, <T extends >>>>>>> FeatureStructure> that is the type of item being returned. >>>>>>> >>>>>>> >>>>>>> The contains and find methods have arguments of type FeatureStructure. >>>>>>> These could be changed to take type "T". >>>>>>> >>>>>>> >>>>>>> >>>>>> No I do not think that they could be changed to take type T. >>>>>> Lets take the case of the contains method. >>>>>> The javadoc says: >>>>>> "Check if the index contains an element equal to the given feature >>>>>> structure >>>>>> according to the >>>>>> ordering of the index. Note that this is in general not the same as >>>>>> feature >>>>>> structure identity." >>>>>> and it for the param fs it says "The FS we're looking for.". There is no >>>>>> place where >>>>>> it says that contains can only be called for FSes which have the type of >>>>>> the >>>>>> index. >>>>>> >>>>>> The change of the parameter from FeatureStructure to T would also change >>>>>> the contract of the method a little, because then it would not be >>>>>> possible >>>>>> anymore >>>>>> to pass a FeatureStructure which has not type T. >>>>>> >>>>>> >>>>>> >>>>> I agree. It's sometimes useful to call FSIterator.moveTo method and >>>>> pass an FS of a Type other than the one that the index was defined >>>>> over, as part of implementing something like a subiterator. >>>>> >>>>> >>>> I agree with the case where it's the "bag" index being used, because >>>> that uses a test which works on all feature structures. >>>> >>>> However, for Set and Sorted, the implication of passing a FS which is >>>> not in the type hierarchy is, according to the JavaDocs, "undefined". >>>> This is because the code assumes the layout of the features and their >>>> values is appropriate for the type. In other words, if the type of some >>>> key was a string, it might take a value from the main int heap and use >>>> it as an index into the string array - and if the int heap object was >>>> not the right type, it could pull an arbitrary value from that slot, and >>>> end up throwing an array index out of bounds exception. When I looked, >>>> it didn't appear to me that the code checked for any kind of type >>>> subsumption before proceeding... (but I may have missed it...) >>>> >>>> It could turn out that the data (whatever is being pulled) would just >>>> happen to "match", even though the types are different. >>>> >>>> Because this is an undefined operation that could throw various kinds of >>>> runtime errors, or return an equal match where none really exists, I >>>> think it should not be allowed, for set and sorted indexes. >>>> >>>> >>> In the case of the AnnotationIndex, the object that you pass to >>> FSIterator.moveTo must be a subtype of Annotation (else you would get >>> all the weird effects that you describe). But it is still valid for a >>> user to do: >>> FSIndex<AnnotType1> index = cas.getAnnotationIndex(annotType1); >>> index.moveTo(annotType2); >>> >>> where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation. >>> >>> In general, the object that you pass to moveTo() must be a subtype of >>> the type that was in the index definition (in the user's descriptor, >>> or for the case of the built-in AnnotationIndex, >>> uima.tcas.Annotation). >>> >>> -Adam >>> >> One concrete example of Adam's point: suppose you have a sentenceFS >> and a tokenIterator. Then tokenIterator.moveTo(sentenceFS) will >> position the token iterator at the first word in the sentence (modulo >> some subtleties that are beside the point here). Very useful. >> > > Yes. This works because the "index" being iterated over is the general > annotation index (the one that's built-it) - and the presumeption is > that "token" and "sentenceFS" are both subtypes of AnnotationFS. > > Is there any reason *not* to add a check to the various methods in > indexing that take one or more Feature Structures, to see if they are > being passed a subtype of the type being indexed (except for bag indexes)?
Sure. Suppose B < A, and we have an index on B, but the ordering relation is defined in terms of features that B inherits from A. Then I can use an A to position iterators on the B index. This works now, and I don't see why we should prohibit it. > -Marshall > > >> --Thilo >> >> >> >>