Re: Generification of FSIndex

2009-08-28 Thread Marshall Schor


Jörn Kottmann wrote:
 Marshall Schor wrote:
 Jörn Kottmann wrote:
  
 Marshall Schor wrote:

 OK - sounds like this won't be on the fix list for 2.3.0 ;-)
 And, given the value of being able to do this in selected cases, it
 doesn't sound right, either, to change the arguments to find, moveTo,
 etc from FeatureStructure.
 So let's leave this as is. 
 I created a jira issue to change moveTo back to FeatureStructure
 argument,
 it was not changed yet because I misunderstood you.

 Which other methods are you referring to ?
 

 The other methods that take a FeatureStructure for purposes of
 specifying a location in an index:

 For these, I'm not sure if the test for equality includes testing that
 the types are the same, or just is testing (for sorted and set
 indexes) that the declared keys are equal.

 The methods are:

 contains
 find
 compare (although the javadoc says the result is undefined if the FS's
 are not of the type of the index, but I wonder if that's checked?)
 iterator(fs)
   

 Here are the declarations of these methods:

 boolean contains(FeatureStructure fs);
 FeatureStructure find(FeatureStructure fs);
 Maybe we should return T here ?
+1, since FSIndex which has this method is already parameterized on T -
it should be returning only things of type T.

 FeatureStructure find(FeatureStructure fs);
 I think it makes sense to return a FS of type T here.
+1

 int compare(FeatureStructure fs1, FeatureStructure fs2);
 If we use T here, it would enforce the API specification a little,
 because at least the java type must match.
I think we should keep this, because it could change the behavior in
this case:

Suppose you have an index over Token, and you have an instance of
Sentence, and want to see if a token is in a sentence.  Assuming the
Token index uses the same keys as the base Annotation Index, doing a
compare between Token and Sentence would be useful and make sense.

-Marshall


Re: Generification of FSIndex

2009-08-26 Thread Jörn Kottmann

Marshall Schor wrote:
OK - sounds like this won't be on the fix list for 2.3.0 ;-) 


And, given the value of being able to do this in selected cases, it
doesn't sound right, either, to change the arguments to find, moveTo,
etc from FeatureStructure.
So let's leave this as is. 
  

I created a jira issue to change moveTo back to FeatureStructure argument,
it was not changed yet because I misunderstood you.

Which other methods are you referring to ?

Jörn


Re: Generification of FSIndex

2009-08-26 Thread Marshall Schor


Jörn Kottmann wrote:
 Marshall Schor wrote:
 OK - sounds like this won't be on the fix list for 2.3.0 ;-)
 And, given the value of being able to do this in selected cases, it
 doesn't sound right, either, to change the arguments to find, moveTo,
 etc from FeatureStructure.
 So let's leave this as is.   
 I created a jira issue to change moveTo back to FeatureStructure
 argument,
 it was not changed yet because I misunderstood you.

 Which other methods are you referring to ?

The other methods that take a FeatureStructure for purposes of
specifying a location in an index:

For these, I'm not sure if the test for equality includes testing that
the types are the same, or just is testing (for sorted and set
indexes) that the declared keys are equal.

The methods are:

contains
find
compare (although the javadoc says the result is undefined if the FS's
are not of the type of the index, but I wonder if that's checked?)
iterator(fs)

-Marshall

 Jörn




Re: Generification of FSIndex

2009-08-26 Thread Thilo Goetz
Marshall Schor wrote:
 
 Jörn Kottmann wrote:
 Marshall Schor wrote:
 OK - sounds like this won't be on the fix list for 2.3.0 ;-)
 And, given the value of being able to do this in selected cases, it
 doesn't sound right, either, to change the arguments to find, moveTo,
 etc from FeatureStructure.
 So let's leave this as is.   
 I created a jira issue to change moveTo back to FeatureStructure
 argument,
 it was not changed yet because I misunderstood you.

 Which other methods are you referring to ?
 
 The other methods that take a FeatureStructure for purposes of
 specifying a location in an index:
 
 For these, I'm not sure if the test for equality includes testing that
 the types are the same, or just is testing (for sorted and set
 indexes) that the declared keys are equal.
 
 The methods are:
 
 contains
 find
 compare (although the javadoc says the result is undefined if the FS's
 are not of the type of the index, but I wonder if that's checked?)

I looked yesterday, there is no check.  There is definitely room
for improvement here.  I'll open a Jira issue, but I'm not proposing
to do anything for this release.

--Thilo

 iterator(fs)
 
 -Marshall
 
 Jörn





Re: Generification of FSIndex

2009-08-11 Thread Thilo Goetz
Adam Lally wrote:
 On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schorm...@schor.com wrote:
 Adam Lally wrote:
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:

 Marshall Schor wrote:

 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.


 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature 
 structure
 according to the
 ordering of the index. Note that this is in general not the same as feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of 
 the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be possible
 anymore
 to pass a FeatureStructure which has not type T.


 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.

 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined.
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.

 
 In the case of the AnnotationIndex, the object that you pass to
 FSIterator.moveTo must be a subtype of Annotation (else you would get
 all the weird effects that you describe).  But it is still valid for a
 user to do:
 FSIndexAnnotType1 index = cas.getAnnotationIndex(annotType1);
 index.moveTo(annotType2);
 
 where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation.
 
 In general, the object that you pass to moveTo() must be a subtype of
 the type that was in the index definition (in the user's descriptor,
 or for the case of the built-in AnnotationIndex,
 uima.tcas.Annotation).
 
   -Adam

One concrete example of Adam's point: suppose you have a sentenceFS
and a tokenIterator.  Then tokenIterator.moveTo(sentenceFS) will
position the token iterator at the first word in the sentence (modulo
some subtleties that are beside the point here).  Very useful.

--Thilo



Re: Generification of FSIndex

2009-08-11 Thread Marshall Schor


Thilo Goetz wrote:
 Adam Lally wrote:
   
 On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schorm...@schor.com wrote:
 
 Adam Lally wrote:
   
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:

 
 Marshall Schor wrote:

   
 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.


 
 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature 
 structure
 according to the
 ordering of the index. Note that this is in general not the same as 
 feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of 
 the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be possible
 anymore
 to pass a FeatureStructure which has not type T.


   
 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.

 
 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined.
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.

   
 In the case of the AnnotationIndex, the object that you pass to
 FSIterator.moveTo must be a subtype of Annotation (else you would get
 all the weird effects that you describe).  But it is still valid for a
 user to do:
 FSIndexAnnotType1 index = cas.getAnnotationIndex(annotType1);
 index.moveTo(annotType2);

 where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation.

 In general, the object that you pass to moveTo() must be a subtype of
 the type that was in the index definition (in the user's descriptor,
 or for the case of the built-in AnnotationIndex,
 uima.tcas.Annotation).

   -Adam
 

 One concrete example of Adam's point: suppose you have a sentenceFS
 and a tokenIterator.  Then tokenIterator.moveTo(sentenceFS) will
 position the token iterator at the first word in the sentence (modulo
 some subtleties that are beside the point here).  Very useful.
   

Yes.  This works because the index being iterated over is the general
annotation index (the one that's built-it) - and the presumeption is
that token and sentenceFS are both subtypes of AnnotationFS.

Is there any reason *not* to add a check to the various methods in
indexing that take one or more Feature Structures, to see if they are
being passed a subtype of the type being indexed (except for bag indexes)?

-Marshall


 --Thilo



   


Re: Generification of FSIndex

2009-08-11 Thread Thilo Goetz
Marshall Schor wrote:
 
 Thilo Goetz wrote:
 Adam Lally wrote:
   
 On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schorm...@schor.com wrote:
 
 Adam Lally wrote:
   
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:

 
 Marshall Schor wrote:

   
 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.


 
 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature 
 structure
 according to the
 ordering of the index. Note that this is in general not the same as 
 feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of 
 the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be 
 possible
 anymore
 to pass a FeatureStructure which has not type T.


   
 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.

 
 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined.
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.

   
 In the case of the AnnotationIndex, the object that you pass to
 FSIterator.moveTo must be a subtype of Annotation (else you would get
 all the weird effects that you describe).  But it is still valid for a
 user to do:
 FSIndexAnnotType1 index = cas.getAnnotationIndex(annotType1);
 index.moveTo(annotType2);

 where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation.

 In general, the object that you pass to moveTo() must be a subtype of
 the type that was in the index definition (in the user's descriptor,
 or for the case of the built-in AnnotationIndex,
 uima.tcas.Annotation).

   -Adam
 
 One concrete example of Adam's point: suppose you have a sentenceFS
 and a tokenIterator.  Then tokenIterator.moveTo(sentenceFS) will
 position the token iterator at the first word in the sentence (modulo
 some subtleties that are beside the point here).  Very useful.
   
 
 Yes.  This works because the index being iterated over is the general
 annotation index (the one that's built-it) - and the presumeption is
 that token and sentenceFS are both subtypes of AnnotationFS.
 
 Is there any reason *not* to add a check to the various methods in
 indexing that take one or more Feature Structures, to see if they are
 being passed a subtype of the type being indexed (except for bag indexes)?

Sure.  Suppose B  A, and we have an index on B, but the ordering
relation is defined in terms of features that B inherits from
   A.
Then I can use an A to position iterators on the B index.  This
works now, and I don't see why we should prohibit it.


 -Marshall
 
 
 --Thilo



   


Re: Generification of FSIndex

2009-08-11 Thread Thilo Goetz


Marshall Schor wrote:
 
 Thilo Goetz wrote:
 Marshall Schor wrote:
   
 Thilo Goetz wrote:
 
 Adam Lally wrote:
   
   
 On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schorm...@schor.com wrote:
 
 
 Adam Lally wrote:
   
   
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com 
 wrote:

 
 
 Marshall Schor wrote:

   
   
 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.


 
 
 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature 
 structure
 according to the
 ordering of the index. Note that this is in general not the same as 
 feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is 
 no
 place where
 it says that contains can only be called for FSes which have the type 
 of the
 index.

 The change of the parameter from FeatureStructure to T would also 
 change
 the contract of the method a little, because then it would not be 
 possible
 anymore
 to pass a FeatureStructure which has not type T.


   
   
 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.

 
 
 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined.
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.

   
   
 In the case of the AnnotationIndex, the object that you pass to
 FSIterator.moveTo must be a subtype of Annotation (else you would get
 all the weird effects that you describe).  But it is still valid for a
 user to do:
 FSIndexAnnotType1 index = cas.getAnnotationIndex(annotType1);
 index.moveTo(annotType2);

 where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation.

 In general, the object that you pass to moveTo() must be a subtype of
 the type that was in the index definition (in the user's descriptor,
 or for the case of the built-in AnnotationIndex,
 uima.tcas.Annotation).

   -Adam
 
 
 One concrete example of Adam's point: suppose you have a sentenceFS
 and a tokenIterator.  Then tokenIterator.moveTo(sentenceFS) will
 position the token iterator at the first word in the sentence (modulo
 some subtleties that are beside the point here).  Very useful.
   
   
 Yes.  This works because the index being iterated over is the general
 annotation index (the one that's built-it) - and the presumeption is
 that token and sentenceFS are both subtypes of AnnotationFS.

 Is there any reason *not* to add a check to the various methods in
 indexing that take one or more Feature Structures, to see if they are
 being passed a subtype of the type being indexed (except for bag indexes)?
 
 Sure.  Suppose B  A, and we have an index on B, but the ordering
 relation is defined in terms of features that B inherits from
A.
 Then I can use an A to position iterators on the B index.  This
 works now, and I don't see why we should prohibit it.
   
 
 Right, I knew I was missing some detail.  So the test for validity is
 not a simple subsumes test, but rather a check to see if the feature
 structure slots used in the keys used for testing by the index are in
 the same position in the feature structure layout as the type being
 passed in.  Correct? 

Essentially, yes.

 
 If so, is there any reason not to add this as a check?

No reason other than that the relevant information is
thrown away after the index has been generated, and it
would be a medium size effort to implement.  I haven't
looked

Re: Generification of FSIndex

2009-08-11 Thread Marshall Schor


Thilo Goetz wrote:
 Marshall Schor wrote:
   
 Thilo Goetz wrote:
 
 Marshall Schor wrote:
   
   
 Thilo Goetz wrote:
 
 
 Adam Lally wrote:
   
   
   
 On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schorm...@schor.com wrote:
 
 
 
 Adam Lally wrote:
   
   
   
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com 
 wrote:

 
 
 
 Marshall Schor wrote:

   
   
   
 The generification of FSIndex currently specifies one type, T 
 extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type 
 FeatureStructure.
 These could be changed to take type T.


 
 
 
 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature 
 structure
 according to the
 ordering of the index. Note that this is in general not the same as 
 feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is 
 no
 place where
 it says that contains can only be called for FSes which have the type 
 of the
 index.

 The change of the parameter from FeatureStructure to T would also 
 change
 the contract of the method a little, because then it would not be 
 possible
 anymore
 to pass a FeatureStructure which has not type T.


   
   
   
 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.

 
 
 
 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined.
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.

   
   
   
 In the case of the AnnotationIndex, the object that you pass to
 FSIterator.moveTo must be a subtype of Annotation (else you would get
 all the weird effects that you describe).  But it is still valid for a
 user to do:
 FSIndexAnnotType1 index = cas.getAnnotationIndex(annotType1);
 index.moveTo(annotType2);

 where annotType1 and annotType2 are both subtypes of 
 uima.tcas.Annotation.

 In general, the object that you pass to moveTo() must be a subtype of
 the type that was in the index definition (in the user's descriptor,
 or for the case of the built-in AnnotationIndex,
 uima.tcas.Annotation).

   -Adam
 
 
 
 One concrete example of Adam's point: suppose you have a sentenceFS
 and a tokenIterator.  Then tokenIterator.moveTo(sentenceFS) will
 position the token iterator at the first word in the sentence (modulo
 some subtleties that are beside the point here).  Very useful.
   
   
   
 Yes.  This works because the index being iterated over is the general
 annotation index (the one that's built-it) - and the presumeption is
 that token and sentenceFS are both subtypes of AnnotationFS.

 Is there any reason *not* to add a check to the various methods in
 indexing that take one or more Feature Structures, to see if they are
 being passed a subtype of the type being indexed (except for bag indexes)?
 
 
 Sure.  Suppose B  A, and we have an index on B, but the ordering
 relation is defined in terms of features that B inherits from
A.
 Then I can use an A to position iterators on the B index.  This
 works now, and I don't see why we should prohibit it.
   
   
 Right, I knew I was missing some detail.  So the test for validity is
 not a simple subsumes test, but rather a check to see if the feature
 structure slots used in the keys used for testing by the index are in
 the same position in the feature structure layout as the type being
 passed in.  Correct

Generification of FSIndex

2009-08-10 Thread Marshall Schor
The generification of FSIndex currently specifies one type, T extends
FeatureStructure that is the type of item being returned.


The contains and find methods have arguments of type FeatureStructure. 
These could be changed to take type T.

I think the current implementation has maybe a bug - if you use contains
or find methods, and pass in Feature Structures whose types are not a T
or a subtype of T, the index comparator code (I think) doesn't check if
the argument is of the right type, and just presumes it can ask for
feature values from the type to use in the testing.

So it would be an improvement to require that these arguments be ?
extends T, I think.

The compare is similar - I think both args should be ? extends T

Other opinions?

-Marshall


Re: Generification of FSIndex

2009-08-10 Thread Jörn Kottmann

Jörn Kottmann wrote:

No I do not think that they could be changed to take type T.
Lets take the case of the contains method.
The javadoc says:
Check if the index contains an element equal to the given feature 
structure according to the
ordering of the index. Note that this is in general not the same as 
feature structure identity.
and it for the param fs it says The FS we're looking for.. There is 
no place where
it says that contains can only be called for FSes which have the type 
of the index.


The change of the parameter from FeatureStructure to T would also change
the contract of the method a little, because then it would not be 
possible anymore

to pass a FeatureStructure which has not type T.

The java developers had similar case in the Collection interface.
Collection has a contains method which has still the signature 
contains(Object)

and not contains(E).

See here:
http://java.sun.com/javase/6/docs/api/java/util/Collection.html

Jörn


Re: Generification of FSIndex

2009-08-10 Thread Jörn Kottmann

Marshall Schor wrote:

The generification of FSIndex currently specifies one type, T extends
FeatureStructure that is the type of item being returned.


The contains and find methods have arguments of type FeatureStructure. 
These could be changed to take type T.
  

No I do not think that they could be changed to take type T.
Lets take the case of the contains method.
The javadoc says:
Check if the index contains an element equal to the given feature 
structure according to the
ordering of the index. Note that this is in general not the same as 
feature structure identity.
and it for the param fs it says The FS we're looking for.. There is no 
place where
it says that contains can only be called for FSes which have the type of 
the index.


The change of the parameter from FeatureStructure to T would also change
the contract of the method a little, because then it would not be 
possible anymore

to pass a FeatureStructure which has not type T.

Jörn




Re: Generification of FSIndex

2009-08-10 Thread Adam Lally
On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:
 Marshall Schor wrote:

 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.


 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature structure
 according to the
 ordering of the index. Note that this is in general not the same as feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be possible
 anymore
 to pass a FeatureStructure which has not type T.


I agree.  It's sometimes useful to call FSIterator.moveTo method and
pass an FS of a Type other than the one that the index was defined
over, as part of implementing something like a subiterator.
  -Adam


Re: Generification of FSIndex

2009-08-10 Thread Marshall Schor


Adam Lally wrote:
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:
   
 Marshall Schor wrote:
 
 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.

   
 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature structure
 according to the
 ordering of the index. Note that this is in general not the same as feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be possible
 anymore
 to pass a FeatureStructure which has not type T.

 

 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.
   

I agree with the case where it's the bag index being used, because
that uses a test which works on all feature structures.

However, for Set and Sorted, the implication of passing a FS which is
not in the type hierarchy is, according to the JavaDocs, undefined. 
This is because the code assumes the layout of the features and their
values is appropriate for the type.  In other words, if the type of some
key was a string, it might take a value from the main int heap and use
it as an index into the string array - and if the int heap object was
not the right type, it could pull an arbitrary value from that slot, and
end up throwing an array index out of bounds exception.  When I looked,
it didn't appear to me that the code checked for any kind of type
subsumption before proceeding... (but I may have missed it...)

It could turn out that the data (whatever is being pulled) would just
happen to match, even though the types are different.

Because this is an undefined operation that could throw various kinds of
runtime errors, or return an equal match where none really exists, I
think it should not be allowed, for set and sorted indexes.

-Marshall
   -Adam


   


Re: Generification of FSIndex

2009-08-10 Thread Marshall Schor


Marshall Schor wrote:
 Adam Lally wrote:
   
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:
   
 
 Marshall Schor wrote:
 
   
 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.

   
 
 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature structure
 according to the
 ordering of the index. Note that this is in general not the same as feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be possible
 anymore
 to pass a FeatureStructure which has not type T.

 
   
 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.
   
 

 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined. 
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.
   

Or - we should change the implementation of these methods to check for
type subsumption of the arguments before calling the low-level compare code.

-Marshall
 -Marshall
   
   -Adam


   
 


   


Re: Generification of FSIndex

2009-08-10 Thread Adam Lally
On Mon, Aug 10, 2009 at 5:32 PM, Marshall Schorm...@schor.com wrote:
 Adam Lally wrote:
 On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:

 Marshall Schor wrote:

 The generification of FSIndex currently specifies one type, T extends
 FeatureStructure that is the type of item being returned.


 The contains and find methods have arguments of type FeatureStructure.
 These could be changed to take type T.


 No I do not think that they could be changed to take type T.
 Lets take the case of the contains method.
 The javadoc says:
 Check if the index contains an element equal to the given feature structure
 according to the
 ordering of the index. Note that this is in general not the same as feature
 structure identity.
 and it for the param fs it says The FS we're looking for.. There is no
 place where
 it says that contains can only be called for FSes which have the type of the
 index.

 The change of the parameter from FeatureStructure to T would also change
 the contract of the method a little, because then it would not be possible
 anymore
 to pass a FeatureStructure which has not type T.



 I agree.  It's sometimes useful to call FSIterator.moveTo method and
 pass an FS of a Type other than the one that the index was defined
 over, as part of implementing something like a subiterator.


 I agree with the case where it's the bag index being used, because
 that uses a test which works on all feature structures.

 However, for Set and Sorted, the implication of passing a FS which is
 not in the type hierarchy is, according to the JavaDocs, undefined.
 This is because the code assumes the layout of the features and their
 values is appropriate for the type.  In other words, if the type of some
 key was a string, it might take a value from the main int heap and use
 it as an index into the string array - and if the int heap object was
 not the right type, it could pull an arbitrary value from that slot, and
 end up throwing an array index out of bounds exception.  When I looked,
 it didn't appear to me that the code checked for any kind of type
 subsumption before proceeding... (but I may have missed it...)

 It could turn out that the data (whatever is being pulled) would just
 happen to match, even though the types are different.

 Because this is an undefined operation that could throw various kinds of
 runtime errors, or return an equal match where none really exists, I
 think it should not be allowed, for set and sorted indexes.


In the case of the AnnotationIndex, the object that you pass to
FSIterator.moveTo must be a subtype of Annotation (else you would get
all the weird effects that you describe).  But it is still valid for a
user to do:
FSIndexAnnotType1 index = cas.getAnnotationIndex(annotType1);
index.moveTo(annotType2);

where annotType1 and annotType2 are both subtypes of uima.tcas.Annotation.

In general, the object that you pass to moveTo() must be a subtype of
the type that was in the index definition (in the user's descriptor,
or for the case of the built-in AnnotationIndex,
uima.tcas.Annotation).

  -Adam


Re: Generification of FSIndex

2009-08-10 Thread Jörn Kottmann

Marshall Schor wrote:

Marshall Schor wrote:
  

Adam Lally wrote:
  


On Mon, Aug 10, 2009 at 4:07 PM, Jörn Kottmannkottm...@gmail.com wrote:
  

  

Marshall Schor wrote:

  


The generification of FSIndex currently specifies one type, T extends
FeatureStructure that is the type of item being returned.


The contains and find methods have arguments of type FeatureStructure.
These could be changed to take type T.

  

  

No I do not think that they could be changed to take type T.
Lets take the case of the contains method.
The javadoc says:
Check if the index contains an element equal to the given feature structure
according to the
ordering of the index. Note that this is in general not the same as feature
structure identity.
and it for the param fs it says The FS we're looking for.. There is no
place where
it says that contains can only be called for FSes which have the type of the
index.

The change of the parameter from FeatureStructure to T would also change
the contract of the method a little, because then it would not be possible
anymore
to pass a FeatureStructure which has not type T.


  


I agree.  It's sometimes useful to call FSIterator.moveTo method and
pass an FS of a Type other than the one that the index was defined
over, as part of implementing something like a subiterator.
  

  

I agree with the case where it's the bag index being used, because
that uses a test which works on all feature structures.

However, for Set and Sorted, the implication of passing a FS which is
not in the type hierarchy is, according to the JavaDocs, undefined. 
This is because the code assumes the layout of the features and their

values is appropriate for the type.  In other words, if the type of some
key was a string, it might take a value from the main int heap and use
it as an index into the string array - and if the int heap object was
not the right type, it could pull an arbitrary value from that slot, and
end up throwing an array index out of bounds exception.  When I looked,
it didn't appear to me that the code checked for any kind of type
subsumption before proceeding... (but I may have missed it...)

It could turn out that the data (whatever is being pulled) would just
happen to match, even though the types are different.

Because this is an undefined operation that could throw various kinds of
runtime errors, or return an equal match where none really exists, I
think it should not be allowed, for set and sorted indexes.
  



Or - we should change the implementation of these methods to check for
type subsumption of the arguments before calling the low-level compare code.
  

The implementation should handle this case correctly, thats the only
way to fix it. With generification it can not be fixed because a client 
would

still be allowed to use a FSIndex as a raw type.

Jörn