[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15591065#comment-15591065 ] Peter Klügl commented on UIMA-5115: --- Sorry, I haven't found the time lately to follow your specs and discussions completely. I'll try to catch up next week. My first guess would be default=false. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15589120#comment-15589120 ] Marshall Schor commented on UIMA-5115: -- Peter, do you have an opinion about the defaulting of "includeAnnotationsWithEndsBeyondBounds"? This is an alternative, which in Subiterator would be called "not strict". In the current defaulting, if not specified (value is false), it means: while traversing a coveredBy(..) or between(..) selections (one with bounds, and you're going within those bounds), skip any annotations which start within the bounds, but whose "end" position lie outside the bound. This, too, is a kind of automatic (silent) skipping of some annotations. But it only applies to these two kinds of selections (coveredBy(..) and between(..)), and it seems both of those fairly strongly imply this skipping is wanted. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588684#comment-15588684 ] Richard Eckart de Castilho commented on UIMA-5115: -- Fine by me. In NLP contexts, it will probably just work because annotations tend to be non-overlapping anyway. It is probably not necessary to be extra-strict. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588632#comment-15588632 ] Peter Klügl commented on UIMA-5115: --- change default for AnnotationIndex processing to non-overlapped (unambiguous): I strongly suggest not to set the default to non-overlapping. Additional to Marshalls arguments, there are many annotations beyond those for plain NLP which overlap all the time. I already see people debugging because they missed the flag... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582461#comment-15582461 ] Marshall Schor commented on UIMA-5115: -- Richard made many useful comments to the first cut of the select documentation (pdf) using the Adobe commenting tool. I'm bring some of those into the Jira (others are good suggestions that I'll incorporate in next revision). # Defaults: #* change default for AnnotationIndex processing to non-overlapped (unambiguous); I'm not sure about this. I agree that in most use cases, the situation will be that there are no overlapping annotations (imagining Sentence with non-overlapping Tokens). But if a pipeline did produce some overlapping Tokens, this default would "silently" skip those. I think this action should not be so "silent", to lessen the chance of mistakes in assumptions made by downstream users of upstream annotators. #* endWithinBounds - I agree with the comment, and in fact, the default (not clearly expressed) was changed in the code to be as suggested. I'm thinking of a rename like "includeEndBeyondBounds"; I suspect it will get very little (if any) use so the long name won't be significant. #* skipEquals - this is poorly documented. The implementation **never** includes the "bound", because both the Subiterator and uimaFIT implementations never included the "bound"; it was not the intent of this to sometimes include the bound. So, it needs to be renamed. #** These two implementations differed in what they meant by the "bound", however. In uimaFIT, the Feature Structure to be skipped was the one which was exactly == (had the same "id") as the bound Feature Structure. In Subiterator, the ones that were skipped were the ones which compared as "equal" using the annotation index's comparator function (which used type priority). What this boolean switch was trying to do was to allow specifying which of these two equal meanings was to be used in doing the skipping. Note that this is a detail that only applies when there are potentially multiple Annotations which compare equal. # General approach to handling ignored or not-applicable settings: I am slightly favoring some kind of notification, if they are indicative of a likely error or misunderstanding by the Annotator writer; this has to be balanced with making this framework "annoying" to the user. Kinds of notification include throwing exceptions, or (decreasing frequency) logging of warnings. # re: renaming Processing Actions: I never liked the term much... I'm ok with terminal actions, result forms, but my choice would be the combo: "terminal forms". # re: renaming the select framework to the CAS Query framework - I think this ties too closely to the CAS as the data source, given that other collections can be the source. We could call it the Feature Structure Query framework, but that seems too verbose, compared to the "select" framework, so I'd prefer to keep "select". # re: ordering and sorted-ordering. I'll make a pass to clarify the subcases. The general approach is that sort ordering for Annotation Indexes is usually implied (but can be (partially) undone using the unordered() builder, if desired for efficiency). > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15548761#comment-15548761 ] Richard Eckart de Castilho commented on UIMA-5115: -- In uimaFIT, we have been using the static methods all over - we had no other chance really. I think implementing such an API in UIMA Core offers the opportunity to implement the methods on the classes proper. I tend to think that is a good thing because usually there is a data object to start with already available... and static methods at least require an additional import. FSArray and FSList getting a generic type sound good. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543128#comment-15543128 ] Marshall Schor commented on UIMA-5115: -- Re: generic typing. The select(...) variants are in Cas, JCas, FSIndex, FSArray and FSList (at the moment). Of these, FSIndex carries a generic type (that is the class is defined as FSIndex. Using that and a "static" form of select, we can replace: FSIterator = cas.select(indexOverToken) with FSIterator = SelectFSs.select(indexOverToken) and the type inferencing will work. // the SelectFSs. prefix could be omitted if a static import was used This seems due to reducing the level of method chaining, by using a static method.. To make this more pervasive, we could add a generic type argument to FSArray and FSList. We could even have both static and non-static select methods. Do these seem like a good ideas? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1554#comment-1554 ] Marshall Schor commented on UIMA-5115: -- for now, I'll keep the offset, but add a limit( n ) to the set of builder-keywords supported by select (rather than converting to stream.limit) - this will allow forms like: {code}cas.select(MyType.class).limit(3).coveredBy(fs).toArray(){code} > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533213#comment-15533213 ] Marshall Schor commented on UIMA-5115: -- even though limit(n) is a stream thing, we don't need to implement it that way. So we have a choice. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533199#comment-15533199 ] Richard Eckart de Castilho commented on UIMA-5115: -- limit() is a Stream API method and offset() is not. So I guess after calling limit() we are in stream-land and no longer in SelectFS-land... at least unless we employ covariant return-types... and even if use covariant return types we'd always need to have limit() as a separate call... so I'd tend a bit more towards keeping the offset as the positional parameter... unsure... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15533169#comment-15533169 ] Marshall Schor commented on UIMA-5115: -- uimaFIT has a param for following/preceding that limits the number of annotations returned. The general "positioning" alternatives include a FS, begin/end, and those two plus a possible "offset". These seem to possibly confict when designing APIs. (offset(n) here means after positioning the iterator to some start position, before starting, do "n" moveToPrevious/Next() operations). I tend to think that limit(n) is more popular than offset(n); if so, perhaps we should offer it the optional positional parameter spot uniformly in all the API variants instead of offset? Examples: {code} cas.select(MyType.class).startAt(begin, end, limit).offset(3) // limit is (optional) positional, offset via keyword cas.select(MyType.class).startAt(begin, end, offset).limit(4) // offset is (optional) positional, limit via keyword cas.select(MyType.class).following(begin, end, limit).offset(3) // limit is (optional) positional, offset via keyword cas.select(MyType.class).following(begin, end, offset).limit(4) // offset is (optional) positional, limit via keyword {code} Other ideas or preferences? I don't have a strong preference, except I'm slightly in favor of being consistent. * The inconsistent approach would be to have it one way for, say,, coveredBy, and the other way for following. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532099#comment-15532099 ] Richard Eckart de Castilho commented on UIMA-5115: -- For the moment, I would suggest that "coveredBy" and "following" be of the same category "location constraint" and are mutually exclusive. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15530858#comment-15530858 ] Marshall Schor commented on UIMA-5115: -- Bounding style + Following/preceding? This is something that could be supported, but is probably "over-enthusiastic" designing ahead of need: {code}select(MyType.class).coveredBy(coveringFS).following(a_fs_to_follow)...{code} Instead, I could report a validation error: can't do both bounding style and following/preceding style together. This is what I'm planning to do, unless there's a need for the other :-) > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526481#comment-15526481 ] Richard Eckart de Castilho commented on UIMA-5115: -- I also believe that this type of syntax is basically equivalent to a type cast in the sense that we cannot access and reason over the type within the code. E.g. we could not throw an exception such as 'my.Token is not a subtype of Annotation'... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526462#comment-15526462 ] Richard Eckart de Castilho commented on UIMA-5115: -- Ok, that looks interesting... have to look into it more... but right now I feel that a syntax like {code} // simple work-around for non-chaining type inferencing - supply the type FSIterator it3 = cas.select("my.Token").fsIterator(); {code} is beyond what can be expected even from advanced Java coders... I feel/fear it does not fulfill the aims of an intuitive and simple API. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526398#comment-15526398 ] Marshall Schor commented on UIMA-5115: -- This bit of uglyness is due to the fact that type inference from the target to the method doesn't "chain". However it can be overcome more simply than passing extra arguments. I did this experiment: {code} // fails - can't convert FSIterator to FSIterator FSIterator it = cas.select("my.Token").fsIterator(); // split into two statements, and it works (no chained inference needed) SelectFS s = cas.select("my.Token"); //OK FSIterator it2 = s.fsIterator(); //OK // simple work-around for non-chaining type inferencing - supply the type FSIterator it3 = cas.select("my.Token").fsIterator(); {code} An internet search yielded this article by John Rose: http://mail.openjdk.java.net/pipermail/lambda-dev/2013-July/010531.html > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15520459#comment-15520459 ] Richard Eckart de Castilho commented on UIMA-5115: -- Looking into this in more detail, I don't see how methods like coveredBy, covering, or at would be able to switch the generic return type to "? extends AnnotationFS". They could do it in two ways: * either by inferring the generic type from the variable to which they are assigned (see problem explained above) * or through an additional class-type parameter from which the new return type is derived If none of these options above are used and they simply returned "? extends AnnotationFS", then that could override the generic type set via e.g. type(Token.class) ... or at/coveredBy/covering would always have to be invoked *before* invoking type(class)... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15520448#comment-15520448 ] Richard Eckart de Castilho commented on UIMA-5115: -- Ok, so here is my argument... If we infer the return type by the variable to which it is bound, we can happily do this: {code} SelectFSs selector = jcas.select().type("my.Token"); {code} But we cannot do these: {code} 1 | List selector = jcas.select().type("my.Token").asList() 2 | doSomethingWith(jcas.select().type("my.Token")); {code} In case 1, we would have to introduce an additional signature of asList() which would interfere with the normal typesafe one (i.e. once returning the generic type of SelectFSs and once obtaining the generic type from the variable to which the result is assigned). So the List cannot have the generic type AnnotationFS, but would need to use FeatureStructure or would require casting. In case 2, Java is not smart enough to infer the generic type from the argument of "doSomethingWith" and we would have to resort to type casting. So doSomethingWith could not accept an AnnotationFS but would have to be able to operate on a FeatureStructure. {code} 1 | List selector = (List selector = jcas.select().type("my.Token", AnnotationFS.class).asList() 2 | doSomethingWith(jcas.select().type("my.Token", AnnotationFS.class)); {code} > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513597#comment-15513597 ] Richard Eckart de Castilho commented on UIMA-5115: -- I think it is not an unusual case to ask for "for the given Lemma, give me all other Lemmas at the same location" - so I think only the reference annotation itself should be excluded from the result of a covering/coveredBy/at but not other annotations of the same type... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513511#comment-15513511 ] Marshall Schor commented on UIMA-5115: -- For the prototype, I'm doing all variants. Default is the uimaFIT style (this can be adjusted, based on community feedback). The variants are controlled by the builder methods: * typePriority - to use type priority as part of the comparator for position and bounds; default is not to * positionUsesType - when type priority is not used, use the type as part of the "equals" when positioning to the left most among equals; default is not to * skipEqual - for boundedBy, bounding, at: skip returning FSs which are "equal"; default: skip FSs which has same ID as bounding FS. ** Note: the at(FSp) with skipEqual is non empty only for typePriority or positionUsesType cases, in which case it returns Annotations with the same begin/end, but with different types than FSp. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513455#comment-15513455 ] Marshall Schor commented on UIMA-5115: -- Details of positioning and bounds, with and without typePriorities: Here's how I think uimaFIT currently is implemented: * (Assume FSp is the FeatureStructure being used as a position, or as a bound) * The type of FSp is ignored; only the begin/end values are used * The starting position is the first (left-most) FS >= FSp (using just begin/end) ** In the case where there are multiple FSs with the same begin/end as FSp, the starting position is the left-most one of these * For bounding operations, if there is an FS which is exactly equal (has the same "id") to FSp, it is skipped. ** Because of the left-most rule, the bounding operation might return fsA, fsB, (skip fsC which has the same *id* as FSp), fsD, etc. Here's a proposal of how this would work with typePriorities (this is different from how subiterator works, see following): * The type of FSp is used as part of the comparison, using the AnnotationIndex comparator * The rest of the definition is as above, except that the comparisons use the typePriorities Subiterator uses this logic: * It always uses type priorities * For positioning, it works as above (type priority case) * For bounding operations: it first positions using the FSp, and then advances until it finds the first FS not equal (using begin/end/type), and starts there Subiterator style, without typePriority (not currently available, here's a proposed design consistent with the subiterator style): * For positioning the first (left-most) FS >= FSp (using just begin/end) * For bounding operations, it first positions using the FSp, then advances until it finds the first FS not equal (using just begin/end), and starts there. The differences between the subiterator / uimaFIT approaches (for corresponding typePriority use or not) may be accidental artifacts of implementation. Should we make the uv3 select() implementation for bounding operations: * skip the FS which has the same id (only) [ uimaFIT sytle ] or * skip all FSs until finding the first one > than FSp (using either begin/end/type or just begin/end) [ subiterator sytle ] I would like to hear from the user community, too :-) > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511218#comment-15511218 ] Marshall Schor commented on UIMA-5115: -- I discovered the existing impl and docs for "subiterator" also make a point of not returning the bounding FS, but also do not returning any annotations that are equal (meaning same begin / end / Type) to that. Earlier in this thread, you had mentioned "for the case of at, covering, and coveredBy, we already agreed that we'd return all at the same position except the reference annotation". This seems to be a difference. I'm wondering if it would be preferable (in the sense that more users would find this would be what they intended) to use this definition of "equal", or a more restrictive one where only the identical FS would be skipped. So, for instance, if you had FSs: * Lemma [ begin: 15, end: 20, baseForm: "foo" ] and * Lemma [ begin: 15, end: 20, baseForm: "bar" ] and asked to get items covered by Lemma [ begin: 15, end: 20, baseForm: "XXX" ], would users of this API prefer that the above 2 Lemma FSs be skipped or not? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511141#comment-15511141 ] Marshall Schor commented on UIMA-5115: -- right, between should only have args fs1, fs2. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510558#comment-15510558 ] Richard Eckart de Castilho commented on UIMA-5115: -- The offsets of two FSes are the same iff (fs1.begin == fs2.begin && fs1.end == fs2.end). If the end differs, offsets are not the same. If there are multiple annotations of the same type at a given (offset) position, we need to decide what to do. I think for the case of at, covering, and coveredBy, we already agreed that we'd return all at the same position except the reference annotation. For following and preceeding, it is a slightly different case. Is AnnoFS1 following/preceding AnnoFS2 if they both have the same offset? At least if type priorities are disabled, I'd tend towards saying no. That is the way how it is currently implemented in uimaFIT. selectFollowing currently seeks forward in the index until it reaches the first annotation that starts at/after the end of the reference annotation. It *might* make sense to refine that a bit if type priorities are active... not sure. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510537#comment-15510537 ] Marshall Schor commented on UIMA-5115: -- What's your definition of "offset"? If 2 FSs share the same "begin", but differ in their "end", are they at the same or different offset? I think you're saying that "following", applied to the case where there's a bunch of otherwise equal annotations at the starting spot, should also skip all of those until it comes to the first one where the begin or end is different from the FS passed in; is that correct? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510397#comment-15510397 ] Richard Eckart de Castilho commented on UIMA-5115: -- I didn't realize that you consider startAt() to be operating on actual index positions - not on the offsets of the annotation passed to startAt(). Sorry for the confusion. So you're absolutely right about the two points above. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510392#comment-15510392 ] Richard Eckart de Castilho commented on UIMA-5115: -- Hum, no... I think "between" should just return everything between fs1.end and fs2.begin - if type priorities are disabled. Again, we probably should consider to have a separate strategy if type priorities are enabled. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510361#comment-15510361 ] Richard Eckart de Castilho commented on UIMA-5115: -- Sounds all ok. Just I do not think we need both coveredBy(begin,end) and between(begin,end) - they are the same, aren't they? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510383#comment-15510383 ] Richard Eckart de Castilho commented on UIMA-5115: -- Right... following() is indeed quite a bit different form startAt()+shift(). Actually also for "following", I think in terms of offset positions, not of index positions. So following(fs) should never return annotations with the same offset as fs - we might want to consider a different behavior if type priorities are enabled. In that case, I guess. following(fs) would never return annotations of the same type or subtype as fs at the same offsets as fs. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510367#comment-15510367 ] Richard Eckart de Castilho commented on UIMA-5115: -- I'm thinking of offsets when thinking of at - yes. Since uimaFIT ignores type priorities, I'm not used anymore to thinking in index positions and tend towards thinking only in offset positions. That has probably contributed to some of the confusion in this discussion. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510331#comment-15510331 ] Marshall Schor commented on UIMA-5115: -- updated and committed new version of SelectFSs, based on above discussion. It has these changes: # at method reserved for old at(bounds).sameBeginEnd(). # other use of at for starting position now called startAt, takes fs, begin/end, or either of these plus a "shift" amount # rename covered to coveredBy # have coveredBy, covering, and between all take bounds arg: either fs or begin/end > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15510161#comment-15510161 ] Marshall Schor commented on UIMA-5115: -- Re: meaning of at(...) Would you prefer at(...) be be reserved for the sameBeginEnd style of using bounds? The other use of at (to provide a starting position) could be covered by renaming it to "startAt". The other uses of at (to specify a bound) could be absorbed into the coveredBy and covering methods. This would also eliminate the sameBeginEnd general style (it could only be applied with an "at" style bound (e.g., not with a "between" style bound). > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509972#comment-15509972 ] Marshall Schor commented on UIMA-5115: -- There's an ambiguity here, because annoFS might be or might not be in the index. If it was in the index, then shift(1) would be needed to match following(). If it was not in the index, the position is already at the following() spot, and shifting one more would miss an annotation. This complexity can be encapsulated in the following() method. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509960#comment-15509960 ] Marshall Schor commented on UIMA-5115: -- If you prefer, I'm happy to switch "between" from being something that specifies start/stop positions in an index, to the combination you think of. It will then be more like one of the "convenience" methods. Please +1 to this, if this is what you prefer :-) > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509947#comment-15509947 ] Marshall Schor commented on UIMA-5115: -- I thought that between meant 2 things: 1) a span, starting at the end of the first and ending at the "begin" of the 2nd. This would have a different "startAt" position than above. 2) an automatic implied switch of the first and 2nd if the 2nd came before the first. This definition I got from the current uimaFIT (unless I misread something), but I'm happy to change it if it's not correct? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509937#comment-15509937 ] Marshall Schor commented on UIMA-5115: -- in v3, I think of implementing offsets (begin/end) via making a "throw-away" Annotation with these begin/end values, and using that. There's an insignificant penalty for that in V3 since the temporary FS is allocated on the stack, not even in the heap (see "escape analysis" in http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html), and effectively GC'd right away. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507915#comment-15507915 ] Richard Eckart de Castilho commented on UIMA-5115: -- And "between" would be short for "startAt(annoFS).endAt(otherAnnoFS)"? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507905#comment-15507905 ] Richard Eckart de Castilho commented on UIMA-5115: -- Ok, np. Should I be able to come up with some convincing argument for 3, I'll post it here ;) But let's then proceed with 2 as the working hypothesis. Good point about 1 being able to switch to Annotation via at(). P.S.: Good that Jira supports threads nowadays! > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507891#comment-15507891 ] Richard Eckart de Castilho commented on UIMA-5115: -- "between(fs1, fs2)" for me is a shorthand for "coveredBy(fs1.end, fs2.begin)". Beyond being a shorthand, I imagine there may be ways of implementing "between(fs1, fs2)" more efficiently because we can seek for the fses in the index and don't have to go via the offsets. But I may simply now have delved sufficiently deep into the APIs yet to see that offsets can be used efficiently in the same way. No, in my present mental world, "between" cannot be used to specify a range for "sameBeginEnd" (at). Again, if I wanted to do that I would probably write "at(fs1.end, fs2.begin)" and suffer if there are any penalties from offset-based lookups vs. FS-based lookups. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507883#comment-15507883 ] Marshall Schor commented on UIMA-5115: -- I agree 1 is not desirable. At the very least, the fact that an "at(0,2)" clause is included implies this is over Annotations, so that would be the return type I would hope, of get. But I think the way this happens is via (2), where it gets the type inferred from the receiving variable. Choice 3) seems not too useful - it adds a check but it seems it just forces the user to write the type twice - once as the receiving argument, and once as the extra argument to the select. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507873#comment-15507873 ] Marshall Schor commented on UIMA-5115: -- I think the general case is startAt(annotFS).shift(n). For the oft-used desire to start following some annoFS, it's fine to have "following(annoFS)" as a convenience method. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507866#comment-15507866 ] Marshall Schor commented on UIMA-5115: -- Here's the case that may be missed: can between be used to specify a range for covering, or sameBeginEnd (not coveredBy)? Are you tying "between" always to coveredBy? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507867#comment-15507867 ] Richard Eckart de Castilho commented on UIMA-5115: -- I imagine this scenario: * I have a type descriptor XML which declares Token as a subtype of Annotation * I did not generate JCas classes for Token So now the question is, which of the following signatures we should prefer: {code} 1 | Annotation t = (Annotation) cas.select("my.Token").at(0, 2).get(); // were get() returns FeatureStructure 2 | Annotation t = cas.select("my.Token").at(0, 2).get() // where the return type of get() is inferred by the type of t 3 | Annotation t = cas.select("my.Token", Annotation.class).at(0, 2).get() // where the return type of get() is inferred by the second arg of select {code} I think 1 is not desirable. In most cases people will have to cast to Annotation because it is more common to work with annotations than with plain features structures. So I would argue the question is between 2 and 3. I kind of tend towards 3... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507852#comment-15507852 ] Richard Eckart de Castilho commented on UIMA-5115: -- For me, *coveredBy* and *between* (and *at*) are different choices along the same paradigmatic axis of "location constraints". Either I say "give me all annotations covered by annotation X" or I say "give me all annotations between X and Y". If I would say "give me all annotations covered by annotation X *and* between X and Y", then I would be providing two "location constraints". Since we said that we only should support a single "location constraint", I would suggest to now allow combining these two constraints into a single statement. If I really wanted to express `coveredBy(between(fs1, fs2))`, I would probably say `coveredBy(fs1.getEnd(), fs2.getBegin())`. I didn't understand the "starting at" semantics yet... you mean to express something like "following(annoFS)" instead as "startAt(annoFS).shift(1)"? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507836#comment-15507836 ] Marshall Schor commented on UIMA-5115: -- In UV3, the built-in types are always present, so we always have TOP and Annotation, for example. If no JCas classes are defined below these, then (just like in V2), these classes are used for the JCas model (they do have all the right features...). So, you can always use one of the built-in supertype classes. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507824#comment-15507824 ] Marshall Schor commented on UIMA-5115: -- I agree it's nice to have * coveredBy(sentence) * covering(10, 20) * sameBeginEnd(token) There are other cases. How would you write combinations that use between(fs1, fs2) and coveredBy() etc? * coveredBy(between(fs1, fs2)) // or * coveredBy().between(fs1, fs2); // the builder methods are only partially ordered, so you could put the between following coveredBy, if that made for better readability. I don't have a strong preference, so I would be fine with making "between" an argument of the 3 styles of boundary use; it could easily translate into a equivalent begin / end (unless I'm missing something). The at(...) has another purpose, besides setting a boundary - it can specify a starting position. Shall we keep "at" for that purpose (or rename it "startAt")? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507803#comment-15507803 ] Richard Eckart de Castilho commented on UIMA-5115: -- I assume even in UV3 there can be the case that there simple are no JCas wrappers available for types, so that types must be referred to by name. In that case (and basically only in that case), I think the additional type argument makes sense - especially in order to distinguish between FeatureStructure and Annotation. I agree that introducing signatures such as `select(Class selectionType, Class resultType)` make absolutely no sense. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507798#comment-15507798 ] Richard Eckart de Castilho commented on UIMA-5115: -- I agree keeping the "cardinality constraints" single(), singleOrNull() and get() separate. I would not conflate them into convenience methods. But I tend to disagree on splitting up "location constraints" into multiple methods. So I'd prefer having {code} jcas.select(Token.class).coveredBy(sentence) {code} than {code} jcas.select(Token.class).at(sentence).coveredBy() {code} > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507797#comment-15507797 ] Marshall Schor commented on UIMA-5115: -- we could. Does it occur enough to warrant this? e.g., if you knew it was "pkg.MyJCasType", then just write it in the other style (MyJCasType.class). I guess this for when the class is in a variable, and varies at run time... In that case, we still do have the ability just use the inferred-from-the-variable-to-which-the-result-is-assigned. I'm not sure the benefit of writing this constraint twice, in this case... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507788#comment-15507788 ] Marshall Schor commented on UIMA-5115: -- Doing the implementation based on these "primitives" is good because you only do things once, and then have other things combine them. I think it can also be useful to help people learn an api - they learn a (few) primitives, and then see how they can be combined in a potentially wide variety of ways, some of which we support with "convenience methods with positional args". I'm imagining for example, a case where someone wanted the "single" constraint applied to a "between" style of boundary specification - something we might not have made a convenience method for, because we didn't think it was wanted/needed. Another possibility is adding the capability to iterate backwards to any of the wide variety of possibilities. With all the variations possible, the convenience methods would only cover a portion... > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507769#comment-15507769 ] Richard Eckart de Castilho commented on UIMA-5115: -- My point was: can we have a signature variant for cas.select("pkg.MyJCasType") that allows the caller to provide at least a minimal type bound, e.g. `cas.select("pkg.MyJCasType", Annotation.class)`? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507764#comment-15507764 ] Marshall Schor commented on UIMA-5115: -- Re: type is inferred from the variable to which the result is assigned - yes that works. For compile-time type safety, you can use the form: {code} // a method in the CAS SelectFSs select(Class clazz)... // a use of this method SelectFSs s = cas.select(MyJCasType.class); {code} The compiler checks that the class of the argument matches the type argument in the receiver declaration. I agree that if you use cas.select("pkg.MyJCasType"), there's no compile-time checking done. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507753#comment-15507753 ] Richard Eckart de Castilho commented on UIMA-5115: -- I see how at first breaking up the API into these separate methods saves implementing various method signatures - but since we are going to introduce them again anyway with the "convenience methods with positional args", I don't see the benefit. If like this model of implementation, I would suggest to implement it but keep the methods private and expose only the convenience methods in the API... maybe some "third-party" feedback might also provide additional new perspectives. Anybody? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507748#comment-15507748 ] Marshall Schor commented on UIMA-5115: -- Yes, these are equivalent in V3. But I'm keeping both in order to make the transition to V3 easier because many of the v2 APIs which were agnostic to JCas vs non-JCas used it instead. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507739#comment-15507739 ] Marshall Schor commented on UIMA-5115: -- I think I may have caused this confusion by leaving off the spec of how the boundary constraint is used. It should have read, for example, at(location).sameBeginAndEnd().single() or at(location).coveredBy().singleOrNull(), etc. And as before, for ease of reading, it's perfectly OK to combine the at(spec).coveredBy() into coveredBy(at-spec). I just break them apart for clarity in examining all the variants... I have imagined a slightly more augmented abstraction. (Again, for convenient API use, some of these ought to be combined...) CONTAINER.select(TYPE-CONSTRAINT). BOUNDARY-CONSTRAINT. at(fs), at(begin, end) between(fs1, fs2) USE-OF-BOUNDARY-CONSTRAINT. boundedBy, bounding, sameBeginAndEnd CARDINALITY-CONSTRAINT + Getter, or TERMINAL single, singleOrNull, or iterator, etc. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507569#comment-15507569 ] Richard Eckart de Castilho commented on UIMA-5115: -- Well, Java needs to get the type information from somewhere. For signatures where some type can be inferred from a parameter, that's nice. That surely works for `select(Token.class)`, not sure about `select(Token.type)`. It certainly does not work for `select("my.Token")` unless we use a signature where the type is inferred from the variable to which the result is assigned, e.g. (I believe this works...) {code} List select(String type); List tokens = select("my.Token"); {code} I am bit ambivalent about this type of signature though... it somehow seems to be more asking for a ClassCastException than `select("my.Token", Annotation.class)` although I cannot really articulate atm why. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507550#comment-15507550 ] Marshall Schor commented on UIMA-5115: -- I agree that it's a bad design if there's lots of casting from FeatureStructure to Annotation (see other comment about AnnotationFS going away in favor of just plain Annotation). I'm hoping this can all be done automatically, without any extra methods. My thought is that the creator of SelectFSs_impl, e.g. cas.select(some-type-spec), would create a "typed" instance of this where the type would fit the type spec. e.g. SelectFS s = cas.select(MyAnnot.type); would create an instance of SelectFS that was typed to be "MyAnnot". Subsequent operations, e.g. s.fsIterator() would return FsIterator. If someone were to create a "typeless" SelectFSs_impl, e.g., cas.select(), and then later specify a type e.g. " ... .type("my.FooType") " I'm hoping we could return an instance of SelectFSs_impl that was "typed" to FooType. Also, if they never specified a type, but implied Annotation (becuase they used a bounding FS, for example), I'd want the resulting SelectFSs_impl to be typed to Annotation. Not sure if this is do-able, but want to try to get a close as is reasonable to this :-) > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507534#comment-15507534 ] Richard Eckart de Castilho commented on UIMA-5115: -- Uhm... are you also contemplating dropping the distinction between FeatureStructure and TOP? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507521#comment-15507521 ] Richard Eckart de Castilho commented on UIMA-5115: -- Extends FeatureStructure is fine as well I guess. But it should not just be "". > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507519#comment-15507519 ] Richard Eckart de Castilho commented on UIMA-5115: -- I think `at(location).single()` is quite a different thing. at (between, covered by, etc.) is IMHO all referring to a "location constraint" whereas single is a cardinality constraint. The following variations should all IMHO be possible: {code} jcas.select(Token.class).coveredBy(sentence).single() jcas.select(Token.class).coveredBy(10, 20).singleOrNull() jcas.select(Token.class).at(10,20).get() jcas.select(Token.class).between(firstName, lastName).get() {code} or abstractly {noformat} CONTAINER.select(TYPE-CONSTRAINT).LOCATION-CONSTRAINT(args)[.CARDINALITY-CONSTRAINT()] {noformat} > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507508#comment-15507508 ] Marshall Schor commented on UIMA-5115: -- Yes. I waffle back and forth on NEW_TYPE extends TOP and NEW_TYPE extends FeatureStructure. The latter is more backwards compatible with v2 code, so I tend to use it; I don't think it hurts (he said hopefully). > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507500#comment-15507500 ] Marshall Schor commented on UIMA-5115: -- you are correct. I've changed these to require "Annotation". There's no need in UV3 for the XyzFS kind of class names - those were the names for either the JCas or non-JCas version of the Java cover class, and in Version 3, we only have the JCas version :-). > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507493#comment-15507493 ] Marshall Schor commented on UIMA-5115: -- I agree about readability. What I was proposing was 1) this refactoring 2) (left unstated, sorry!) we can have whatever positional argument variations on these we want. In particular, I completely agree that the coveredBy(boundingFs) reads very well, and should be one of the variants. The point of the refactoring is to realize that you can have (more-or-less easily) other combinations, such as: at(location).single(), with any of the 3 variations of at, etc. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15507180#comment-15507180 ] Richard Eckart de Castilho commented on UIMA-5115: -- So what you are suggesting is e.g. *New variant* {code} jcas.select().type(Token.class).at(sentence).covered() // or jcas.select().type(Token.class).covered().at(sentence) {code} instead of *Old variant* {code} jcas.select().type(Token.class).coveredBy(sentence) {code} I am not very convinced... I think splitting the API up in this rather atomic way makes it less intuitive/readable. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506947#comment-15506947 ] Richard Eckart de Castilho commented on UIMA-5115: -- An alternative could be to distinct between select() which would bound to AnnotationFS and selectFS() which would be bound to FeatureStructure. uimaFIT has select and selectFS in the CasUtil. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506945#comment-15506945 ] Richard Eckart de Castilho commented on UIMA-5115: -- It might be worth introducing pseudo-type-safe variations of these methods: {code} SelectFSs type(Type uimaType); SelectFSs type(String fullyQualifiedTypeName); SelectFSs type(int jcasClass_dot_type); {code} These additional methods should allow the user to specify a built-in JCas type or CAS FS interface as boundary, e.g. {code} SelectFSs type(String fullyQualifiedTypeName, Class aBoundingType); {code} Usage example: {code} cas.select().type("my.Token", AnnotationFS.class); {code} Otherwise we'd probably be casting a lot from FeatureStructure to AnnotationFS. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506927#comment-15506927 ] Richard Eckart de Castilho commented on UIMA-5115: -- I assume we still have TOP in UV3? Then IMHO the generics should be bounded by that and not by object, e.g {noformat} type(Class jcasClass_dot_class); {noformat} > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506921#comment-15506921 ] Richard Eckart de Castilho commented on UIMA-5115: -- I think methods that require offsets should not take `FeatureStructure` but require `AnnotationFS`, e.g. {noformat} + SelectFSs at(FeatureStructure fs); // AI + SelectFSs between(FeatureStructure fs1, FeatureStructure fs2); // AI {noformat} > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506790#comment-15506790 ] Marshall Schor commented on UIMA-5115: -- I checked in a possible initial interface for result of select(...), it's here: http://svn.apache.org/viewvc/uima/uimaj/branches/experiment-v3-jcas/uimaj-core/src/main/java/org/apache/uima/cas/SelectFSs.java?view=markup > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506742#comment-15506742 ] Marshall Schor commented on UIMA-5115: -- I noticed a maybe cleaner refactoring: * at(FS), at(begin, end) and between(fs1, fs2) - all specify begin / end boundaries * covered, covering, and the other at(fs) [meaning all FS with the same begin/end position) - all specify ways to use the begin/end boundaries. So I'm thinking of having at/between be the way to specify begin/end boundaries, and covered(), covering(), and sameBeginEnd() be 3 ways of using these. WDYT? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506574#comment-15506574 ] Richard Eckart de Castilho commented on UIMA-5115: -- > I think the answer is you return Sentence and subtypes; that is, there's no > additional type filtering based on the type of the bounding FS (again, > because it's not needed - if only tokens are wanted, that's more directly > expressible using Token.class as the first argument). That is correct. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506568#comment-15506568 ] Richard Eckart de Castilho commented on UIMA-5115: -- If get() would return an Optional, then the way to get the actual value would be get().get(). I'm not a fan of Optional either. I'm quite fine with get(), single() and singleOrNull() - but since Optional is one of the hip things today, I wanted to make at least sure that we know why we don't do it. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506567#comment-15506567 ] Marshall Schor commented on UIMA-5115: -- I was thinking of selectCovered. The question I was posing arises in the case where the type of the bounding FS is a subtype of the index type being iterated, just opposite to your example. For instance: selectCovered(Sentence.type, aVeryWideToken), where someone has (for reasons unknown) constructed a "Token" covering multiple sentences. Here the question is, do you return items of Sentence and its subtypes, or Token and its subtypes? I think the answer is you return Sentence and subtypes; that is, there's no additional type filtering based on the type of the bounding FS (again, because it's not needed - if only tokens are wanted, that's more directly expressible using Token.class as the first argument). > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506523#comment-15506523 ] Marshall Schor commented on UIMA-5115: -- I didn't follow the example get().get(). I think the first get() would return the single element or null, and the 2nd get would not compile (since FS doesn't have a "get" method. The small value in Optional (seems "documentational", only, to me) is outweighed (IMHO) by it's verbosity and inefficiency (creating an extra Java object wrapping things). But I certainly could be missing something... The single(true), single(false) forms communicate less to the reader than singleOrNull and single. Other forms which might communicate well include nullOk().single(). But that seems somewhat stilted. It might be useful if there are other methods where sometimes null is OK and other times you want to throw an exception. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506016#comment-15506016 ] Richard Eckart de Castilho commented on UIMA-5115: -- I assume you mean selectCovering because there is no coveredBy in uimaFIT? Return all Sentence annotations (including subtypes of Sentence) that contain the specified Token annotation. {code} Token token = ...; List sentences = selectCovering(Sentence.class, token) {code} In case you mean selectCovered: return all Token annotations (including subtypes of Token) that are contained within the boundaries of the specified sentence. {code} Sentence sentence = ...; List tokens = selectCovered(Token.class, sentence) {code} I hope that the statements above are correct with respect to the implementation. They are at least how it should work. selectCovered is using cas.getAnnotationIndex(type).iterator() which afaik returns subtypes as well. selectCovering is going over the full annotation index and then uses ts.subsumes() to check if an annotation should be included - so subtypes should also be included here. > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506000#comment-15506000 ] Richard Eckart de Castilho commented on UIMA-5115: -- Possibly consider use of Java Optional type (in some of these cases)... e.g. get().get() looks stupid... Maybe `single(boolean optional)`? > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504781#comment-15504781 ] Marshall Schor commented on UIMA-5115: -- Does uimaFIT iteration for coveredBy(fs1) filter the result by the type of fs1? That is, if fs1's begin and end specify 10 FSs, but only 3 of them are of the type/subtype of fs1, does the result only have those 3 FSs? I don't believe UIMA's subiterator design does this filtering - it just uses the fs1 for bounds and for type priority ordering (not type/subtype filtering). > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504694#comment-15504694 ] Marshall Schor commented on UIMA-5115: -- related feature requests to consider > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (UIMA-5115) uv3 select() api for iterators and streams over CAS contents
[ https://issues.apache.org/jira/browse/UIMA-5115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504684#comment-15504684 ] Marshall Schor commented on UIMA-5115: -- wish lists to consider > uv3 select() api for iterators and streams over CAS contents > > > Key: UIMA-5115 > URL: https://issues.apache.org/jira/browse/UIMA-5115 > Project: UIMA > Issue Type: New Feature > Components: Core Java Framework >Reporter: Marshall Schor >Priority: Minor > Fix For: 3.0.0SDKexp > > > Design and implement a select() API based on uimaFIT's select, integrated > well with Java 8 concepts. Initial discussions in UIMA-1524. Wiki with > diagram: https://cwiki.apache.org/confluence/display/UIMA/UV3+Iterator+support -- This message was sent by Atlassian JIRA (v6.3.4#6332)