Hi Richard,

Sorry for the late reply, I’ve been busy with other things lately, but thanks 
for the extensive explanation, and it gives me a much better understanding of 
the select API.  Generally I don’t oppose to any of the thoughts you have, but 
I don’t necessarily always find it intuitive (yet), and there hasn’t been any 
issues for our limited use cases, except for the bug you fixed very quickly, 
thanks for that :)

Cheers
Mario


> On 20 Nov 2020, at 22.04, Richard Eckart de Castilho <[email protected]> wrote:
>
> External email – Do not click links or open attachments unless you recognize 
> the sender and know that the content is safe.
>
>
> Hi Mario,
>
> if I understand you correctly, you imagine select() to be a streaming 
> operation. Actually, it is not - at least not immediately.
>
> When select() is invoked, it creates an object that is a hybrid between a 
> builder, an Iterable and a Stream. If at any point you invoke an Iterable or 
> Stream method on it, it loses the other personalities.
>
> The methods such as the following are part of the "builder" personality:
>
> - following(x)
> - coveredBy(x)
> - covering(x)
> - ...
>
> - shifted(y)
> - backwards()
> - noneOverlapping()
> - typePriorities()
> - ...
>
> While operating on the "builder" personality, the order of methods has no 
> effect. E.g. the following calls are all equivalent:
>
> cas.select(Token.class).shifted(-1).following(t3).backwards()
> cas.select(Token.class).following(t3).backwards().shifted(-1)
> cas.select(Token.class).backwards().shifted(-1).following(t3)
>
> If you try to give conflicting instructions to the builder personality, the 
> last instruction should be used, e.g.
>
> cas.select(Token.class).following(t3).shifted(-1).preceding(t4)
>
> should be equivalent to
>
> cas.select(Token.class).preceding(t4).shifted(-1)
>
> (... or if there are bugs it might do something unexpected ...)
>
> Methods like coveredBy(x) or covering(x) set up bounds for the iterator 
> internally created by SelectFS.
> I think the initial idea for following(x)/preceding(x) was that they would 
> not define bounds - but IMHO that doesn't make too much sense. From my 
> perspective they also define bounds either from the beginning of the document 
> to x (preceding) or from x to the end of the document (following). There is 
> also the startAt(x) method - this does not define a boundary - it just moves 
> the iterator to a given start position.
>
> So while the following operations are bounded:
>
> cas.select(Token.class).following(x).asList()
> cas.select(Token.class).preceding(x).asList()
>
> these operations are their respective not-bounded versions
>
> cas.select(Token.class).startAt(x).asList()
> cas.select(Token.class).startAt(x).backwards().asList()
>
> The not-bounded versions behave a bit differently from the bounded ones. E.g. 
> preceding(x) returns annotations in document order while 
> startAt(x).backwards() returns them in iteration order. Also,
> following(x) and preceding(x) would never include x in their results, while 
> startAt(x) should return
> x as the first entry in the result list. I do hope that I explained this 
> correctly and that it makes sense and that it mostly matches the 
> implementation. I am still working on setting up a tighter test suite to 
> ensure it does ;)
>
> select() only really becomes a stream if you invoke stream() or a method from 
> the Stream interface (e.g. filter() or map()). It can also become a list, an 
> array, or an iterator. So the following is actually *not* possible:
>
> select(Token.class).filter(t -> t.getCoveredText().equals("blah")).shifted(1)
>
> because "shifted()" is a method from the builder personality of SelectFS 
> while "filter()" is a method of the Stream personality. However, this would 
> work:
>
> select(Token.class).filter(t -> t.getCoveredText().equals("blah")).skip(1)
>
> because "skip()" is a method on Stream.
>
> Ok, but independent of the different personalities of select(), I understand 
> that you'd find it not logical or intuitive that limit and shifted interact 
> with each other. But you do support the idea of
> capping shift at 0 and simply ignoring any smaller values for bounded 
> selections.
>
> Cheers,
>
> -- Richard


________________________________
Disclaimer:
This email and any files transmitted with it are confidential and directed 
solely for the use of the intended addressee or addressees and may contain 
information that is legally privileged, confidential, and exempt from 
disclosure. If you have received this email in error, please notify the sender 
by telephone, fax, or return email and immediately delete this email and any 
files transmitted along with it. Unintended recipients are not authorized to 
disclose, disseminate, distribute, copy or take any action in reliance on 
information contained in this email and/or any files attached thereto, in any 
manner other than to notify the sender; any unauthorized use is subject to 
legal prosecution.

Reply via email to