Hi, Once again thanks for the response. It is really appreciated :)
I tried the moveTo(fs) instead of just using an iterator constructed from the FS, and this appeared to give me all items of the specified type when I didn’t set any values on it, which was an accidental experiment, but when I set the key property to what I was searching for then I got zero items back. Not sure what I might be doing wrong here, but I have learned something maybe more importantly to our use case in the mean time: The cost of indexing exceeds by far the benefits of any expected lookup speed in our case. We are annotating a number of items with a lot of extracted feature information, and the hope was to be able to quickly get top 5 or 10 or whatever of the items with this or that key, which is why it was sorted by key first in natural sort order and then by the value in reverse order, meaning higher value is better, so that we could quickly get to the first item with the right key and then start pulling the top most items until we have those that we need. So even if I could get this to work optimally it would in our case not be beneficial given the cost of indexing. It seems we really need many of those queries before it pays of, since the amount of feature information is much larger than the items they are associated with, so I reached to the preliminary conclusion to not have features in any index at all and just using plain FS record structures instead. It appears in our case much cheaper to run through all target items, which there are comparatively less of, to find what we need than to index all associated features and find the relevant target items through feature look up. Cheers, Mario > On 6 Sep 2019, at 16:50 , Marshall Schor <[email protected]> wrote: > > Please don't add to the indexes, the FS you're temporarily using as the > argument > for the moveTo operation. (and of course, if you don't add it, you won't need > to remove it...) > > If you describe your use case in a bit more detail, I can perhaps comment on > this more. > > -Marshall > > On 9/6/2019 2:50 AM, Mario Juric wrote: >> Hi, >> >> Thanks for responding. >> >> I tried with a temporary FS where the key value was set, but I got every >> annotation from the index, so that didn’t appear to change anything, and it >> also broke my unit tests immediately. I also stepped through the iterator >> implementation and found construction of the iterator quite a bit complex >> with an FS, so that went over my head without spending time to get a deeper >> understanding of the underlying index implementation. Therefore I tried with >> an indexed FS and this seemed to return the correct items, but it would be >> awkward having to add some FS to the index in order to retrieve something >> else and then having to remove the FS from the index again. I am now also in >> doubt about the insertion costs, but I haven’t measured that yet. >> >> I am not sure how many use custom FSIndex, but currently the API doesn’t >> really support very well the type of use cases that we are working with, so >> this is a disappointment for us. Does UIMA 3 improve on this? We are still >> on 2.x since we are awaiting the next major DKPro release with UIMA 3 >> because of dependencies. >> >> Thanks a lot and cheers, >> Mario >> >> >> >> >> >> >> >> >> >> >> >> >>> On 5 Sep 2019, at 23:42 , Richard Eckart de Castilho <[email protected]> >>> wrote: >>> >>> On 5. Sep 2019, at 23:40, Marshall Schor <[email protected]> wrote: >>>> The normal way to get the "binary search" kind of behavior is to get a >>>> plain >>>> iterator over the sorted index, and then use the moveTo method, specifying >>>> a >>>> target FS as the one to move to. The target FS can be a "temporary" FS, >>>> one >>>> that is never added to the indexes, itself; it is just used to supply >>>> values >>>> used in the comparison. >>> Is there a way to do this using a "temporary" FS which does not take up CAS >>> heap >>> space in UIMAv2? >>> >>> -- Richard >>
