Hi Mario,

> On 11. Nov 2020, at 09:11, Mario Juric <[email protected]> wrote:
> 
> I ran the latest benchmarks, and they seem to confirm your initial 
> conclusions, that the JCasUtil.selectCovered method perform better than the 
> corresponding SelectFSs coveredBy method. I somehow picked up the idea that 
> the new select API should improve performance, which it does for other select 
> calls, but it is not the case for coveredBy. I was wondering whether you have 
> some ideas as to why the new API performance isn't closer to the 
> JCasUtil.selectCovered method?

First, the benchmarks are not really very representative at the moment. They 
create annotation structures that are highly overlapping and very dense - 
something that you wouldn't see in real life normally. So some operations show 
up slower in the benchmarks than they would feel in real life. The benchmarks 
should probably be adjusted to test against different types of structures.

The select methods of uimaFIT are very lightweight and much less 
flexible/configurable than SelectFS. If you look at cases where you have small 
cases and a very high frequency of method calls, the uimaFIT methods are likely 
to be better because they have a lower setup cost.

For larger CASes and lower call frequencies, the SelectFS can be better. In 
particular if you can iterate over the results (and maybe stop before having 
iterated over all results) instead of retrieving the results as a list. 
SelectFS tries hard to not calculate the full result list while uimaFIT will 
usually calculate and return the full result list.

There may be more to it, but that's my insight/intuition for the moment.

Cheers,

-- Richard

Reply via email to