Re: big offsets efficiency, and multiple offsets

2013-12-05 Thread Jens Grivolla
I agree that it might make more sense to model our needs more directly instead of trying to squeeze it into the schema we normally use for text processing. But at the same time I would of course like to avoid having to reimplement many of the things that are already available when using

Re: big offsets efficiency, and multiple offsets

2013-12-05 Thread Jens Grivolla
I forgot to say that the text analysis view(s) will necessarily have to use character offsets so that we can obtain the coveredText, which means that all resulting annotations will also use character offsets. The merged view will need to use time-based offsets which means that we have to

Re: big offsets efficiency, and multiple offsets

2013-12-05 Thread Eddie Epstein
On 05/12/13 10:04, Jens Grivolla wrote: I agree that it might make more sense to model our needs more directly instead of trying to squeeze it into the schema we normally use for text processing. But at the same time I would of course like to avoid having to reimplement many of the things

big offsets efficiency, and multiple offsets

2013-12-04 Thread Jens Grivolla
Hi, we're now starting the EUMSSI project, which deals with integrating annotation layers coming from audio, video and text analysis. We're thinking to base it all on UIMA, having different views with separate audio, video, transcribed text, etc. sofas. In order to align the different views

Re: big offsets efficiency, and multiple offsets

2013-12-04 Thread Richard Eckart de Castilho
Why is it bad if you cannot inherit from Annotation? The getCoveredText() will not work anyway if you are working with audio/video data. -- Richard On 04.12.2013, at 12:31, Jens Grivolla j+...@grivolla.net wrote: Hi, we're now starting the EUMSSI project, which deals with integrating

Re: big offsets efficiency, and multiple offsets

2013-12-04 Thread Jens Grivolla
True, but don't things like selectCovered() etc. expect Annotations (to match on begin/end)? So using Annotation might make it easier in some cases to select the annotations we're interested in. -- Jens On 04/12/13 15:35, Richard Eckart de Castilho wrote: Why is it bad if you cannot inherit

Re: big offsets efficiency, and multiple offsets

2013-12-04 Thread Richard Eckart de Castilho
selectCovered() and friends expect annotations (or AnnotationFS), yes. Anyway, I don't want to convince you to deviate from your idea. Frame offsets sound very reasonable. Just trying to discuss potential implications and confusions (e.g. getCoveredText() not working). Also, can I have

Re: big offsets efficiency, and multiple offsets

2013-12-04 Thread Marshall Schor
Echoing Richard, 1) It would perhaps make more sense to be more direct about each of the different types of data. UIMA built-in only the most popular things - and Annotation was one of them. Annotation derives from Annotation-base, which just defines an associated Sofa / view. So it would make

Re: big offsets efficiency, and multiple offsets

2013-12-04 Thread Richard Eckart de Castilho
:) Btw. the indexing system in UIMA didn't appear extensible to me last time I checked. Considering somebody would introduce a x/y coordinates scheme for image data. This would call for some spatial index, e.g. a k-d tree. While it is possible to define different indexes of the bag, set, and