Re: New CAS heap impl?

Eddie Epstein Fri, 19 Oct 2007 09:15:58 -0700

On 10/19/07, Thilo Goetz <[EMAIL PROTECTED]> wrote:
>
> >
> > As far as I know, the main requirements for delta CAS is that it is easy
> (
> > i.e. cheap) to know,
> >  1. which FS were created in the current call
> >  2. which preexisting FS were deleted from the index
> >  3. when setting a feature value, if the containing FS was preexisting
>
> None of these are particularly easy to do now, and they
> won't be any easier or harder when I'm done ;-)  As I said,
> there will still be unique IDs, and as long as you don't
> refer to the heap directly, my changes should not affect
> this design.



With the current design, the top of the FS heap position on calling process
is used to identify new versus preexisting FS during or after the call: just
compare any FS address to that position to know if it is new or not.

>
> > Another thing to keep in mind for calls to remote services is the
> > requirement that any FS references in the client are still valid after
> > making a call.
> >
> > As for impact on binary serialization performance, an easy experiment
> would
> > be to modify binary serialization to end up with a string list instead
> of a
> > string heap, using a scenario that had a lot of strings in the CAS. This
> > would give a good idea of the extra overhead of creating individual FS
> > objects.
>
> I must admit that I don't understand what you mean.
>
> For both paragraphs?

Consider the following code:
        AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(specifier);
        CAS cas = ae.newCAS();
        cas.setDocumentText("some text");
        AnnotationFS fs = cas.createAnnotation(cas.getAnnotationType(), 0,
4);
        ae.process(cas);
        System.out.println(fs.getCoveredText());

Preexisting fs in the client must be valid after a process call, no?

For the 2nd paragraph, I was referring to binary blob serialization.

Eddie

Re: New CAS heap impl?

Reply via email to