Re: Backwards compatibility for CAS API redesign

2006-12-21 Thread Adam Lally

On 12/21/06, Thilo Goetz [EMAIL PROTECTED] wrote:

 The idea is that a CAS has a current view (best term I can think of
 for it right now).  Any methods on the CAS that are view-oriented will
 apply to the current view.  This includes but is not limited to:
 getSofa()
 getDocumentText()
 getIndexRepository()
 addFsToIndexes()
 createAnnotation(int begin, int end) //needs to know which Sofa to refer to

It seems to me that this makes the CAS a view, maybe a deprecated one ;-)



Well, the current view isn't fixed.  For each annotator that's called
the current view may actually be a different physical view.  That's
why I think the better mental model is of the CAS having several views
and at any given time one is designated as the current view.



 Note that this approach also allows single-sofa application code to
 work.  We have a lot of code that does:
 AnalysisEngine ae = ...
 CAS cas = ae.newCAS();
 cas.setDocumentText(someString);
 ae.process(cas);

 and I think it would be really nice if this continues to work.

Very true, if this should cease to work, it would break a lot of code.
+1 to preserving this functionality.



Excellent.  There haven't been nearly enough +1's in this thread so far. :)



 /**
 * Gets the global index repository, which provides access to all indexed FS
 * in the entire CAS.
 */
 FSIndexRepository CAS.getGlobalIndexRepository()

 /**
 * Gets the index repository for the current view.
 */
 FSIndexRepository CAS.getIndexRepository()

And what about addFsToIndexes()?  I guess it should be local to the
current view.


Yes, for backwards compatibility to work we would need
CAS.addFsToIndexes() to apply to the current view only.



What I'm not so sure about is, do we need
addToAllIndexes()?  It doesn't make sense anyway to add annotations to
indexes of other views.


We need to sort out the meaning of global indexes over on the other
thread before we can come to a final answer here.  But, I was hoping
that if we have CAS.getGlobalIndexRepository() we'd also have
CAS.addFsToGlobalIndexes(), just for consistency of naming.



I'll just say this once, because I know I won't get through with
changing it: to me, the term view in this context has different
associations from what we mean by it.  When I hear indexes and views, I
think databases.  In DBs, a view is just a different way to look at your
data, and not necessarily a filter.  Our views are always filters, and
don't make the data accessible in any different way than it was before.
  On the other hand, our use of the term index is not DB conformant
either, so maybe I should just get this association out of my head.  I
do wonder if other people have the same issue, though.



Duly noted. :)  Maybe documentation can help... in the chapter that
introduces the CAS we  can point out that our definitions are not
consistent with how those terms in used in databases.

-Adam


Backwards compatibility for CAS API redesign

2006-12-19 Thread Adam Lally

This is a proposal for how to preserve some amount of backwards
compatibility if we do the CAS / CasView API redesign currently being
discussed.  This only addresses single-Sofa annotators/applications,
but that is the vast majority of the code out there and so I'm willing
to accept breaking compatibility for multi-Sofa code.

The idea is that a CAS has a current view (best term I can think of
for it right now).  Any methods on the CAS that are view-oriented will
apply to the current view.  This includes but is not limited to:
getSofa()
getDocumentText()
getIndexRepository()
addFsToIndexes()
createAnnotation(int begin, int end) //needs to know which Sofa to refer to


The current view is determined by the framework and can be different
for different annotators.  For single-sofa annotators the current view
is the view that the annotator should process, as determined by sofa
mappings in the usual way.

Note that this approach also allows single-sofa application code to
work.  We have a lot of code that does:
AnalysisEngine ae = ...
CAS cas = ae.newCAS();
cas.setDocumentText(someString);
ae.process(cas);

and I think it would be really nice if this continues to work.


We could deprecate these APIs and encourage people to switch to the
view-oriented APIs, which would be something like:
AnalysisEngine ae = ...
CAS cas = ae.newCAS();
CasView initialView = cas.getInitialView();
initialView.setDocumentText(someString);
ae.process(cas);


Although really, I wonder if we can explain the current view concept
well enough that it doesn't need to be deprecated and can continue to
be used for single-Sofa cases.


One consequence of this approach when combined with the global
indexes proposal is that methods on the CAS interface that deal with
the global indexes need to have different names than our current
methods.  For example:

/**
* Gets the global index repository, which provides access to all indexed FS
* in the entire CAS.
*/
FSIndexRepository CAS.getGlobalIndexRepository()

/**
* Gets the index repository for the current view.
*/
FSIndexRepository CAS.getIndexRepository()



This may not be absolutely the cleanest design we could come up with,
but I think it's reasonable and I think it's worth trading off a
little elegance in return for not overly frustrating our users and
creating barriers to their migration to Apache UIMA.

-Adam