On 1/8/07, Marshall Schor <[EMAIL PROTECTED]> wrote:
Here's a short proposal / straw-person that might address many of the
concerns raised.

1) Drop the JCas interface - only have the CAS interface.  (one less
interface; process(...) method doesn't need variants depending on kind
of API interface passed in being CAS / JCas).


Hmmm... I'm curious how it's possible to merge CAS and JCas,
especially if we need to do it without sacrificing any performance.
For example, JCas has somewhat expensive initialization where it tries
to load classes for each type in the type system.  We don't currently
pay that cost if JCas is never used.  But if JCas and CAS are merged,
how do we know if the user needs to have these classes loaded?  That's
only one particular issue.

Implementation challenges aside, I have some mixed feelings about
this.  In some ways it makes things simpler, but in other ways it
seems like CAS and JCas are different ways of thinking and that we
shouldn't try to hide that.  Maybe this is another area where we could
benefit from asking some users what they think.


2) Put all the user-facing methods for the CAS into the CAS interface
(this includes the view stuff and the sofa stuff).  Conceptually, this
interface contains all the methods needed by a user using the CAS.

3) Add a new interface called "CasViewSelector".  This interface has
just the 2-3 methods that select a view.  This interface is passed to
process methods.


Thilo mentioned a very similar idea I think.  This does solve the
issue of the "base CAS" with all its unsupported operations.  However
it has the drawback of not really matching the names that the UIMA
specification proposal uses.  The spec says that CASes are what's
passed between analytics and that CASes contain views.  We'd still be
calling a "CAS" what the spec calls a "View".  That just makes me
nervous about rushing to implement this.

Actually, forgetting about the spec for a second, what do we say in
our documentation is the thing that carries the analysis data between
annotators?  If we still say that's called a "CAS", and that the thing
that's serialized and sent between remote annotators is still a "CAS",
then this just doesn't seem consistent with this suggested naming of
interfaces.



But if I insist on calling that a CAS, but don't want to break
backward compatibility with people who our using the existing CAS
interface, we seem to be stuck.  Hmm.  org.apache.uima3.cas.CAS,
anyone? :-)

-Adam

Reply via email to