> On 29. Dec 2022, at 13:01, Pablo Duboue <[email protected]> wrote:
>
> Here is some dream concept code:
> https://gist.github.com/DrDub/9413410626b5a77d8f1f576f6447d64e (getting
> the syntax and approach right will take a lot of iterations and
> consultations of course)
Thanks for the example code :) It has some interesting ideas. I'll consider
them based on my background with Cassis and UIMA-J.
== Type system
I can see that you imagine defining types in a natural pythonic way here. For
Cassis, we chose a different approach that is based on a type system definition
(either programmatically [1] created or loaded from XML [2]) and then uses
factory methods to generate type classes (comparable to JCas classes).
We needed the type classes to have special properties and we wanted to be able
to handle UIMA features like type system merging - so we couldn't go with
simple Python classes.
== Access to CAS contents
Your python code seems inspired by the UIMAv2 CAS index API.
UIMAv3 introduces a new "select" API for retrieving FSes from the CAS [3]. This
was inspired by the popular "select" methods of uimaFIT. In cassis, a simple
version of select has been implemented [4] which feels more like the uimaFIT
methods than like the V3 select API.
Note that Cassis does not support indices or type priorities. To be honest,
those always seemed to be more in the way than helpful anyway. The UIMAv3
select API by also default ignores type priorities (can be turned on though for
a given select call).
== Component concept
The Python annotation with component metadata on the analysis engine class
looks interesting. I wonder if you need the indexes though. Can you not work
simply with the built-in annotation index?
== Data mapping
The `wrap` code in there looks very interesting, e.g.
-----
SetFeature({MyNER.Source: "spaCy"}).wrap(
TypeMapper(output={spacy.Sentence: MySentence, spacy.NER:
MyNER}).wrap(
SpacyAnnotator({"load": "en"})
)
)
-----
Cheers,
-- Richard
[1] https://github.com/dkpro/dkpro-cassis#creating-types-and-adding-features
[2] https://github.com/dkpro/dkpro-cassis#loading-a-cas
[3]
https://uima.apache.org/d/uimaj-current/version_3_users_guide.html#_uv3.new_extended_apis.select
[4] https://github.com/dkpro/dkpro-cassis#selecting-annotations