On 06/22/2015 06:23 PM, Marshall Schor wrote:
In reading this paper, it seems one of the key ideas is "dynamic typing". There
seems to be multiple aspects to this, including type "adapters" of various
kinds, to enable more-easily fitting together independently developed
components. I also get the sense that making things "easy" for developers is a
value that dynamic typing provides. Are you thinking here of the Javascript
style of typing values as "var" instead of specific static types?
If dynamic typing means something beyond getting independently - developed
components' type systems to work together "easily", can you give a couple of use
cases of what the dream is here?
That would be a good start. Beyond that, think about what we call
generic annotators, i.e., annotators that take a spec as input (e.g., a
bunch of regex rules) and produce annotations or other data as output.
The data types that the generic annotator produces varies with the spec,
and so it can't have a static, external type system. It might produce
tokens with one spec, sentences with another, and person names with a third.
Also, and I can't stress this enough, I want to be able to communicate
with annotators just at the level of the data. I want to be able to read
data from files, or from network streams. I want to read from Kafka or
sequence files in HDFS. And I want to be able to do that without having
to know the precise type system that the data was written with. And I
want to be able to do this in Python or Go if I feel like it, so there
must be no framework dependency. Think JSON.
Of course I need to know a thing or two about the data format, otherwise
the data is not very useful. However, if I just need the tokens, I don't
want to have to know all the rest, and I'd like this to be a lot easier
than it is now in UIMA.
I put some concepts into the wiki-page; feel free to correct/augment etc.
Thanks! -Marshall
On 6/22/2015 5:30 AM, Thilo Goetz wrote:
Let me throw last year's dream into the ring then:
http://aclweb.org/anthology/W14-5209
--Thilo
On 06/18/2015 04:41 PM, Marshall Schor wrote:
I've put up another wiki page as a place to collect ideas for UIMA version 3.
It's a place to dream a bit. It goes with the earlier page,
https://cwiki.apache.org/confluence/display/UIMA/Modernizing+the+internals+of+UIMA
.
Feel free to contribute, of course!
The page is linked from the above page, but here's the direct link:
https://cwiki.apache.org/confluence/display/UIMA/Ideas+for+UIMAJ+v3
-Marshall