Hi,
Am 12.09.2016 um 10:52 schrieb Joern Kottmann: > I strongly disagree here I think the really static type system (and with > JCas even compile time static) in UIMA makes it hard reuse a component, > because I need to write explicit type system converters in many cases to be > able to use them. In my opinion, the static type system is one of the big advantages of UIMA compared to GATE. The explicit type system converter can be a performance problem, but it is the only thing that will work for non-trivial types. Btw, a generic converter will cause even more performance problems. I can see your point, but I do not agree. How often does one write an analysis engine compared to a converter? The converter is written once, and adapted if the type system changes (won't happen so often normally). So, I rather take the advantage of static type systems for developing analysis engines. > > The alternative to this would be a type system which is much less static > (or dynamic) and APIs to write AEs which can adapt well to similar but > different user defined type systems. This could be achieved by allowing > type system mappings, by adding explicit support for adapters in the > framework, allowing dynamic definition of types, Type system mapping is not that easy as it sounds, and leads exactly to the explicit converters mentioned above. Yes, you can do that for simple use cases, but not for complex type systems. And this is not a specific problem of UIMA but rather a general one. I can see that type mapping like sofa mapping for aggregate analysis engines can be handy, but that will work only for simple use cases, e.g., read only or for equal feature ranges. Ruta, for example, provides also type aliasing when importing type systems. Dynamic type systems where new types and features are incrementally added by analysis engines can be a nice feature, but can also reduce the maintainability of the pipelines. It would have been a nice feature for Ruta since Ruta spams new types, but the generation of type system descriptors during compile time works perfectly well for me now. > > Together with Thilo I wrote a paper which speaks a bit about this topic > (see at 6.4): > http://www.aclweb.org/anthology/W14-5209 > > You have a different view and that is ok, and other people here too. > I know the paper of course and I liked it. There is a difference to state something "is just wrong" or to complain about JCas in general with arguments that are not accurate (in my opinion), or to provide some arguments what can be improved in UIMA and how it can be improved. Different views will always be for the better of UIMA if the arguments are constructive. > If have a large pipeline you will end up writing two converters if you use > an AE which can't adapt to your type system, one to convert to the AEs type > system, this one you place before, and one to convert back from the AE type > system to yours. I was speaking here about a simple example, and not a > simple pipeline. > Well, I implemented the converters for the major type systems - once -, and now I can use the analysis engines which are wrapped in an aggregate analysis engine with the converters. This is of course not an optimal solution, but I do not see a realistically better one. Can you provide a better one that will work, e.g, for combining cTAKES and DKPro Core components up to the parser level without loss of information? If yes, I'll be the first to adapt it. Best, Peter > Jörn >
