Am 10.06.2012 um 19:03 schrieb Marshall Schor:

> Hmmm,  it seems to me that something is wrong if a UIMA pipeline ended up 
> sending a CAS to a sofa-unaware component without a default view having been 
> set up.  I would guess that in this situation, it would be better to throw an 
> exception rather than hide this by automatically creating the view.   If a 
> missing view is created, its subject-of-analysis would be left unset?  I'm 
> guessing that most sofa-unaware annotators would not expect that, and would 
> fail in mysterious ways.
> 
> What would be the use cases where it would be more valuable to create the 
> view, rather than signal something's amiss?

My use-case is an aggregate analysis engine that uses a CollectionReader as its 
first component (a CasMultiplier may also work, I didn't test that). UIMA 
doesn't support sofa mappings for readers other than in CPEs (or I missed 
something). We would like to add support for sofa-mapped readers in uimaFIT 
though and would like to do so implementing as little infrastructure as 
possible on top of UIMA. Ideally, we'd just cleverly configure UIMA to get the 
feature implemented.

So, to work around that fact that CollectionReaderDescriptions do not support 
sofa mappings, I configured an AnalysisEngineDescription for a 
CollectionReader. UIMA internally doesn't really care much which kind of 
processing component is declared in an AnalysisEngineDescription, because 
internally it is all handled the same. I dimly remember a post to one of the 
UIMA mailing lists saying that the distinction between readers, analysis 
engines and consumers is largely arbitrary and that everything could be done 
with CasMultipliers as well.

So when I run the aggregate, the collection reader tries to write data to some 
mapped sofa, but the sofa does not yet exist. The reader is not sofa-aware, so 
it shouldn't have to create its initial view itself. If I use a sofa-unaware 
CasMultiplier instead, I suppose the same thing will happen. The 
reader/CasMultiplier would set the sofa of course, but since it is 
sofa-unaware, it wouldn't create the view.

I guess another option should be to change CollectionReaderAdapter to create 
any missing initial view for sofa-unaware readers. That would not have any side 
other component type and it would solve the problem for my use-case as well. 
The problem is, that doesn't work, because the 
PrimitiveAnalysisEngine_impl.classAnalysisComponentProcess() already tries to 
access the mapped view and fails. Changing that to test if the 
mAnalysisComponent is a sofa-unaware CollectionReaderAdapter and creating a new 
view only in that case looks rather like a hack to me, although it would 
probably resolve the situation. I didn't test that yet, but if you think it 
reasonable, I can check it.

Actually, thinking about it, I wonder if missing views should not be created on 
the first request in general. I have several times seen people use some helper 
methods that try to get a view and if an exception is thrown create the view 
and return it.

Or maybe it'd make sense to simply add the possibility to declare sofa mappings 
to the CollectionReaderDescription.

-- Richard

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
[email protected] 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 






Reply via email to