On 23.10.2013 16:34, Marshall Schor wrote:
> On 10/23/2013 8:36 AM, Peter Klügl wrote:
>> Is it correct that the type system may not change if the analysis engine
>> implementation extends JCasAnnotator_ImplBase? I somehow miss the method
>> typeSystemInit(). Hmm, should I really switch to CasAnnotator_ImplBase,
>> or do I have missed something?
> I think the type system is "equal" for these 2 CASes, but not "==", since the
> "failing" case recreates a new CAS from the identical metadata. 


The type were not equal enough for a HashMap :-)


> UIMA is designed with the lifecycle:  1) assemble / configure pipeline,
> including merging type systems; 2) use the internal Java objects that were
> created in (1) to process multiple work-items, typically by reusing CASes (via
> the reset()) or by getting new CASes from the AnalysisEngine representing the
> top level of the pipeline using "analysisEngine.newJCas()" or
> analysisEngine.newCas().  This produces new CASes where the type system impl
> objects are == (identical).
>
> Approaches which produce type system objects which are equal but not == should
> be discouraged.
>
> You could probably easily detect when a user passes a CAS where the type 
> system
> is not ==, and redo your internal setups...

Stupid me. Yes, I did that now.

Thanks :-)

Peter


> -Marshall
>
>
>> Peter
>>
>> On 23.10.2013 14:35, Peter Klügl (JIRA) wrote:
>>>     [ 
>>> https://issues.apache.org/jira/browse/UIMA-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802850#comment-13802850
>>>  ] 
>>>
>>> Peter Klügl commented on UIMA-3357:
>>> -----------------------------------
>>>
>>> Thanks for reporting this. I added a test for now.
>>>
>>> The problem is that the type system has changed, at least its 
>>> representation in java, but nobody told the analysis engine about it. On 
>>> the one hand, the environment of the script stores the known types. This is 
>>> initiated  by {{initializeTypes()}} either if the analysis engine was not 
>>> initialized yet or if the analysis engine is forced to update itself with 
>>> each process call (parameter reloadScript). On the other hand, the internal 
>>> "indexing" (begin and end map in RutaBasic) uses the current CAS, its 
>>> annotations and their types. So we have different type objects that cause 
>>> problems.
>>>
>>>
>>>> CONTAINS fails when running script as AE in a pipeline with a new CAS
>>>> ---------------------------------------------------------------------
>>>>
>>>>                 Key: UIMA-3357
>>>>                 URL: https://issues.apache.org/jira/browse/UIMA-3357
>>>>             Project: UIMA
>>>>          Issue Type: Bug
>>>>          Components: ruta, uimaFIT
>>>>    Affects Versions: 2.0.1ruta, 2.1.0ruta
>>>>            Reporter: Daniel Maeurer
>>>>            Assignee: Peter Klügl
>>>>            Priority: Minor
>>>>
>>>> When running my Ruta script as an analysis engine in a pipeline, it does 
>>>> not work correctly when creating a new CAS and processing the pipeline a 
>>>> second time with the new CAS. 
>>>> While reusing the old cas with "cas.reset()" is working, creating a new 
>>>> CAS results in failing rules including "CONTAINS" in the ruta script.
>>>> The ruta script used in the example:
>>>> {code:title=mystic.ruta|borderStyle=solid}
>>>> PACKAGE de.tudarmstadt.algo.vpino.ruta;
>>>> DECLARE test;
>>>> Document{CONTAINS(CW)->MARK(test)};
>>>> {code}
>>>> The following Java class can reproduce the error. It creates four xmi 
>>>> files. The last xmi file is missing the annotations created with rules 
>>>> including "CONTAINS".
>>>> {code:title=MysticPipe.java|borderStyle=solid}
>>>> package org.uimafit.pipeline;
>>>> import java.io.File;
>>>> import java.io.FileOutputStream;
>>>> import java.io.IOException;
>>>> import java.io.OutputStream;
>>>> import java.util.ArrayList;
>>>> import java.util.List;
>>>> import org.apache.uima.UIMAFramework;
>>>> import org.apache.uima.analysis_engine.AnalysisEngine;
>>>> import org.apache.uima.analysis_engine.AnalysisEngineDescription;
>>>> import org.apache.uima.analysis_engine.AnalysisEngineProcessException;
>>>> import org.apache.uima.cas.CAS;
>>>> import org.apache.uima.cas.impl.XmiCasSerializer;
>>>> import org.apache.uima.fit.factory.AnalysisEngineFactory;
>>>> import org.apache.uima.fit.pipeline.SimplePipeline;
>>>> import org.apache.uima.resource.ResourceInitializationException;
>>>> import org.apache.uima.resource.metadata.ResourceMetaData;
>>>> import org.apache.uima.util.CasCreationUtils;
>>>> import org.apache.uima.util.InvalidXMLException;
>>>> import org.apache.uima.util.XMLInputSource;
>>>> import org.apache.uima.util.XMLSerializer;
>>>> import org.xml.sax.SAXException;
>>>> public class MysticPipe {
>>>>    public static void main(String[] args) throws Exception {
>>>>            working("This is a test.", initPipeline());
>>>>            failing("This is a test.", initPipeline());
>>>>    }
>>>>    private static AnalysisEngine initPipeline() throws 
>>>> ResourceInitializationException, IOException, InvalidXMLException {
>>>>            File specFile = new 
>>>> File("./descriptor/de/tudarmstadt/algo/vpino/ruta/mysticEngine.xml");
>>>>            XMLInputSource in = new XMLInputSource(specFile);
>>>>            AnalysisEngineDescription ruta = (AnalysisEngineDescription) 
>>>> UIMAFramework.getXMLParser().parseResourceSpecifier(in);
>>>>            return AnalysisEngineFactory.createEngine(ruta);
>>>>    }
>>>>    private static void working(String input, AnalysisEngine theEngine) 
>>>> throws ResourceInitializationException, AnalysisEngineProcessException, 
>>>> IOException,
>>>>            SAXException {
>>>>            final List<ResourceMetaData> metaData = new 
>>>> ArrayList<ResourceMetaData>();
>>>>            metaData.add(theEngine.getMetaData());
>>>>            final CAS cas = CasCreationUtils.createCas(metaData);
>>>>            System.out.println("create a new cas...");
>>>>            cas.setDocumentLanguage("de");
>>>>            cas.setDocumentText(input);
>>>>            SimplePipeline.runPipeline(cas, theEngine);
>>>>            writeXmiFile(cas, "works_test1");//CHECK
>>>>            //THE DIFFERENCE
>>>>            cas.reset();
>>>>            //END DIFFERENCE
>>>>            System.out.println("create a new cas...");
>>>>            cas.setDocumentLanguage("de");
>>>>            cas.setDocumentText(input);
>>>>            SimplePipeline.runPipeline(cas, theEngine);
>>>>            writeXmiFile(cas, "works_test2");//CHECK
>>>>    }
>>>>    private static void failing(String input, AnalysisEngine theEngine) 
>>>> throws ResourceInitializationException, AnalysisEngineProcessException, 
>>>> IOException,
>>>>            SAXException {
>>>>            final List<ResourceMetaData> metaData = new 
>>>> ArrayList<ResourceMetaData>();
>>>>            metaData.add(theEngine.getMetaData());
>>>>            final CAS cas = CasCreationUtils.createCas(metaData);
>>>>            System.out.println("create a new cas...");
>>>>            cas.setDocumentLanguage("de");
>>>>            cas.setDocumentText(input);
>>>>            SimplePipeline.runPipeline(cas, theEngine);
>>>>            writeXmiFile(cas, "works_test3"); // CHECK
>>>>            //THE DIFFERENCE
>>>>            final CAS cas2 = CasCreationUtils.createCas(metaData);
>>>>            //END DIFFERENCE
>>>>            System.out.println("create a new cas...");
>>>>            cas2.setDocumentLanguage("de");
>>>>            cas2.setDocumentText(input);
>>>>            SimplePipeline.runPipeline(cas2, theEngine);
>>>>            writeXmiFile(cas2, "fail_test4"); //FAIL
>>>>            return;
>>>>    }
>>>>    
>>>>    public static void writeXmiFile(CAS aCas, String Fname) throws 
>>>> IOException, SAXException {
>>>>            File outFile = new File("output", Fname + ".xmi");
>>>>            OutputStream out = null;
>>>>            try {
>>>>                    // out = new StringOutputStream();
>>>>                    out = new FileOutputStream(outFile);
>>>>                    XmiCasSerializer ser = new 
>>>> XmiCasSerializer(aCas.getTypeSystem());
>>>>                    XMLSerializer xmlSer = new XMLSerializer(out, false);
>>>>                    ser.serialize(aCas, xmlSer.getContentHandler());
>>>>            } finally {
>>>>                    if (out != null) {
>>>>                            out.close();
>>>>                    }
>>>>            }
>>>>    }
>>>> }
>>>> {code}
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.1#6144)
>>>

Reply via email to