Hi all,

I'm just starting to get familiar with UIMA Ruta and the workbench, and I'm having some strange issues.

I got a project from a co-worker who already prepared some scripts for me to extend. The project has .html files in the input folder, and he already provided a Ruta script to convert HTML markup into annotations. The script is adapted from the Ruta manual:

ENGINE utils.HtmlAnnotator;
ENGINE utils.HtmlConverter;
ENGINE HtmlViewWriter;
TYPESYSTEM utils.HtmlTypeSystem;
TYPESYSTEM utils.SourceDocumentInformation;

Document{->CONFIGURE(HtmlAnnotator, "onlyContent"=true), EXEC(HtmlAnnotator, {TAG})};

Document { -> CONFIGURE(HtmlConverter, "inputView" = "_InitialView",
    "outputView" = "plain", "expandOffsets"=false, "replaceLinebreaks"=true, "skipWhitespacs"=true, "linebreakReplacement"=" ", "processAll"=true),
      EXEC(HtmlConverter)};

Document{ -> CONFIGURE(HtmlViewWriter, "inputView" = "plain",
    "outputView" = "_InitialView", "output" = "../converted"),
    EXEC(HtmlViewWriter)};

On my machine and with my settings, when I run this script, my console get spammed with org.apache.uima.analysis_engine.AnalysisEngineProcessExceptions caused by java.io.FileNotFoundException  with the message "../converted (Permission denied)". I checked the file permissions on this directory which were 775 - I even chmodded to 777 but still the same issue.

In spite of all these exceptions, the output still gets generated, though. I would be fine with it if there weren't another issue - although the script should write the annotations into _InitialView, I need to change the view to "plain" in the editor to get plain text with HTML annotations. The _InitialView still shows the html markup.

I think both issues are related. Any ideas?

Cheers,

Mandy


System Info: eclipse Oxygen.3a Release (4.7.3a), UIMA Ruta workbench 2.6.1, OS Kubuntu 18.04

Reply via email to