Hi all,
I'm just starting to get familiar with UIMA Ruta and the workbench, and
I'm having some strange issues.
I got a project from a co-worker who already prepared some scripts for
me to extend. The project has .html files in the input folder, and he
already provided a Ruta script to convert HTML markup into annotations.
The script is adapted from the Ruta manual:
ENGINE utils.HtmlAnnotator;
ENGINE utils.HtmlConverter;
ENGINE HtmlViewWriter;
TYPESYSTEM utils.HtmlTypeSystem;
TYPESYSTEM utils.SourceDocumentInformation;
Document{->CONFIGURE(HtmlAnnotator, "onlyContent"=true),
EXEC(HtmlAnnotator, {TAG})};
Document { -> CONFIGURE(HtmlConverter, "inputView" = "_InitialView",
"outputView" = "plain", "expandOffsets"=false,
"replaceLinebreaks"=true, "skipWhitespacs"=true,
"linebreakReplacement"=" ", "processAll"=true),
EXEC(HtmlConverter)};
Document{ -> CONFIGURE(HtmlViewWriter, "inputView" = "plain",
"outputView" = "_InitialView", "output" = "../converted"),
EXEC(HtmlViewWriter)};
On my machine and with my settings, when I run this script, my console
get spammed with
org.apache.uima.analysis_engine.AnalysisEngineProcessExceptions caused
by java.io.FileNotFoundException
with the message "../converted (Permission denied)". I checked the
file permissions on this directory which were 775 - I even chmodded to
777 but still the same issue.
In spite of all these exceptions, the output still gets generated,
though. I would be fine with it if there weren't another issue -
although the script should write the annotations into _InitialView, I
need to change the view to "plain" in the editor to get plain text with
HTML annotations. The _InitialView still shows the html markup.
I think both issues are related. Any ideas?
Cheers,
Mandy
System Info: eclipse Oxygen.3a Release (4.7.3a), UIMA Ruta workbench
2.6.1, OS Kubuntu 18.04