Hi,

does the plain vs _InitialView problem occur in the CASes in the output
folder or in the converted folder?


"output" should contain the result of the script processing. The
_InitialView is set by the launcher, it's static and cannot be changed.

"converted" should contain additional CASes where the plain view is
copied to the _InitialView, which hasn't been set yet.


(Although I think that I have written those rules as an example some
time ago, I personally prefer to perform the HTML conversion in Java)


Best,


Peter


Am 06.02.2019 um 16:18 schrieb Mandy Neumann:
> Hi,
>
> after some additional digging I found this setting in the workbench
> preferences where SourceDocumentInformation is used for the output
> parameter. This seems to have fixed the permission issue, I get no
> more exceptions.
>
> Unfortunately, the problem with plain vs. _InitialView still persists,
> which is kind of annoying. Any ideas on that? (I'd like to also make
> sure that this is not causing any further problems in my planned
> workflow.)
>
> Best,
>
> Mandy
>
> Am 06.02.19 um 15:40 schrieb Marshall Schor:
>> hi,
>>
>> I'm not an expert, but I'm guessing that there still is a permissions
>> issue,
>> perhaps on a different file or directory than the one you checked.
>>
>> Try having someone else take a look at your stack trace / error
>> message, and
>> your file system permissions.  A second pair of eyes often is helpful
>> (I speak
>> from personal experience).
>>
>> Cheers. -Marshall
>>
>> On 2/6/2019 5:44 AM, Mandy Neumann wrote:
>>> Hi all,
>>>
>>> I'm just starting to get familiar with UIMA Ruta and the workbench,
>>> and I'm
>>> having some strange issues.
>>>
>>> I got a project from a co-worker who already prepared some scripts
>>> for me to
>>> extend. The project has .html files in the input folder, and he already
>>> provided a Ruta script to convert HTML markup into annotations. The
>>> script is
>>> adapted from the Ruta manual:
>>>
>>>> ENGINE utils.HtmlAnnotator;
>>>> ENGINE utils.HtmlConverter;
>>>> ENGINE HtmlViewWriter;
>>>> TYPESYSTEM utils.HtmlTypeSystem;
>>>> TYPESYSTEM utils.SourceDocumentInformation;
>>>>
>>>> Document{->CONFIGURE(HtmlAnnotator, "onlyContent"=true),
>>>> EXEC(HtmlAnnotator,
>>>> {TAG})};
>>>>
>>>> Document { -> CONFIGURE(HtmlConverter, "inputView" = "_InitialView",
>>>>      "outputView" = "plain", "expandOffsets"=false,
>>>> "replaceLinebreaks"=true,
>>>> "skipWhitespacs"=true, "linebreakReplacement"=" ", "processAll"=true),
>>>>        EXEC(HtmlConverter)};
>>>>
>>>> Document{ -> CONFIGURE(HtmlViewWriter, "inputView" = "plain",
>>>>      "outputView" = "_InitialView", "output" = "../converted"),
>>>>      EXEC(HtmlViewWriter)};
>>> On my machine and with my settings, when I run this script, my
>>> console get
>>> spammed with
>>> org.apache.uima.analysis_engine.AnalysisEngineProcessExceptions
>>> caused by java.io.FileNotFoundException
>>>   with the message "../converted (Permission denied)". I checked the
>>> file
>>> permissions on this directory which were 775 - I even chmodded to
>>> 777 but
>>> still the same issue.
>>>
>>> In spite of all these exceptions, the output still gets generated,
>>> though. I
>>> would be fine with it if there weren't another issue - although the
>>> script
>>> should write the annotations into _InitialView, I need to change the
>>> view to
>>> "plain" in the editor to get plain text with HTML annotations. The
>>> _InitialView still shows the html markup.
>>>
>>> I think both issues are related. Any ideas?
>>>
>>> Cheers,
>>>
>>> Mandy
>>>
>>>
>>> System Info: eclipse Oxygen.3a Release (4.7.3a), UIMA Ruta workbench
>>> 2.6.1, OS
>>> Kubuntu 18.04
>>>
>>>
-- 
Dr. Peter Klügl
R&D Text Mining/Machine Learning

Averbis GmbH
Tennenbacher Str. 11
79106 Freiburg
Germany

Fon: +49 761 708 394 0
Fax: +49 761 708 394 10
Email: peter.klu...@averbis.com
Web: https://averbis.com

Headquarters: Freiburg im Breisgau
Register Court: Amtsgericht Freiburg im Breisgau, HRB 701080
Managing Directors: Dr. med. Philipp Daumke, Dr. Kornél Markó

Reply via email to