[
https://issues.apache.org/jira/browse/UIMA-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Klügl reassigned UIMA-3512:
---------------------------------
Assignee: Peter Klügl
> Add additional engine parameter for Ruta HtmlConverter to configure linebreak
> replacement.
> ------------------------------------------------------------------------------------------
>
> Key: UIMA-3512
> URL: https://issues.apache.org/jira/browse/UIMA-3512
> Project: UIMA
> Issue Type: Improvement
> Components: ruta
> Affects Versions: 2.1.1ruta
> Reporter: Philip-Daniel Beck
> Assignee: Peter Klügl
> Fix For: 2.1.1ruta
>
> Attachments: linebreakReplacementEngineParameter.core_patch,
> linebreakReplacementEngineParameter.docbook_patch
>
>
> When converting an HTML file to plain text with HtmlConverter engine in Ruta,
> there exists an engine parameter "replaceLinebreaks" of type boolean to
> decide if text linebreaks should be replaced or not. If set to true, all
> linebreaks are kept in the document. If set to false, all linebreaks are
> deleted. Therefore, the last word of a line and the first word of the next
> line are put together without whitespace in between. It would often be better
> if a linebreak is replaced by a whitespace. To configure this, another engine
> parameter that defines the String, the linebreak is replaced with, would be
> useful.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)