[ 
https://issues.apache.org/jira/browse/CONNECTORS-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julien Massiera updated CONNECTORS-1679:
----------------------------------------
    Description: The output of the HTML extractor is generated with escaped 
entities (eg '&' becomes '& amp ;'), which is not the wanted behavior as we 
want this connector to extract text from HTML as it is  (was: The output of the 
HTML extractor is generated with escaped entities (eg '&' becomes '&'), 
which is not the wanted behavior as we want this connector to extract text from 
HTML as it is)

> HTML Extractor: output has escaped entities
> -------------------------------------------
>
>                 Key: CONNECTORS-1679
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1679
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: HTML extractor
>    Affects Versions: ManifoldCF 2.20
>            Reporter: Julien Massiera
>            Priority: Major
>             Fix For: ManifoldCF 2.21
>
>         Attachments: patch-CONNECTORS-1679.txt
>
>
> The output of the HTML extractor is generated with escaped entities (eg '&' 
> becomes '& amp ;'), which is not the wanted behavior as we want this 
> connector to extract text from HTML as it is



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to