[ https://issues.apache.org/jira/browse/JAMES-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16000499#comment-16000499 ]
Quynh Nguyen commented on JAMES-2019: ------------------------------------- We used TextExtractor for two cases: - One for textBody - The other for preview In this case we expect - textBody should be replace empty html tag by "/n" - preview should normalize string It needs to consider when we do the ticket https://issues.apache.org/jira/browse/JAMES-2018 because the Jsoup will normalize string as default so textBody should be wrong in some cases > JMAP Preview is not display because of empty > -------------------------------------------- > > Key: JAMES-2019 > URL: https://issues.apache.org/jira/browse/JAMES-2019 > Project: James Server > Issue Type: Bug > Reporter: Quynh Nguyen > > This one will be fixed automatically when we do this one > https://issues.apache.org/jira/browse/JAMES-2018 > - With the huge HTML, TikaTextExtrator does not remove the white space then > when we truncate preview string, it returns MAX_LENGTH_TRUNCATION of space > - Another issue: Tika does not include <title> tag in extracted content but > Jsoup does --> will remote <title> tag with JsoupTextExtractor. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org