[ 
https://issues.apache.org/jira/browse/MAILBOX-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253875#comment-17253875
 ] 

René Cordier commented on MAILBOX-403:
--------------------------------------

[~btellier] can you help close this ticket? I don't have the right on MAILBOX. 
Thanks!

> Email main body is also indexed as an attachment
> ------------------------------------------------
>
>                 Key: MAILBOX-403
>                 URL: https://issues.apache.org/jira/browse/MAILBOX-403
>             Project: James Mailbox
>          Issue Type: Bug
>            Reporter: Benoit Tellier
>            Priority: Major
>
> h2. What
> I discovered that the main body part, holding the text of an email, and 
> already indexed as part of textBody/htmlBody properties, is also indexed as 
> an attachment.
> This behaviour is functionally wrong, as it returns attachment hits for terms 
> contained in the body of the message. 
> It also cause a larger index size, meaning more disk costs, and higher 
> latencies.
> h2. Definition of done
> Unit tests emonstrating ElasticSearch main bodies are no longer indexed as 
> attachments.
> h2. How
> Upon turning children subparts into attachment (flattening) only keep mime 
> parts that explicitly have a content-disposition (either inline or 
> attachment).
> This by the way avoids indexing multiparts as attachments (they were not 
> filtered out...)
> Proposed fix: https://github.com/linagora/james-project/pull/4152



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org
For additional commands, e-mail: server-dev-h...@james.apache.org

Reply via email to