[ https://issues.apache.org/jira/browse/MAILBOX-403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253875#comment-17253875 ]
René Cordier commented on MAILBOX-403: -------------------------------------- [~btellier] can you help close this ticket? I don't have the right on MAILBOX. Thanks! > Email main body is also indexed as an attachment > ------------------------------------------------ > > Key: MAILBOX-403 > URL: https://issues.apache.org/jira/browse/MAILBOX-403 > Project: James Mailbox > Issue Type: Bug > Reporter: Benoit Tellier > Priority: Major > > h2. What > I discovered that the main body part, holding the text of an email, and > already indexed as part of textBody/htmlBody properties, is also indexed as > an attachment. > This behaviour is functionally wrong, as it returns attachment hits for terms > contained in the body of the message. > It also cause a larger index size, meaning more disk costs, and higher > latencies. > h2. Definition of done > Unit tests emonstrating ElasticSearch main bodies are no longer indexed as > attachments. > h2. How > Upon turning children subparts into attachment (flattening) only keep mime > parts that explicitly have a content-disposition (either inline or > attachment). > This by the way avoids indexing multiparts as attachments (they were not > filtered out...) > Proposed fix: https://github.com/linagora/james-project/pull/4152 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: server-dev-unsubscr...@james.apache.org For additional commands, e-mail: server-dev-h...@james.apache.org