[ https://issues.apache.org/jira/browse/CONNECTORS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970023#comment-15970023 ]
Karl Wright commented on CONNECTORS-1410: ----------------------------------------- [~kamaci]: This code is not bounded in memory use; the entire message body must be read in here in order to be decoded. That's not allowed for ManifoldCF. {code} - InputStream is = msg.getInputStream(); + InputStream is = new ByteArrayInputStream(extractBodyContent(msg).getBytes(StandardCharsets.UTF_8)); {code} [~cguzel] The door is closed for non-critical fixes for 2.7. This fix has problems (described above) and does not seem critical to me. I am not going to hold the release for new features at this point. > Binary Attachment Data as Plain Text at Email Content > ----------------------------------------------------- > > Key: CONNECTORS-1410 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1410 > Project: ManifoldCF > Issue Type: Bug > Components: Email connector > Affects Versions: ManifoldCF 2.6 > Reporter: Furkan KAMACI > Assignee: Furkan KAMACI > Fix For: ManifoldCF 2.8 > > Attachments: CONNECTORS-1410.patch > > > Previously, we were indexing e-mails and its attachments together. We changed > this logic with CONNECTORS-1375 as indexing e-mail and its attachments > separately. > However, there is a problem. Content fields of emails which has attachment(s) > includes both body and attachments's binary content as plain text. > As we index attachments separately, we can just index body as content instead > of appending email body and all attachments' binary data as plain text. -- This message was sent by Atlassian JIRA (v6.3.15#6346)