[ 
https://issues.apache.org/jira/browse/CONNECTORS-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970023#comment-15970023
 ] 

Karl Wright commented on CONNECTORS-1410:
-----------------------------------------

[~kamaci]:  This code is not bounded in memory use; the entire message body 
must be read in here in order to be decoded.  That's not allowed for ManifoldCF.

{code}
-              InputStream is = msg.getInputStream();
+              InputStream is = new 
ByteArrayInputStream(extractBodyContent(msg).getBytes(StandardCharsets.UTF_8));
{code}

[~cguzel] The door is closed for non-critical fixes for 2.7.  This fix has 
problems (described above) and does not seem critical to me.  I am not going to 
hold the release for new features at this point.


> Binary Attachment Data as Plain Text at Email Content
> -----------------------------------------------------
>
>                 Key: CONNECTORS-1410
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1410
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Email connector
>    Affects Versions: ManifoldCF 2.6
>            Reporter: Furkan KAMACI
>            Assignee: Furkan KAMACI
>             Fix For: ManifoldCF 2.8
>
>         Attachments: CONNECTORS-1410.patch
>
>
> Previously, we were indexing e-mails and its attachments together. We changed 
> this logic with CONNECTORS-1375 as indexing e-mail and its attachments 
> separately.
> However, there is a problem. Content fields of emails which has attachment(s) 
> includes both body and attachments's binary content as plain text.
> As we index attachments separately, we can just index body as content instead 
> of appending email body and all attachments' binary data as plain text.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to