[ 
https://issues.apache.org/jira/browse/TIKA-461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004163#comment-13004163
 ] 

Sjoerd Smeets edited comment on TIKA-461 at 3/8/11 7:52 PM:
------------------------------------------------------------

Added a CC and BCC metadata testdocument plus a testdocument for testing email 
with big addresslists for the TIKA-461-config-patch

      was (Author: ssmeets):
    CC and BCC metadata testdocument plus a test for testing email with big 
addresslists for the TIKA-461-config-patch
  
> RFC822 messages not parsed
> --------------------------
>
>                 Key: TIKA-461
>                 URL: https://issues.apache.org/jira/browse/TIKA-461
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 0.7
>            Reporter: Joshua Turner
>            Assignee: Julien Nioche
>         Attachments: TIKA-461-config.patch, TIKA-461-parse.patch, 
> TIKA-461-plus-tests-1.patch, TIKA-461.patch, testRFC822-CC-BCC, 
> testRFC822-big, testRFC822-multipart
>
>
> Presented with an RFC822 message exported from Thunderbird, AutodetectParser 
> produces an empty body, and a Metadata containing only one key-value pair: 
> "Content-Type=message/rfc822". Directly calling MboxParser likewise gives an 
> empty body, but with two metadata pairs: "Content-Encoding=us-ascii 
> Content-Type=application/mbox".
> A quick peek at the source of MboxParser shows that the implementation is 
> pretty naive. If the wiring can be sorted out, something like Apache James' 
> mime4j might be a better bet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to