[ 
https://issues.apache.org/jira/browse/MIME4J-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated MIME4J-249:
-------------------------------
    Description: 
On TIKA-1970, [[email protected]] submitted an email that he 
generated in Mac Mail by "saving as text."  The file is available 
[here|https://issues.apache.org/jira/secure/attachment/12804129/Testemail-nodate.txt].
  The date is of format {{16 May 2016 at 09:30:32 GMT+1}}, and we're getting a 
{{null}} when we use the LenientFieldParser to parse the date field.

After fixing that, I ran our Mime4j wrapper on ~19k rfc822 files from our 
regression testing corpus.  I found that mime4j didn't have a parse for ~3700 
files (~20%).  I added some substantial workarounds in Tika, and would be happy 
to contribute test files/code.

I also found that Mime4j was misparsing dates of format {{14 Dec 95 00:16:22 
GMT}} as the year 95 A.D.

  was:On TIKA-1970, [[email protected]] submitted an email 
that he generated in Mac Mail by "saving as text."  The file is available 
[here|https://issues.apache.org/jira/secure/attachment/12804129/Testemail-nodate.txt].
  The date is of format {{16 May 2016 at 09:30:32 GMT+1}}, and we're getting a 
{{null}} when we use the LenientFieldParser to parse the date field.


> Date parser could be more robust
> --------------------------------
>
>                 Key: MIME4J-249
>                 URL: https://issues.apache.org/jira/browse/MIME4J-249
>             Project: James Mime4j
>          Issue Type: Improvement
>          Components: parser (core)
>    Affects Versions: 0.7.2
>            Reporter: Tim Allison
>
> On TIKA-1970, [[email protected]] submitted an email that he 
> generated in Mac Mail by "saving as text."  The file is available 
> [here|https://issues.apache.org/jira/secure/attachment/12804129/Testemail-nodate.txt].
>   The date is of format {{16 May 2016 at 09:30:32 GMT+1}}, and we're getting 
> a {{null}} when we use the LenientFieldParser to parse the date field.
> After fixing that, I ran our Mime4j wrapper on ~19k rfc822 files from our 
> regression testing corpus.  I found that mime4j didn't have a parse for ~3700 
> files (~20%).  I added some substantial workarounds in Tika, and would be 
> happy to contribute test files/code.
> I also found that Mime4j was misparsing dates of format {{14 Dec 95 00:16:22 
> GMT}} as the year 95 A.D.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to