[ 
https://issues.apache.org/jira/browse/MIME4J-196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleg Kalnichevski updated MIME4J-196:
-------------------------------------

    Fix Version/s: 0.8

I am working on a set of low level parsing routes that could be used to 
assemble more lenient / tolerant field parsers, but this issue may have to wait 
until 0.8

Oleg 

> Lenient parsing of Mailadresses should be a little more lenient
> ---------------------------------------------------------------
>
>                 Key: MIME4J-196
>                 URL: https://issues.apache.org/jira/browse/MIME4J-196
>             Project: JAMES Mime4j
>          Issue Type: Wish
>          Components: parser (core)
>            Reporter: Jens Wilmer
>            Priority: Trivial
>             Fix For: 0.8
>
>
> Parsing a mailaddress as in https://issues.apache.org/jira/browse/MIME4J-31 
> results in a ParseException. Parsing a mailaddress starting with a dot (.) 
> results in a ParseException.
> When parsing an addressfield with multiple adresses, the Exception occuring 
> while parsing a single address is caught and null is returned as the 
> resulting addresslist. (this breaks tika as it expects an empty list rather 
> than null)
> It would be nice if invalid addresses would be handled more gracefully when 
> in lenient mode. And it would be nice if at least the correct addresses would 
> be returned while parsing an addresslist with a corrupted address.
> I am using Mime4J via the Apache Tika project to extract text from emails for 
> indexing in Lucene. The textstream of tika is directly read by a lucene field 
> and indexing fails if an exception is thrown by Mime4J. This currently 
> happens every time a headerfield contains more than 1000 characters due to 
> tika using the unusable mime4j standardconfiguration ( 
> https://issues.apache.org/jira/browse/TIKA-640 ), and every time a malformed 
> emailaddress is encountered ( https://issues.apache.org/jira/browse/TIKA-641 
> ). 
> These problems can be taken care of in Tika, but there is no way for Tika to 
> retrieve the working mailaddresses out of a list, if Mime4j returns only 
> none; maybe this problem could be addressed in Mime4J.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to