I typically use JavaMail to parse eml files, but it is not terribly
forgiving. I've looked at using Mime4J in some situations, most notably
when there are invalid headers, and its leniency is great!
For whatever reason, we sometimes get messages where date fields do not
have quotes around them. For example:
Content-Type: text/plain; name="attachment.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="attachment.txt";
size=64;
creation-date=Sat, 30 Apr 2005 19:28:29 -0300;
modification-date=Sat, 30 Apr 2005 19:28:29 -0300
JavaMail cannot parse these, but Mime4J can, and the DOM APIs, it will
easily re-write these headers to be compliant. However, the DOM APIs
sometimes modify other parts of the source message (seems to be related to
parts being labeled "quoted-printable" but not being so), so I've started
looking at the streaming components.
Ideally, I would like to leave the original message as-is, even if it is
otherwise not correct, except for these headers (either rewriting all
headers or just Content-Disposition which appears to be the only place this
issue occurs).
Does anyone have an example of how to do such a modification (or something
close enough, such as using a stream parser to make a copy of the original
message)?
Thanks in advance!