[ https://issues.apache.org/jira/browse/TIKA-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900088#comment-15900088 ]
Tim Allison commented on TIKA-1879: ----------------------------------- For "from", I assumed a single sender (which isn't always the case with "on behalf of" and/or "via"), and I created separate fields for Exchange email formats, e.g. "/o=ExchangeLabs/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=polyspot1.onmicrosoft.com-50609-Some-One was mapped to: message_from_o=ExchangeLabs, message_from_ou=Exchange AdministrativeGroup (FY...) message_from_cn=polyspot1.... However, this won't map neatly to handling the "to" fields. One unsatisfactory option is to keep a parallel arrays of names, smtpemails and exchangeemails, with empty cells in the smtpemails when there is an exchange formatted email and vice versa. A cleaner option would be to have a single pair of parallel arrays with name[] and email[], where email[] would include the literal email value, whether it is smtp or exchange; the user would then have to parse an Exchange email address if they wanted to differentiate _o, _ou and _cn. [~mcaruanagalizia] and [~lfcnassif], any recommendations? > Extract recipient information in MSG files with more granularity > ---------------------------------------------------------------- > > Key: TIKA-1879 > URL: https://issues.apache.org/jira/browse/TIKA-1879 > Project: Tika > Issue Type: Improvement > Components: parser > Reporter: Tim Allison > Priority: Minor > > As proposed in the parent task, it might be nice to have a parallel array for > recipient name/recipient email for TO, CC and BCC. -- This message was sent by Atlassian JIRA (v6.3.15#6346)