[ 
https://issues.apache.org/jira/browse/TIKA-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558384#comment-16558384
 ] 

Tim Allison edited comment on TIKA-2694 at 7/26/18 3:02 PM:
------------------------------------------------------------

This is the way that "addresses" can be stored in Outlook -- x500 vs smtp.  
I've seen smtp email addresses in .msg, but these Outlook exchange (x500) 
addresses are quite common, and very annoying if you're expecting actual email 
addresses.  If you can find that the smtp email address is stored somewhere in 
the MAPIMessage object for this file, let us know.

See: http://askme4tech.com/exchange-server-x500-address-amazing-thing-know


was (Author: talli...@mitre.org):
I'm pretty sure this is the way that "addresses" can be stored in Outlook.  
I've seen actual email addresses in .msg, but these Outlook exchange addresses 
are quite common, and very annoying if you're expecting actual email addresses. 
 If you can find that the actual email address is stored somewhere in the 
MAPIMessage object for this file, let us know.

> "From" headers is not always extracted correctly on msg mails
> -------------------------------------------------------------
>
>                 Key: TIKA-2694
>                 URL: https://issues.apache.org/jira/browse/TIKA-2694
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.17
>         Environment: CentOS 7
> Windows 10
>            Reporter: Celpan Valeria
>            Priority: Major
>         Attachments: Fw Anime User Analysis.msg
>
>
> For some emails we get instead of the email address for "From" field a value 
> which looks like `/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP 
> (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER`.
>  The issue seems to be connected to the library 
> `org.apache.poi:poi-scratchpad:3.17` as when running   
> `org.apache.tika.parser.microsoft.OutlookExtractor::OutlookExtractor(DirectoryNode,
>  ParserContext)` we get `this.msg.mainChunks.allChunks.SenderEmailAddress = 
> "/O=SONY/OU=EXCHANGE ADMINISTRATIVE GROUP 
> (FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=EBERGER"`.
>  Check attachment to reproduce this defect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to