[
https://issues.apache.org/jira/browse/TIKA-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ghenadie updated TIKA-2723:
---------------------------
Description:
Hello,
I have a file with .mht extension. Tika processes this file as an email (Is
Email? - true), and uses RFC822Parser to parse it.
This is an issue for me. And seems to be an issue from Tika. As far as this is
a web container, it should not be parsed through RFCParser (which is an email
parser).
Please investigate this issue as soon as possible.
Please let me know in case of any questions.
Thank you,
Ghenadie R.
was:
Hello,
I have a file with .mht extension. Tika processes this file as an email (Is
Email? - true), and uses RFC822Parser to parse it. As a result, I have the
content with email fields, as: From, To, CC, BCC, Subject.
This is an issue for me. And seems to be an issue from Tika. As far as this is
a web container, it should not be parsed through RFCParser (which is an email
parser).
Please investigate this issue as soon as possible.
Please let me know in case of any questions.
Thank you,
Ghenadie R.
> Issue with parsing .mht container
> ---------------------------------
>
> Key: TIKA-2723
> URL: https://issues.apache.org/jira/browse/TIKA-2723
> Project: Tika
> Issue Type: Bug
> Components: mime
> Affects Versions: 1.17
> Reporter: Ghenadie
> Priority: Major
> Labels: patch
> Fix For: 1.17
>
> Attachments: Sample-excel.mht, [TIKA-2723] Issue with parsing _mht
> container - ASF JIRA.mht
>
>
> Hello,
> I have a file with .mht extension. Tika processes this file as an email (Is
> Email? - true), and uses RFC822Parser to parse it.
> This is an issue for me. And seems to be an issue from Tika. As far as this
> is a web container, it should not be parsed through RFCParser (which is an
> email parser).
> Please investigate this issue as soon as possible.
> Please let me know in case of any questions.
>
> Thank you,
> Ghenadie R.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)