[
https://issues.apache.org/jira/browse/TIKA-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753683#action_12753683
]
Yonik Seeley edited comment on TIKA-193 at 9/10/09 9:26 AM:
------------------------------------------------------------
Hmmm, I'm testing Solr Cell from the current solr-trunk (which has Tika 0.4),
and I'm seeing Content-Type added twice, for PDFs only.
<arr name="attr_Content-Type">
<str>application/pdf</str>
<str>application/pdf</str>
</arr>
EDIT: false alarm - there was an old tika jar in the classpath.
was (Author: [email protected]):
Hmmm, I'm testing Solr Cell from the current solr-trunk (which has Tika
0.4), and I'm seeing Content-Type added twice, for PDFs only.
<arr name="attr_Content-Type">
<str>application/pdf</str>
<str>application/pdf</str>
</arr>
> PDFParser adds mime-type twice
> ------------------------------
>
> Key: TIKA-193
> URL: https://issues.apache.org/jira/browse/TIKA-193
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.3
> Reporter: Jonathan Koren
> Assignee: Jukka Zitting
> Priority: Minor
> Fix For: 0.4
>
> Attachments: patch
>
>
> Using AutoDetectParser to call PDFParser causes the mime-type to be added
> twice. It should be added exactly once.
> Proposed Fix:
> parser/pdf/PDFParser.java should be changed from:
> metadata.add(Metadata.CONTENT_TYPE, "application/pdf");
> to:
> metadata.set(Metadata.CONTENT_TYPE, "application/pdf");
> as per other Tika bundled parsers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.