[ 
https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997086#comment-13997086
 ] 

Tim Allison commented on TIKA-1295:
-----------------------------------

Good point. TextBag is not the right option, although it would be expedient. :) 
I see that there is an ALT PropertyType. Are there any plans to implement that 
(or did I miss the implementation somewhere) or should I give it a try?

For now, I'm going to turn off my incorrect multi-valued extraction for title 
and description in PDFParser.

> Make some Dublin Core items multi-valued
> ----------------------------------------
>
>                 Key: TIKA-1295
>                 URL: https://issues.apache.org/jira/browse/TIKA-1295
>             Project: Tika
>          Issue Type: Bug
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Minor
>             Fix For: 1.6
>
>
> According to: http://www.pdfa.org/2011/08/pdfa-metadata-xmp-rdf-dublin-core, 
> dc:title, dc:description and dc:rights should allow multiple values because 
> of language alternatives.  Unless anyone objects in the next few days, I'll 
> switch those to Property.toInternalTextBag() from Property.toInternalText().  
> I'll also modify PDFParser to extract dc:rights.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to