[ https://issues.apache.org/jira/browse/TIKA-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13997086#comment-13997086 ]
Tim Allison commented on TIKA-1295: ----------------------------------- Good point. TextBag is not the right option, although it would be expedient. :) I see that there is an ALT PropertyType. Are there any plans to implement that (or did I miss the implementation somewhere) or should I give it a try? For now, I'm going to turn off my incorrect multi-valued extraction for title and description in PDFParser. > Make some Dublin Core items multi-valued > ---------------------------------------- > > Key: TIKA-1295 > URL: https://issues.apache.org/jira/browse/TIKA-1295 > Project: Tika > Issue Type: Bug > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Minor > Fix For: 1.6 > > > According to: http://www.pdfa.org/2011/08/pdfa-metadata-xmp-rdf-dublin-core, > dc:title, dc:description and dc:rights should allow multiple values because > of language alternatives. Unless anyone objects in the next few days, I'll > switch those to Property.toInternalTextBag() from Property.toInternalText(). > I'll also modify PDFParser to extract dc:rights. -- This message was sent by Atlassian JIRA (v6.2#6252)