[ 
https://issues.apache.org/jira/browse/TIKA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771026#comment-15771026
 ] 

David Pilato commented on TIKA-2227:
------------------------------------

Sorry. Answer is {{TikaCoreProperties.KEYWORDS}}.

Don't know I missed it... :) 

> Replacement of MSOffice#KEYWORDS for RTF and ODT docs
> -----------------------------------------------------
>
>                 Key: TIKA-2227
>                 URL: https://issues.apache.org/jira/browse/TIKA-2227
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: David Pilato
>            Priority: Minor
>
> I'm trying to extract metadata from different type of documents.
> I'm using for that {{metadata.get(MSOffice.KEYWORDS)}} but it's marked as 
> {{Deprecated}} by {{Office}} class.
> So I changed my code to use now {{metadata.get(Office.KEYWORDS)}} instead.
> It does not work for 2 types of docs: 
> * RTF: 
> https://github.com/dadoonet/fscrawler/blob/master/src/test/resources/documents/test.rtf
> * ODT: 
> https://github.com/dadoonet/fscrawler/blob/master/src/test/resources/documents/test.odt
> It seems that RTF and ODT keywords are extracted to a {{"Keyword"}} metadata 
> name although they should probably be generated to {{"meta:keyword"}}.
> You can reuse if needed the documents I linked to here as test case if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to