[ 
https://issues.apache.org/jira/browse/TIKA-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Burch resolved TIKA-695.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 1.1
    
> Custom properties on xlsx, docx, pptx
> -------------------------------------
>
>                 Key: TIKA-695
>                 URL: https://issues.apache.org/jira/browse/TIKA-695
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.10, 1.0
>         Environment: All OS
>            Reporter: Etienne Jouvin
>            Priority: Minor
>             Fix For: 1.1
>
>
> Parser on office Xfiles do not get custom properties.
> In class MetadataExtractor, method extract, only core and extended properties 
> are retrieve.
> I added something like this:
> extractMetadata(extractor.getCustomProperties(), metadata);
> {quote}
>       /**
>        * Add this method to read custom properties on document.
>        * 
>        * @param properties All custom properties.
>        * @param metadata Metadata to complete with read properties.
>        */
>       private void extractMetadata(CustomProperties properties, Metadata 
> metadata) {
>               
> org.openxmlformats.schemas.officeDocument.x2006.customProperties.CTProperties 
> propsHolder = properties.getUnderlyingProperties();
>               String value = null;
>               DateUtils dateUtils = DateUtils.getInstance();
>               BigDecimal bigDecimal;
>               for (CTProperty property : propsHolder.getPropertyList()) {
>                       /* Parse each property */
>                       if (property.isSetLpwstr()) {
>                               value = property.getLpwstr();
>                       } else if (property.isSetFiletime()) {
>                               value = 
> dateUtils.convertDate(property.getFiletime(), null);
>                       } else if (property.isSetDate()) {
>                               value = 
> dateUtils.convertDate(property.getDate(), null);
>                       } else if (property.isSetDecimal()) {
>                               bigDecimal = property.getDecimal();
>                               value = null == bigDecimal ? null : 
> bigDecimal.toString();
>                       } else if (property.isSetBool()) {
>                               value = 
> BooleanUtils.toStringTrueFalse(property.getBool());
>                       } else if (property.isSetInt()) {
>                               value = Integer.toString(property.getInt());
>                       } else if (property.isSetLpstr()) {
>                               value = property.getLpstr();
>                       } else if (property.isSetI4()) {
>                               /* Number in Excel for example.... Why i4 ? Ask 
> microsoft. */
>                               value = Integer.toString(property.getI4());
>                       } else {
>                               /* For other type, do nothing. */
>                               continue;
>                       }
>                       /* Add the custom prefix, as done in old office format. 
> */
>                       addProperty(metadata, "custom:" + property.getName(), 
> value);
>               }
>       }
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to