[ https://issues.apache.org/jira/browse/TIKA-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15504643#comment-15504643 ]
Tim Allison commented on TIKA-2069: ----------------------------------- Just realized that we might want to handle extraction of Actions and/or javascript from PDFs in a similar way? New+related ticket if anyone has an interest? > Extract Macro text from Microsoft Office documents > -------------------------------------------------- > > Key: TIKA-2069 > URL: https://issues.apache.org/jira/browse/TIKA-2069 > Project: Tika > Issue Type: Improvement > Components: detector, parser > Affects Versions: 1.13 > Environment: RHEL 5.x, Apache Tomcat > Reporter: Jeff Swindle > Labels: features > Attachments: excel-macro.PNG, test-macro-doc.docm, > test-macro-doc.docm-tika-app-output.txt, word-macro.PNG, xlsmacro.xlsm, > xlsmacro.xlsm.tika-app-output.txt > > > Tika supports macro-enabled Microsoft Office documents by extracting metadata > and contents, however, macros within the document are not in the metadata or > content output. > Desire is to have the macro text extracted also. > Info regarding macro extraction: http://www.decalage.info/vba_tools -- This message was sent by Atlassian JIRA (v6.3.4#6332)