[ https://issues.apache.org/jira/browse/TIKA-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated TIKA-605: ----------------------------------- Attachment: TIKA-605.Mattmann.100914.2.patch.txt - ok here is a fully working complete test. Unit tests pass. System.out.printlns removed, and it handles all metadata now. I had to change the invocation command b/c the ExternalParser cannot both extract Metadata *and* XHTML output from the same stream. Instead, I carried forward the ExternalParser's applyPatterns strategy, and am simply calling that locally (since inheritance was blocked by private methods), and I'm simply using ExternalParser to set up the command invocation and parsing both the output and the metadata from this myself. Give it a whirl! > Tika GDAL parser > ---------------- > > Key: TIKA-605 > URL: https://issues.apache.org/jira/browse/TIKA-605 > Project: Tika > Issue Type: New Feature > Components: parser > Environment: indep. of env. > Reporter: Chris A. Mattmann > Assignee: Chris A. Mattmann > Labels: gdal, gsoc2013, integration, mentor, tika > Fix For: 1.7 > > Attachments: 0001-TIKA-605-Tika-GDAL-parser.patch, > TIKA-605.Mattmann.092511.patch.txt, TIKA-605.Mattmann.100914.1.patch.txt, > TIKA-605.Mattmann.100914.2.patch.txt > > > Leverage the GDAL toolkit and its Java SWIG bindings to create a Tika parser > around GDAL. See here: > http://trac.osgeo.org/gdal/browser/trunk/gdal/swig/java/apps/gdalinfo.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)