[ https://issues.apache.org/jira/browse/TIKA-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-4198. ------------------------------- Fix Version/s: 3.0.0 Resolution: Fixed > Skip blob fields in geopkg files > -------------------------------- > > Key: TIKA-4198 > URL: https://issues.apache.org/jira/browse/TIKA-4198 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Priority: Minor > Fix For: 3.0.0 > > > Some geopkg tables store "geom" information in blob fields, starting with > magic: 47 50 00... > By default Tika handles blobs as embedded files. This can cause serious > resource waste on geopkg files that contain hundreds of thousands of rows > with a geom field. > We should create a new parser for geopkg that subclasses the sqlite parser > and skips blobs from the geom fields by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)