[ https://issues.apache.org/jira/browse/TIKA-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14280433#comment-14280433 ]
Tim Allison commented on TIKA-1511: ----------------------------------- Hmmmm... This will fail if someone sends in a custom EmbeddedDocumentExtractor because there is no way to pass the StatementTablePair to that interface via ParseContext. Some options: 1) We could go back to treating the db as one big doc, as we do with xls, but I think I'd prefer to treat each table as a separate doc. 2) We could get rid of the StatementTablePair hack, extract the text from each table into a String and then pass that into EmbeddedDocumentExtractor as the InputStream. The drawback to this is that we'd ignore the handler and lose potential <tr> <td> markup.... Any ideas on this? > Create a parser for SQLite3 > --------------------------- > > Key: TIKA-1511 > URL: https://issues.apache.org/jira/browse/TIKA-1511 > Project: Tika > Issue Type: New Feature > Components: parser > Affects Versions: 1.6 > Reporter: Luis Filipe Nassif > Fix For: 1.8 > > Attachments: TIKA-1511v1.patch, testSQLLite3b.db > > > I think it would be very useful, as sqlite is used as data storage by a wide > range of applications. Opening the ticket to track it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)