[ https://issues.apache.org/jira/browse/TIKA-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison updated TIKA-1330: ------------------------------ Attachment: TIKA-1330v1-patch.zip This is the first version of tika-batch. Much cleanup remains. This first patch is intended to start the framework and offer concrete classes for filesystem (FS) handling...single input directory and single output directory. I've included the patch against trunk, example log4jxml files, an example batch-config file and two sh scripts to kick off the two different processes. Any and all feedback is welcomed! > Add robust tika-batch code > -------------------------- > > Key: TIKA-1330 > URL: https://issues.apache.org/jira/browse/TIKA-1330 > Project: Tika > Issue Type: Sub-task > Components: cli, general, server > Reporter: Tim Allison > Assignee: Tim Allison > Attachments: TIKA-1330v1-patch.zip > > > In my current design plan, I see creating a separate component "tika-batch" > that includes a small bit of configurable code to run Tika against a large > batch of documents. This code should be robust against OOM and hangs, and it > should have fairly robust logging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)