[ 
https://issues.apache.org/jira/browse/TIKA-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-1330:
------------------------------
    Attachment: TIKA-1330v1-patch.zip

This is the first version of tika-batch.  Much cleanup remains.

This first patch is intended to start the framework and offer concrete classes 
for filesystem (FS) handling...single input directory and single output 
directory.

I've included the patch against trunk, example log4jxml files, an example 
batch-config file and two sh scripts to kick off the two different processes.

Any and all feedback is welcomed!

> Add robust tika-batch code
> --------------------------
>
>                 Key: TIKA-1330
>                 URL: https://issues.apache.org/jira/browse/TIKA-1330
>             Project: Tika
>          Issue Type: Sub-task
>          Components: cli, general, server
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>         Attachments: TIKA-1330v1-patch.zip
>
>
> In my current design plan, I see creating a separate component "tika-batch" 
> that includes a small bit of configurable code to run Tika against a large 
> batch of documents.  This code should be robust against OOM and hangs, and it 
> should have fairly robust logging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to