[
https://issues.apache.org/jira/browse/TIKA-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18037982#comment-18037982
]
Mikhail Khludnev edited comment on TIKA-4529 at 11/13/25 7:42 AM:
------------------------------------------------------------------
Briefly describe my experience.
h2. Context. Cloud Container Job
Some cloud trigger launches tika-app.jar container with mounted S3 and it
converts file to file (they are s3 objects underneath).
h2. Issues
CLI seems 3 of 5.
* There's no file to file mode arguments. However there's an opportunity to
handle many files per invocation via batch mode - good.
* batch mode makes -v options unavailable
* -bc argument is announced, but it doesn't work
h2. Advantages
beside of all Tika goodness:
* this scenario requires complex arguments. However the image has {{python3}}.
I mount a script, which reads notification from cloud, composes filesList text
file and invokes `java -jar tika-app.jar ...` with necessary args.
*
Such script hardly can be reused, but can be provided as a sample.
h2. Summary
just providing {{org.apache.tika.cli.TikaCLI}} in existing docker image will
let users to cover this scenario via overriding {{ENTRYPOINT}}
was (Author: mkhludnev):
Briefly describe my experience.
h2. Context. Cloud Container Job
Some cloud trigger launches tika-app.jar container with mounted S3 and it
converts file to file (they are s3 objects underneath).
h2. Issues
CLI seems 3 of 5.
* There's no file to file mode arguments. However there's an opportunity to
handle many files per invocation via batch mode - good.
* batch mode makes -v options unavailable
* -bc argument is announced, but it doesn't work
h2. Advantages
beside of all Tika goodness:
* this scenario requires complex arguments. However the image has {{python3}}.
I mount a script, which reads notification from cloud composes filesList text
file and invokes `java -jar tika-app.jar ...` with necessary args.
Such script hardly can be reused, but can be provided as a sample.
h2. Summary
just providing {{org.apache.tika.cli.TikaCLI}} in existing docker image will
let users to cover this scenario via overriding {{ENTRYPOINT}}
> tika-app docker image
> ---------------------
>
> Key: TIKA-4529
> URL: https://issues.apache.org/jira/browse/TIKA-4529
> Project: Tika
> Issue Type: Wish
> Components: docker
> Reporter: Mikhail Khludnev
> Priority: Minor
> Attachments: Dockerfile
>
>
> It would be nice to have docker image with tika-app.jar. It's quite usable
> for serverless integration for clouds. eg it's possible to start "serverless"
> container (think about aws batch), mount object storages and convert one (or
> many??) files
--
This message was sent by Atlassian Jira
(v8.20.10#820010)