[
https://issues.apache.org/jira/browse/TIKA-179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680430#action_12680430
]
Jukka Zitting commented on TIKA-179:
------------------------------------
Well, it should work also when the input is piped. I was able to reproduce
this, the --text option works when the input is given as a file argument, but
not when it's piped.
java -cp target/tika-0.3-standalone.jar org.apache.tika.cli.TikaCLI --text
PDF.pdf
> Tika stand alone CLI --text output mostly not working, other output formats
> are fine
> ------------------------------------------------------------------------------------
>
> Key: TIKA-179
> URL: https://issues.apache.org/jira/browse/TIKA-179
> Project: Tika
> Issue Type: Bug
> Components: cli
> Affects Versions: 0.2, 0.3
> Environment: Java 1.5 (also tried Java 1.6). OS used: Mac OS X,
> Linux (CentOS)
> Reporter: Paul Borgermans
> Assignee: Jukka Zitting
> Fix For: 0.3
>
>
> When using Tika standalone jar after mvn install in CLI mode, in most of my
> test documents (pdf, doc, ppt, odt, ), the plain text output option (-t or
> --text) does not produce any result. When using the other options (xml, html,
> metadata), the output is correct. Activating debug mode (-v) does not produce
> additional info either.
> When using the GUI, dragging and dropping does produce the expected results,
> also in the plain text tab/window
> I rebuilt tika many times in the past 2 months (cleared .m2 directory every
> time) from svn (latest revision tried: 724002), the CLI --text result is
> always the same: usually missing output.
> For now, I use the -x output option chained to html2txt as a workaround, but
> would prefer to use just tika to convert to plain text (which is used for
> further indexing in Solr).
> Thanks
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.