[ 
https://issues.apache.org/jira/browse/TIKA-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180343#comment-17180343
 ] 

Akash edited comment on TIKA-3172 at 8/19/20, 7:46 AM:
-------------------------------------------------------

[~tallison] 

If we use above mentioned tika config file to extract a normal text file, the 
parser list returned by Tika is changing.

Like without config file, it returns org.apache.tika.parser.DefaultParser and 
org.apache.tika.parser.csv.TextAndCSVParser
 Using config file it returns org.apache.tika.parser.CompositeParser, 
org.apache.tika.parser.DefaultParser, and 
org.apache.tika.parser.csv.TextAndCSVParser

 

Is this expected behaviour or is there something wrong in the tika config file 
I have shared above ?


was (Author: akki1607):
[~tallison] 

If we use above mentioned tika config file to extract a normal text file, the 
parser list returned by Tika is changing.

Like without config file, it returns org.apache.tika.parser.DefaultParser and 
org.apache.tika.parser.csv.TextAndCSVParser
 Using config file it returns org.apache.tika.parser.CompositeParser, 
org.apache.tika.parser.DefaultParser, and 
org.apache.tika.parser.csv.TextAndCSVParser

 

Is this expected behaviour or is there something wrong in the tika config file ?

> PDF Parser configuration enable auto space using tika config file
> -----------------------------------------------------------------
>
>                 Key: TIKA-3172
>                 URL: https://issues.apache.org/jira/browse/TIKA-3172
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 1.24.1
>            Reporter: Akash
>            Priority: Major
>
> Need information on how to set enableAutoSpace using tika config file.
> {code:java}
> /<properties>
>   <parsers>
>     <parser class="org.apache.tika.parser.DefaultParser">
>       <parser-exclude class="org.apache.tika.parser.pdf.PDFParser"/>
>     </parser>
>     <parser class="org.apache.tika.parser.pdf.PDFParser">
>       <params>
>         <param name="enableAutoSpace" type="bool">false</param>
>       </params>
>     </parser>
>   </parsers>
> </properties>/ 
> {code}
> Above configuration is not working.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to