[ https://issues.apache.org/jira/browse/TIKA-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187205#comment-15187205 ]
Nick Burch commented on TIKA-1508: ---------------------------------- > I think that's exactly what ParseContext should be for..it should be a > vehicle for Param passing. We can delineate by property name (FQ) and/or by > class. I view {{ParseContext}} as somewhere you configure things on a per-document basis, not a per-parser basis. So, need to set where Tesseract lives on your system? Applies to everything, so on the parser. Need to tell Tesseract to use a German not an English dictionary on this particular jpeg? Applies to just this one document being parserd, so on the {{ParseContext}} > Add uniformity to parser parameter configuration > ------------------------------------------------ > > Key: TIKA-1508 > URL: https://issues.apache.org/jira/browse/TIKA-1508 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Fix For: 1.13 > > > We can currently configure parsers by the following means: > 1) programmatically by direct calls to the parsers or their config objects > 2) sending in a config object through the ParseContext > 3) modifying .properties files for specific parsers (e.g. PDFParser) > Rather than scattering the landscape with .properties files for each parser, > it would be great if we could specify parser parameters in the main config > file, something along the lines of this: > {noformat} > <parser class="org.apache.tika.parser.audio.AudioParser"> > <params> > <int name="someparam1">2</int> > <str name="someOtherParam2">something or other</str> > </params> > <mime>audio/basic</mime> > <mime>audio/x-aiff</mime> > <mime>audio/x-wav</mime> > </parser> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)