Overriding settings in TikaConfig

Peter Kronenberg Wed, 10 Feb 2021 10:22:21 -0800

What is the difference between these two examples.

In the first one, I construct a new TesseractOCRConfig, make changes,  and add 
it to the parseContext.
In the second one, I get the TesseractOCRConfig from the TesseractOCRParser in 
the TikaConfig and make changes.  I don't add it to the parseContext since it 
doesn't seem to be necessary


Are these 2 things equivalent?


public static String parse(String file) throws TikaException, SAXException, 
IOException {

    AutoDetectParser parser = new AutoDetectParser(new TikaConfig());

    ParseContext parseContext = new ParseContext();

    TesseractOCRConfig tessConfig = new TesseractOCRConfig();
    parseContext.set(AutoDetectParser.class, parser);
    parseContext.set(TesseractOCRConfig.class, tessConfig);

    tessConfig.setEnableImageProcessing(true);

}


public static String parse(String file) throws TikaException, SAXException, 
IOException {

    AutoDetectParser parser = new AutoDetectParser(new TikaConfig());



    Parser tesseractOcrParser = findParser(tikaConfig.getParser(), 
org.apache.tika.parser.ocr.TesseractOCRParser.class);

    TesseractOCRConfig tessConfig = 
((TesseractOCRParser)tesseractOcrParser).getDefaultConfig();


    //parseContext.set(AutoDetectParser.class, parser);
    //parseContext.set(TesseractOCRConfig.class, tessConfig);

    tessConfig.setEnableImageProcessing(true);


}

Overriding settings in TikaConfig

Reply via email to