What is the difference between these two examples.
In the first one, I construct a new TesseractOCRConfig, make changes, and add
it to the parseContext.
In the second one, I get the TesseractOCRConfig from the TesseractOCRParser in
the TikaConfig and make changes. I don't add it to the parseContext since it
doesn't seem to be necessary
Are these 2 things equivalent?
public static String parse(String file) throws TikaException, SAXException,
IOException {
AutoDetectParser parser = new AutoDetectParser(new TikaConfig());
ParseContext parseContext = new ParseContext();
TesseractOCRConfig tessConfig = new TesseractOCRConfig();
parseContext.set(AutoDetectParser.class, parser);
parseContext.set(TesseractOCRConfig.class, tessConfig);
tessConfig.setEnableImageProcessing(true);
}
public static String parse(String file) throws TikaException, SAXException,
IOException {
AutoDetectParser parser = new AutoDetectParser(new TikaConfig());
Parser tesseractOcrParser = findParser(tikaConfig.getParser(),
org.apache.tika.parser.ocr.TesseractOCRParser.class);
TesseractOCRConfig tessConfig =
((TesseractOCRParser)tesseractOcrParser).getDefaultConfig();
//parseContext.set(AutoDetectParser.class, parser);
//parseContext.set(TesseractOCRConfig.class, tessConfig);
tessConfig.setEnableImageProcessing(true);
}