Hi,

I'm trying to use Tika that comes with Solr 5.2.  The following code is not
working:

public static void parseWithTika() throws Exception
{
    File file = new File("C:\\temp\\test.pdf");

    FileInputStream in = new FileInputStream(file);
    AutoDetectParser parser = new AutoDetectParser();
    Metadata metadata = new Metadata();
    metadata.add(Metadata.RESOURCE_NAME_KEY, file.getName());
    BodyContentHandler contentHandler = new BodyContentHandler();

    parser.parse(in, contentHandler, metadata);

    String content = contentHandler.toString();   <=== 'content' is always
empty

    in.close();
}

'content' is always empty string unless when the file I pass to Tika is a
text file.  Any idea what's the issue?

I have also tried sample codes off https://tika.apache.org/1.8/examples.html
with the same result.


Thanks !!

Steve

Reply via email to