DefaultHandler is effectively a NullHandler; it doesn't store or do anything.



Try BodyContentHandler or ToXMLHandler or maybe WriteoutHandler.





If you want to write out each embedded file as a binary, try subclassing 
EmbeddedResourceHandler.



QUOTE:
0down 
votefavorite<http://stackoverflow.com/questions/24495504/unable-tp-read-zipfile-using-apache-tika?sem=2>


i am using Apache Tika 1.5 for parsing the contents present in a zip file,

here's my sample code

    Parser parser = new AutoDetectParser();

    ParseContext context = new ParseContext();

    context.set(Parser.class, parser);

    ContentHandler handler = new DefaultHandler();

    Metadata metadata = new Metadata();

    InputStream stream = null;

    try {

        stream = TikaInputStream.get(new File(zipFilePath));

    } catch (FileNotFoundException e) {

        e.printStackTrace();

    }

    try {



        parser.parse(stream, handler, metadata, context);



        logger.info("Content:\t" + handler.toString());

    } catch (IOException e) {

        e.printStackTrace();

    } catch (SAXException e) {

        e.printStackTrace();

    } catch (TikaException e) {

        e.printStackTrace();

    } finally {

        try {

            stream.close();

        } catch (IOException e) {

            e.printStackTrace();

        }

    }

in the logger statement all i see is org.xml.sax.helpers.DefaultHandler@5bd8e367

i am missing something, unable to figure it out, looking for some help




-----Original Message-----

From: yeshwanth kumar [mailto:yeshwant...@gmail.com]

Sent: Monday, June 30, 2014 1:28 PM

To: d...@tika.apache.org

Subject: Stack Overflow Question



Unable tp read zipfile using Apache Tika

http://stackoverflow.com/q/24495504/1899893?sem=2

Reply via email to