Author: jukka Date: Thu Sep 4 10:40:36 2008 New Revision: 692170 URL: http://svn.apache.org/viewvc?rev=692170&view=rev Log: TIKA-149: Parser for zip files
Include some newlines to make the plain text output a bit more readable (and to avoid words running into each other and breaking full text indexing) Modified: incubator/tika/trunk/src/main/java/org/apache/tika/parser/zip/ZipParser.java Modified: incubator/tika/trunk/src/main/java/org/apache/tika/parser/zip/ZipParser.java URL: http://svn.apache.org/viewvc/incubator/tika/trunk/src/main/java/org/apache/tika/parser/zip/ZipParser.java?rev=692170&r1=692169&r2=692170&view=diff ============================================================================== --- incubator/tika/trunk/src/main/java/org/apache/tika/parser/zip/ZipParser.java (original) +++ incubator/tika/trunk/src/main/java/org/apache/tika/parser/zip/ZipParser.java Thu Sep 4 10:40:36 2008 @@ -83,6 +83,7 @@ throws IOException, SAXException { xhtml.startElement("div", "class", "file"); xhtml.element("h1", entry.getName()); + xhtml.characters("\n"); try { Metadata metadata = new Metadata(); @@ -95,6 +96,7 @@ // Could not parse the entry, just skip the content } + xhtml.characters("\n"); xhtml.endElement("div"); }