[ https://issues.apache.org/jira/browse/TIKA-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316211#comment-14316211 ]
Hudson commented on TIKA-1544: ------------------------------ SUCCESS: Integrated in tika-trunk-jdk1.7 #485 (See [https://builds.apache.org/job/tika-trunk-jdk1.7/485/]) TIKA-1544 consecutive new lines not preserved in rtf (tallison: http://svn.apache.org/viewvc/tika/trunk/?view=rev&rev=1658947) * /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/rtf/TextExtractor.java * /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/rtf/RTFParserTest.java * /tika/trunk/tika-parsers/src/test/resources/test-documents/testRTFNewlines.rtf > empty lines are not preserved > ----------------------------- > > Key: TIKA-1544 > URL: https://issues.apache.org/jira/browse/TIKA-1544 > Project: Tika > Issue Type: Bug > Affects Versions: 1.6 > Environment: Windows 8, Java 1.8 > Reporter: mortee > Priority: Minor > Fix For: 1.8 > > Attachments: preserve_new_lines_in_rtf.patch, testRTFNewlines.rtf > > > I'm trying to extract the text content from RTF documents. The files contain > empty lines (two or more consecutive paragraph-end marks), on which the > further processing relies to tell apart different parts of the text. But > unfortuantely Tika (with --text switch) eliminates all those empty lines, > instead of preserving them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)