[ https://issues.apache.org/jira/browse/TIKA-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460067#comment-16460067 ]
Hudson commented on TIKA-2636: ------------------------------ UNSTABLE: Integrated in Jenkins build tika-2.x-windows #242 (See [https://builds.apache.org/job/tika-2.x-windows/242/]) TIKA-2636 ENVI Header metadata fields can span more than one line (lewis.mcgibbney: rev ceb7b42ba2e342e7becb81d0c661ccd6209a915e) * (edit) tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java * (edit) tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml * (edit) tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java * (add) output.txt * (edit) tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser * (add) tika-parsers/src/test/resources/test-documents/ang20150420t182050_corr_v1e_img.hdr TIKA-2636 ENVI Header metadata fields can span more than one line (lewis.mcgibbney: rev 1fae340976e054bc8206cf79dff8b33758eebe82) * (delete) output.txt TIKA-2636 ENVI Header metadata fields can span more than one line (lewis.mcgibbney: rev d2c412940b976e607b698b598e25481495d0b8e4) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java TIKA-2636 ENVI Header metadata fields can span more than one line (lewis.mcgibbney: rev fb4e39323b1d0576ea8066065febae94765a96c2) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java > ENVI Header metadata fields can span more than one line > ------------------------------------------------------- > > Key: TIKA-2636 > URL: https://issues.apache.org/jira/browse/TIKA-2636 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.18 > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Priority: Major > Fix For: 1.19 > > Attachments: ang20150420t182050_corr_v1e_img.hdr > > > [~tpalsulich] was correct when [he > stated|https://issues.apache.org/jira/browse/TIKA-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046140#comment-14046140] > "...See below for how to read and output line by line (copy & paste between > the xml start/end in EnviHeaderParser). I have a hunch this isn't really what > we want -- what if a metadata field has a newline in it? What if the line is > too long to fit into a string? On the other hand, with nice input, it's much > nicer output." > As it turns out ENVI header metadata fields can span more than one line. An > example is as follows > {code} > 1. ENVI > 2. description = { > 3. Georeferenced Image built from input GLT. [Wed Jun 10 04:37:54 2015] > [Wed > 4. Jun 10 04:48:52 2015]} > 5. samples = 739 > 6. lines = 14674 > 7. bands = 432 > 8. header offset = 0 > 9. file type = ENVI Standard > 10. data type = 4 > 11. interleave = bil > 12. sensor type = Unknown > 13. byte order = 0 > 14. map info = { UTM , 1.000 , 1.000 , 724522.127 , 4074620.759 , > 1.1000000000e+00 , 1.1000000000e+00 , 12 , North , WGS-84 , units=Meters , > rotation=75.00000000 } > 15. wavelength units = Nanometers > ... > {code} > The case here is when a metadata field value is contained within curly > brackets. The examples above are clearly L2-L4 where the value is spread over > three lines and L14 where the value is contained within the one line. > This requires a patch to fix the > [EnviHeaderParser|https://github.com/apache/tika/blob/9130bbc1fa6d69419b2ad294917260d6b1cced08/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)