[ https://issues.apache.org/jira/browse/TIKA-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460469#comment-16460469 ]
Hudson commented on TIKA-2636: ------------------------------ UNSTABLE: Integrated in Jenkins build Tika-trunk #1478 (See [https://builds.apache.org/job/Tika-trunk/1478/]) TIKA-2636 ENVI Header metadata fields can span more than one line (lewis.mcgibbney: [https://github.com/apache/tika/commit/1e45da928ab699cd3e483d5c1006412c69ff6c09]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java TIKA-2636 ENVI Header metadata fields can span more than one line (lewis.mcgibbney: [https://github.com/apache/tika/commit/95d967b14c33acd4e82814f6cb470cbab0f4ee08]) * (edit) tika-parsers/src/test/java/org/apache/tika/config/TikaEncodingDetectorTest.java > ENVI Header metadata fields can span more than one line > ------------------------------------------------------- > > Key: TIKA-2636 > URL: https://issues.apache.org/jira/browse/TIKA-2636 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.18 > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Priority: Major > Fix For: 1.19 > > Attachments: ang20150420t182050_corr_v1e_img.hdr > > > [~tpalsulich] was correct when [he > stated|https://issues.apache.org/jira/browse/TIKA-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046140#comment-14046140] > "...See below for how to read and output line by line (copy & paste between > the xml start/end in EnviHeaderParser). I have a hunch this isn't really what > we want -- what if a metadata field has a newline in it? What if the line is > too long to fit into a string? On the other hand, with nice input, it's much > nicer output." > As it turns out ENVI header metadata fields can span more than one line. An > example is as follows > {code} > 1. ENVI > 2. description = { > 3. Georeferenced Image built from input GLT. [Wed Jun 10 04:37:54 2015] > [Wed > 4. Jun 10 04:48:52 2015]} > 5. samples = 739 > 6. lines = 14674 > 7. bands = 432 > 8. header offset = 0 > 9. file type = ENVI Standard > 10. data type = 4 > 11. interleave = bil > 12. sensor type = Unknown > 13. byte order = 0 > 14. map info = { UTM , 1.000 , 1.000 , 724522.127 , 4074620.759 , > 1.1000000000e+00 , 1.1000000000e+00 , 12 , North , WGS-84 , units=Meters , > rotation=75.00000000 } > 15. wavelength units = Nanometers > ... > {code} > The case here is when a metadata field value is contained within curly > brackets. The examples above are clearly L2-L4 where the value is spread over > three lines and L14 where the value is contained within the one line. > This requires a patch to fix the > [EnviHeaderParser|https://github.com/apache/tika/blob/9130bbc1fa6d69419b2ad294917260d6b1cced08/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)