Re: Review Request 22892: New parser for ENVI header files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/#review46631 --- trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java https://reviews.apache.org/r/22892/#comment82136 Good comment Nick. I committed the version of this patch without this improvement, and we can make this improvement later on with a new issue. - Chris Mattmann On June 23, 2014, 11:14 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 11:14 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Re: Review Request 22892: New parser for ENVI header files
On June 24, 2014, 12:28 p.m., Nick Burch wrote: trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java, lines 75-82 https://reviews.apache.org/r/22892/diff/3/?file=615266#file615266line75 This might be better using something like a BufferedReader, so you can read in one line of the Envi file at a time, and output each into their own p tag / li tag within a ul Thanks for the input Nick. I'm working on trying to implement the BufferedReader now. - Ann --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/#review46518 --- On June 23, 2014, 11:14 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 11:14 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Re: Review Request 22892: New parser for ENVI header files
On June 24, 2014, 12:28 p.m., Nick Burch wrote: Looking into this more, AutoDetectReader is already a subclass of BufferedReader. Should we, as discussed here [1], be reading chunk by chunk, as this code (and TXTParser) is doing manually? If so, we should really just use the built in BufferedReader implementation. Which... leads to AutoDetectReader -- we should create a new constructor which accepts a buffer size and passes that along in the super constructor call. Once we create that, we can clean this up to properly read chunk by chunk. Or, we just don't do that and read line by line, with reader.readLine(), as in the original StackOverflow question ;). [1] - http://stackoverflow.com/questions/17084657/most-robust-way-of-reading-a-file-or-stream-using-java-to-prevent-dos-attacks - Tyler --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/#review46518 --- On June 23, 2014, 11:14 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 11:14 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Review Request 22892: New parser for ENVI header files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Re: Review Request 22892: New parser for ENVI header files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/#review46459 --- Looks great Annie, with the package updates I think I can commit this. - Chris Mattmann On June 23, 2014, 9:43 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 9:43 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Re: Review Request 22892: New parser for ENVI header files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/#review46457 --- trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java https://reviews.apache.org/r/22892/#comment81848 org.apache.tika.parser.envi trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java https://reviews.apache.org/r/22892/#comment81849 org.apache.tika.parser.envi - Chris Mattmann On June 23, 2014, 9:43 p.m., Ann Burgess wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 9:43 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Re: Review Request 22892: New parser for ENVI header files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 10:01 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs (updated) - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess
Re: Review Request 22892: New parser for ENVI header files
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22892/ --- (Updated June 23, 2014, 11:14 p.m.) Review request for tika. Bugs: TIKA-1274 https://issues.apache.org/jira/browse/TIKA-1274 Repository: tika Description --- New parser for ENVI header files. Note, this is a parser for header files that will have an associated, separate data file. This parser will not extract content from the data file. Diffs (updated) - trunk/tika-parsers/src/main/java/org/apache/tika/parser/envi/EnviHeaderParser.java PRE-CREATION trunk/tika-parsers/src/test/java/org/apache/tika/parser/envi/EnviHeaderParserTest.java PRE-CREATION trunk/tika-parsers/src/test/resources/test-documents/envi_test_header.hdr PRE-CREATION Diff: https://reviews.apache.org/r/22892/diff/ Testing --- Text parsing test completed with file envi_test_header.hdr. Thanks, Ann Burgess