[ https://issues.apache.org/jira/browse/IO-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150124#comment-13150124 ]
Sebb commented on IO-288: ------------------------- Good to know that it's easy to unambiguously detect CR and LF. There seems to be a lot of spurious files in the zip archive. I'm not sure that the getNewLineMatchByteCount() is as efficient as BufferedReader.readLine() - it seems to process characters multiple times. It could probably be improved by just checking current and previous chars. Also, I don't think it's necessary to encode \n or \r - just use the appropriate characters. There are no tests for multi-block files where there may be lines spanning blocks. Indeed the CRLF pair may span blocks; I'm not convinced that the code handles that correctly. In order for getNewLineMatchByteCount() to detect all CRLF pairs, it generally needs at least 2 characters to be present; this does not seem to be guaranteed. Note: could use a smaller block size to make the test files smaller; probably sensible to compare the results with a forward line reader. It would then be simple to have a directory of various different test files - read the file forward and store the lines; ensure that the reverse reader matches the reversed lines. The field totalBlockCount needs to be a long, not an int. Might simplify the code to use empty arrays rather than null. > Supply a ReversedLinesFileReader > --------------------------------- > > Key: IO-288 > URL: https://issues.apache.org/jira/browse/IO-288 > Project: Commons IO > Issue Type: New Feature > Components: Utilities > Reporter: Georg Henzler > Fix For: 2.2 > > Attachments: ReversedLinesFileReader0.2.zip > > > I needed to analyse a log file today and I was looking for a > ReversedLinesFileReader: A class that behaves exactly like BufferedReader > except that it goes from bottom to top when readLine() is called. I didn't > find it in IOUtils and the internet didn't help a lot either, e.g. > http://www.java2s.com/Tutorial/Java/0180__File/ReversingaFile.htm is a fairly > inefficient - the log files I'm analysing are huge and it is not a good idea > to load the whole content in the memory. > So I ended up writing an implementation myself using little memory and the > class RandomAccessFile - see attached file. It's used as follows: > int blockSize = 4096; // only that much memory is needed, no matter how big > the file is > ReversedLinesFileReader reversedLinesFileReader = new ReversedLinesFileReader > (myFile, blockSize, "UTF-8"); // encoding is supported > String line = null; > while((line=reversedLinesFileReader.readLine())!=null) { > ... // use the line > if(enoughLinesSeen) { > break; > } > } > reversedLinesFileReader.close(); > I believe this could be useful for other people as well! -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira