[jira] [Commented] (CSV-42) Lots of possible changes
[ https://issues.apache.org/jira/browse/CSV-42?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691912#comment-13691912 ] Benedikt Ritter commented on CSV-42: Let's review this once again an decide what we want to change for 1.0 Lots of possible changes Key: CSV-42 URL: https://issues.apache.org/jira/browse/CSV-42 Project: Commons CSV Issue Type: Improvement Affects Versions: 1.0 Reporter: Bob Smith Priority: Minor Fix For: 1.0 Attachments: src.zip I made a lot of changes to pretty much all of the classes in the csv package. I thought it would be better to put all of the the changes here in one issue, but feel free to only take the parts you like (if any). Hopefully if nothing else the test cases will be useful to you. I'll try to list most of the changes here, but I'm sure I'm forgetting some. This should include all of the big changes at least. I focused mostly on the parser, but I also made a few changes to the printer classes (although I don't think I added any new test cases there). h3. General Changes - Changed all class names with CSV in them to use Csv. This is how it appears in the commons-lang escapeCsv methods and I think it's easier to read the class name when acronyms are not in all upper case. (x) {color:red}3 letter acronyms are usually kept in uppercase (for example URLConnection or SAXParser in the JDK, but there are some exceptions){color} - Formatted the code. I used Eclipse with a version of the Java formatting style that uses spaces instead of tabs and with a few other small changes to try to make it more similar to the style of this code. The formatting was inconsistent before (sometimes 2 space indent, sometimes 4) which made it really hard to work on. (/) {color:green}DONE{color} - Removed all deprecated methods/constructors (/) {color:green}DONE{color} - Made all public classes final. If there is ever a need to create subclasses of them then this could be changed, but I think it would be better to at least start them as final (since once they are released as non-final it's hard to go back). - A few bug fixes (and test cases for them) h3. CsvParser - There were a few bugs for special cases, so I made as small of changes as I could to the parser code to fix these. - Added a lot of test cases. I created a test case for all bugs that I found, so even if you don't use my changes to this class you should be able to use the test cases to find all of the same bugs. - Added a close method. (x) {color:red} The try-with-resources statement in Java 7 makes resources management much easier, there is no need to add a close() method to the parser.{color} - Renamed the nextValue method to getValue (so it is more consistent with the getAll and getLine method names). I think I would prefer to use a different method name prefix for all three of these (like readAll) since I wouldn't normally expect a get method to have side effects, but I didn't want to just change the names of the most used methods. (x) {color:red}This method has been removed, the parser now works line by line.{color} - Changed the getLineNumber method to return the correct line number when there are multi-line values. (x) {color:red}The suggested code counts the number of records instead of the number of lines. For debugging it's better to return the actual line number.{color} - Moved all of the lexer methods into an inner CsvLexer class that is completely independent of the CsvParser class. The methods were already separated out, so it wasn't a very big change. I also moved the lexer test cases into a new CsvLexerTest class. (/) {color:green}DONE{color} - Got rid of the interpreting unicode escape options. This doesn't really have anything to do with parsing a CSV file so I think it should be left up to the user of the class to implement this if needed. As an example, I made a CsvParserUnicodeEscapeTest class that uses the code from the lexer in a Reader subclass. One nice thing is that with this implementation, the interpreted values can be used as the delimiter, encapsulator, etc. (/) {color:green}DONE - The unicode unescaping is now handled by a class implementing java.io.Reader (to be contributed to Commons IO).{color} - Got rid of the escape option for the same reason as the unicode escape option. I replaced it with an encapsulator escape option that is only used as an escape operator on the encapsulator character. h3. ExtendedBufferedReader - Greatly simplified this class. I removed all the methods that weren't being used (including keeping track of the line number) and changed the lookahead option to use the BufferedReader mark and reset methods. (/) {color:green}DONE -
[jira] [Commented] (CSV-42) Lots of possible changes
[ https://issues.apache.org/jira/browse/CSV-42?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13691947#comment-13691947 ] Gary Gregory commented on CSV-42: - Just a note on: {quote} Added a close method. The try-with-resources statement in Java 7 makes resources management much easier, there is no need to add a close() method to the parser. {quote} The try-with-resources in Java 7 requires the a class implements Closeable, which means implementing close(), but the parser does not create closeable resources, it takes a {{Reader}}, which itself can be used with try-with-resources in Java 7. Lots of possible changes Key: CSV-42 URL: https://issues.apache.org/jira/browse/CSV-42 Project: Commons CSV Issue Type: Improvement Affects Versions: 1.0 Reporter: Bob Smith Priority: Minor Fix For: 1.0 Attachments: src.zip I made a lot of changes to pretty much all of the classes in the csv package. I thought it would be better to put all of the the changes here in one issue, but feel free to only take the parts you like (if any). Hopefully if nothing else the test cases will be useful to you. I'll try to list most of the changes here, but I'm sure I'm forgetting some. This should include all of the big changes at least. I focused mostly on the parser, but I also made a few changes to the printer classes (although I don't think I added any new test cases there). h3. General Changes - Changed all class names with CSV in them to use Csv. This is how it appears in the commons-lang escapeCsv methods and I think it's easier to read the class name when acronyms are not in all upper case. (x) {color:red}3 letter acronyms are usually kept in uppercase (for example URLConnection or SAXParser in the JDK, but there are some exceptions){color} - Formatted the code. I used Eclipse with a version of the Java formatting style that uses spaces instead of tabs and with a few other small changes to try to make it more similar to the style of this code. The formatting was inconsistent before (sometimes 2 space indent, sometimes 4) which made it really hard to work on. (/) {color:green}DONE{color} - Removed all deprecated methods/constructors (/) {color:green}DONE{color} - Made all public classes final. If there is ever a need to create subclasses of them then this could be changed, but I think it would be better to at least start them as final (since once they are released as non-final it's hard to go back). - A few bug fixes (and test cases for them) h3. CsvParser - There were a few bugs for special cases, so I made as small of changes as I could to the parser code to fix these. - Added a lot of test cases. I created a test case for all bugs that I found, so even if you don't use my changes to this class you should be able to use the test cases to find all of the same bugs. - Added a close method. (x) {color:red} The try-with-resources statement in Java 7 makes resources management much easier, there is no need to add a close() method to the parser.{color} - Renamed the nextValue method to getValue (so it is more consistent with the getAll and getLine method names). I think I would prefer to use a different method name prefix for all three of these (like readAll) since I wouldn't normally expect a get method to have side effects, but I didn't want to just change the names of the most used methods. (x) {color:red}This method has been removed, the parser now works line by line.{color} - Changed the getLineNumber method to return the correct line number when there are multi-line values. (x) {color:red}The suggested code counts the number of records instead of the number of lines. For debugging it's better to return the actual line number.{color} - Moved all of the lexer methods into an inner CsvLexer class that is completely independent of the CsvParser class. The methods were already separated out, so it wasn't a very big change. I also moved the lexer test cases into a new CsvLexerTest class. (/) {color:green}DONE{color} - Got rid of the interpreting unicode escape options. This doesn't really have anything to do with parsing a CSV file so I think it should be left up to the user of the class to implement this if needed. As an example, I made a CsvParserUnicodeEscapeTest class that uses the code from the lexer in a Reader subclass. One nice thing is that with this implementation, the interpreted values can be used as the delimiter, encapsulator, etc. (/) {color:green}DONE - The unicode unescaping is now handled by a class implementing java.io.Reader (to be contributed to Commons IO).{color} - Got rid of the escape option for the same reason as the unicode escape option. I replaced it with an