[jira] [Commented] (CSV-272) Support fixed-width format
[ https://issues.apache.org/jira/browse/CSV-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292170#comment-17292170 ] Holger Brandl commented on CSV-272: --- I don't mind the appearance, but I'm stuck with an old fix-width-delimited file from the mid-90s and was hoping to use apache-commons-csv. :-) I'd also like to support fixed-width-delimited files in krangl, which is a data-frame library for/in kotlin https://github.com/holgerbrandl/krangl and which is using commons-csv for file-reading. I [hacked |https://github.com/holgerbrandl/krangl/commit/65744ccf5dee9b756da7b6fe5e7cb31715bae417#diff-e2fc3be92a3578f8a93fa51930597d181d218627e0c2d144096385a800e7eceeR183] together some fixed-width support this morning, but to me, such parsing logic should be better live in a library such as apache-commons-csv. Unfortunately I don't know enough about the internals of commons-csv to provide a PR. > Support fixed-width format > -- > > Key: CSV-272 > URL: https://issues.apache.org/jira/browse/CSV-272 > Project: Commons CSV > Issue Type: Improvement > Components: Parser >Affects Versions: 1.8 >Reporter: Holger Brandl >Priority: Major > > Although not strictly a delimiter format, fixed-width tables are still very > common. So it would be great if commons-csv would support fixed-width via a > dedicated format. > Since jira does not render fixed-delim content correctly, I've deposited an > example under > https://gist.github.com/holgerbrandl/26298ae77d53b3393d9d22c73249ab72 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CSV-272) Support fixed-width format
Holger Brandl created CSV-272: - Summary: Support fixed-width format Key: CSV-272 URL: https://issues.apache.org/jira/browse/CSV-272 Project: Commons CSV Issue Type: Improvement Components: Parser Affects Versions: 1.8 Reporter: Holger Brandl Although not strictly a delimiter format, fixed-width tables are still very common. So it would be great if commons-csv would support fixed-width via a dedicated format. Since jira does not render fixed-delim content correctly, I've deposited an example under https://gist.github.com/holgerbrandl/26298ae77d53b3393d9d22c73249ab72 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CSV-257) Updating from 1.6 to 1.7 breaks
[ https://issues.apache.org/jira/browse/CSV-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150558#comment-17150558 ] Holger Brandl edited comment on CSV-257 at 7/2/20, 9:08 PM: I think that the commons-csv should not be opinionated concerning the question if an empty header makes sense or not. It should parse it (if the structure is legit csv) and leave this decision to the user. It's a common pattern to have an empty header for the first column containing the row number/index. E.g. see official examples such as [https://jetbrains.bintray.com/lets-plot/mpg.csv] was (Author: holgerbrandl): I think that the commons-csv should not be opinionated concerning the question if an empty header makes sense or not. It should parse it (if the structure is legit csv) and leave this decision to the user. > Updating from 1.6 to 1.7 breaks > --- > > Key: CSV-257 > URL: https://issues.apache.org/jira/browse/CSV-257 > Project: Commons CSV > Issue Type: Bug > Components: Parser >Affects Versions: 1.7 >Reporter: xia0c >Priority: Major > > When I try to upgrade commons csv from 1.6 to the latest version 1.7. The > following code breaks. > {code:java} > public class Demo { > @Test > public void TestCSV(){ > try { > InputStream testInput = new > ByteArrayInputStream(",".getBytes(StandardCharsets.UTF_8)); > Reader reader = new InputStreamReader(testInput); > char character = ",".charAt(0); > CSVFormat format = > CSVFormat.RFC4180.withDelimiter(character).withFirstRecordAsHeader().withIgnoreSurroundingSpaces(); > CSVParser parser = new CSVParser(reader, format); > } catch (Exception e) { > System.out.println("!!!"); > assertEquals("java.lang.IllegalArgumentException: The > header contains a duplicate name: \"\" in [, ]", e.toString()); > } > } > } > {code} > The test should pass, but it failed with error: > {code:java} > org.junit.ComparisonFailure: expected:<...lArgumentException: [The header > contains a duplicate name: ""] in [, ]> but was:<...lArgumentException: [A > header name is missing] in [, ]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at Demo.TestCSV(Demo.java:31) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:541) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:763) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:463) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:209) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CSV-257) Updating from 1.6 to 1.7 breaks
[ https://issues.apache.org/jira/browse/CSV-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150558#comment-17150558 ] Holger Brandl commented on CSV-257: --- I think that the commons-csv should not be opinionated concerning the question if an empty header makes sense or not. It should parse it (if the structure is legit csv) and leave this decision to the user. > Updating from 1.6 to 1.7 breaks > --- > > Key: CSV-257 > URL: https://issues.apache.org/jira/browse/CSV-257 > Project: Commons CSV > Issue Type: Bug > Components: Parser >Affects Versions: 1.7 >Reporter: xia0c >Priority: Major > > When I try to upgrade commons csv from 1.6 to the latest version 1.7. The > following code breaks. > {code:java} > public class Demo { > @Test > public void TestCSV(){ > try { > InputStream testInput = new > ByteArrayInputStream(",".getBytes(StandardCharsets.UTF_8)); > Reader reader = new InputStreamReader(testInput); > char character = ",".charAt(0); > CSVFormat format = > CSVFormat.RFC4180.withDelimiter(character).withFirstRecordAsHeader().withIgnoreSurroundingSpaces(); > CSVParser parser = new CSVParser(reader, format); > } catch (Exception e) { > System.out.println("!!!"); > assertEquals("java.lang.IllegalArgumentException: The > header contains a duplicate name: \"\" in [, ]", e.toString()); > } > } > } > {code} > The test should pass, but it failed with error: > {code:java} > org.junit.ComparisonFailure: expected:<...lArgumentException: [The header > contains a duplicate name: ""] in [, ]> but was:<...lArgumentException: [A > header name is missing] in [, ]> > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at Demo.TestCSV(Demo.java:31) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:541) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:763) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:463) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:209) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)