[jira] [Commented] (CSV-272) Support fixed-width format

2021-02-27 Thread Holger Brandl (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292170#comment-17292170
 ] 

Holger Brandl commented on CSV-272:
---

I don't mind the appearance, but I'm stuck with an old fix-width-delimited file 
from the mid-90s and was hoping to use apache-commons-csv. :-)

I'd also like to support fixed-width-delimited files in krangl, which is a 
data-frame library for/in kotlin https://github.com/holgerbrandl/krangl  and 
which is using commons-csv for file-reading. I [hacked 
|https://github.com/holgerbrandl/krangl/commit/65744ccf5dee9b756da7b6fe5e7cb31715bae417#diff-e2fc3be92a3578f8a93fa51930597d181d218627e0c2d144096385a800e7eceeR183]
 together some fixed-width support this morning, but to me, such parsing logic 
should be better live in a library such as apache-commons-csv.

Unfortunately I don't know enough about the internals of commons-csv to provide 
a PR.

> Support fixed-width format
> --
>
> Key: CSV-272
> URL: https://issues.apache.org/jira/browse/CSV-272
> Project: Commons CSV
>  Issue Type: Improvement
>  Components: Parser
>Affects Versions: 1.8
>Reporter: Holger Brandl
>Priority: Major
>
> Although not strictly a delimiter format, fixed-width tables are still very 
> common. So it would be great if commons-csv would support fixed-width via a 
> dedicated format.
> Since jira does not render fixed-delim content correctly, I've deposited an 
> example under 
> https://gist.github.com/holgerbrandl/26298ae77d53b3393d9d22c73249ab72
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CSV-272) Support fixed-width format

2021-02-26 Thread Holger Brandl (Jira)
Holger Brandl created CSV-272:
-

 Summary: Support fixed-width format
 Key: CSV-272
 URL: https://issues.apache.org/jira/browse/CSV-272
 Project: Commons CSV
  Issue Type: Improvement
  Components: Parser
Affects Versions: 1.8
Reporter: Holger Brandl


Although not strictly a delimiter format, fixed-width tables are still very 
common. So it would be great if commons-csv would support fixed-width via a 
dedicated format.

Since jira does not render fixed-delim content correctly, I've deposited an 
example under 
https://gist.github.com/holgerbrandl/26298ae77d53b3393d9d22c73249ab72
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CSV-257) Updating from 1.6 to 1.7 breaks

2020-07-02 Thread Holger Brandl (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150558#comment-17150558
 ] 

Holger Brandl edited comment on CSV-257 at 7/2/20, 9:08 PM:


I think that the commons-csv should not be opinionated concerning the question 
if an empty header makes sense or not. It should parse it (if the structure is 
legit csv) and leave this decision to the user.

 

It's a common pattern to have an empty header for the first column containing 
the row number/index. E.g. see official examples such as 
[https://jetbrains.bintray.com/lets-plot/mpg.csv] 


was (Author: holgerbrandl):
I think that the commons-csv should not be opinionated concerning the question 
if an empty header makes sense or not. It should parse it (if the structure is 
legit csv) and leave this decision to the user.

> Updating from 1.6 to 1.7 breaks
> ---
>
> Key: CSV-257
> URL: https://issues.apache.org/jira/browse/CSV-257
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.7
>Reporter: xia0c
>Priority: Major
>
> When I try to upgrade commons csv from 1.6 to the latest version 1.7. The 
> following code breaks.
> {code:java}
> public class Demo {
>   @Test
>   public void TestCSV(){
>   try {
>   InputStream testInput = new 
> ByteArrayInputStream(",".getBytes(StandardCharsets.UTF_8));
>   Reader reader = new InputStreamReader(testInput);
>   char character = ",".charAt(0);
> CSVFormat format = 
> CSVFormat.RFC4180.withDelimiter(character).withFirstRecordAsHeader().withIgnoreSurroundingSpaces();
> CSVParser parser = new CSVParser(reader, format);
>   } catch (Exception e) {
>   System.out.println("!!!");
>   assertEquals("java.lang.IllegalArgumentException: The 
> header contains a duplicate name: \"\" in [, ]", e.toString());
>   }
>   }
> }
> {code}
> The test should pass, but it failed with error:
> {code:java}
> org.junit.ComparisonFailure: expected:<...lArgumentException: [The header 
> contains a duplicate name: ""] in [, ]> but was:<...lArgumentException: [A 
> header name is missing] in [, ]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at Demo.TestCSV(Demo.java:31)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89)
>   at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:541)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:763)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:463)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:209)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CSV-257) Updating from 1.6 to 1.7 breaks

2020-07-02 Thread Holger Brandl (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150558#comment-17150558
 ] 

Holger Brandl commented on CSV-257:
---

I think that the commons-csv should not be opinionated concerning the question 
if an empty header makes sense or not. It should parse it (if the structure is 
legit csv) and leave this decision to the user.

> Updating from 1.6 to 1.7 breaks
> ---
>
> Key: CSV-257
> URL: https://issues.apache.org/jira/browse/CSV-257
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.7
>Reporter: xia0c
>Priority: Major
>
> When I try to upgrade commons csv from 1.6 to the latest version 1.7. The 
> following code breaks.
> {code:java}
> public class Demo {
>   @Test
>   public void TestCSV(){
>   try {
>   InputStream testInput = new 
> ByteArrayInputStream(",".getBytes(StandardCharsets.UTF_8));
>   Reader reader = new InputStreamReader(testInput);
>   char character = ",".charAt(0);
> CSVFormat format = 
> CSVFormat.RFC4180.withDelimiter(character).withFirstRecordAsHeader().withIgnoreSurroundingSpaces();
> CSVParser parser = new CSVParser(reader, format);
>   } catch (Exception e) {
>   System.out.println("!!!");
>   assertEquals("java.lang.IllegalArgumentException: The 
> header contains a duplicate name: \"\" in [, ]", e.toString());
>   }
>   }
> }
> {code}
> The test should pass, but it failed with error:
> {code:java}
> org.junit.ComparisonFailure: expected:<...lArgumentException: [The header 
> contains a duplicate name: ""] in [, ]> but was:<...lArgumentException: [A 
> header name is missing] in [, ]>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at Demo.TestCSV(Demo.java:31)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:89)
>   at 
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:41)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:541)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:763)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:463)
>   at 
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:209)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)