[jira] [Commented] (CSV-227) first column always quoting when multilingual language, when not on second column

2019-09-28 Thread Gary D. Gregory (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16940104#comment-16940104
 ] 

Gary D. Gregory commented on CSV-227:
-

A PR on GitHub with tests would help ;)

> first column always quoting when multilingual language, when not on second 
> column
> -
>
> Key: CSV-227
> URL: https://issues.apache.org/jira/browse/CSV-227
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.5
>Reporter: Jisun, Shin
>Priority: Major
>
> when including multilingual  character (utf-8 encoding),
> CSVPrinter always quote only first column, not other columns.
>  
> {code:java}
> //  example code
> CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);
> CSVPrinter printer = new CSVPrinter(System.out, format);
> List temp = new ArrayList();
> temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
> temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
> temp.add(new String[] { "", "hello4", "", "test4" });
> for (String[] temp1 : temp) {
> printer.printRecord(temp1);
> }
> printer.close();
> {code}
>  
> result =>
> "ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
> "한글3",hello3,3한글3,test3
> "",hello4,,test4
>  
> i found the code.
> multilingual charaters are out of  0x7E. first record and multilinguage  
> always print quotes.
>   
> {code:java}
> // CSVFormat.class
> ...
> 1173: char c = value.charAt(pos);
> 1174: 
> 1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / 
> %x23-2B / %x2D-7E
> 1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 
> 0x2D || c > 0x7E)) {
> 1177: quote = true;
> 1178: } else if (c <= COMMENT) {
> ...{code}
>  
> would you fix this bug?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CSV-227) first column always quoting when multilingual language, when not on second column

2019-09-07 Thread Yuji Konishi (Jira)


[ 
https://issues.apache.org/jira/browse/CSV-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924858#comment-16924858
 ] 

Yuji Konishi commented on CSV-227:
--

All columns were not quoting when the following code was executed.

{code:java}
@Test
public void testFoo() throws IOException {
CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);

CSVPrinter printer = new CSVPrinter(System.out, format);

List temp = new ArrayList();

temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
temp.add(new String[] { "", "hello4", "", "test4" });

for (String[] temp1 : temp) {
printer.printRecord(temp1);
}
printer.close();
}
{code}

ㅁㅎㄷㄹ,ㅁㅎㄷㄹ,,test2
 한글3,hello3,3한글3,test3
 "",hello4,,test4


$ git log
commit 1a7c6140825bd7b3abe73c5dd732b090acc84b61 (HEAD -> master, origin/master, 
origin/HEAD)


> first column always quoting when multilingual language, when not on second 
> column
> -
>
> Key: CSV-227
> URL: https://issues.apache.org/jira/browse/CSV-227
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.5
>Reporter: Jisun, Shin
>Priority: Major
>
> when including multilingual  character (utf-8 encoding),
> CSVPrinter always quote only first column, not other columns.
>  
> {code:java}
> //  example code
> CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);
> CSVPrinter printer = new CSVPrinter(System.out, format);
> List temp = new ArrayList();
> temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
> temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
> temp.add(new String[] { "", "hello4", "", "test4" });
> for (String[] temp1 : temp) {
> printer.printRecord(temp1);
> }
> printer.close();
> {code}
>  
> result =>
> "ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
> "한글3",hello3,3한글3,test3
> "",hello4,,test4
>  
> i found the code.
> multilingual charaters are out of  0x7E. first record and multilinguage  
> always print quotes.
>   
> {code:java}
> // CSVFormat.class
> ...
> 1173: char c = value.charAt(pos);
> 1174: 
> 1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / 
> %x23-2B / %x2D-7E
> 1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 
> 0x2D || c > 0x7E)) {
> 1177: quote = true;
> 1178: } else if (c <= COMMENT) {
> ...{code}
>  
> would you fix this bug?
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CSV-227) first column always quoting when multilingual language, when not on second column

2019-06-19 Thread Daniel Cattlin (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867455#comment-16867455
 ] 

Daniel Cattlin commented on CSV-227:


With the default "QuoteMode.MINIMAL" I've seen some pretty weird behaviour too. 
It's pretty easy to reproduce if you use 2 columns of data that contain the 
same values and do a side by side comparison. Here are some example data that 
was output by the CSV writer with unexpected quoting:

Notice that any row that starts with a unicode character gets the first field 
in that row quoted but not the second field which is the same - this includes 
the escape character, which I find a bit odd. I also checked any fields with 
the delimiter are quoted just fine. I saw a similar question on Stack Overflow 
[https://stackoverflow.com/questions/36663273/unexpected-quoting-in-apache-commons-csv]
{code:java}
"[QElmqgucZ",[QElmqgucZ
"`K^bPRa\Xm",`K^bPRa\Xm
NJ[\LWwY`Z,NJ[\LWwY`Z
c[n`zOk]qv,c[n`zOk]qv
y[KIphm]Bk,y[KIphm]Bk
"\rin\toDOP",\rin\toDOP
McLbuXeP]a,McLbuXeP]a
"\x`U^BHnVj",\x`U^BHnVj
"_\MzHJA]RO",_\MzHJA]RO
XslXnTQOEc,XslXnTQOEc
"-UHlnX\hNu",-UHlnX\hNu
ObGYlN_`g`,ObGYlN_`g`
"[FazYv\vtd",[FazYv\vtd{code}

> first column always quoting when multilingual language, when not on second 
> column
> -
>
> Key: CSV-227
> URL: https://issues.apache.org/jira/browse/CSV-227
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.5
>Reporter: Jisun, Shin
>Priority: Major
>
> when including multilingual  character (utf-8 encoding),
> CSVPrinter always quote only first column, not other columns.
>  
> {code:java}
> //  example code
> CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);
> CSVPrinter printer = new CSVPrinter(System.out, format);
> List temp = new ArrayList();
> temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
> temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
> temp.add(new String[] { "", "hello4", "", "test4" });
> for (String[] temp1 : temp) {
> printer.printRecord(temp1);
> }
> printer.close();
> {code}
>  
> result =>
> "ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
> "한글3",hello3,3한글3,test3
> "",hello4,,test4
>  
> i found the code.
> multilingual charaters are out of  0x7E. first record and multilinguage  
> always print quotes.
>   
> {code:java}
> // CSVFormat.class
> ...
> 1173: char c = value.charAt(pos);
> 1174: 
> 1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / 
> %x23-2B / %x2D-7E
> 1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 
> 0x2D || c > 0x7E)) {
> 1177: quote = true;
> 1178: } else if (c <= COMMENT) {
> ...{code}
>  
> would you fix this bug?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CSV-227) first column always quoting when multilingual language, when not on second column

2019-04-09 Thread Jisun, Shin (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813984#comment-16813984
 ] 

Jisun, Shin commented on CSV-227:
-

Sorry, I did not explain it enough.

I want to strip quotes.
So. set "QuoteMode.MINIMAL".

And  after second column, i works. but first column doesn't.



> first column always quoting when multilingual language, when not on second 
> column
> -
>
> Key: CSV-227
> URL: https://issues.apache.org/jira/browse/CSV-227
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.5
>Reporter: Jisun, Shin
>Priority: Major
>
> when including multilingual  character (utf-8 encoding),
> CSVPrinter always quote only first column, not other columns.
>  
> {code:java}
> //  example code
> CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);
> CSVPrinter printer = new CSVPrinter(System.out, format);
> List temp = new ArrayList();
> temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
> temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
> temp.add(new String[] { "", "hello4", "", "test4" });
> for (String[] temp1 : temp) {
> printer.printRecord(temp1);
> }
> printer.close();
> {code}
>  
> result =>
> "ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
> "한글3",hello3,3한글3,test3
> "",hello4,,test4
>  
> i found the code.
> multilingual charaters are out of  0x7E. first record and multilinguage  
> always print quotes.
>   
> {code:java}
> // CSVFormat.class
> ...
> 1173: char c = value.charAt(pos);
> 1174: 
> 1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / 
> %x23-2B / %x2D-7E
> 1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 
> 0x2D || c > 0x7E)) {
> 1177: quote = true;
> 1178: } else if (c <= COMMENT) {
> ...{code}
>  
> would you fix this bug?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CSV-227) first column always quoting when multilingual language, when not on second column

2018-12-28 Thread Amit Chaurasia (JIRA)


[ 
https://issues.apache.org/jira/browse/CSV-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16730462#comment-16730462
 ] 

Amit Chaurasia commented on CSV-227:


[~Trichotomy], it works as you expect it to when QuoteMode.ALL.

CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.ALL);

> first column always quoting when multilingual language, when not on second 
> column
> -
>
> Key: CSV-227
> URL: https://issues.apache.org/jira/browse/CSV-227
> Project: Commons CSV
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 1.5
>Reporter: Jisun, Shin
>Priority: Major
>
> when including multilingual  character (utf-8 encoding),
> CSVPrinter always quote only first column, not other columns.
>  
> {code:java}
> //  example code
> CSVFormat format = CSVFormat.DEFAULT.withQuoteMode(QuoteMode.MINIMAL);
> CSVPrinter printer = new CSVPrinter(System.out, format);
> List temp = new ArrayList();
> temp.add(new String[] { "ㅁㅎㄷㄹ", "ㅁㅎㄷㄹ", "", "test2" });
> temp.add(new String[] { "한글3", "hello3", "3한글3", "test3" });
> temp.add(new String[] { "", "hello4", "", "test4" });
> for (String[] temp1 : temp) {
> printer.printRecord(temp1);
> }
> printer.close();
> {code}
>  
> result =>
> "ㅁㅎㄷㄹ",ㅁㅎㄷㄹ,,test2
> "한글3",hello3,3한글3,test3
> "",hello4,,test4
>  
> i found the code.
> multilingual charaters are out of  0x7E. first record and multilinguage  
> always print quotes.
>   
> {code:java}
> // CSVFormat.class
> ...
> 1173: char c = value.charAt(pos);
> 1174: 
> 1175: // RFC4180 (https://tools.ietf.org/html/rfc4180) TEXTDATA = %x20-21 / 
> %x23-2B / %x2D-7E
> 1176: if (newRecord && (c < 0x20 || c > 0x21 && c < 0x23 || c > 0x2B && c < 
> 0x2D || c > 0x7E)) {
> 1177: quote = true;
> 1178: } else if (c <= COMMENT) {
> ...{code}
>  
> would you fix this bug?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)