Hello Josh,

Is there a mismatch in expectations of what escaping means?

Escaping works one character at a time: Escape the next single
character. There is no escape start and escape end sequence
characters.

Am I missing something?

Gary

On Wed, Jul 17, 2024 at 5:38 PM Josh Bultman
<josh.bult...@pkware.com.invalid> wrote:
>
> Hello,
>
> I’m using commons csv for a component in our application, and I ran into a 
> weird edge case. In our application, we take in CSV files from the file 
> system without knowing the format beforehand. So, I’m writing a method that 
> guesses the CSV format based on column consistency, encountering trailing 
> data, and a few other things. While writing tests, I encountered the fact 
> that commons csv does not escape the full CRLF with a single escape 
> character. For example, if \ is the escape character, 
> row,\\r\n<file:////r/n>test will be parsed as:
>
> row\r
> test
>
> Instead of:
>
> row,\r\ntest
>
> Initially this felt like the wrong decision to me, so I created a fix for it. 
> During the regression tests, the testRandomMySql test failed because 
> occasionally a \\r<file:////r> was generated as the last part of a row, which 
> also escaped the \n record separator, causing an incorrect number of rows to 
> be read. This made me question whether it’s a good idea at all to escape both 
> the CR and the LF if they’re together, since maybe it’s best to assume that 
> they would be escaped separately like so: \\r\\n<file:////r/n>. Though, if 
> someone were writing a csv manually on a windows machine and decided to 
> escape a newline, I could see them simply typing \ and then hitting enter, 
> which would give: \\r\n<file:////r/n>.
>
> I would be interested to hear other people’s thoughts on this. If it’s still 
> something we deem an issue, I can modify the mySQL test and make a PR.
>
> Thank you,
> Josh

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to