On 26/09/2009, at 3:33 AM, Wilson, Ronald wrote:

> the RFC says that everything between the commas is supposed to be  
> part of the field, including white space.  Normally I trim the white  
> space unless it's quoted.

You can certainly offer the option to trim whitespace or change case  
or correct spelling mistakes, but that's after doing the CSV import  
technically correctly.

> Still, the RFC does not address how to handle rows like this:
>
> 1234,abc"123",abc
> 1235,""123,abc
>
> What are you supposed to do with those?  It is not clear.

You should generate an error, in the same way as you would generate an  
error if an XML tag missed a closing tag or if SQL was missing a  
closing bracket before the end of line. All syntax has definitions  
within which the data must conform or be rejected.

> Also, are you supposed to strip the quotes upon consuming the field?

Yes, in the same way as you strip the XML tags when pulling XML data  
into an array.

> Are you supposed to un-escape escaped quotes?

Yes, that's the point.

> "1234" -> 1234 or "1234" ?

1234

> "15""" -> 15" or 15"" or "15""" or "15"" ?

15"

> Seems to me if you strip quotes, you have to un-escape any escaped  
> quotes in the field.

Correct.

> Then there is the matter of white space outside the quotes.  The RFC  
> seems silent on all these issues, though the ABNF grammar implies  
> that white space outside quotes is not tolerated, which could lead  
> to considerable user surprise.

Technically:

1234,abc"123",abc

does not conform to CSV, as answered above. Similarly:

1234,   "123",abc

does not conform, but I think most importers will tolerate white space  
outside the quotes and ignore it.

Tom
BareFeet

  --
Comparison of SQLite GUI applications:
http://www.tandb.com.au/sqlite/compare/?ml

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to