On 26/09/2009, at 3:33 AM, Wilson, Ronald wrote: > the RFC says that everything between the commas is supposed to be > part of the field, including white space. Normally I trim the white > space unless it's quoted.
You can certainly offer the option to trim whitespace or change case or correct spelling mistakes, but that's after doing the CSV import technically correctly. > Still, the RFC does not address how to handle rows like this: > > 1234,abc"123",abc > 1235,""123,abc > > What are you supposed to do with those? It is not clear. You should generate an error, in the same way as you would generate an error if an XML tag missed a closing tag or if SQL was missing a closing bracket before the end of line. All syntax has definitions within which the data must conform or be rejected. > Also, are you supposed to strip the quotes upon consuming the field? Yes, in the same way as you strip the XML tags when pulling XML data into an array. > Are you supposed to un-escape escaped quotes? Yes, that's the point. > "1234" -> 1234 or "1234" ? 1234 > "15""" -> 15" or 15"" or "15""" or "15"" ? 15" > Seems to me if you strip quotes, you have to un-escape any escaped > quotes in the field. Correct. > Then there is the matter of white space outside the quotes. The RFC > seems silent on all these issues, though the ABNF grammar implies > that white space outside quotes is not tolerated, which could lead > to considerable user surprise. Technically: 1234,abc"123",abc does not conform to CSV, as answered above. Similarly: 1234, "123",abc does not conform, but I think most importers will tolerate white space outside the quotes and ignore it. Tom BareFeet -- Comparison of SQLite GUI applications: http://www.tandb.com.au/sqlite/compare/?ml _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users