Re: [Python-Dev] These csv test cases seem incorrect to me...
Hi Skip, On 2007-03-12 03:01, [EMAIL PROTECTED] wrote: > I decided it would be worthwhile to have a csv module written in Python (no > C underpinnings) for a number of reasons: > > * It will probably be easier to add Unicode support to a Python version > > * More people will be able to read/grok/modify/fix bugs in a Python > implementation than in the current mixed Python/C implementation. > > * With alternative implementations of Python available (PyPy, > IronPython, Jython) it makes sense to have a Python version they can > use. Lots of good reasons :-) I've written a Python-only Unicode aware CSV module for a client (mostly because CSV data tends to be quirky and I needed a quick way of dealing with corner cases). Perhaps I can get them to donate it to the PSF... > I'm far from having anything which will pass the current test suite, but in > diagnosing some of my current failures I noticed a couple test cases which > seem wrong. In the TestDialectExcel class I see these two questionable > tests: > > def test_quotes_and_more(self): > self.readerAssertEqual('"a"b', [['ab']]) > > def test_quote_and_quote(self): > self.readerAssertEqual('"a" "b"', [['a "b"']]) > > It seems to me that if a field starts with a quote it *has* to be a quoted > field. Any quotes appearing within a quoted field have to be escaped and > the field has to end with a quote. Both of these test cases fail on or the > other assumption. If they are indeed both correct and I'm just looking at > things crosseyed I think they at least deserve comments explaining why they > are correct. > > Both test cases date from the first checkin. I performed the checkin > because of the group developing the module I believe I was the only one with > checkin privileges at the time, not because I wrote the test cases. > > Any ideas about why these test cases are in there? I can't imagine Excel > generating either one. My recommendation: Let the module do whatever Excel does with such data. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Mar 14 2007) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] These csv test cases seem incorrect to me...
>> I'm far from having anything which will pass the current test suite, >> but in diagnosing some of my current failures I noticed a couple test >> cases which seem wrong. In the TestDialectExcel class I see these >> two questionable tests: >> >> def test_quotes_and_more(self): >> self.readerAssertEqual('"a"b', [['ab']]) >> >> def test_quote_and_quote(self): >> self.readerAssertEqual('"a" "b"', [['a "b"']]) Andrew> The point was to produce the same results as Excel. Sure, Excel Andrew> probably doesn't generate crap like this itself, but 3rd parties Andrew> do, and people complain if we don't parse it just like Excel Andrew> (sigh). (sigh) indeed. Thanks, Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] These csv test cases seem incorrect to me...
Andrew McNamara <[EMAIL PROTECTED]> wrote: > The point was to produce the same results as Excel. Sure, Excel probably > doesn't generate crap like this itself, but 3rd parties do, and people > complain if we don't parse it just like Excel (sigh). The slight problem with copying Excel is that Excel won't parse its own CSV output. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] These csv test cases seem incorrect to me...
>I decided it would be worthwhile to have a csv module written in Python (no >C underpinnings) for a number of reasons: Several other people have already done this. I will forward you their e-mail address in a separate private e-mail. >I'm far from having anything which will pass the current test suite, but in >diagnosing some of my current failures I noticed a couple test cases which >seem wrong. In the TestDialectExcel class I see these two questionable >tests: > >def test_quotes_and_more(self): >self.readerAssertEqual('"a"b', [['ab']]) > >def test_quote_and_quote(self): >self.readerAssertEqual('"a" "b"', [['a "b"']]) [...] >Any ideas about why these test cases are in there? I can't imagine Excel >generating either one. The point was to produce the same results as Excel. Sure, Excel probably doesn't generate crap like this itself, but 3rd parties do, and people complain if we don't parse it just like Excel (sigh). -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] These csv test cases seem incorrect to me...
I decided it would be worthwhile to have a csv module written in Python (no C underpinnings) for a number of reasons: * It will probably be easier to add Unicode support to a Python version * More people will be able to read/grok/modify/fix bugs in a Python implementation than in the current mixed Python/C implementation. * With alternative implementations of Python available (PyPy, IronPython, Jython) it makes sense to have a Python version they can use. I'm far from having anything which will pass the current test suite, but in diagnosing some of my current failures I noticed a couple test cases which seem wrong. In the TestDialectExcel class I see these two questionable tests: def test_quotes_and_more(self): self.readerAssertEqual('"a"b', [['ab']]) def test_quote_and_quote(self): self.readerAssertEqual('"a" "b"', [['a "b"']]) It seems to me that if a field starts with a quote it *has* to be a quoted field. Any quotes appearing within a quoted field have to be escaped and the field has to end with a quote. Both of these test cases fail on or the other assumption. If they are indeed both correct and I'm just looking at things crosseyed I think they at least deserve comments explaining why they are correct. Both test cases date from the first checkin. I performed the checkin because of the group developing the module I believe I was the only one with checkin privileges at the time, not because I wrote the test cases. Any ideas about why these test cases are in there? I can't imagine Excel generating either one. Thx, Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com