[issue44677] CSV sniffing falsely detects space as a delimiter

2021-07-21 Thread Raymond Hettinger
Raymond Hettinger added the comment: Changing sniffer logic is risky because it risks breaking existing code that relies on the current predictions. FWIW, in your example, the sniffer gets the desired result if given a delimiter hint: >>> s = "a|b\nc| 'd\ne|' f" >>> pprint.pp(dict(vars(Snif

[issue44677] CSV sniffing falsely detects space as a delimiter

2021-07-20 Thread Roundup Robot
Change by Roundup Robot : -- keywords: +patch nosy: +python-dev nosy_count: 1.0 -> 2.0 pull_requests: +25801 stage: -> patch review pull_request: https://github.com/python/cpython/pull/27256 ___ Python tracker _

[issue44677] CSV sniffing falsely detects space as a delimiter

2021-07-20 Thread Piotr Tokarski
Piotr Tokarski added the comment: I think changing `(?P["\']).*?(?P=quote)` to `(?P["\'])[^\n]*?(?P=quote)` in all regexes does the trick, doesn't it? -- ___ Python tracker _

[issue44677] CSV sniffing falsely detects space as a delimiter

2021-07-20 Thread Piotr Tokarski
Piotr Tokarski added the comment: Test sample: ``` import csv from io import StringIO def csv_text(): return StringIO("a|b\nc| 'd\ne|' f") with csv_text() as input_file: print('The following text is going to be parsed:') print(input_file.read()) print() with csv_text() as

[issue44677] CSV sniffing falsely detects space as a delimiter

2021-07-19 Thread Piotr Tokarski
New submission from Piotr Tokarski : Let's consider the following CSV content: "a|b\nc| 'd\ne|' f". The real delimiter in this case is '|' character while ' ' is sniffed. Find verbose example attached. Problem lays in csv.py file in the following code: ``` matches = [] for re