New submission from Skip Montanaro: I have a CSV file. Here are a few rows:
"2013-10-30 14:26:46.000528","1.36097023829" "2013-10-30 14:26:46.999755","1.36097023829" "2013-10-30 14:26:47.999308","1.36097023829" "2013-10-30 14:26:49.002472","1.36097023829" "2013-10-30 14:26:50","1.36097023829" "2013-10-30 14:26:51.000549","1.36097023829" "2013-10-30 14:26:51.999315","1.36097023829" "2013-10-30 14:26:52.999703","1.36097023829" "2013-10-30 14:26:53.999640","1.36097023829" "2013-10-30 14:26:54.999139","1.36097023829" I want to parse the strings in the first column as timestamps. I can, and often do, use dateutil.parser.parse(), but in situations like this where all the timestamps are of the same format, it can be incredibly slow. OTOH, there is no single format I can pass to datetime.datetime.strptime() that will parse all the above timestamps. Using "%Y-%m-%d %H:%M:%S" I get errors about the leftover microseconds. Using "%Y-%m-%d %H:%M:%S".%f" I get errors when I try to parse a timestamp which doesn't have microseconds. Alas, it is datetime itself which is to blame for this problem. The above timestamps were all printed from an earlier Python program which just dumps the str() of a datetime object to its output CSV file. Consider: >>> dt = dateutil.parser.parse("2013-10-30 14:26:50") >>> print dt 2013-10-30 14:26:50 >>> dt2 = dateutil.parser.parse("2013-10-30 14:26:51.000549") >>> print dt2 2013-10-30 14:26:51.000549 The same holds for isoformat(): >>> print dt.isoformat() 2013-10-30T14:26:50 >>> print dt2.isoformat() 2013-10-30T14:26:51.000549 Whatever happened to "be strict in what you send, but generous in what you receive"? If strptime() is going to complain the way it does, then str() should always generate a full timestamp, including microseconds. The above is from a Python 2.7 session, but I also confirmed that Python 3.3 behaves the same. I've checked 2.7 and 3.3 in the Versions list, but I don't think it can be fixed there. Can the __str__ and isoformat methods of datetime (and time) objects be modified for 3.4 to always include the microseconds? Alternatively, can the %S format character be modified to consume optional decimal point and microseconds? I rate this as "easy" considering the easiest fix is to modify __str__ and isoformat, which seems unchallenging. ---------- components: Extension Modules keywords: easy messages: 201917 nosy: skip.montanaro priority: normal severity: normal status: open title: Inconsistency between datetime's str()/isoformat() and its strptime() method type: behavior versions: Python 2.7, Python 3.3, Python 3.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19475> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com