[issue26118] String performance issue using single quotes

2016-01-15 Thread Steven D'Aprano
Steven D'Aprano added the comment: On Fri, Jan 15, 2016 at 07:56:39AM +, poostenr wrote: > As I did more testing I noticed that appending data to the file slowed > down. The file grew initially with ~30-50KB increments and around > 500KB it had slowed down to ~3-5KB/s, until around 1MB the

[issue26118] String performance issue using single quotes

2016-01-15 Thread poostenr
poostenr added the comment: Thank you for your feedback Victor and Steven. I just copied my scripts and 360MB of CSV files over to Linux. The entire process finished in 4 minutes exactly, using the original python scripts. So there is something different between my environments. If it was a

[issue26118] String performance issue using single quotes

2016-01-15 Thread SilentGhost
SilentGhost added the comment: poostenr, this is demonstrably not a problem with the CPython, which this bug tracker is about. There are few options available on the internet if you need help with your code: mailing lists and irc are among them. -- nosy: +SilentGhost resolution: ->

[issue26118] String performance issue using single quotes

2016-01-15 Thread STINNER Victor
STINNER Victor added the comment: I implemented overkill optimization in _PyUnicodeWriter API used by str.format() and str%args. If the result is the input, the string is not copied by value, but by reference. >>> x="hello" >>> ("%s" % x) is x True >>> ("{}".format(x)) is x True If the

[issue26118] String performance issue using single quotes

2016-01-15 Thread STINNER Victor
STINNER Victor added the comment: > If you see a factor of 30x difference in your code, I suspect it's not > related to str.format(), but some other processing in your code. The performance of instructions like ("x=%s" % x) or ("x={}".format(x)) depend on the length of the string. Maybe

[issue26118] String performance issue using single quotes

2016-01-14 Thread Steven D'Aprano
Steven D'Aprano added the comment: I cannot replicate that performance difference under Linux. There's a small difference (about 0.1 second per million iterations, or a tenth of a microsecond) on my computer, but I don't think that's meaningful: py> from timeit import Timer py> t1 =

[issue26118] String performance issue using single quotes

2016-01-14 Thread poostenr
poostenr added the comment: Eric, Steven, thank you for your feedback so far. I am using Windows7, Intel i7. That one particular file of 6.5MB took ~1 minute on my machine. When I ran that same test on Linux with Python 3.5.1, it took about 3 seconds. I was amazed to see a 20x difference.

[issue26118] String performance issue using single quotes

2016-01-14 Thread poostenr
poostenr added the comment: Eric, I just tried your examples. The loop count is 100x more, but the results are about a factor 10 off. Test1: My results: C:\Data>python -m timeit -s 'x=4' '",{0}".format(x)' 1 loops, best of 3: 0.0116 usec per loop Eric's results: $ python -m timeit

[issue26118] String performance issue using single quotes

2016-01-14 Thread Eric V. Smith
Eric V. Smith added the comment: I see a small difference, but I think it's related to the fact that in the first example you're concatenating 2 strings (',' and the result of {0}), and in the 2nd example it's 3 strings ("'", {0}, "',"): $ echo '",{0}".format(x)' ",{0}".format(x) $ python -m

[issue26118] String performance issue using single quotes

2016-01-14 Thread Eric V. Smith
Eric V. Smith added the comment: Please show us how you're measuring the performance. Also, please show the output of "python -V". -- components: +Interpreter Core -Windows nosy: +eric.smith -paul.moore, steve.dower, tim.golden, zach.ware ___ Python

[issue26118] String performance issue using single quotes

2016-01-14 Thread ubehera
Changes by ubehera : -- nosy: +ubehera ___ Python tracker ___ ___ Python-bugs-list

[issue26118] String performance issue using single quotes

2016-01-14 Thread poostenr
poostenr added the comment: My initial observations with my Python script using: s = "{0},".format(columnvalue) # fast Processed ~360MB of data from 2:16PM - 2:51PM (35 minutes, ~10MB/min) One particular file 6.5MB took ~1 minute. When I changed this line of code to: s =

[issue26118] String performance issue using single quotes

2016-01-14 Thread poostenr
New submission from poostenr: There appears to be a significant performance issue between the following two statements. Unable to explain performance impact. s = "{0},".format(columnvalue) # fast s = "'{0}',".format(columnvalue) # ~30x slower So far, no luck trying to find other statements

[issue26118] String performance issue using single quotes

2016-01-14 Thread poostenr
poostenr added the comment: Eric, Steven, During further testing I was not able to find any real evidence that the statement I was focused on had a real performance issue. As I did more testing I noticed that appending data to the file slowed down. The file grew initially with ~30-50KB