poostenr added the comment:

Thank you for your feedback Victor and Steven.

I just copied my scripts and 360MB of CSV files over to Linux.
The entire process finished in 4 minutes exactly, using the original python 
scripts.
So there is something different between my environments.
If it was a fragmentation issue, then I would expect to always have a slow 
performance on the Windows system. But I can influence the performance by 
alternating between the two original statements:
s = "{0},".format(columnvalue)   # fast
s = "'{0}',".format(columnvalue) # ~30x slower

I apologize for not being able to provide the entire code.
There is too much code to post at this time.

I am opening a file like this:
#logger = open(filename, rw, buffering, encoding)
logger = open('output.sql', 'a', 1, 'iso-8859-1')

I write to file:
logger.write(text+'\n')

I'm using a library to escape the string before saving to file.
import pymysql.converters as conv
<...>
for key in listkeys:
    keyvalue = self.recordstats[key]
    fieldtype   = keyvalue[0]
    columnvalue = record[key]
    columnvalue = conv.escape_string(columnvalue)
    if (count > 1):
        s = "{0},".format(columnvalue)  # No single quotes
    else
        s = "{0},".format(columnvalue)  # No single quotes
    count -= 1
    logger.write(s+'\n')

I appreciate the feedback and ideas so far.
Trying the profiler is on my list to see if it provides more insight.
I am not using Anaconda3 on Linux. Perhaps that has an impact somehow?

I never suspected inserting the two single quotes to cause such a problem in 
performance. I noticed it when I parsed ~40GB of data and it took almost a week 
to complete instead of my expected 6-7 hrs.
Just the other day I decided to remove the single quotes because it was the 
only thing left that I'd changed. I had discarded that change the past two 
weeks because that couldn't be causing the performance problem.

Today, I wasn't expecting such a big difference between running my script on 
Linux or Windows.

If I discover anything else, I will post an update.
When I get the chance I can remove redundant code and post the source.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26118>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to