On 04/03/2013 04:22 AM, Neil Hodgson wrote:
rusi:
Can you please try one more experiment Neil?
Knock off all non-ASCII strings (paths) from your dataset and try
again.
Results are the same 0.40 (well, 0.001 less but I don't think the
timer is that accurate) for Python 3.2 and 0.78 for Python 3.3.
Neil
That would seem to imply that the speed regression on your data is NOT
caused by the differing size encodings. Perhaps it is the difference in
MSC compiler version, or other changes made between 3.2 and 3.3
Of course, I can't then explain why Steven didn't get the same results.
Perhaps the difference between 32bit Python and 64 on Windows? Or
perhaps you have significantly more (or significantly fewer)
"collisions" than Steven did.
Before I saw this message, I was thinking of suggesting that you supply
a key= parameter to sort, specifying as a key the Unicode character
65536 higher than the one supplied. That way all the keys to be sorted
would be 32 bits in size. If this made the timings change noticeably,
it could be a big clue.
--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list