On Wed, 18 Feb 2009 07:08:04 +1100, Jervis Whitley wrote:
>> This moves the for-loop out of slow Python into fast C and should be >> much, much faster for very large input. >> >> > _Should_ be faster. Yes, Python's timing results are often unintuitive. > Here is my test on an XP system Python 2.5.4. I had similar results on > python 2.7 trunk. ... > **no vowels** > any: [0.36063678618957751, 0.36116506191682773, 0.36212355395824081] > for: [0.24044885376801672, 0.2417684017413404, 0.24084797257163482] I get similar results. ... > **BIG word vowel 'U' final char** > any: [8.0007259193539895, 7.9797344140269644, 7.8901742633514012] for: > [7.6664422372764101, 7.6784683633957584, 7.6683055766498001] Well, I did say "for very large input". 10000 chars isn't "very large" -- that's only 9K. Try this instead: >>> BIGWORD = 'g' * 500000 + 'U' # less than 500K of text >>> >>> Timer("for_test(BIGWORD)", setup).repeat(number=1000) [4.7292280197143555, 4.633030891418457, 4.6327309608459473] >>> Timer("any_test(BIGWORD)", setup).repeat(number=1000) [4.7717428207397461, 4.6366970539093018, 4.6367099285125732] The difference is not significant. What about bigger? >>> BIGWORD = 'g' * 5000000 + 'U' # less than 5MB >>> >>> Timer("for_test(BIGWORD)", setup).repeat(number=100) [4.8875839710235596, 4.7698030471801758, 4.769787073135376] >>> Timer("any_test(BIGWORD)", setup).repeat(number=100) [4.8555209636688232, 4.8139419555664062, 4.7710208892822266] It seems to me that I was mistaken -- for large enough input, the running time of each version converges to approximately the same speed. What happens when you have hundreds of megabytes, I don't know. -- Steven -- http://mail.python.org/mailman/listinfo/python-list