Re: [IronPython] More Performance comparisons - dictionary updates and tuples
Dino this all sounds good. FWIW I think we can get some advantage in a couple of places in our code by switching from range to xrange. General dictionary performance is critical of course. Michael Foord http://www.ironpythoninaction.com/ Dino Viehland wrote: > BTW I did get a chance to look at this and I believe I have solutions for > these issues including a new and improved tuple hash. It is an interesting > test case though... > > So looking at the scenarios here: > Tuple create and unpack - we're actually not much slower at the > actual tuple create / unpack here, but we are slower. We can detect the > common parallel assignment case and optimize it into temporary variables > rather than creating an object array, a tuple, an enumerator for the tuple, > and pulling out the individual values. Once I did that the times seemed > comparable. But switching from a range to an xrange gave IronPython a speed > advantage. So range apparently has more overhead than CPython and with such > a small loop it starts to show up. > > On the dictionary performance - the new dictionaries really favor reading > more than writing. Previously in 1.x we required locking on both reads and > writes to dictionaries. The new dictionaries are thread safe for 1 writer > and multiple readers. The trade off here is that our resizes are more > expensive. Unfortunately they were particularly more expensive than they > needed to be - we would rehash everything on resize. And update was just > copy everything from one dictionary to another, potentially resizing multiple > times along the way, doing tons of unnecessary hashing. It's easy enough to > make update pre-allocate the end size and also avoid rehashing all the > values. That gets us competitive again. > > The int's also got sped up by this change to the point where they're close to > 1.x but they're not quite the same speed. This seems to mostly come back to > the different performance characteristics of our new dictionaries and the > resizing issue. > > There's probably still some more tuning to be done on our dictionaries over > time. We're also not going to be beating CPython's highly-tuned dictionaries > anytime soon but at least we should be much more competitive now. I haven't > got the changes checked in yet but hopefully over the next few days they'll > make it in. Then I might have some more performance info to share as I'll > get runs against a large number of tests. > > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Foord > Sent: Wednesday, April 16, 2008 10:33 AM > To: Discussion of IronPython > Subject: [IronPython] More Performance comparisons - dictionary updates and > tuples > > Hello guys, > > I've been looking at performance in Resolver One. (Object creation in > IronPython seems to be really good whereas dictionary lookups not so > good when we compare against CPython.) > > It turns out that we are getting bitten quite badly by the performance > of hashing tuples (fixing this for IP 1.1.2 would be *great*). I did > some profiling and have some comparisons - and can also show a > regression in performance in IP 2 (Beta 1) when using ints as dictionary > keys in an update operation. I thought I would post the results as they > may be useful. > > (Dictionary update in IP 1 with tuple keys is an order of magnitude > slower than CPython. So is IP 2 but still twice as good as IP 1 - two > times *worse* than IP 1 for tuple creating and unpacking though.) > > Results first: > > CPython > e:\Dev>timeit1.py > tuple_create_and_unpack took 220.56131 ms > dict_update took 541.000127792 ms > > > IP 1.1.1 > e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py > tuple_create_and_unpack took 680.9792 ms > dict_update took 7891.3472 ms > > > IP 2 Beta 1 > e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py > tuple_create_and_unpack took 1341.9296 ms > dict_update took 4756.84 ms > > > If we switch to using integers rather than tuples for the dictionary > keys, the performance changes: > > CPython > e:\Dev>timeit1.py > tuple_create_and_unpack took 200.47684 ms > dict_update took 230.46594 ms > > > IP 1.1.1 > e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py > tuple_create_and_unpack took 911.3104 ms > dict_update took 420.6048 ms > > > IP 2 Beta 1 > e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py > tuple_create_and_unpack took 971.3968 ms > dict_update took 1582.2752 ms > > > With ints as keys, IP 1 is only half the speed of CPython - but IP 2 is > four times slower than IP 1! > >
Re: [IronPython] More Performance comparisons - dictionary updates and tuples
BTW I did get a chance to look at this and I believe I have solutions for these issues including a new and improved tuple hash. It is an interesting test case though... So looking at the scenarios here: Tuple create and unpack - we're actually not much slower at the actual tuple create / unpack here, but we are slower. We can detect the common parallel assignment case and optimize it into temporary variables rather than creating an object array, a tuple, an enumerator for the tuple, and pulling out the individual values. Once I did that the times seemed comparable. But switching from a range to an xrange gave IronPython a speed advantage. So range apparently has more overhead than CPython and with such a small loop it starts to show up. On the dictionary performance - the new dictionaries really favor reading more than writing. Previously in 1.x we required locking on both reads and writes to dictionaries. The new dictionaries are thread safe for 1 writer and multiple readers. The trade off here is that our resizes are more expensive. Unfortunately they were particularly more expensive than they needed to be - we would rehash everything on resize. And update was just copy everything from one dictionary to another, potentially resizing multiple times along the way, doing tons of unnecessary hashing. It's easy enough to make update pre-allocate the end size and also avoid rehashing all the values. That gets us competitive again. The int's also got sped up by this change to the point where they're close to 1.x but they're not quite the same speed. This seems to mostly come back to the different performance characteristics of our new dictionaries and the resizing issue. There's probably still some more tuning to be done on our dictionaries over time. We're also not going to be beating CPython's highly-tuned dictionaries anytime soon but at least we should be much more competitive now. I haven't got the changes checked in yet but hopefully over the next few days they'll make it in. Then I might have some more performance info to share as I'll get runs against a large number of tests. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Foord Sent: Wednesday, April 16, 2008 10:33 AM To: Discussion of IronPython Subject: [IronPython] More Performance comparisons - dictionary updates and tuples Hello guys, I've been looking at performance in Resolver One. (Object creation in IronPython seems to be really good whereas dictionary lookups not so good when we compare against CPython.) It turns out that we are getting bitten quite badly by the performance of hashing tuples (fixing this for IP 1.1.2 would be *great*). I did some profiling and have some comparisons - and can also show a regression in performance in IP 2 (Beta 1) when using ints as dictionary keys in an update operation. I thought I would post the results as they may be useful. (Dictionary update in IP 1 with tuple keys is an order of magnitude slower than CPython. So is IP 2 but still twice as good as IP 1 - two times *worse* than IP 1 for tuple creating and unpacking though.) Results first: CPython e:\Dev>timeit1.py tuple_create_and_unpack took 220.56131 ms dict_update took 541.000127792 ms IP 1.1.1 e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py tuple_create_and_unpack took 680.9792 ms dict_update took 7891.3472 ms IP 2 Beta 1 e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py tuple_create_and_unpack took 1341.9296 ms dict_update took 4756.84 ms If we switch to using integers rather than tuples for the dictionary keys, the performance changes: CPython e:\Dev>timeit1.py tuple_create_and_unpack took 200.47684 ms dict_update took 230.46594 ms IP 1.1.1 e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py tuple_create_and_unpack took 911.3104 ms dict_update took 420.6048 ms IP 2 Beta 1 e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py tuple_create_and_unpack took 971.3968 ms dict_update took 1582.2752 ms With ints as keys, IP 1 is only half the speed of CPython - but IP 2 is four times slower than IP 1! The code used - which runs under both CPython and IronPython from random import random try: import clr from System import DateTime def timeit(func): start = DateTime.Now func() end = DateTime.Now print func.__name__, 'took %s ms' % (end - start).TotalMilliseconds except ImportError: import time def timeit(func): start = time.time() func() end = time.time() print func.__name__, 'took %s ms' % ((end - start) * 1000) def tuple_create_and_unpack(): for val in range(100): a, b = val, val + 1 d1 = {} for x in range(100): for y in range(100): d1[x, y] = random() d2 = {} for x in range(1000): for y in range(1000): d2[x, y] =
Re: [IronPython] More Performance comparisons - dictionary updates and tuples
I'll have to look at these closer but I suspect fixing dict update performance will be a trivial change. We currently just go through some overly generic code path which works for any IDictionary. Unfortunately when used against a PythonDictionary we're going through a code path which is making a copy of all the members into a list, getting an enumerator for that list, and then copying the members into the new dictionary. Not exactly the fastest thing in the world and likely a regression caused directly from the new dictionary implementation... The tuple hashing is better in Beta 1 then it was before (due to your feedback! :)) but it's still not perfect. I ran into a test case in the standard library the other day which is verifying the # of collisions is below 15 or something like that and we were still getting thousands of collisions in that test. So that still deserves another iteration. The regressions w/ ints and tuple assignment certainly deserve a good investigation as well - hopefully it's just something silly that got broken. :) I'll try and get to these tomorrow but if I don't I'll open a bug just so it doesn't get lost. Are there any other issues that you or anyone else would like to see for 1.1.2? We've currently only got 2 other issues up on CodePlex marked for 1.1.2 right now. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Michael Foord Sent: Wednesday, April 16, 2008 10:33 AM To: Discussion of IronPython Subject: [IronPython] More Performance comparisons - dictionary updates and tuples Hello guys, I've been looking at performance in Resolver One. (Object creation in IronPython seems to be really good whereas dictionary lookups not so good when we compare against CPython.) It turns out that we are getting bitten quite badly by the performance of hashing tuples (fixing this for IP 1.1.2 would be *great*). I did some profiling and have some comparisons - and can also show a regression in performance in IP 2 (Beta 1) when using ints as dictionary keys in an update operation. I thought I would post the results as they may be useful. (Dictionary update in IP 1 with tuple keys is an order of magnitude slower than CPython. So is IP 2 but still twice as good as IP 1 - two times *worse* than IP 1 for tuple creating and unpacking though.) Results first: CPython e:\Dev>timeit1.py tuple_create_and_unpack took 220.56131 ms dict_update took 541.000127792 ms IP 1.1.1 e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py tuple_create_and_unpack took 680.9792 ms dict_update took 7891.3472 ms IP 2 Beta 1 e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py tuple_create_and_unpack took 1341.9296 ms dict_update took 4756.84 ms If we switch to using integers rather than tuples for the dictionary keys, the performance changes: CPython e:\Dev>timeit1.py tuple_create_and_unpack took 200.47684 ms dict_update took 230.46594 ms IP 1.1.1 e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py tuple_create_and_unpack took 911.3104 ms dict_update took 420.6048 ms IP 2 Beta 1 e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py tuple_create_and_unpack took 971.3968 ms dict_update took 1582.2752 ms With ints as keys, IP 1 is only half the speed of CPython - but IP 2 is four times slower than IP 1! The code used - which runs under both CPython and IronPython from random import random try: import clr from System import DateTime def timeit(func): start = DateTime.Now func() end = DateTime.Now print func.__name__, 'took %s ms' % (end - start).TotalMilliseconds except ImportError: import time def timeit(func): start = time.time() func() end = time.time() print func.__name__, 'took %s ms' % ((end - start) * 1000) def tuple_create_and_unpack(): for val in range(100): a, b = val, val + 1 d1 = {} for x in range(100): for y in range(100): d1[x, y] = random() d2 = {} for x in range(1000): for y in range(1000): d2[x, y] = random() def dict_update(): d1.update(d2) timeit(tuple_create_and_unpack) timeit(dict_update) Michael Foord http://www.ironpythoninaction.com/ ___ Users mailing list Users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com ___ Users mailing list Users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
Re: [IronPython] More Performance comparisons - dictionary updates and tuples
Simon Dahlbacka wrote: > > from random import random > > try: >import clr >from System import DateTime > >def timeit(func): >start = DateTime.Now >func() >end = DateTime.Now >print func.__name__, 'took %s ms' % (end - > start).TotalMilliseconds > > > Just a small nitpick or whatever, you might want to consider using the > System.Diagnostics.StopWatch class as it "Provides a set of methods > and properties that you can use to accurately measure elapsed time." Sure - DateTime has a granularity of 10-15ms, which I thought was accurate enough for this. Thanks Michael > > i.e. > > try: >import clr >from System.Diagnostics import StopWatch > >def timeit(func): >watch = StopWatch() >watch.Start() >func() >watch.Stop() >print func.__name__, 'took %s ms' % watch.Elapsed.TotalMilliseconds > > > > ___ > Users mailing list > Users@lists.ironpython.com > http://lists.ironpython.com/listinfo.cgi/users-ironpython.com > ___ Users mailing list Users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
Re: [IronPython] More Performance comparisons - dictionary updates and tuples
> from random import random > > try: >import clr >from System import DateTime > >def timeit(func): >start = DateTime.Now >func() >end = DateTime.Now >print func.__name__, 'took %s ms' % (end - start).TotalMilliseconds Just a small nitpick or whatever, you might want to consider using the System.Diagnostics.StopWatch class as it "Provides a set of methods and properties that you can use to accurately measure elapsed time." i.e. try: import clr from System.Diagnostics import StopWatch def timeit(func): watch = StopWatch() watch.Start() func() watch.Stop() print func.__name__, 'took %s ms' % watch.Elapsed.TotalMilliseconds ___ Users mailing list Users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com
[IronPython] More Performance comparisons - dictionary updates and tuples
Hello guys, I've been looking at performance in Resolver One. (Object creation in IronPython seems to be really good whereas dictionary lookups not so good when we compare against CPython.) It turns out that we are getting bitten quite badly by the performance of hashing tuples (fixing this for IP 1.1.2 would be *great*). I did some profiling and have some comparisons - and can also show a regression in performance in IP 2 (Beta 1) when using ints as dictionary keys in an update operation. I thought I would post the results as they may be useful. (Dictionary update in IP 1 with tuple keys is an order of magnitude slower than CPython. So is IP 2 but still twice as good as IP 1 - two times *worse* than IP 1 for tuple creating and unpacking though.) Results first: CPython e:\Dev>timeit1.py tuple_create_and_unpack took 220.56131 ms dict_update took 541.000127792 ms IP 1.1.1 e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py tuple_create_and_unpack took 680.9792 ms dict_update took 7891.3472 ms IP 2 Beta 1 e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py tuple_create_and_unpack took 1341.9296 ms dict_update took 4756.84 ms If we switch to using integers rather than tuples for the dictionary keys, the performance changes: CPython e:\Dev>timeit1.py tuple_create_and_unpack took 200.47684 ms dict_update took 230.46594 ms IP 1.1.1 e:\Dev>e:\Dev\ironpython1\ipy.exe timeit1.py tuple_create_and_unpack took 911.3104 ms dict_update took 420.6048 ms IP 2 Beta 1 e:\Dev>e:\Dev\ironpython2\ipy.exe timeit1.py tuple_create_and_unpack took 971.3968 ms dict_update took 1582.2752 ms With ints as keys, IP 1 is only half the speed of CPython - but IP 2 is four times slower than IP 1! The code used - which runs under both CPython and IronPython from random import random try: import clr from System import DateTime def timeit(func): start = DateTime.Now func() end = DateTime.Now print func.__name__, 'took %s ms' % (end - start).TotalMilliseconds except ImportError: import time def timeit(func): start = time.time() func() end = time.time() print func.__name__, 'took %s ms' % ((end - start) * 1000) def tuple_create_and_unpack(): for val in range(100): a, b = val, val + 1 d1 = {} for x in range(100): for y in range(100): d1[x, y] = random() d2 = {} for x in range(1000): for y in range(1000): d2[x, y] = random() def dict_update(): d1.update(d2) timeit(tuple_create_and_unpack) timeit(dict_update) Michael Foord http://www.ironpythoninaction.com/ ___ Users mailing list Users@lists.ironpython.com http://lists.ironpython.com/listinfo.cgi/users-ironpython.com