Re: [racket-dev] Racket hash tables vs. Python dicts - Performance

Jake Eakle Fri, 25 Jul 2014 05:41:37 -0700

This is probably not responsible for much of the performance difference,
but the racket code has (random 23) where it looks like it ought to be
(random 26), resulting in a smaller space of possible random keys. This
will cause the branch with the lookup to happen somewhat more often than in
the python version.



On Fri, Jul 25, 2014 at 1:34 AM, Laurent <[email protected]> wrote:

> Even more idiomatic:
> (time (for ([w words])
>         (hash-update! d w add1 0)))
>
> But it seems to be slightly slower that Robby's version.
>
> Laurent
>
>
> On Thu, Jul 24, 2014 at 11:26 AM, Robby Findler <
> [email protected]> wrote:
>
>> Not an answer to your direction question, but this is the more
>> idiomatic way to write that and it seems to be a bit faster:
>>
>> (time (for ([w (in-list words)])
>>         (hash-set! d w (add1 (hash-ref d w 0)))))
>>
>> On Wed, Jul 23, 2014 at 10:54 AM, Pedro Ramos
>> <[email protected]> wrote:
>> > Hi,
>> >
>> > I've been developing an implementation of Python in Racket, where I'm
>> > implementing Python's dictionaries over Racket custom hash tables.
>> >
>> > While my occasional benchmarks typically show better performance on
>> Racket
>> > programs than their Python equivalents, Racket's hash tables generally
>> seem
>> > to be slower than Python's dicts.
>> >
>> > I've set up this benchmark in Racket:
>> >
>> >
>> > #lang racket
>> >
>> > (define alphabet "abcdefghijklmnopqrstuvwxyz")
>> > (define (random-word n)
>> >   (build-string n (lambda (x) (string-ref alphabet (random 23)))))
>> >
>> > (define words (for/list ([k 1000000])
>> >                 (random-word 3)))
>> > (define d (make-hash))
>> >
>> > (time (for ([w words])
>> >         (if (hash-has-key? d w)
>> >             (hash-set! d w (add1 (hash-ref d w)))
>> >             (hash-set! d w 1))))
>> >
>> >
>> > And its equivalent in Python:
>> >
>> >
>> > import random
>> > import time
>> >
>> > alphabet = "abcdefghijklmnopqrstuvwxyz"
>> > def random_word(n):
>> >   return ''.join([random.choice(alphabet) for i in range(n)])
>> >
>> > words = [random_word(3) for k in xrange(1000000)]
>> > d = {}
>> >
>> > a = time.time()
>> > for w in words:
>> >   if w in d:
>> >     d[w] = d[w] + 1
>> >   else:
>> >     d[w] = 1
>> > b = time.time()
>> > print b-a, 'seconds'
>> >
>> >
>> > The Racket example yields running times of around 500 ms (running on
>> Racket
>> > v6.0.1) while the Python example yields running times of around 330 ms
>> > (running on Python 2.7.3).
>> >
>> > I find this unusual because Python is somewhat more dynamic than Racket,
>> > since
>> > (a) Python's equality and hashing functions have to dispatched at
>> runtime
>> > for each key;
>> > (b) referencing and setting values in a Python dict is done using a very
>> > general operator, [], whose behaviour also has to be dispatched at
>> runtime,
>> > unlike the more specific hash-ref and hash-set! Racket functions.
>> >
>> > Is there something I'm missing about Racket's hash tables which explains
>> > this slower speed?
>> >
>> > Thanks,
>> > Pedro Ramos
>> >
>> > _________________________
>> >   Racket Developers list:
>> >   http://lists.racket-lang.org/dev
>> >
>> _________________________
>>   Racket Developers list:
>>   http://lists.racket-lang.org/dev
>>
>
>
> _________________________
>   Racket Developers list:
>   http://lists.racket-lang.org/dev
>
>


-- 
A warb degombs the brangy. Your gitch zanks and leils the warb.

_________________________
  Racket Developers list:
  http://lists.racket-lang.org/dev

Re: [racket-dev] Racket hash tables vs. Python dicts - Performance

Reply via email to