This is probably not responsible for much of the performance difference,
but the racket code has (random 23) where it looks like it ought to be
(random 26), resulting in a smaller space of possible random keys. This
will cause the branch with the lookup to happen somewhat more often than in
the python version.


On Fri, Jul 25, 2014 at 1:34 AM, Laurent <laurent.ors...@gmail.com> wrote:

> Even more idiomatic:
> (time (for ([w words])
>         (hash-update! d w add1 0)))
>
> But it seems to be slightly slower that Robby's version.
>
> Laurent
>
>
> On Thu, Jul 24, 2014 at 11:26 AM, Robby Findler <
> ro...@eecs.northwestern.edu> wrote:
>
>> Not an answer to your direction question, but this is the more
>> idiomatic way to write that and it seems to be a bit faster:
>>
>> (time (for ([w (in-list words)])
>>         (hash-set! d w (add1 (hash-ref d w 0)))))
>>
>> On Wed, Jul 23, 2014 at 10:54 AM, Pedro Ramos
>> <pedropra...@tecnico.ulisboa.pt> wrote:
>> > Hi,
>> >
>> > I've been developing an implementation of Python in Racket, where I'm
>> > implementing Python's dictionaries over Racket custom hash tables.
>> >
>> > While my occasional benchmarks typically show better performance on
>> Racket
>> > programs than their Python equivalents, Racket's hash tables generally
>> seem
>> > to be slower than Python's dicts.
>> >
>> > I've set up this benchmark in Racket:
>> >
>> >
>> > #lang racket
>> >
>> > (define alphabet "abcdefghijklmnopqrstuvwxyz")
>> > (define (random-word n)
>> >   (build-string n (lambda (x) (string-ref alphabet (random 23)))))
>> >
>> > (define words (for/list ([k 1000000])
>> >                 (random-word 3)))
>> > (define d (make-hash))
>> >
>> > (time (for ([w words])
>> >         (if (hash-has-key? d w)
>> >             (hash-set! d w (add1 (hash-ref d w)))
>> >             (hash-set! d w 1))))
>> >
>> >
>> > And its equivalent in Python:
>> >
>> >
>> > import random
>> > import time
>> >
>> > alphabet = "abcdefghijklmnopqrstuvwxyz"
>> > def random_word(n):
>> >   return ''.join([random.choice(alphabet) for i in range(n)])
>> >
>> > words = [random_word(3) for k in xrange(1000000)]
>> > d = {}
>> >
>> > a = time.time()
>> > for w in words:
>> >   if w in d:
>> >     d[w] = d[w] + 1
>> >   else:
>> >     d[w] = 1
>> > b = time.time()
>> > print b-a, 'seconds'
>> >
>> >
>> > The Racket example yields running times of around 500 ms (running on
>> Racket
>> > v6.0.1) while the Python example yields running times of around 330 ms
>> > (running on Python 2.7.3).
>> >
>> > I find this unusual because Python is somewhat more dynamic than Racket,
>> > since
>> > (a) Python's equality and hashing functions have to dispatched at
>> runtime
>> > for each key;
>> > (b) referencing and setting values in a Python dict is done using a very
>> > general operator, [], whose behaviour also has to be dispatched at
>> runtime,
>> > unlike the more specific hash-ref and hash-set! Racket functions.
>> >
>> > Is there something I'm missing about Racket's hash tables which explains
>> > this slower speed?
>> >
>> > Thanks,
>> > Pedro Ramos
>> >
>> > _________________________
>> >   Racket Developers list:
>> >   http://lists.racket-lang.org/dev
>> >
>> _________________________
>>   Racket Developers list:
>>   http://lists.racket-lang.org/dev
>>
>
>
> _________________________
>   Racket Developers list:
>   http://lists.racket-lang.org/dev
>
>


-- 
A warb degombs the brangy. Your gitch zanks and leils the warb.
_________________________
  Racket Developers list:
  http://lists.racket-lang.org/dev

Reply via email to