[issue23269] Tighten-up search loops in sets

Raymond Hettinger Sun, 25 Jan 2015 12:32:13 -0800

Raymond Hettinger added the comment:

Reopening this with a better patch that nicely tightens the inner-loop without 
an algorithmic change or any downside.


Currently, the entry computation adds a constant (i+j), masks it (&mask) to 
wrap on overflow, scales it to a table index (<< 4), and then does the key 
lookup (one memory access).   These steps are sequential (not parallizeable).

The patch moves the wrap-on-overflow decision out of the inner loop (adding a 
single, fast, predictable branch (i + LINEAR_PROBES > mask).   The inner-loop 
then simplifies to an add, fetch, and test (no masking and shifting).

The generated code on Clang is very tight:

LBB29_26:      
    cmpq    $0, (%rsi)       # entry->key == NULL
    je  LBB29_30
    incq    %rdx             # j++
    addq    $16, %rsi        # entry++  (done in parallel with the incq)
    cmpq    $9, %rdx         # j <= LINEAR_PROBES
    jbe LBB29_26

On gcc-4.9, the loop is unrolled and we get even better code:

    leaq    (%rbx,%rsi), %rdx   # entry = table[i]
    cmpq    $0, (%rdx)          # entry.key == NULL
    je  L223

    leaq    16(%rbx,%rsi), %rdx # entry = table[i+1];
    cmpq    $0, (%rdx)          # entry.key == NULL
    je  L223

    leaq    32(%rbx,%rsi), %rdx
    cmpq    $0, (%rdx)
    je  L223

----------
resolution: rejected -> 
status: closed -> open
Added file: http://bugs.python.org/file37857/tight2.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23269>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23269] Tighten-up search loops in sets

Reply via email to