On Tue, Nov 12, 2013 at 09:48:23AM +0100, Armin Rigo wrote:

Hi Armin,

> Ok, but then it seems that the first problem we hit down this road, before
> arriving at the __contains__ performance, is actually mixing ints and
> floats in the same list.  We need to do something to store such lists
> efficiently.  I'm suggesting using a list of floats with a custom encoding
> of N-bit integers into NaN values (for N==32 or maybe a higher number that
> still fits).  This requires a bit of tweaking here and there (maybe all the
> way to the JIT) but shouldn't be too hard.

This is something I have been thinking about too. I can't pretend that I've
thought of every in and out, but my current thinking is that it needs to be
alongside an IntegerListStrategy. If we stored every list with numbers as
floats, lists which only store ints (which we know are very common [1]) would
double in storage size. The JIT would also then be disadvantaged, as
currently it can trivially prove that every item coming out of an integer
list is an integer, allowing it to remove many type checks; in an int+float
list, it will have to add a type check for every item that is extracted
(maybe changes to the JIT can reduce the overhead of this). So, if we decide
to do this, I think we would still want IntegerListStrategy -- with some
variant of the patch to 'find' I proposed -- alongside an
IntegerFloatListStrategy.


Laurie

[1] See table 3:
  
http://tratt.net/laurie/research/pubs/html/bolz_diekmann_tratt__storage_strategies_for_collections_in_dynamically_typed_languages/
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
https://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to