I have started doing practice creating C extensions for CPython, so
here are two ideas I have had, possibly useless.
If you keep adding elements to a CPython dict/set, it periodically
rebuilds itself. So maybe dict.reserve(n) and a set.reserve(n) methods
may help, reserving enough (empty) memory for about n *distinct* keys
the programmer wants to add to the dict/set in a short future. I have
seen that the the C API of the dicts doesn't allow this, and I don't
know if this can be implemented modifying the dicts a bit. Do you think
this may be useful?
-------------------------------
Sometime I need to remove some elements from dicts/sets. If I know a
rule that tells what elements have to be removed I can use code like
this:
import itertools
def filterset(predicate, inset):
return set(itertools.ifilter(predicate, inset))
print filterset(lambda x: x & 1, set(range(10)))
Output:
set([1, 3, 9, 5, 7])
And a similar one to filter dicts that works with a predicate(key,
val).
Most of the times such functions are good enough, but sometimes the
dicts are big, so to reduce memory used I remove keys in place:
def filterdict(pred, indict):
todel = [k for k,v in indict.iteritems() if not pred(k,v)]
for key in todel:
del indict[key]
d = dict.fromkeys(xrange(8), 0)
print d
filterdict(lambda k,v: k & 1, d)
print d
# Output:
# {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0}
# {1: 0, 3: 0, 5: 0, 7: 0}
But doing the same thing while iterating on the dict may be faster and
use even less memory.
This iteration&deletion capability is probably not safe enough to be
used inside Python programs, but a compiled C module with a function
that works like that filterdict (and deletes while iterating) may be
created, and its use is safe from Python programs.
The C++ STL API of maps allows to delete items while iterating on the
map (but probably it has to be done only when necessary, because it may
be a source for bugs):
typedef map<int, string> isMap;
typedef isMap::value_type isValType;
typedef isMap::iterator isMapItor;
void main() {
isMap m;
... // items added to m
for(isMapItor itr = m.begin(); itr != m.end();) {
if(itr->second == "somestring")
m.erase(itr++);
else
++itr;
}
The Python/C API, 7.4.1 Dictionary Objects, says that's not possible
with CPython dicts:
>The dictionary p should not be mutated during iteration. It is safe (since
>Python 2.1) to modify the values of the keys as you iterate over the
>dictionary, but only so long as the set of keys does not change.<
Do you think it may be useful to create to create such C filterdict
function that deletes while iterating? (To create such function it
probably has to bypass the CPython dict API).
Bye,
bearophile
--
http://mail.python.org/mailman/listinfo/python-list