[issue31265] Remove doubly-linked list from C OrderedDict

2017-11-09 Thread STINNER Victor
Change by STINNER Victor : -- nosy: +haypo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python

[issue31265] Remove doubly-linked list from C OrderedDict

2017-12-15 Thread Raymond Hettinger
Change by Raymond Hettinger : -- stage: -> resolved status: open -> closed ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue31265] Remove doubly-linked list from C OrderedDict

2017-12-15 Thread STINNER Victor
STINNER Victor added the comment: Dividing the memory consumption by two is quite impressive. Raymond: Would you mind to explain why you close the issue? -- ___ Python tracker ___

[issue31265] Remove doubly-linked list from C OrderedDict

2017-12-15 Thread INADA Naoki
INADA Naoki added the comment: Since normal dict preserves insertion order, OrderedDict usage will be decreased. So I don't think 2x memory consumption is big issue. BTW, functools.lru_cache uses custom doubly-linked list which is much memory inefficient. Rewriting it to use OrderedDict will r

[issue31265] Remove doubly-linked list from C OrderedDict

2017-12-15 Thread STINNER Victor
STINNER Victor added the comment: Writing code compatible with Python older than 3.5 (like Python 2.7) requires to use OrderedDict. I expect that very few users will use "dict if sys.version_info >= (3, 6) else OrderedDict" to get an ordered dict. OrderedDict remains useful when you need the

[issue31265] Remove doubly-linked list from C OrderedDict

2017-12-15 Thread Raymond Hettinger
Raymond Hettinger added the comment: > Raymond: Would you mind to explain why you close the issue? On python-dev, Inada suggested he would drop this issue if Guido approved regular dicts having guaranteed order. That occurred and I closed this issue. The other reasons were listed earlier in

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
New submission from INADA Naoki: Since dict preserves insertion order, doubly linked list in OrderedDict can be removed. There is small performance improvement for odict creation: $ curl https://api.github.com/orgs/python/repos > repos.json $ ./py-patched -m perf timeit --compare-to `pwd`/py-de

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
Changes by INADA Naoki : -- pull_requests: +3232 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.p

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
INADA Naoki added the comment: There are some failing tests remaining, and I want to discuss about some of them here. Traceback (most recent call last): File "/home/inada-n/work/python/nodebug/Lib/test/test_ordered_dict.py", line 261, in test_pop self.assertEqual(m.pop('a', default=6), 6)

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread Raymond Hettinger
Changes by Raymond Hettinger : -- nosy: +rhettinger ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mai

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread Raymond Hettinger
Raymond Hettinger added the comment: I would be very careful going down this path. Regular dict ordering is not yet guaranteed and is subject to change. Its design was primarily about compaction and the ordering was a side-effect. In contrast, the data structure for collections.OrderedDict(

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I like the idea. Actually I wanted to write such patch myself, but this is very delicate thing and needs to be very careful. The largest benefit is not just memory saving and performance, but robustness. Currently it is easy to went OrderedDict in incorrect

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
INADA Naoki added the comment: I think typical usage is: get, set (incl, creating), and iterate. * get: there is no difference * set: inserting new key is little faster since no updating linked list * iterate: in typical case, new odict is faster because current odict iterator do lookup for eac

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
INADA Naoki added the comment: od.move_to_end() is slower too: $ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import OrderedDict as odict; od = odict.fromkeys("abcdefghijklmnopqrstuvwxyz")' -- 'od.move_to_end("a"); od.move_to_end("b")' py-default: ...

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
INADA Naoki added the comment: Another idea is making "dict preserves insertion order" as language spec. There are still some difference between dict and odict: .move_to_end() and __eq__. But most use cases of odict is just keep insertion order. For example, parsing config file, json, or csv. I

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What about move_to_end(last=False)? -- ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-23 Thread INADA Naoki
INADA Naoki added the comment: When no change happens: $ ./py-patched -m perf timeit --compare-to `pwd`/py-default -s 'from collections import Ordedict; od = odict.fromkeys("abcdefghijklmnopqrstuvwxyz")' -- 'od.move_to_end("a", 0)'

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-24 Thread INADA Naoki
Changes by INADA Naoki : -- nosy: +eric.snow ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pytho

[issue31265] Remove doubly-linked list from C OrderedDict

2017-08-26 Thread Raymond Hettinger
Raymond Hettinger added the comment: I'll bring this up at the language sprints. Last year, there was an explicit decision to not go down this path. Here are a few thoughts for now: It is a big deal to change the API. We try hard not to do that (breaking code for no reason and making it mor

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-04 Thread Eric Snow
Eric Snow added the comment: > For now, this proposal is on hold because > 1) it isn't clear that it should be done, > 2) it needs a lot of serious discussion before proceeding, > 3) it may be premature while the status of the regular dict > is still in flux. +1 When writing the C implementatio

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-04 Thread INADA Naoki
INADA Naoki added the comment: FYI, my approach is not original, but copied from PyPy. https://bitbucket.org/pypy/pypy/src/3e52029e9a5a677f7de62ef49f36090465cfbf4c/rpython/rtyper/lltypesystem/rordereddict.py?at=default&fileviewer=file-view-default#rordereddict.py-1437:1551 -- _

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread Raymond Hettinger
Raymond Hettinger added the comment: Based on the python-dev discussion, can we close this now? -- ___ Python tracker ___ ___ Python-

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread Raymond Hettinger
Raymond Hettinger added the comment: Just for the record, here is the draft of the post I was going to make on python-dev but didn't prove to be necessary. - I think would be a mistake to replace the implementation of collections.OrderedDict() with

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread INADA Naoki
INADA Naoki added the comment: The discussion on [1] was for removing pure Python implementation, not about changing C implementation. [1]: https://mail.python.org/pipermail/python-dev/2017-September/149147.html While I withdrawed my suggestion about removing pure Python implementation, I still

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread INADA Naoki
INADA Naoki added the comment: > Just for the record, here is the draft of the post I was going to make on > python-dev but didn't prove to be necessary. Thank you for write down your thought. For move_to_end(), I admit new behavior is *amortized* O(1) and current behavior is *worst-case* O(1)

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread Raymond Hettinger
Raymond Hettinger added the comment: ISTM that what is being proposed is an algorithmically flawed re-implementation of the ordered dictionary. I'm unclear about whether you understand and acknowledge why the doubly-linked list was chosen and what kind of workloads it supports (we didn't choo

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread INADA Naoki
INADA Naoki added the comment: > I'm unclear about whether you understand and acknowledge why the > doubly-linked list was chosen and what kind of workloads it supports (we > didn't choose it because it was either convenient or fun, we chose it because > it was an algorithmically correct way

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Note that mixed insertion and deletion is worst-case O(n) in current implementation. Original Raymond's design didn't preserve ordering during deletion. It had worst-case O(1) for mixed insertion and deletion. I didn't follow the numerous discussions about

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread INADA Naoki
INADA Naoki added the comment: > Original Raymond's design didn't preserve ordering during deletion. Original Raymond's pure Python implementation rebuilds index table. Without it, proving can be very long or infinite loop. See L89-92 in http://code.activestate.com/recipes/578375/ Strictly speak

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-10 Thread Armin Rigo
Armin Rigo added the comment: I would side with Inada in thinking they both give the same amortized complexity, but beyond that, benchmarks are the real answer. There is little value in keeping the current implementation of OrderedDict *if* benchmarks show that it is rarely faster. -

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-11 Thread Eric Snow
Eric Snow added the comment: On Sun, Sep 10, 2017 at 10:27 PM, Serhiy Storchaka wrote: > Note that mixed insertion and deletion is worst-case O(n) in current > implementation. Could you elaborate? Note that every operation of the current implementation matches the complexity of the Python imp

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-12 Thread INADA Naoki
INADA Naoki added the comment: > Eric Snow added the comment: > > On Sun, Sep 10, 2017 at 10:27 PM, Serhiy Storchaka > wrote: >> Note that mixed insertion and deletion is worst-case O(n) in current >> implementation. > > Could you elaborate? Note that every operation of the current > implement

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-12 Thread INADA Naoki
INADA Naoki added the comment: > I would side with Inada in thinking they both give the same amortized > complexity, but beyond that, benchmarks are the real answer. There is little > value in keeping the current implementation of OrderedDict *if* benchmarks > show that it is rarely faster.

[issue31265] Remove doubly-linked list from C OrderedDict

2017-09-12 Thread Eric Snow
Eric Snow added the comment: > It means rebuilding hash table to clean up dummy entries. > So, even when dict size is not increasing, remove + insert loop has > worst case O(n), amortized O(1) complexity. Ah, so it matches the pure Python implementation then. -- ___