On Sat, 25 Oct 2008 21:53:10 +0000, Lie Ryan wrote: > On Sat, 25 Oct 2008 09:21:05 +0000, Steven D'Aprano wrote: > >> On Sat, 25 Oct 2008 08:58:18 +0000, Lie Ryan wrote: >> >>> <anotherrandommusing> >>> Since python is dynamic language, I think it should be possible to do >>> something like this: >>> >>> a = list([1, 2, 3, 4, 5], implementation = 'linkedlist') b = >>> dict({'a': 'A'}, implementation = 'binarytree') c = dict({'a': 'A'}, >>> implementation = 'binarytree') >> >> Oh I hope not. I think you have mistaken "dynamic" for "chaotic". >> >> When I see a dict, I want to know that any two dicts work the same way. > > Oh no, the two dict implementation would work _exactly_ the same from > the outside, they are transparently interchangeable. Only the > performance characteristic differs because of the different > implementation.
Exactly. That was my point. [...] >> I don't want to have to search the entire project's source code to find >> out if it is a dict implemented as a hash table with O(1) lookups, or a >> dict implemented as a binary tree with O(log N) lookups, or a dict >> implemented as a linear array with O(N) lookups. > > No, you'd only need to look at the dict's creation point (or actually > much better at projects docs, but not everyone create good docs). And how do you find an arbitrary object's creation point without searching the project's source code? >> If I wanted that sort of nightmare, I can already do it by shadowing >> the builtin: >> >> dict = binarytree >> D = dict({'a': 'A'}) # make a binary tree > > I DON'T want THAT sort of nightmare you mentioned... And it'd be > impossible to have two dictionary that have two different > implementations. Nonsense. dict = binarytree D1 = dict({'a': 'A'}) # make a binary tree "dict" dict = __builtin__.dict D2 = dict({'a': 'A'}) # make a standard dict dict = someothertype D3 = dict({'a': 'A'}) I'm not suggesting this is a good idea. This is a terrible idea. But it is not much worse than your idea: D1 = dict({'a': 'A'}, implementation='binarytree') D2 = dict({'a': 'A'}, implementation='dict') D3 = dict({'a': 'A'}, implementation='someothertype') >> There is no possible good that come from this suggestion. The beauty of >> Python is that the built-in data structures (list, dict, set) are >> powerful enough for 99% of uses[1], and for the other 1%, you can >> easily and explicitly use something else. > > Oh really? As far as I know, python's list is extremely bad if you're > inserting data at the beginning of the list And how often do you do that? And when you do, use a deque. Just call it a deque. [...] >> But *explicitly* is the point. There's never any time where you can do >> this: > > Yes, true, explicitly IS the point. How more explicit can you be than: > dict({foo: bar}, implementation = 'binarytree') > >> type(mydict) is dict You miss the point. With your plan, you can do this: D1 = dict({foo: bar}, implementation = 'binarytree') D2 = dict({foo: bar}, implementation = 'dict') type(D1) is type(D2) and yet D1 and D2 have UTTERLY different performance characteristics. So now you need to add ANOTHER test to distinguish dicts-which-are-dicts from dicts-which-are-binary-trees: D1.implementation != D2.implementation And why? So you can avoid calling a goose a goose, and call it a duck instead. > If my memory serves right, binary tree dict and hashed table dict is > both a dict right? (Duck Typing) > Only their implementation differs. Implementation is... well, > "implementation detail". Duck typing refers to *interface*, not implementation. I have no problem with you using a type with the same interface as a dict. That's what duck typing is all about. Just don't call it a dict! >> and not know exactly what performance characteristics mydict will have. > > Oh... why do I need to know what the performance characteristic of > mydict is? Unless I know what I'm doing. http://www.joelonsoftware.com/articles/fog0000000319.html Because when you do this: mydict[key] = 1 it's important whether each dict lookup is O(1), O(log N) or O(N). For a dict with one million items, that means that an implementation based on a binary tree does O(20) times more processing than a dict, and an implementation based on linear searching does O(1000000) times more processing. If you think implementation details don't matter, try this: s1 = 'c'*(10**6) versus s2 = '' for i in xrange(10**6): s2 = 'c' + s2 # defeat optimizer >> (Unless you shadow dict or type, or otherwise do something that breaks >> the rules.) You never need to ask, "Okay, it's a dict. What sort of >> dict?" > > Okay, it's a dict. What sort of dict? Who the hell cares? If you don't care, then why are you specifying the implementation type? mydict = dict({'foo': 'bar'}, implementation="surprise me!") You can't have it both ways. If you care, then you know enough to want a hash table based dict (the standard) or a binary tree or something else. So go ahead and use whatever data structure you want. Just don't call it a dict. But if you don't care, then just use the Python standard data types. > I don't need > to know, they all looks and behave the same (Duck Typing)... Until you try a simple operation like len(mydict) and it takes three minutes because the implementation you choose is really inefficient. > Sometimes we need a data type to use a specific data structure that have > some specific performance characteristic, because we know we'll be doing > a specific operation a lot more than other operations. I'm not arguing that you should never use any other data structure. Go right ahead and use them all you like. >> If you want a binary tree, ask for a binary tree. > > Yeah, you ask for binary tree EXPLICITLY: bintreedict = dict({a:b}, > implementation = 'binarytree') Or you could do it the right way: D1 = binarytree(data) D2 = dict(data) type(D1), type(D2) => returns binarytree, dict versus: D1 = dict(data, implementation='binarytree') D2 = dict(data) type(D1), type(D2) => returns dict, dict -- Steven -- http://mail.python.org/mailman/listinfo/python-list