Re: Why operations between dict views return a set and not a frozenset?
Thank you a lot for letting me understand :) On Tue, 11 Jan 2022 at 22:09, Peter J. Holzer wrote: > On 2022-01-11 19:49:20 +0100, Marco Sulla wrote: > > I think this is what you mean: > > > > >>> dis.dis("for _ in {1, 2}: pass") > > 1 0 SETUP_LOOP 12 (to 14) > > 2 LOAD_CONST 3 (frozenset({1, 2})) > > 4 GET_ITER > > >>6 FOR_ITER 4 (to 12) > > 8 STORE_NAME 0 (_) > > 10 JUMP_ABSOLUTE6 > > >> 12 POP_BLOCK > > >> 14 LOAD_CONST 2 (None) > > 16 RETURN_VALUE > > >>> a = {1, 2} > > >>> dis.dis("for _ in a: pass") > > 1 0 SETUP_LOOP 12 (to 14) > > 2 LOAD_NAME0 (a) > > 4 GET_ITER > > >>6 FOR_ITER 4 (to 12) > > 8 STORE_NAME 1 (_) > > 10 JUMP_ABSOLUTE6 > > >> 12 POP_BLOCK > > >> 14 LOAD_CONST 0 (None) > > 16 RETURN_VALUE > > I think you have omitted the part that Chris was hinting at. > > >>> dis.dis("a = {1, 2};\nfor _ in a: pass") > 1 0 LOAD_CONST 0 (1) > 2 LOAD_CONST 1 (2) > 4 BUILD_SET2 > 6 STORE_NAME 0 (a) > > 2 8 LOAD_NAME0 (a) > 10 GET_ITER > >> 12 FOR_ITER 4 (to 18) > 14 STORE_NAME 1 (_) > 16 JUMP_ABSOLUTE 12 > >> 18 LOAD_CONST 2 (None) > 20 RETURN_VALUE > > Now compare > > 2 LOAD_CONST 3 (frozenset({1, 2})) > > with > > 1 0 LOAD_CONST 0 (1) > 2 LOAD_CONST 1 (2) > 4 BUILD_SET2 > > and you see the difference between using a frozenset as a constant and > building a set at runtime. > > hp > > -- >_ | Peter J. Holzer| Story must make more sense than reality. > |_|_) || > | | | h...@hjp.at |-- Charles Stross, "Creative writing > __/ | http://www.hjp.at/ | challenge!" > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On 2022-01-11 19:49:20 +0100, Marco Sulla wrote: > I think this is what you mean: > > >>> dis.dis("for _ in {1, 2}: pass") > 1 0 SETUP_LOOP 12 (to 14) > 2 LOAD_CONST 3 (frozenset({1, 2})) > 4 GET_ITER > >>6 FOR_ITER 4 (to 12) > 8 STORE_NAME 0 (_) > 10 JUMP_ABSOLUTE6 > >> 12 POP_BLOCK > >> 14 LOAD_CONST 2 (None) > 16 RETURN_VALUE > >>> a = {1, 2} > >>> dis.dis("for _ in a: pass") > 1 0 SETUP_LOOP 12 (to 14) > 2 LOAD_NAME0 (a) > 4 GET_ITER > >>6 FOR_ITER 4 (to 12) > 8 STORE_NAME 1 (_) > 10 JUMP_ABSOLUTE6 > >> 12 POP_BLOCK > >> 14 LOAD_CONST 0 (None) > 16 RETURN_VALUE I think you have omitted the part that Chris was hinting at. >>> dis.dis("a = {1, 2};\nfor _ in a: pass") 1 0 LOAD_CONST 0 (1) 2 LOAD_CONST 1 (2) 4 BUILD_SET2 6 STORE_NAME 0 (a) 2 8 LOAD_NAME0 (a) 10 GET_ITER >> 12 FOR_ITER 4 (to 18) 14 STORE_NAME 1 (_) 16 JUMP_ABSOLUTE 12 >> 18 LOAD_CONST 2 (None) 20 RETURN_VALUE Now compare 2 LOAD_CONST 3 (frozenset({1, 2})) with 1 0 LOAD_CONST 0 (1) 2 LOAD_CONST 1 (2) 4 BUILD_SET2 and you see the difference between using a frozenset as a constant and building a set at runtime. hp -- _ | Peter J. Holzer| Story must make more sense than reality. |_|_) || | | | h...@hjp.at |-- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!" signature.asc Description: PGP signature -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Wed, Jan 12, 2022 at 5:49 AM Marco Sulla wrote: > > Ok... so I suppose, since you're inviting me to use dis and look at the > bytecode, that are you talking about constants in assembly, so const in C? > Sorry for the confusion, I'm not so skilled in C and I know nearly nothing > about assembly. Furthermore I never look at the bytecode of any language > before, so I simply didn't understand you. > No, I'm talking about constants in Python. > I think this is what you mean: > > >>> dis.dis("for _ in {1, 2}: pass") > 1 0 SETUP_LOOP 12 (to 14) > 2 LOAD_CONST 3 (frozenset({1, 2})) This is a constant. > 4 GET_ITER > >>6 FOR_ITER 4 (to 12) > 8 STORE_NAME 0 (_) > 10 JUMP_ABSOLUTE6 > >> 12 POP_BLOCK > >> 14 LOAD_CONST 2 (None) This is a constant. > 16 RETURN_VALUE > >>> a = {1, 2} > >>> dis.dis("for _ in a: pass") > 1 0 SETUP_LOOP 12 (to 14) > 2 LOAD_NAME0 (a) > 4 GET_ITER > >>6 FOR_ITER 4 (to 12) > 8 STORE_NAME 1 (_) > 10 JUMP_ABSOLUTE6 > >> 12 POP_BLOCK > >> 14 LOAD_CONST 0 (None) This is a constant. > 16 RETURN_VALUE > Try the same thing with other code and see whether you can see a difference. In other words, *play around with dis.dis*. If you're trying to hack on the internals of the CPython dictionary implementation and do not understand how Python bytecode is executed, you are doomed to make many MANY errors of judgment about performance. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
Ok... so I suppose, since you're inviting me to use dis and look at the bytecode, that are you talking about constants in assembly, so const in C? Sorry for the confusion, I'm not so skilled in C and I know nearly nothing about assembly. Furthermore I never look at the bytecode of any language before, so I simply didn't understand you. I think this is what you mean: >>> dis.dis("for _ in {1, 2}: pass") 1 0 SETUP_LOOP 12 (to 14) 2 LOAD_CONST 3 (frozenset({1, 2})) 4 GET_ITER >>6 FOR_ITER 4 (to 12) 8 STORE_NAME 0 (_) 10 JUMP_ABSOLUTE6 >> 12 POP_BLOCK >> 14 LOAD_CONST 2 (None) 16 RETURN_VALUE >>> a = {1, 2} >>> dis.dis("for _ in a: pass") 1 0 SETUP_LOOP 12 (to 14) 2 LOAD_NAME0 (a) 4 GET_ITER >>6 FOR_ITER 4 (to 12) 8 STORE_NAME 1 (_) 10 JUMP_ABSOLUTE6 >> 12 POP_BLOCK >> 14 LOAD_CONST 0 (None) 16 RETURN_VALUE On Tue, 11 Jan 2022 at 01:05, Chris Angelico wrote: > On Tue, Jan 11, 2022 at 10:26 AM Marco Sulla > wrote: > > > > On Wed, 5 Jan 2022 at 23:02, Chris Angelico wrote: > > > > > > On Thu, Jan 6, 2022 at 8:01 AM Marco Sulla < > marco.sulla.pyt...@gmail.com> wrote: > > > > > > > > On Wed, 5 Jan 2022 at 14:16, Chris Angelico > wrote: > > > > > That's an entirely invisible optimization, but it's more than just > > > > > "frozenset is faster than set". It's that a frozenset or tuple can > be > > > > > stored as a function's constants, which is a massive difference. > > > > > > > > Can you explain this? > > > > > > Play around with dis.dis and timeit. > > > > ? I don't understand. You're talking about function constants. What > > are they? I can't dig deep into something if I can't know what it is. > > Maybe are you talking about function default values for parameters? > > No, I'm talking about constants. Every function has them. > > > Of course. You can use a proxy and slow down almost everything much > > more. Or you can simply create a version of the mutable object with > > fewer methods, as more or less frozenset is. I checked the > > implementation, no fast iteration is implemented. I do not understand > > why in `for x in {1, 2, 3}` the set is substituted by a frozenset. > > Constants. Like I said, play around with dis.dis, and explore what's > already happening. A set can't be a constant, a frozenset can be. > Constants are way faster than building from scratch. > > Explore. Play around. I'm not going to try to explain everything in detail. > > If you're delving into the details of the C implementation of the > dictionary, I would have expected you'd already be familiar with the > way that functions behave. > > ChrisA > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Tue, Jan 11, 2022 at 10:26 AM Marco Sulla wrote: > > On Wed, 5 Jan 2022 at 23:02, Chris Angelico wrote: > > > > On Thu, Jan 6, 2022 at 8:01 AM Marco Sulla > > wrote: > > > > > > On Wed, 5 Jan 2022 at 14:16, Chris Angelico wrote: > > > > That's an entirely invisible optimization, but it's more than just > > > > "frozenset is faster than set". It's that a frozenset or tuple can be > > > > stored as a function's constants, which is a massive difference. > > > > > > Can you explain this? > > > > Play around with dis.dis and timeit. > > ? I don't understand. You're talking about function constants. What > are they? I can't dig deep into something if I can't know what it is. > Maybe are you talking about function default values for parameters? No, I'm talking about constants. Every function has them. > Of course. You can use a proxy and slow down almost everything much > more. Or you can simply create a version of the mutable object with > fewer methods, as more or less frozenset is. I checked the > implementation, no fast iteration is implemented. I do not understand > why in `for x in {1, 2, 3}` the set is substituted by a frozenset. Constants. Like I said, play around with dis.dis, and explore what's already happening. A set can't be a constant, a frozenset can be. Constants are way faster than building from scratch. Explore. Play around. I'm not going to try to explain everything in detail. If you're delving into the details of the C implementation of the dictionary, I would have expected you'd already be familiar with the way that functions behave. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Wed, 5 Jan 2022 at 23:02, Chris Angelico wrote: > > On Thu, Jan 6, 2022 at 8:01 AM Marco Sulla > wrote: > > > > On Wed, 5 Jan 2022 at 14:16, Chris Angelico wrote: > > > That's an entirely invisible optimization, but it's more than just > > > "frozenset is faster than set". It's that a frozenset or tuple can be > > > stored as a function's constants, which is a massive difference. > > > > Can you explain this? > > Play around with dis.dis and timeit. ? I don't understand. You're talking about function constants. What are they? I can't dig deep into something if I can't know what it is. Maybe are you talking about function default values for parameters? > > > Function positional arguments aren't interchangeable, so it makes > > > sense to have them as a tuple. > > > > You are wrong, since kwarg is a dict. Indeed I proposed to use > > frozendict for kwargs, and Guido said that it's a pity that this will > > break a lot of existing Python code :D, since the fact that args is > > _immutable_ and kwargs not always bothered him. > > Excuse me? I mentioned kwargs in the part that you removed from the > quote, and the part you're quoting explicitly says "positional > arguments". Ok, I quote also the other part: > (Function *keyword* arguments, on the other hand, are different; as > long as the mapping from keys to values is maintained, you can remove > some of them and pass the rest on, without fundamentally changing > their meaning.) First of all, I repeat, Guido said (more or less) that in a perfect world, kwargs are immutable. Or maybe I did not understand what he said, maybe he said that in a perfect world also args are mutable. But I suppose it's more probable the first hypothesis :D Secondly, you can also get the args from a function, transform it in a list, change something and pass it unpacked to another function. You will not change the meaning of the tuple, since, well, you copied it in another mutable object. The original object is untouched. I perfectly agree that, in the majority of cases, returning an immutable vs a mutable are a matter of... sense? Meaning? Ok, I perfectly agree. But IMHO there are many cases in which immutable objects are used for a matter of speed, and I bet that args is one of them. > > Anyway, I'm starting to think that neither set nor frozenset are good > > for dict items: > > > > (venv_3_10) marco@buzz:~$ python > > Python 3.10.0 (heads/3.10-dirty:f6e8b80d20, Nov 18 2021, 19:16:18) > > [GCC 10.1.1 20200718] on linux > > Type "help", "copyright", "credits" or "license" for more information. > > >>> a = {1: 2} > > >>> b = {3: []} > > >>> a | b > > {1: 2, 3: []} > > >>> a.items() | b.items() > > Traceback (most recent call last): > > File "", line 1, in > > TypeError: unhashable type: 'list' > > >>> > > Well yes. Only dict keys can be considered to be set-like. This is not true. It's at least from Python 3.6, and I think also before, that almost the full Set API was added to both keys and items view. Items indeed are a sort of set, in a mathematical sense, since any pair (key, value) is unique, even if value is mutable. > I don't > know WHAT you think you're trying to do here, but if you ever thought > of set operations on dict values, you may want to completely rethink > what you're doing. set ops on values? Never said that :) I said that currently you can operate on item views with set operators. This is a fact. I also said that, since py sets accept only hashable objects, maybe another ad-hoc object should be used for the result of the items operations. But maybe the change isn't worth the additional trouble. Indeed I didn't know about the new set methods and operations on dict views until I explored dictobject.c > Performance is not an automatic result of immutability. That simply > isn't how it works. Of course. You can use a proxy and slow down almost everything much more. Or you can simply create a version of the mutable object with fewer methods, as more or less frozenset is. I checked the implementation, no fast iteration is implemented. I do not understand why in `for x in {1, 2, 3}` the set is substituted by a frozenset. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Thu, Jan 6, 2022 at 8:01 AM Marco Sulla wrote: > > On Wed, 5 Jan 2022 at 14:16, Chris Angelico wrote: > > That's an entirely invisible optimization, but it's more than just > > "frozenset is faster than set". It's that a frozenset or tuple can be > > stored as a function's constants, which is a massive difference. > > Can you explain this? Play around with dis.dis and timeit. > > Function positional arguments aren't interchangeable, so it makes > > sense to have them as a tuple. > > You are wrong, since kwarg is a dict. Indeed I proposed to use > frozendict for kwargs, and Guido said that it's a pity that this will > break a lot of existing Python code :D, since the fact that args is > _immutable_ and kwargs not always bothered him. Excuse me? I mentioned kwargs in the part that you removed from the quote, and the part you're quoting explicitly says "positional arguments". > Anyway, I'm starting to think that neither set nor frozenset are good > for dict items: > > (venv_3_10) marco@buzz:~$ python > Python 3.10.0 (heads/3.10-dirty:f6e8b80d20, Nov 18 2021, 19:16:18) > [GCC 10.1.1 20200718] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> a = {1: 2} > >>> b = {3: []} > >>> a | b > {1: 2, 3: []} > >>> a.items() | b.items() > Traceback (most recent call last): > File "", line 1, in > TypeError: unhashable type: 'list' > >>> Well yes. Only dict keys can be considered to be set-like. I don't know WHAT you think you're trying to do here, but if you ever thought of set operations on dict values, you may want to completely rethink what you're doing. Performance is not an automatic result of immutability. That simply isn't how it works. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Wed, 5 Jan 2022 at 14:16, Chris Angelico wrote: > That's an entirely invisible optimization, but it's more than just > "frozenset is faster than set". It's that a frozenset or tuple can be > stored as a function's constants, which is a massive difference. Can you explain this? > In fact, the two data types are virtually identical in performance once > created [...] This is really strange, since in theory frozenset should not have to check if itself is mutated during the iteration, on each cycle. So the speed should be noticeable faster. Maybe frozenset was not optimised, because the use case is really little and will add potentially useless C code? Furthermore, more the code, more the memory consumption and less the speed. I have to check setobject.c. > Function positional arguments aren't interchangeable, so it makes > sense to have them as a tuple. You are wrong, since kwarg is a dict. Indeed I proposed to use frozendict for kwargs, and Guido said that it's a pity that this will break a lot of existing Python code :D, since the fact that args is _immutable_ and kwargs not always bothered him. Anyway, I'm starting to think that neither set nor frozenset are good for dict items: (venv_3_10) marco@buzz:~$ python Python 3.10.0 (heads/3.10-dirty:f6e8b80d20, Nov 18 2021, 19:16:18) [GCC 10.1.1 20200718] on linux Type "help", "copyright", "credits" or "license" for more information. >>> a = {1: 2} >>> b = {3: []} >>> a | b {1: 2, 3: []} >>> a.items() | b.items() Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'list' >>> -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
Op 4/01/2022 om 19:27 schreef Marco Sulla: $ python Python 3.10.0 (heads/3.10-dirty:f6e8b80d20, Nov 18 2021, 19:16:18) [GCC 10.1.1 20200718] on linux Type "help", "copyright", "credits" or "license" for more information. a = {1:2} c = {1:2, 3:4} c.keys() - a.keys() {3} Why not frozenset({3})? My 2 cents worths: Because dictviews mutate with the directory. That makes them more like set than like frozenset. So operations on them produce a set. -- Antoon Pardon. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Thu, Jan 6, 2022 at 12:05 AM Marco Sulla wrote: > > On Wed, 5 Jan 2022 at 00:54, Chris Angelico wrote: > > That's because a tuple is the correct data type when returning two > > distinct items. It's not a list that has two elements in it; it's a > > tuple of (key, value). Immutability is irrelevant. > > Immutability is irrelevant, speed no. A tuple is faster than a list > and more compact. Also frozenset is faster than set. Indeed CPython > optimises internally a > > for x in {1, 2, 3} > > transforming the set in a frozenset for a matter of speed. That's why > tuple is usually preferred. I expected the same for frozenset That's an entirely invisible optimization, but it's more than just "frozenset is faster than set". It's that a frozenset or tuple can be stored as a function's constants, which is a massive difference. In fact, the two data types are virtually identical in performance once created: rosuav@sikorsky:~$ python3 -m timeit -s "stuff = {1,2,3}" "for x in stuff: pass" 500 loops, best of 5: 46.2 nsec per loop rosuav@sikorsky:~$ python3 -m timeit -s "stuff = frozenset({1,2,3})" "for x in stuff: pass" 500 loops, best of 5: 46.7 nsec per loop rosuav@sikorsky:~$ python3 -m timeit -s "stuff = set(range(1))" "for x in stuff: pass" 5000 loops, best of 5: 82.1 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s "stuff = frozenset(range(1))" "for x in stuff: pass" 5000 loops, best of 5: 81.3 usec per loop Mutability is irrelevant, and so is the speed of the data type. Having set operations on keys views return frozensets wouldn't improve anything. > > Got any examples of variable-length sequences? > > function positional args are tuples, for example. > > > Usually a tuple is a > > structure, not just a sequence. > > eh? Are you talking about the underlying C code? No, I'm talking about purpose. In general, a list contains a sequence of things whose order matters but which can be removed from - you can remove one item from your shopping list and the rest still mean what they were. A tuple has a specific set of things in a specific order, like Cartesian coordinates. You can't just remove the x coordinate from an (x,y,z) tuple without fundamentally changing what it is. Function positional arguments aren't interchangeable, so it makes sense to have them as a tuple. Removing the first argument would redefine what all the others mean, so a tuple is correct - it's not just a list that's been made immutable for performance's sake. (Function *keyword* arguments, on the other hand, are different; as long as the mapping from keys to values is maintained, you can remove some of them and pass the rest on, without fundamentally changing their meaning.) Do you have any examples of actually variable-length sequences that are tuples for speed? Measure before claiming a speed difference. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Wed, 5 Jan 2022 at 00:54, Chris Angelico wrote: > That's because a tuple is the correct data type when returning two > distinct items. It's not a list that has two elements in it; it's a > tuple of (key, value). Immutability is irrelevant. Immutability is irrelevant, speed no. A tuple is faster than a list and more compact. Also frozenset is faster than set. Indeed CPython optimises internally a for x in {1, 2, 3} transforming the set in a frozenset for a matter of speed. That's why tuple is usually preferred. I expected the same for frozenset > Got any examples of variable-length sequences? function positional args are tuples, for example. > Usually a tuple is a > structure, not just a sequence. eh? Are you talking about the underlying C code? > If something is just returning a > sequence, it'll most often return a dedicated sequence type (like > range in Py3) or a list (like lots of things in Py2). Python 2 is now obsolete, I don't think is relevant for the discussion. About your sentence, yes, usually a dedicated view, sequence or generator is returned, but tuples too are really much used. A list is returned very sporadically, for what I remember. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On 2022-01-05 at 08:30:30 +1100, Cameron Simpson wrote: > On 04Jan2022 21:03, Marco Sulla wrote: > >On Tue, 4 Jan 2022 at 19:38, Chris Angelico wrote: > >> [...] should the keys view be considered > >> frozen or not? Remember the set of keys can change (when the > >> underlying dict changes). > > > >Well, also the items can change, but they are returned as tuples with > >2 elements. > > > >It seems to me that the stdlib, when something should return a > >sequence, prefers to return a tuple. So I expected the same preference > >for frozenset over set. > > > >> It's not difficult to construct a frozenset from a set. > > > >This sentence has the commutative property :) > > Indeed. > > But speaking for myself, I may well want to perform additional work on > the object returned. Making a copy of it for tht purpose seems very > wasteful (imagine the set is quite large). A modifiable version can be > used immediately with no time or space cost. And it can be left alone if > it is to be unchanged. If I got a frozenset back I would inherently > have to copy it to do "modifying work". Unless the additional work is to use it as a dictionary key, or to add it to an existing [necessarily mutable!] set. Then again, that's not work *on* the object returned, that's additional work *with* the object returned. We could go around and around on this all day. :-) Python began with the premise of mutability. IIRC, many attempts at immutable class instances (for use in sets or as dictionary keys) have run into the same issues Marco Sulla is having. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Wed, Jan 5, 2022 at 7:04 AM Marco Sulla wrote: > > On Tue, 4 Jan 2022 at 19:38, Chris Angelico wrote: > > [...] should the keys view be considered > > frozen or not? Remember the set of keys can change (when the > > underlying dict changes). > > Well, also the items can change, but they are returned as tuples with > 2 elements. That's because a tuple is the correct data type when returning two distinct items. It's not a list that has two elements in it; it's a tuple of (key, value). Immutability is irrelevant. > It seems to me that the stdlib, when something should return a > sequence, prefers to return a tuple. So I expected the same preference > for frozenset over set. Got any examples of variable-length sequences? Usually a tuple is a structure, not just a sequence. If something is just returning a sequence, it'll most often return a dedicated sequence type (like range in Py3) or a list (like lots of things in Py2). ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On 04Jan2022 21:03, Marco Sulla wrote: >On Tue, 4 Jan 2022 at 19:38, Chris Angelico wrote: >> [...] should the keys view be considered >> frozen or not? Remember the set of keys can change (when the >> underlying dict changes). > >Well, also the items can change, but they are returned as tuples with >2 elements. > >It seems to me that the stdlib, when something should return a >sequence, prefers to return a tuple. So I expected the same preference >for frozenset over set. > >> It's not difficult to construct a frozenset from a set. > >This sentence has the commutative property :) Indeed. But speaking for myself, I may well want to perform additional work on the object returned. Making a copy of it for tht purpose seems very wasteful (imagine the set is quite large). A modifiable version can be used immediately with no time or space cost. And it can be left alone if it is to be unchanged. If I got a frozenset back I would inherently have to copy it to do "modifying work". So I prefer getting a modifiable object back. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Tue, 4 Jan 2022 at 19:38, Chris Angelico wrote: > [...] should the keys view be considered > frozen or not? Remember the set of keys can change (when the > underlying dict changes). Well, also the items can change, but they are returned as tuples with 2 elements. It seems to me that the stdlib, when something should return a sequence, prefers to return a tuple. So I expected the same preference for frozenset over set. > It's not difficult to construct a frozenset from a set. This sentence has the commutative property :) -- https://mail.python.org/mailman/listinfo/python-list
Re: Why operations between dict views return a set and not a frozenset?
On Wed, Jan 5, 2022 at 5:29 AM Marco Sulla wrote: > > $ python > Python 3.10.0 (heads/3.10-dirty:f6e8b80d20, Nov 18 2021, 19:16:18) > [GCC 10.1.1 20200718] on linux > Type "help", "copyright", "credits" or "license" for more information. > >>> a = {1:2} > >>> c = {1:2, 3:4} > >>> c.keys() - a.keys() > {3} > >>> > Let's start with this. >>> a = {1} >>> c = {1, 3} >>> c - a {3} Do you agree that this should be a set? If so, then the next question is: should the keys view be considered frozen or not? Remember the set of keys can change (when the underlying dict changes). It's not difficult to construct a frozenset from a set. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Why operations between dict views return a set and not a frozenset?
$ python Python 3.10.0 (heads/3.10-dirty:f6e8b80d20, Nov 18 2021, 19:16:18) [GCC 10.1.1 20200718] on linux Type "help", "copyright", "credits" or "license" for more information. >>> a = {1:2} >>> c = {1:2, 3:4} >>> c.keys() - a.keys() {3} >>> Why not frozenset({3})? -- https://mail.python.org/mailman/listinfo/python-list
Return and set
I have a method getToken() which checks to see if a value is set, and if so, return it. However, it doesn't feel pythonic to me: def getToken(self): if self.tok: t = self.tok self.tok = None return t # ... Is there a way to trim the 'if' block to reset self.tok upon return? -- Bill -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
I don't know if this is better (or more pythonic), but it works for me on python 2.7. class MyKlass(object): ... def __init__(self, tok): ... self.tok = tok ... def gettoken(self): ... t, self.tok = self.tok or None, None ... return t On Tue, Jul 19, 2011 at 9:23 AM, Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com wrote: I have a method getToken() which checks to see if a value is set, and if so, return it. However, it doesn't feel pythonic to me: def getToken(self): if self.tok: t = self.tok self.tok = None return t # ... Is there a way to trim the 'if' block to reset self.tok upon return? -- Bill -- http://mail.python.org/**mailman/listinfo/python-listhttp://mail.python.org/mailman/listinfo/python-list -- Paul Woolcock pwool...@gmail.com -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com writes: I have a method getToken() which checks to see if a value is set, and if so, return it. However, it doesn't feel pythonic to me: Clearly that's because the function name is not Pythonic :-) I'll assume the name is a PEP-8 compatible ‘get_token’. def getToken(self): if self.tok: t = self.tok self.tok = None return t # ... Are you testing ‘self.tok’ in a boolean context because you don't care whether it it might be ‘’ or ‘0’ or ‘0.0’ or ‘[]’ or ‘False’ or lots of other things that evaluate false in a boolean context? If you want to test whether it is any value other than ‘None’, that's not the way to do it. Instead, use ‘if self.token is not None’. But I don't see why you test it at all, in that case, since you're immediately setting it to ‘None’ afterward. Also, the function name is quite misleading; the implication for a function named ‘get_foo’ is that it is a non-destructive read. I would expect the name of this function to indicate what's going on much more explicitly. My suggestion:: def get_and_reset_token(self): result = self.token self.token = None return result -- \ “I stayed up all night playing poker with tarot cards. I got a | `\ full house and four people died.” —Steven Wright | _o__) | Ben Finney -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
On 07/19/2011 09:43 AM, Ben Finney wrote: Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com writes: I have a method getToken() which checks to see if a value is set, and if so, return it. However, it doesn't feel pythonic to me: Clearly that's because the function name is not Pythonic :-) I'll assume the name is a PEP-8 compatible ‘get_token’. def getToken(self): if self.tok: t = self.tok self.tok = None return t # ... Are you testing ‘self.tok’ in a boolean context because you don't care whether it it might be ‘’ or ‘0’ or ‘0.0’ or ‘[]’ or ‘False’ or lots of other things that evaluate false in a boolean context? If you want to test whether it is any value other than ‘None’, that's not the way to do it. Instead, use ‘if self.token is not None’. But I don't see why you test it at all, in that case, since you're immediately setting it to ‘None’ afterward. Also, the function name is quite misleading; the implication for a function named ‘get_foo’ is that it is a non-destructive read. I would expect the name of this function to indicate what's going on much more explicitly. My suggestion:: def get_and_reset_token(self): result = self.token self.token = None return result This function is used in a file parser. There are two methods, getToken() and peekToken(). getToken pops a token from the file, while peekToken keeps the token, but still returns it. Code: def getToken(self): if self.tok: t = self.tok self.tok = None return t try: t = self.gen.next() except StopIteration: return NULL else: return t def peekToken(self): if not self.tok: self.tok = self.getToken() return self.tok NULL is an enumerated value I have defined above. The idea is for peekToken to reuse getToken, but to keep the token still around. -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
On 7/19/2011 9:52 AM, Billy Mays wrote: On 07/19/2011 09:43 AM, Ben Finney wrote: Billy Mays 81282ed9a88799d21e77957df2d84bd6514d9...@myhashismyemail.com writes: I have a method getToken() which checks to see if a value is set, and if so, return it. However, it doesn't feel pythonic to me: Clearly that's because the function name is not Pythonic :-) I'll assume the name is a PEP-8 compatible ‘get_token’. def getToken(self): if self.tok: t = self.tok self.tok = None return t # ... Are you testing ‘self.tok’ in a boolean context because you don't care whether it it might be ‘’ or ‘0’ or ‘0.0’ or ‘[]’ or ‘False’ or lots of other things that evaluate false in a boolean context? If you want to test whether it is any value other than ‘None’, that's not the way to do it. Instead, use ‘if self.token is not None’. But I don't see why you test it at all, in that case, since you're immediately setting it to ‘None’ afterward. Also, the function name is quite misleading; the implication for a function named ‘get_foo’ is that it is a non-destructive read. I would expect the name of this function to indicate what's going on much more explicitly. My suggestion:: def get_and_reset_token(self): result = self.token self.token = None return result This function is used in a file parser. There are two methods, getToken() and peekToken(). getToken pops a token from the file, while peekToken keeps the token, but still returns it. Code: def getToken(self): if self.tok: t = self.tok self.tok = None return t try: t = self.gen.next() except StopIteration: return NULL else: return t def peekToken(self): if not self.tok: self.tok = self.getToken() return self.tok You did not answer Ben's question about the allowed values of self.tok and whether you really want to clobber all 'false' values. The proper code depends on that answer. NULL is an enumerated value I have defined above. The idea is for peekToken to reuse getToken, but to keep the token still around. I think about reversing and have getToken use peekToken and then reset. But that depends on the exact logic which depends on the specs. I would more likely have just one function with a reset parameter defaulted to the more common value. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
That sounds artificially backwards; why not let getToken() reuse peekToken()? def peek(self): if self.tok is None: try: self.tok = self.gen.next() except StopIteration: self.tok = NULL return self.tok def pop(self): token = self.peek() self.tok = None return token -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
On 07/19/2011 01:00 PM, Micah wrote: That sounds artificially backwards; why not let getToken() reuse peekToken()? def peek(self): if self.tok is None: try: self.tok = self.gen.next() except StopIteration: self.tok = NULL return self.tok def pop(self): token = self.peek() self.tok = None return token I actually like this way better, thanks! -- Bill -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
On 07/19/2011 01:02 PM, Terry Reedy wrote: You did not answer Ben's question about the allowed values of self.tok and whether you really want to clobber all 'false' values. The proper code depends on that answer. NULL is an enumerated value I have defined above. The idea is for peekToken to reuse getToken, but to keep the token still around. I think about reversing and have getToken use peekToken and then reset. But that depends on the exact logic which depends on the specs. I would more likely have just one function with a reset parameter defaulted to the more common value. self.gen is a generator that gets filters single characters from a file. Values that come from self.gen.next() will always be string values since the file generator closes on EOF. I can be sure that I will either get a string from self.gen.next() or catch an exception so its okay to use NULL (which evaluates to 0). You are correct it makes more sense to use peekToken() to do the lifting and getToken() to reuse the token. However, I am not sure what you mean by reset. -- Bill -- http://mail.python.org/mailman/listinfo/python-list
Re: Return and set
On 7/19/2011 1:00 PM, Micah wrote: That sounds artificially backwards; why not let getToken() reuse peekToken()? def peek(self): if self.tok is None: try: self.tok = self.gen.next() If this is changed (as intended for the iteration protocol) to self.tok = next(self.gen) then this code (and OP's version) should run as is on both Py2.6/7 and Py3. except StopIteration: self.tok = NULL return self.tok def pop(self): token = self.peek() self.tok = None return token -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
On 2006-07-01, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: There's a few good reasons. 1 - golden handcuffs. Breaking old code is bad 90% of the time 2 - creating a set MAY be slower. Python's sets seem to imply to that they will always be a hash map. in this case, some creative hash map mapping could allow one to create a set without calculating hash codes (make the set hashmap have the same dimentions and rules as the dictionary one). If there was intent to allow Python implementations to use trees for the set, then a list is far faster to create (O(n) time instead of O(nlogn)). 3 - using a set is sometimes slower (just as using a list is sometimes slower) I can't speak for your code, but this is the most common use of keys in my coding: # d is some dictionary keys = d.keys() keys.sort() for k in keys: #blah Wouldn't you be better of with a tree instead of dictionary? Maybe there are other reasons to prefer a dict, but have a look at: http://www.pardon-sleeuwaegen.be/antoon/avltree.html Suppose t is a tree implementing the same mapping a your dictionary d above, the code would be: # t is some tree for k in t: #blah And the keys will be treated in order. If you try it, let me know what you think of it. -- Antoon Pardon -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python3000 in mind]
Yes, this is what he's saying. Its not broken, just a bit different. After all, you dont have a problem with: lst = [1, 2, 3] ptr = lst lst.append(4) # this changes ptr And a view of the dictionary is orders faster than creating a copy of it (which is required to keep k0 from changing in your example). If you're LUCKY, copying a dictionary is O(n), and i would expect it to take longer because hashmaps usually have a lot of empty space in them (anyone with python specific experiance know if its diffent for python hashmaps?), and each empty space takes time to iterate across. So a dictionary is over O(n), while a view takes O(1) time. considering how often this operation is used, it makes sense to do this optimization And if you really want an independant list, you can always say k0 = list(d.keys()) Paul Rubin wrote: Wait a sec, you're saying k0 = d.keys() d['whee'] = 'parrot'# this can change k0??? That sounds broken. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python3000 in mind]
[EMAIL PROTECTED] [EMAIL PROTECTED] writes: And a view of the dictionary is orders faster than creating a copy of it (which is required to keep k0 from changing in your example). If you're LUCKY, copying a dictionary is O(n), There are ways to do it so you don't have to copy, but the dict would then no longer be a straightforward hash table. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python3000 in mind]
Ooh, can you point me to them? This sounds very interesting. The only way I can think of doing it is to have some fun with pointers and not make the final copy until a change is made to the table. I'd love to read about an algoritm which can get around this! I feel so behind in the times, I still prefer the simpliciy of the redblacktree sometimes =p Paul Rubin wrote: There are ways to do it so you don't have to copy, but the dict would then no longer be a straightforward hash table. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [withPython3000 in mind]
You bring up a good point. the for x in d: notation is a relativly new construction, for x in d.keys() is much older. Some of the value of d.keys() goes away because we have this new construction, but there's some reasons to keep it around: 1. Consitency. You can get the values, you can get the (key, value) pairs. it'd be odd not to be able to get the keys 2. Choices. if d.keys() is a FAST operation, then you can then use that to create any structure you want. For example, if you want a set, set(d.keys()) only requires you to create a set. If d.keys() created an independant list, python would first need to create a list, then create a set. Paul Rubin wrote: Delaney, Timothy (Tim) [EMAIL PROTECTED] writes: If you want an independent data set, you have to take a snapshot. For the above, that's doing: k0 = list(d.keys()) I don't understand. Why have .keys() at all, if it doesn't get you an independent data set? If all you want is to iterate through the dict, you can already do that: for k in d: -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
Paddy [EMAIL PROTECTED] (P) wrote: P [EMAIL PROTECTED] wrote: This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? P I think the order of the items returned by keys() and values() are P related. I decided on a short empirical test: yes, it is documented that their order is related. In fact d.items() == zip(d.keys(), d.values()) This wouldn't work with sets instead of lists. -- Piet van Oostrum [EMAIL PROTECTED] URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4] Private email: [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
RE: Dictionary .keys() and .values() should return a set[withPython3000 in mind]
Paul Rubin wrote: Delaney, Timothy (Tim) [EMAIL PROTECTED] writes: If you want an independent data set, you have to take a snapshot. For the above, that's doing: k0 = list(d.keys()) I don't understand. Why have .keys() at all, if it doesn't get you an independent data set? If all you want is to iterate through the dict, you can already do that: for k in d: Several reasons. d.keys() d.values() d.items() You can only have iter(d) equivalent to iter(*one* of the above). Also, d.keys() can provide an indexable view i.e. you can index it like the list currently returned from d.keys(). For most use cases, you can treat it exactly like you can treat the list returned now - just don't try adding to it or (probably) removing from it. Tim Delaney -- http://mail.python.org/mailman/listinfo/python-list
RE: Dictionary .keys() and .values() should return a set [withPython3000 in mind]
Paul Rubin wrote: [EMAIL PROTECTED] [EMAIL PROTECTED] writes: And a view of the dictionary is orders faster than creating a copy of it (which is required to keep k0 from changing in your example). If you're LUCKY, copying a dictionary is O(n), There are ways to do it so you don't have to copy, but the dict would then no longer be a straightforward hash table. Plus copy-on-write isn't something you want in general - it's *much* slower since it needs to copy the internal data every time it's modified. The vast majority of uses of data structures does not involve concurrent modification. Tim Delaney -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [withPython3000 in mind]
Delaney, Timothy (Tim) wrote: Plus copy-on-write isn't something you want in general - it's *much* slower since it needs to copy the internal data every time it's modified. The vast majority of uses of data structures does not involve concurrent modification. But copy-on-write in the presence of reference counts doesn't do the mad useless-copy stuff. --Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
RE: Dictionary .keys() and .values() should return a set[withPython3000 in mind]
Scott David Daniels wrote: Delaney, Timothy (Tim) wrote: Plus copy-on-write isn't something you want in general - it's *much* slower since it needs to copy the internal data every time it's modified. The vast majority of uses of data structures does not involve concurrent modification. But copy-on-write in the presence of reference counts doesn't do the mad useless-copy stuff. Good point - if there's only one reference, there's no need for copying - this is definitely something that could be taken advantage of. Tim Delaney -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
[EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? I think this is an interesting suggestion. Of course, the current situation is as much a product of historical progression as anything: lists and dicts pre-date sets, so the collection of keys had to be returned as a list. Since lists also carry some concept of order to them, the documentation for the list returned by dict.keys() went out of its way to say that the order of the elements in the dict.keys() list had no bearing on the dict, the insertion order of entries, or anything else, that the order of keys was purely arbitrary. In fact, there is not a little irony to this proposal, since it seems it was just a few months ago that c.l.py had just about weekly requests for how to create an ordered dict, with various ideas of how a dict should be ordered, but most intended to preserve the order of insertion of items into the dict. And now here we have just about the opposite proposal - dicts should not only *not* be ordered, they should revel in their disorderliness. I liked the example in which the OP (of this thread) wanted to compare the keys from two different dicts, for equality of keys. Since the keys() method returns a set of indeterminate order, we can't simply perform dictA.keys() == dictB.keys(). But how often does this really happen? In practice, I think the keys of a dict, when this collection is used at all, are usually sorted and then iterated over, usually to prettyprint the keys and values in the dict. Internally, this set of items shouldn't even exist as a separate data structure - the dict's keys are merely labels on nodes in some sort of hash tree. Now that we really do have sets in Python, if one were to design dicts all over again, it does seem like set would be a better choice than list for the type returned by the keys() method. To preserve compatibility, would a second method, keyset(), do the trick? The OP used this term himself, referring to the keyset of the dictionary. Still given the few cases where I would access this as a true set, using set(dictA.keys()) doesn't seem to be that big a hardship, and its obviousness will probably undercut any push for a separate keyset() method. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
[EMAIL PROTECTED] wrote: This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? I think the order of the items returned by keys() and values() are related. I decided on a short empirical test: import random n=50 d = dict((i,random.randint(0,n-1)) for i in range(n)) k,v = d.keys(), d.values() [d[k[i]] == v[i] for i in range(n)] [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True] ## for larger n try # False in [d[k[i]] == v[i] for i in range(n)] The order of keys() and the order of values() are related, so a change to returning sets would also loose this functionality. Mind you, Never rely on that implied ordering. Always use items(). And so *if* sets were returned for keys() and values() then it should for items() too. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
Paddy: Mind you, Never rely on that implied ordering. Always use items(). Using dict.items() is probably better, but the manual says: If items(), keys(), values(), iteritems(), iterkeys(), and itervalues() are called with no intervening modifications to the dictionary, the lists will directly correspond. This allows the creation of (value, key) pairs using zip(): pairs = zip(a.values(), a.keys()). The same relationship holds for the iterkeys() and itervalues() methods: Is this going to change? dict.keyset() seems nice, but you usually don't want to make a too much big API. Keeping APIs small is very important in Python, otherwise you need the manual to write code. I think a better solution to solve such key set problems is to optimize Python itself, so Python computes set(dict) really fast (it can just copies the hash of the dict). Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
Nick Vatamaniuc wrote: Robert Kern wrote: [EMAIL PROTECTED] wrote: The same thing goes for the values(). Here most people will argue that ... This part is pretty much a non-starter. Not all Python objects are hashable. ... Also, I may need keys to map to different objects that happen to be equal. -- Robert Kern So, values() can't return a set because of (at least) the two reasons given above. And since, as Scott David Daniels pointed out, dicts support the iterator protocol, you can ask for a set of the keys easily enough if you want it: d = dict(a=1, b=2, c=3) set(d) set(['a', 'c', 'b']) So, IMHO, there's not much point to having keys() return a set. Peace, ~Simon -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python3000 in mind]
The meaning of dict.keys, etc, will not change for the 2.x series. For 3.0, I believe that Guido already intends that .keys() no longer return a separate list. For one thing, a major, if not the main use, of the method is for iteration, as in 'for keys in d.keys():'. For this, creating and them discarding a separate list is inefficient and, in a sense, silly. One option is to make .keys be what .iterkeys is today. Another, more recent, is to make .keys return a new iterable dict view object, details not yet worked out. Either way, one would make an independent set or list with set(d.keys()) or list(d.keys)). Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
RE: Dictionary .keys() and .values() should return a set [with Python3000 in mind]
[EMAIL PROTECTED] wrote: This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? There has been much discussion of this on the Python-3000 mailing list. Suggestings included making keys(), etc return an iterator, and getting rid of iterkeys(), etc. The eventual consensus was that keys(), etc should return views on the dictionary. These views would be re-iterable and will have basically the same behaviour as the lists returned from keys(), etc. However, such a view could have O(1) __contains__ behaviour, and would not incur the overhead of creating a list and populating it - they would be backed by the dictionary. Of course, this introduces the issue of concurrent modification, but you can always get an independent copy by calling list(dict.keys()). Tim Delaney -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python3000 in mind]
Delaney, Timothy (Tim) [EMAIL PROTECTED] writes: The eventual consensus was that keys(), etc should return views on the dictionary. These views would be re-iterable and will have basically the same behaviour as the lists returned from keys(), etc. However, such a view could have O(1) __contains__ behaviour, and would not incur the overhead of creating a list and populating it - they would be backed by the dictionary. Of course, this introduces the issue of concurrent modification, but you can always get an independent copy by calling list(dict.keys()). Wait a sec, you're saying k0 = d.keys() d['whee'] = 'parrot'# this can change k0??? That sounds broken. -- http://mail.python.org/mailman/listinfo/python-list
RE: Dictionary .keys() and .values() should return a set [withPython3000 in mind]
Paul Rubin wrote: Delaney, Timothy (Tim) [EMAIL PROTECTED] writes: The eventual consensus was that keys(), etc should return views on the dictionary. These views would be re-iterable and will have basically the same behaviour as the lists returned from keys(), etc. However, such a view could have O(1) __contains__ behaviour, and would not incur the overhead of creating a list and populating it - they would be backed by the dictionary. Of course, this introduces the issue of concurrent modification, but you can always get an independent copy by calling list(dict.keys()). Wait a sec, you're saying k0 = d.keys() d['whee'] = 'parrot'# this can change k0??? That sounds broken. That's the problem with views. You either take a snapshot of the current state, or you deal with the backing store changing. Java's solution is to raise a ConcurrentModificationException when it determines this has happened (with lots of weasel words about no guarantees). If you want an independent data set, you have to take a snapshot. For the above, that's doing: k0 = list(d.keys()) or k0 = set(d.keys()) depending on what behaviour you want. Tim Delaney -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [withPython3000 in mind]
Delaney, Timothy (Tim) [EMAIL PROTECTED] writes: If you want an independent data set, you have to take a snapshot. For the above, that's doing: k0 = list(d.keys()) I don't understand. Why have .keys() at all, if it doesn't get you an independent data set? If all you want is to iterate through the dict, you can already do that: for k in d: -- http://mail.python.org/mailman/listinfo/python-list
Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? I know that sets were introduced a lot later and lists/dictionaries were used instead but I think the only correct way now is for the dictionary keys and values to be sets. Presently {1:0,2:0,3:0}.keys() will produce [1,2,3] but it could also produce [3,1,2] or [3,2,1] on a different machine architecture. The Python documentation states that: Keys and values are listed in an _arbitrary_(my emphasis) order which is non-random, varies across Python implementations, and depends on the dictionary's history of insertions and deletions. So on the same machine it will be the case that: {1:0, 2:0, 3:0}.keys() == {3:0, 2:0, 1:0}.keys() is True. But if there are 2 lists of keys of the same dictionary, one is pickled on another machine, with a different arbitrary non-random ordering, then the keys will not be equal to each other. It seems like the problem could be solved by returning a set instead. The same thing goes for the values(). Here most people will argue that values are not necessarily unique, so a list is more appropriate, but in fact the values are unique it is just that more than one key could map to the same value. What is 'seen' in dictionary is the mapping rule. Also the values are not ordered and should not be indexable -- they should be a set. Just as the keys(), one ordered list of values from a dictionary on one machine will not necessarily be equal to another list of values an on another machine, while in fact, they should be. On a more fundamental and general level, a dictionary is actually an explicit function, also called a 'map'. A set of keys, traditionally called a 'domain' are mapped to a set of values, traditionally called a 'range'. This mapping produces at most a surjective function (i.e. two or more keys can map to same value and all the values are mapped to by some keys). Notice that the traditional counterparts to keys and values are sets and not just lists. This seems like theory babble, but in the case of Python staying 'true' to the theory is usually a GoodThing(tm). I love Python primarily because it does something in only one, correct, and reasonable way. The same principle should probably apply to Python itself including to its built-in data structures. Of course the compatibilty will be broken. Any code relying on a certain ordering of keys() or values() returned would need to be updated. One could argue though that such code was not entirely correct in the first place to asssume that, and should be fixed anyway. Obviously this fix should not be in Python 2.X but perhaps would be worth considering for Python 3000. What does everyone think? -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
There's a few good reasons. 1 - golden handcuffs. Breaking old code is bad 90% of the time 2 - creating a set MAY be slower. Python's sets seem to imply to that they will always be a hash map. in this case, some creative hash map mapping could allow one to create a set without calculating hash codes (make the set hashmap have the same dimentions and rules as the dictionary one). If there was intent to allow Python implementations to use trees for the set, then a list is far faster to create (O(n) time instead of O(nlogn)). 3 - using a set is sometimes slower (just as using a list is sometimes slower) I can't speak for your code, but this is the most common use of keys in my coding: # d is some dictionary keys = d.keys() keys.sort() for k in keys: #blah sets cannot be sorted, while lists can. If keys() returned a set, then I'd have to turn it into a list every time. There's potential to add views to python (a key view of a dictionary being a frozenset containing the dictionary's keys, which is updated whenever the dictionary updates), but I think thats annother topic which is out of the scope of your origional idea. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
[EMAIL PROTECTED] wrote: I can't speak for your code, but this is the most common use of keys in my coding: # d is some dictionary keys = d.keys() keys.sort() for k in keys: #blah This you can rewrite quite effectively as: for k in sorted(d): #blah --Scott David Daniels [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
1 - golden handcuffs. Breaking old code is bad 90% of the time I agree with you on that, mostly for code that counted on list methods of result of keys() - like you later show with sort. But there is a side note: old code that assumed a particular ordering of the keys or values is broken anyway. So even if ks={1:0,2:0,3:0}.keys() returns [1,2,3] on my machine I should not do something like 'my_savings_account + ks[0]' That code should be fixed anyway, since on a different machine it might produce different values for ks[0]. 2 - creating a set MAY be slower. Creating a set from the dictionary's keys should not be a lot slower because the keys are already unique, there is no need to check each key against the other keys just return them as a set. I assume this is what you meant by make the set hashmap have the same dimensions and rules as the dictionary one. Perhaps a dictionary would internally just copy its keys to the set and return it rather than construct as set from scratch (with duplication checks and all). 3 - using a set is sometimes slower Again, depending how it is used. In your case you argue that you usually sort the keys anyway so a list is more convinient. But different use cases can call for differnent operations on the keys() after they have been retrieved. What if someone wants to do an intersection to find common keys with another dictionary, then a set would be more appropriate. The intent of the set() type was to not temp anyone into assuming an ordering of keys() just because a list is indexable. And eliminate the need for a long footnote in the documentation of the dict type that talks about 'an arbitrary non-random ordering' - it takes while just to grasp what that means... In general I believe that a small performance penalty is acceptable in order to have a correct and consistent data type, especially for Python i.e. I might not argue the same for Perl or C. -Nick V. [EMAIL PROTECTED] wrote: There's a few good reasons. 1 - golden handcuffs. Breaking old code is bad 90% of the time 2 - creating a set MAY be slower. Python's sets seem to imply to that they will always be a hash map. in this case, some creative hash map mapping could allow one to create a set without calculating hash codes (make the set hashmap have the same dimentions and rules as the dictionary one). If there was intent to allow Python implementations to use trees for the set, then a list is far faster to create (O(n) time instead of O(nlogn)). 3 - using a set is sometimes slower (just as using a list is sometimes slower) I can't speak for your code, but this is the most common use of keys in my coding: # d is some dictionary keys = d.keys() keys.sort() for k in keys: #blah sets cannot be sorted, while lists can. If keys() returned a set, then I'd have to turn it into a list every time. There's potential to add views to python (a key view of a dictionary being a frozenset containing the dictionary's keys, which is updated whenever the dictionary updates), but I think thats annother topic which is out of the scope of your origional idea. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
[EMAIL PROTECTED] wrote: The same thing goes for the values(). Here most people will argue that values are not necessarily unique, so a list is more appropriate, but in fact the values are unique it is just that more than one key could map to the same value. What is 'seen' in dictionary is the mapping rule. Also the values are not ordered and should not be indexable -- they should be a set. Just as the keys(), one ordered list of values from a dictionary on one machine will not necessarily be equal to another list of values an on another machine, while in fact, they should be. This part is pretty much a non-starter. Not all Python objects are hashable. In [1]: d = {} In [2]: for i in range(1, 10): ...: d[i] = range(i) ...: ...: In [3]: set(d.values()) --- exceptions.TypeError Traceback (most recent call last) /Users/kern/ipython console TypeError: list objects are unhashable Also, I may need keys to map to different objects that happen to be equal. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
On 2/07/2006 6:01 AM, [EMAIL PROTECTED] wrote: This has been bothering me for a while. [snip] Summary of OP's post: d.keys() and d.values() should return sets in Python 3.0. Observations: (1) My code [some of which dates back to Python 1.5.1] uses about 12 x d.items() and 11 x d.keys() for each 1 x d.values() (2) Most cases of d.X() don't need/want the untrammelled list and could be replaced by d.iterX(). Example: max_frequency = max(tally.values()) Opinion: d.X() could be deprecated, but I'd rather see a consciousness-raising for the d.iterX() methods, and for the construct for key in d: Cheers, John -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
Nick Vatamaniuc [EMAIL PROTECTED] wrote: But there is a side note: old code that assumed a particular ordering of the keys or values is broken anyway. From a testing point of view, it would be interesting if there was a flag which said, Deliberately change everything which isn't guaranteed to be a specific way. So, for example, dict.keys() would return a list in reverse order of how it normally does it (even if it cost more to do it that way). An alternate hash key generator would be slipped into place. Floating point math would get a little noise added to the least significant bits. And so on. Might be interesting to see what sorts of bugs that would shake out from user code. -- http://mail.python.org/mailman/listinfo/python-list
Re: Dictionary .keys() and .values() should return a set [with Python 3000 in mind]
You are correct I should have thought of that. I still think the keys() method should return a set() though. Robert Kern wrote: [EMAIL PROTECTED] wrote: The same thing goes for the values(). Here most people will argue that values are not necessarily unique, so a list is more appropriate, but in fact the values are unique it is just that more than one key could map to the same value. What is 'seen' in dictionary is the mapping rule. Also the values are not ordered and should not be indexable -- they should be a set. Just as the keys(), one ordered list of values from a dictionary on one machine will not necessarily be equal to another list of values an on another machine, while in fact, they should be. This part is pretty much a non-starter. Not all Python objects are hashable. In [1]: d = {} In [2]: for i in range(1, 10): ...: d[i] = range(i) ...: ...: In [3]: set(d.values()) --- exceptions.TypeError Traceback (most recent call last) /Users/kern/ipython console TypeError: list objects are unhashable Also, I may need keys to map to different objects that happen to be equal. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list