Re: Non-deterministic set ordering
On 2022-05-16 04:20, Rob Cliffe via Python-list wrote: On 16/05/2022 04:13, Dan Stromberg wrote: On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list wrote: I was shocked to discover that when repeatedly running the following program (condensed from a "real" program) under Python 3.8.3 for p in { ('x','y'), ('y','x') }: print(p) the output was sometimes ('y', 'x') ('x', 'y') and sometimes ('x', 'y') ('y', 'x') Can anyone explain why running identical code should result in traversing a set in a different order? Sets are defined as unordered so that they can be hashed internally to give O(1) operations for many tasks. It wouldn't be unreasonable for sets to use a fixed-by-arbitrary ordering for a given group of set operations, but being unpredictable deters developers from mistakenly assuming they are ordered. If you need order, you should use a tuple, list, or something like https://grantjenks.com/docs/sortedcontainers/sortedset.html Thanks, I can work round this behaviour. But I'm curious: where does the variability come from? Is it deliberate (as your answer seems to imply)? AFAIK the same code within the *same run* of a program does produce identical results. Basically, Python uses hash randomisation in order to protect it against denial-of-service attacks. (Search for "PYTHONHASHSEED" in the docs.) It also applied to dicts (the code for sets was based on that for dicts), but dicts now remember their insertion order. -- https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering
Thanks, Paul. Question answered! Rob Cliffe On 16/05/2022 04:36, Paul Bryan wrote: This may explain it: https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions On Mon, 2022-05-16 at 04:20 +0100, Rob Cliffe via Python-list wrote: On 16/05/2022 04:13, Dan Stromberg wrote: On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list wrote: I was shocked to discover that when repeatedly running the following program (condensed from a "real" program) under Python 3.8.3 for p in { ('x','y'), ('y','x') }: print(p) the output was sometimes ('y', 'x') ('x', 'y') and sometimes ('x', 'y') ('y', 'x') Can anyone explain why running identical code should result in traversing a set in a different order? Sets are defined as unordered so that they can be hashed internally to give O(1) operations for many tasks. It wouldn't be unreasonable for sets to use a fixed-by-arbitrary ordering for a given group of set operations, but being unpredictable deters developers from mistakenly assuming they are ordered. If you need order, you should use a tuple, list, or something like https://grantjenks.com/docs/sortedcontainers/sortedset.html Thanks, I can work round this behaviour. But I'm curious: where does the variability come from? Is it deliberate (as your answer seems to imply)? AFAIK the same code within the *same run* of a program does produce identical results. Best wishes Rob Cliffe -- https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering
This may explain it: https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions On Mon, 2022-05-16 at 04:20 +0100, Rob Cliffe via Python-list wrote: > > > On 16/05/2022 04:13, Dan Stromberg wrote: > > > > On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list > > wrote: > > > > I was shocked to discover that when repeatedly running the > > following > > program (condensed from a "real" program) under Python 3.8.3 > > > > for p in { ('x','y'), ('y','x') }: > > print(p) > > > > the output was sometimes > > > > ('y', 'x') > > ('x', 'y') > > > > and sometimes > > > > ('x', 'y') > > ('y', 'x') > > > > Can anyone explain why running identical code should result in > > traversing a set in a different order? > > > > > > Sets are defined as unordered so that they can be hashed internally > > to > > give O(1) operations for many tasks. > > > > It wouldn't be unreasonable for sets to use a fixed-by-arbitrary > > ordering for a given group of set operations, but being > > unpredictable > > deters developers from mistakenly assuming they are ordered. > > > > If you need order, you should use a tuple, list, or something like > > https://grantjenks.com/docs/sortedcontainers/sortedset.html > Thanks, I can work round this behaviour. > But I'm curious: where does the variability come from? Is it > deliberate > (as your answer seems to imply)? AFAIK the same code within the > *same > run* of a program does produce identical results. > Best wishes > Rob Cliffe -- https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering
On 16/05/2022 04:13, Dan Stromberg wrote: On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list wrote: I was shocked to discover that when repeatedly running the following program (condensed from a "real" program) under Python 3.8.3 for p in { ('x','y'), ('y','x') }: print(p) the output was sometimes ('y', 'x') ('x', 'y') and sometimes ('x', 'y') ('y', 'x') Can anyone explain why running identical code should result in traversing a set in a different order? Sets are defined as unordered so that they can be hashed internally to give O(1) operations for many tasks. It wouldn't be unreasonable for sets to use a fixed-by-arbitrary ordering for a given group of set operations, but being unpredictable deters developers from mistakenly assuming they are ordered. If you need order, you should use a tuple, list, or something like https://grantjenks.com/docs/sortedcontainers/sortedset.html Thanks, I can work round this behaviour. But I'm curious: where does the variability come from? Is it deliberate (as your answer seems to imply)? AFAIK the same code within the *same run* of a program does produce identical results. Best wishes Rob Cliffe -- https://mail.python.org/mailman/listinfo/python-list
Re: Non-deterministic set ordering
On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list < python-list@python.org> wrote: > I was shocked to discover that when repeatedly running the following > program (condensed from a "real" program) under Python 3.8.3 > > for p in { ('x','y'), ('y','x') }: > print(p) > > the output was sometimes > > ('y', 'x') > ('x', 'y') > > and sometimes > > ('x', 'y') > ('y', 'x') > > Can anyone explain why running identical code should result in > traversing a set in a different order? > Sets are defined as unordered so that they can be hashed internally to give O(1) operations for many tasks. It wouldn't be unreasonable for sets to use a fixed-by-arbitrary ordering for a given group of set operations, but being unpredictable deters developers from mistakenly assuming they are ordered. If you need order, you should use a tuple, list, or something like https://grantjenks.com/docs/sortedcontainers/sortedset.html -- https://mail.python.org/mailman/listinfo/python-list
Re: Changing calling sequence
On 16/05/22 1:20 am, 2qdxy4rzwzuui...@potatochowder.com wrote: IMO, classmethods were/are a bad idea (yes, I'm probably in the minority around here, but someone has to be). I don't think class methods are a bad idea per se, but having them visible through instances seems unnecessary and confusing. I suspect that wasn't a deliberate design decision, but just a side effect of using a single class dict for both class and instance things. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Non-deterministic set ordering
I was shocked to discover that when repeatedly running the following program (condensed from a "real" program) under Python 3.8.3 for p in { ('x','y'), ('y','x') }: print(p) the output was sometimes ('y', 'x') ('x', 'y') and sometimes ('x', 'y') ('y', 'x') Can anyone explain why running identical code should result in traversing a set in a different order? Thanks Rob Cliffe -- https://mail.python.org/mailman/listinfo/python-list
Re: Changing calling sequence
On 2022-05-15 at 14:44:09 +1000, Chris Angelico wrote: > On Sun, 15 May 2022 at 14:27, dn wrote: > > > > On 15/05/2022 11.34, 2qdxy4rzwzuui...@potatochowder.com wrote: > > > On 2022-05-15 at 10:22:15 +1200, > > > dn wrote: > > > > > >> That said, a function which starts with a list of ifs-buts-and-maybes* > > >> which are only there to ascertain which set of arguments have been > > >> provided by the calling-routine; obscures the purpose/responsibility > > >> of the function and decreases its readability (perhaps not by much, > > >> but varying by situation). > > > > > > Agreed. > > > > > >> Accordingly, if the function is actually a method, recommend following > > >> @Stefan's approach, ie multiple-constructors. Although, this too can > > >> result in lower readability. > > > > > > (Having proposed that approach myself (and having used it over the > > > decades for functions, methods, procedures, constructors, ...), I also > > > agree.) > > > > > > Assuming good names,¹ how can this lead to lower readability? I guess > > > if there's too many of them, or programmers have to start wondering > > > which one to use? Or is this in the same generally obfuscating category > > > as the ifs-buts-and-maybes at the start of a function? > > > > > > ¹ and properly invalidated caches > > > > Allow me to extend the term "readability" to include "comprehension". > > Then add the statistical expectation that a class has only __init__(). Aha. In that light, yeah, in geeral, the more stuff there is, the harder it is to get your head around it. And even if I document the class (or the module), no one makes the time to read (let alone comprehend) the document, which *should* clarify all those things that are hard to discern from the code itself. > > Thus, assuming this is the first time (or, ... for a while) that the > > class is being employed, one has to read much further to realise that > > there are choices of constructor. > > Yeah. I would generally say, though, that any classmethod should be > looked at as a potential alternate constructor, or at least an > alternate way to obtain objects (eg preconstructed objects with > commonly-used configuration - imagine a SecuritySettings class with a > classmethod to get different defaults). I think opening up the class and sifting through its classmethods to find the factory functions is what dn is talking about. Such a design also means that once I have a SecuritySettings object, its (the instance's) methods include both instance and class level methods. IMO, classmethods were/are a bad idea (yes, I'm probably in the minority around here, but someone has to be). The first person to scream "but discoverability" will be severely beaten with a soft cushion. > > Borrowing from the earlier example: > > > > > This would be quite pythonic. For example, "datetime.date" > > > has .fromtimestamp(timestamp), .fromordinal(ordinal), > > > .fromisoformat(date_string), ... > > > > Please remember that this is only relevant if the function is actually a > > module - which sense does not appear from the OP (IMHO). Note that datetime.date is a class, not a module. > > The alternatives' names are well differentiated and (apparently#) > > appropriately named*. [...] > > Continuing the 'have to read further' criticism (above), it could > > equally-well be applied to my preference for keyword-arguments, in that > > I've suggested defining four parameters but the user will only call the > > function with either three or one argument(s). Could this be described > > as potentially-confusing? Potentially. :-) In a well designed *library*, common keywords across multiple functions provide consistency, which is generally good. Even a bit of redundancy can be good for the same reason. OTOH, when there's only one function, and it has a pile of keyword parameters that can only be used in certain combinations, then it definitely can be harder to read/understand/use than separate functions with simpler interfaces. > Yes, definitely. Personally, I'd split it into two, one that takes the > existing three arguments (preferably with the same name, for > compatibility), and one with a different name that takes just the one > arg. That could be a small wrapper that calls the original, or the > original could become a wrapper that calls the new one, or the main > body could be refactored into a helper that they both call. It all > depends what makes the most sense internally, because that's not part > of the API at that point. > > But it does depend on how the callers operate. Sometimes it's easier > to have a single function with switchable argument forms, other times > it's cleaner to separate them. "Easier" and "cleaner" are very often orthogonal. ;-) (Rich Hickey (creator of Clojure) talks a lot about the difference between "easy" and "simple." Arguemnts for and against Unix often involve similar terms.) And "easier" or "cleaner" for whom? The person writing the