Bugs item #1721372, was opened at 2007-05-18 10:10 Message generated for change (Comment added) made by aisaac0 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Closed Resolution: Rejected Priority: 5 Private: No Submitted By: Alan (aisaac0) Assigned to: Nobody/Anonymous (nobody) Summary: emphasize iteration volatility for set Initial Comment: For <URL:http://docs.python.org/lib/types-set.html>, append the following new sentence to the 2nd paragraph. Iteration over a set returns elements in an indeterminate order, which generally depends on factors outside the scope of the containing program. *Justification:* users should not be expected to understand without being told that iteration order depends on factors outside the scope of the containing program. (Additionally, unlike the documentation for dictionaries, the documentation for sets fails to give a serious warning not to rely on iteration order.) ---------------------------------------------------------------------- >Comment By: Alan (aisaac0) Date: 2007-05-21 22:27 Message: Logged In: YES user_id=1025672 Originator: YES Note that on c.l.python, Raymond Hettinger justifies this rejection as follows: "the docs are sufficient when they say that set ordering is arbitrary" Where exactly do the docs say this? I do not see it. I am looking here: <URL:http://docs.python.org/lib/types-set.html> I also take this as a concession that the docs *should* say something like this, which is about half of the language I proposed (unless there is some reason why 'arbitrary' is superior to 'indeterminate'). Btw, I did provide the source code to several people before Peter. This was clear in the thread on c.l.python. I do not think they would appreciate being called "not experienced". ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2007-05-19 16:29 Message: Logged In: YES user_id=21627 Originator: NO aisaac0, thanks for elaborating. Your remark now convinces me that it was the right thing to reject this change. Ite seems that you suggest that experienced users a) are aware that some objects compare and hash by their id(), and b) that the id() is the address in memory, and c) that the id() will influence the order in which objects are iterated, and d) fail to see that the id() may differ across runs Such users are *not* experienced. There are many more reasons why the id of an object may vary across runs. E.g. Linux 2.6 deliberately randomizes memory management, so that identical processes get their objects allocated at different addresses, to defeat security exploits that rely on deterministic address of things in main memory (there is a system call to disable this randomization) Looking at the entire thread, I agree with Carsten Haese's posting: That even experienced users couldn't diagnose this correctly is because they a) did not receive the source code, and b) were talked into believing that this has to do something with the random module. The library reference is a specification, not a tutorial. ---------------------------------------------------------------------- Comment By: Alan (aisaac0) Date: 2007-05-19 08:09 Message: Logged In: YES user_id=1025672 Originator: YES The previous comment completely misses the point. Again, please see the discussion on c.l.python. Not one of the participants expected sets to be "ordered". What was suprising to them was the order can *change* across sequential executions of an **unchanged** source. This is of course *quite* different than expecting that sets are ordered; I am perplexed that anyone would conflate the two. One cannot credibly argue that anyone who understands that sets are not ordered will not be surprised, since even sophisticated users were as a matter of fact surprised in the c.l.python discussion. (Until it was explained by Peter of course.) A natural conclusion is that the docs should offer better protection against such surprise, since we have concrete evidence that even sophisticated users can be surprised by this. In sum, the previous comment conflates two distinct issues and so fails to address the reasons for the proposed docs patch. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2007-05-19 01:38 Message: Logged In: YES user_id=21627 Originator: NO The documentation already says "Being an unordered collection, sets do not record element position or order of insertion." If users read this and fail to understand the notion of an unordered collection, I see no way of "fixing" this. ---------------------------------------------------------------------- Comment By: Alan (aisaac0) Date: 2007-05-18 21:28 Message: Logged In: YES user_id=1025672 Originator: YES While I do not mind my language being rejected, *something* should be added to warn users. What the previous comment fails to mention is the number of people on c.l.python, some of whom are quite sophisticated users, who failed to discover the source of indeterminacy. Users should not have to "rediscover" this because of a documentation failure. ---------------------------------------------------------------------- Comment By: Raymond Hettinger (rhettinger) Date: 2007-05-18 18:08 Message: Logged In: YES user_id=80475 Originator: NO While the OP knows what he means here, the suggested text does not add clarity, it only makes the subject harder to understand and implies that some mysterious, dark force is in place. Further, the suggested text is simply incorrect. Given deterministic assignment of hash values and a consistent insertion order, the order of keys in a set or dictionary is fully determined. I've read the source of this suggestion on comp.lang.python and commented there. The underlying issue had nothing to do with either sets or dicts. The code in question "re-discovered" that the location of objects in memory would vary between runs if the user deleted a pyc file for a module. The OP's script used object ids as hash values, hence the set/dict ordering could vary between runs. This was at odds with his expectation that that the ordering would be deterministic. The moral is that non-deterministic hash values lead to non-deterministic set/dict ordering. The docs for sets and dicts should not be muddled with tangential discussions about implementation specific details regarding what governs where objects are placed in memory. ---------------------------------------------------------------------- Comment By: Alan (aisaac0) Date: 2007-05-18 13:00 Message: Logged In: YES user_id=1025672 Originator: YES Location in memory. See Peter Otten's discussion at http://www.thescripts.com/forum/post2552380-16.html ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2007-05-18 12:05 Message: Logged In: YES user_id=21627 Originator: NO What factors outside the containing program influence iteration order? Iteration is completely deterministic, and only depends on the items inserted, and the order in which they were inserted, neither of which is outside the scope of the containing program. It's just that the order is not easily predictable. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1721372&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com