On 3 March 2017 at 18:47, Wolfgang Maier < wolfgang.ma...@biologie.uni-freiburg.de> wrote:
> On 03/03/2017 04:36 AM, Nick Coghlan wrote: > >> On 2 March 2017 at 21:06, Wolfgang Maier >> <wolfgang.ma...@biologie.uni-freiburg.de >> <mailto:wolfgang.ma...@biologie.uni-freiburg.de>> wrote: >> >> - overall I looked at 114 code blocks that contain one or more breaks >>> >> >> >> Thanks for doing that research :) >> >> >> Of the remaining 19 non-trivial cases >>> >>> - 9 are variations of your classical search idiom above, i.e., >>> there's an else clause there and nothing more is needed >>> >>> - 6 are variations of your "nested side-effects" form presented >>> above with debatable (see above) benefit from except break >>> >>> - 2 do not use an else clause currently, but have multiple breaks >>> that do partly redundant things that could be combined in a single >>> except break clause >>> >> >> >> Those 8 cases could also be reviewed to see whether a flag variable >> might be clearer than relying on nested side effects or code repetition. >> >> > [...] > > >> This is a case where a flag variable may be easier to read than loop >> state manipulations: >> >> may_have_common_prefix = True >> while may_have_common_prefix: >> prefix = None >> for item in items: >> if not item: >> may_have_common_prefix = False >> break >> if prefix is None: >> prefix = item[0] >> elif item[0] != prefix: >> may_have_common_prefix = False >> break >> else: >> # all subitems start with a common "prefix". >> # move it out of the branch >> for item in items: >> del item[0] >> subpatternappend(prefix) >> >> Although the whole thing could likely be cleaned up even more via >> itertools.zip_longest: >> >> for first_uncommon_idx, aligned_entries in >> enumerate(itertools.zip_longest(*items)): >> if not all_true_and_same(aligned_entries): >> break >> else: >> # Everything was common, so clear all entries >> first_uncommon_idx = None >> for item in items: >> del item[:first_uncommon_idx] >> >> (Batching the deletes like that may even be slightly faster than >> deleting common entries one at a time) >> >> Given the following helper function: >> >> def all_true_and_same(entries): >> itr = iter(entries) >> try: >> first_entry = next(itr) >> except StopIteration: >> return False >> if not first_entry: >> return False >> for entry in itr: >> if not entry or entry != first_entry: >> return False >> return True >> >> - finally, 1 is a complicated break dance to achieve sth that >>> clearly would have been easier with except break; from typing.py: >>> >> >> > [...] > > >> I think is another case that is asking for the inner loop to be factored >> out to a named function, not for reasons of re-use, but for reasons of >> making the code more readable and self-documenting :) >> >> > It's true that using a flag or factoring out redundant code is always a > possibility. Having the except clause would clearly not let people do > anything they couldn't have done before. > On the other hand, the same is true for the else clause - it's only > advantage here is that it's existing already I forget where it came up, but I seem to recall Guido saying that if he were designing Python today, he wouldn't include the "else:" clause on loops, since it inevitably confuses folks the first time they see it. (Hence articles like mine that attempt to link it with try/except/else rather than if/else). > - because a single flag could always distinguish between a break having > occurred or not: > > brk = False > for item in iterable: > if some_condition: > brk = True > break > if brk: > do_stuff_upon_breaking_out() > else: > do_alternative_stuff() > > is a general pattern that would always work without except *and* else. > > However, the fact that else exists generates a regrettable asymmetry in > that there is direct language support for detecting one outcome, but not > the other. > It's worth noting that this asymmetry doesn't necessarily exist in the corresponding C idiom that I assume was the inspiration for the Python equivalent: int data_array_len = sizeof(data_array) / sizeof(data_array[0]); in idx = 0; for (idx = 0; idx < data_array_len; idx++) { if (condition(container[idx])) { break; } } if (idx < data_array_len) { // We found a relevant entry } else { // We didn't find anything } In Python prior to 2.1 (when PEP 234 added the iterator protocol), a similar approach could be used for Python's for loops: num_items = len(container): for idx in range(num_items): if condition(container[idx]): break if num_items and idx < num_items: # We found a relevant entry else: # We didn't find anything However, while my own experience with Python is mainly with 2.2+ (and hence largely after the era where "for i in range(len(container)):" was still common), I've spent a lot of time working with C and the corresponding iterator protocols in C++, and there it is pretty common to move the "entry found" code before the break and then invert the conditional check that appears after the loop: int data_array_len = sizeof(data_array) / sizeof(data_array[0]); int idx = 0; for (idx = 0; idx < data_array_len; idx++) { if (condition(container[idx])) { // We found a relevant entry break; } } if (idx >= data_array_len) { // We didn't find anything } And it's *this* version of the C/C++ idiom that Python's "else:" clause replicates. One key aspect of this particular idiomatic structure is that it retains the same overall shape regardless of whether the inner structure is: if condition(item): # Condition is true, so process the item process(item) break or: if maybe_process_item(item): # Item was processed, so we're done here break Whereas the "post-processing" model can't handle pre-composed helper functions that implement both the conditional check and the item processing, and then report back which branch they took. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/