[issue42937] 192.0.0.8 (IPv4 dummy address) considered globally reachable
Martijn Pieters added the comment: > Private is a subset of special use. Should a "_special_use" constant be > created. This would include multicast, link_local, private_use, and a few > more. There are already dedicated tests for those other special use networks in ipaddress. 192.0.0.0/24 is the block reserved for "IETF Protocol Assignments", which really means: private use. https://datatracker.ietf.org/doc/html/rfc6890#section-2.2.2 marks the block as "Not usable unless by virtue of a more specific reservation.". The registry at https://www.iana.org/assignments/iana-ipv4-special-registry/iana-ipv4-special-registry.xhtml lists those specific reservations, and only 2 to date are *globally reachable*, which means they are probably not private: - 192.0.0.9/32, Port Control Protocol Anycast, RFC 7723 - 192.0.0.10/32, Traversal Using Relays around NAT Anycast, RFC 8155 I strongly feel that _any other IP address in the reserved range_ should be treated as private unless marked, by IANA, as globally reachable, at some future date. That would require the list of networks for IPv4Address / IPv4Network is_private to include all of 192.0.0.0/24 _minus those two exceptions_; calculating the network masks for these: >>> def exclude_all(network, *excluded): ... try: ... for sub in network.address_exclude(excluded[0]): ... yield from exclude_all(sub, *excluded[1:]) ... except (IndexError, ValueError): ... yield network ... >>> iana_reserved = IPv4Network("192.0.0.0/24") >>> to_remove = IPv4Network("192.0.0.9/32"), IPv4Network("192.0.0.10/32") >>> for sub in exclude_all(iana_reserved, *to_remove): ... print(sub) ... 192.0.0.128/25 192.0.0.64/26 192.0.0.32/27 192.0.0.16/28 192.0.0.0/29 192.0.0.12/30 192.0.0.11/32 192.0.0.8/32 The module could trivially do this on import, or we could hard-code the above list. -- ___ Python tracker <https://bugs.python.org/issue42937> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44167] ipaddress.IPv6Address.is_private makes redundant checks
Change by Martijn Pieters : -- keywords: +patch pull_requests: +24826 stage: -> patch review pull_request: https://github.com/python/cpython/pull/26209 ___ Python tracker <https://bugs.python.org/issue44167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42937] 192.0.0.8 (IPv4 dummy address) considered globally reachable
Martijn Pieters added the comment: Oops, I got my issue numbers mixed up. This is related to #44167, I meant. -- ___ Python tracker <https://bugs.python.org/issue42937> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42937] 192.0.0.8 (IPv4 dummy address) considered globally reachable
Martijn Pieters added the comment: This is related to #42937, the IPv4 private network list is not considering the whole of 192.0.0.0/24 to be private. RFC 5736 / 6890 reserved 192.0.0.0/24 for special purposes (private networks) and to date a few subnets of that network have received assignments. The ipaddress modules should use that subnet for any `is_private` test, and not just the subnets of that network that have received specific assignments. E.g. the list currently contains just 192.0.0.0/29 and 192.0.0.170/31, but as this bug report points out, 192.0.0.8/32 has since been added, as have 192.0.0.9/32 and 192.0.0.10/32. The IPv6 implementation *does* cover the whole reserved subnet (although it also includes 2 specific registrations, see the aforementioned #42937), it is just IPv4 that is inconsistent and incomplete here. -- nosy: +mjpieters ___ Python tracker <https://bugs.python.org/issue42937> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44167] ipaddress.IPv6Address.is_private makes redundant checks
New submission from Martijn Pieters : ipaddress.IPv6Address.is_private uses a hard-coded list of `IPv6Network` objects that cover private networks to test against. This list contains two networks that are subnets of a 3rd network in the list. IP addresses that are not private are tested against all 3 networks where only a single test is needed. The networks in question are: IPv6Network('2001::/23'), IPv6Network('2001:2::/48'), # within 2001::/23 ... IPv6Network('2001:10::/28'), # within 2001::/23 The first is a supernet of the other two, so any IP address that is tested against the first and is not part of that network, will also not be part of the other two networks: >>> from ipaddress import IPv6Network >>> sub_tla_id = IPv6Network('2001::/23') >>> sub_tla_id.supernet_of(IPv6Network('2001:2::/48')) True >>> sub_tla_id.supernet_of(IPv6Network('2001:10::/28')) True We can safely drop these two network entries from the list. On a separate note: the definition here is inconsistent with IPv4Address's list of private networks. 2001::/23 is the whole subnet reserved for special purpose addresses (RFC 2928), regardless of what ranges have actually been assigned. The IPv4 list on the other hand only contains _actual assignments within the reserved subnet_, not the whole reserved block (RFC 5736 / RFC 6890, reserving 192.0.0.0/24, IPv4Address only considers 192.0.0.0/29 and 192.0.0.170/31). I'll file a separate issue for that if not already reported. -- components: Library (Lib) messages: 393860 nosy: mjpieters priority: normal severity: normal status: open title: ipaddress.IPv6Address.is_private makes redundant checks type: performance versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue44167> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30256] Adding a SyncManager Queue proxy to a SyncManager dict or Namespace proxy raises an exception
Martijn Pieters added the comment: Might it be better to just *drop* the AutoProxy object altogether? All that it adds is a delayed call to MakeProxyType(f"AutoProxy[{typeid}]", exposed) (with exposed defaulting to public_methods(instance)), per instance per process. It could be replaced by a direct call to `MakeProxyType()`, using `public_methods` directly on the registered type. This wouldn't work for callables that are not classes or where instances add functions to the instance dict, but for those rare cases you can pass in the `exposed` argument. The advantage is that it would simplify the codebase; no more need to special-case the BaseProxy.__reduce__ method, removing the get_methods() method on the Server class, etc. Less surface for this class of bugs to happen in the future. -- nosy: +mjpieters ___ Python tracker <https://bugs.python.org/issue30256> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40647] Building with a libreadline.so located outside the ld.so.conf search path fails
Martijn Pieters added the comment: Last but not least, this is essentially a duplicate of https://bugs.python.org/issue4010 -- ___ Python tracker <https://bugs.python.org/issue40647> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40647] Building with a libreadline.so located outside the ld.so.conf search path fails
Martijn Pieters added the comment: Actually, this won't do it either, as `self.lib_dirs` already contains the --prefix. Clearly, I just need to add -R=${PREFIX}/lib to CPPFLAGS. -- resolution: -> not a bug ___ Python tracker <https://bugs.python.org/issue40647> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40647] Building with a libreadline.so located outside the ld.so.conf search path fails
New submission from Martijn Pieters : This issue goes back a long time. The libreadline handling in the modules setup.py doesn't add the location of the readline library to the runtime library paths: self.add(Extension('readline', ['readline.c'], library_dirs=['/usr/lib/termcap'], extra_link_args=readline_extra_link_args, libraries=readline_libs)) This requires the readline library to have been added to a traditional location or has taken care of either ld.so.conf or LD_LIBRARY_PATH. I'm building a series of Python binaries with a custom `--prefix` where I also installed a local copy of readline (so both are configured with the same prefix), and while setup.py finds the correct library, importing the compiled result fails because no `RPATH` is set. This could be fixed by adding the parent path of the located `libreadline` shared library as a `runtime_library_dirs` entry: readline_libdirs = None if do_readline not in self.lib_dirs: readline_libdirs = [ os.path.abspath(os.path.dirname(do_readline)) ] self.add(Extension('readline', ['readline.c'], library_dirs=['/usr/lib/termcap'], extra_link_args=readline_extra_link_args, runtime_library_dirs=readline_libdirs, libraries=readline_libs)) -- components: Extension Modules messages: 369054 nosy: mjpieters priority: normal severity: normal status: open title: Building with a libreadline.so located outside the ld.so.conf search path fails type: compile error versions: Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue40647> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36077] Inheritance dataclasses fields and default init statement
Martijn Pieters added the comment: I've supported people hitting this issue before (see https://stackoverflow.com/a/53085935/100297, where I used a series of mixin classes to make use of the changing MRO when the mixins share base classes, to enforce a field order from inherited classes. I'd be very much in favour of dataclasses using the attrs approach to field order: any field named in a base class *moves to the end*, so you can 'insert' your own fields by repeating parent fields that need to come later: @attr.s(auto_attribs=True) class Parent: foo: str bar: int baz: bool = False @attr.s(auto_attribs=True) class Child(Parent): spam: str baz: bool = False The above gives you a `Child(foo: str, bar: int, spam: str, baz: bool = False)` object, note that `baz` moved to the end of the arguments. `dataclasses` currently doesn't do this, so it'd be a breaking change. -- nosy: +mjpieters ___ Python tracker <https://bugs.python.org/issue36077> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35278] [security] directory traversal in tempfile prefix
Martijn Pieters added the comment: I found this issue after helping someone solve a Stack Overflow question at https://stackoverflow.com/q/58767241/100297; they eventually figured out that their prefix was a path, not a path element. I'd be all in favour of making tempfile._sanitize_params either reject a prefix or suffix with `os.sep` or `os.altsep` characters, or just take the last element of os.path.split(). -- nosy: +mjpieters ___ Python tracker <https://bugs.python.org/issue35278> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38364] inspect.iscoroutinefunction / isgeneratorfunction / isasyncgenfunction can't handle partialmethod objects
Change by Martijn Pieters : -- keywords: +patch pull_requests: +16188 stage: -> patch review pull_request: https://github.com/python/cpython/pull/16600 ___ Python tracker <https://bugs.python.org/issue38364> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38374] Remove weakref.ReferenceError entry from documentation
New submission from Martijn Pieters : The weakref documentation still mentions weakref.ReferenceError: https://docs.python.org/3/library/weakref.html#weakref.ReferenceError But this alias for the built-in ReferenceError exception was removed in the 3.0 development cycle (https://github.com/python/cpython/commit/2633c69fae7e413b2b64b01d8c0c901ae649a225#diff-b7975e9ef5a6be5f64e9bb391de03057), the last version where `weakref.ReferenceError` still exists is Python 2.7. Please remove it, it's just confusing now. -- assignee: docs@python components: Documentation messages: 353977 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: Remove weakref.ReferenceError entry from documentation versions: Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue38374> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38364] inspect.iscoroutinefunction / isgeneratorfunction / isasyncgenfunction can't handle partialmethod objects
New submission from Martijn Pieters : This is a follow-up to #33261, which added general support for detecting generator / coroutine / async generator functions wrapped in partials. It appears that partialmethod objects were missed out. While a partialmethod object will produce a functools.partial() object on binding to an instance, the .func attribute of that partial is a bound method, not a function, and the current _has_code_flag implementation unwraps methods *before* it unwraps partials. Next, binding to a class produces a partialmethod._make_unbound_method.._method wrapper function. _unwrap_partial can't unwrap this, as it doesn't handle this case; it could look for the `_partialmethod` attribute and follow that to find the `.func` attribute. Test case: import inspect import functools class Foo: async def bar(self, a): return a ham = partialmethod(bar, "spam") print(inspect.iscoroutinefunction(Foo.bar) # True print(inspect.iscoroutinefunction(Foo.ham) # False instance = Foo() print(inspect.iscoroutinefunction(instance.bar) # True print(inspect.iscoroutinefunction(instance.ham) # False -- components: Library (Lib) messages: 353849 nosy: mjpieters priority: normal severity: normal status: open title: inspect.iscoroutinefunction / isgeneratorfunction / isasyncgenfunction can't handle partialmethod objects type: behavior versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue38364> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37470] Make it explicit what happens when using a bounded queue with QueueHandler
New submission from Martijn Pieters : The documentation doesn't make it explicit what happens if you use a bounded queue together with logging.handlers.QueueHandler. If the queue is bounded in size and attempts are made to add logrecords faster than a queue listener removes them, then the resulting `queue.Full` exception is passed to `handler.handleError()` and that usually means the record is simply dropped (see https://docs.python.org/3/library/logging.html#logging.Handler.handleError). That may be the desired behaviour, but making it explicit is always better. -- assignee: docs@python components: Documentation messages: 347018 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: Make it explicit what happens when using a bounded queue with QueueHandler versions: Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue37470> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37469] Make it explicit that logging QueueHandler / QueueListener accepts a SimpleQueue.
New submission from Martijn Pieters : The implementation of the logging.handler.QueueHandler and logging.handler.QueueListener does not make use of the task tracking API of queues (queue.task_done(), queue.join()) nor does it care if the queue is unbounded (queue.full(), catching the Full exception). As such, it can work just as well with the new queue.SimpleQueue implementation (new in 3.7, see https://docs.python.org/3/library/queue.html#queue.SimpleQueue), which is fast and lightweight, implemented in C. Can the documentation be updated to make this option explicit? -- assignee: docs@python components: Documentation messages: 347017 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: Make it explicit that logging QueueHandler / QueueListener accepts a SimpleQueue. versions: Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue37469> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12169] Factor out common code for d2 commands register, upload and upload_docs
Change by Martijn Pieters : -- pull_requests: +12168 ___ Python tracker <https://bugs.python.org/issue12169> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12169] Factor out common code for d2 commands register, upload and upload_docs
Change by Martijn Pieters : -- pull_requests: -12166 ___ Python tracker <https://bugs.python.org/issue12169> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36188] Remove vestiges of Python 2 unbound methods from Python 3
Change by Martijn Pieters : -- keywords: +patch pull_requests: +12167 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue36188> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36188] Remove vestiges of Python 2 unbound methods from Python 3
New submission from Martijn Pieters : The implementation of method_hash, method_call and method_descr_get all still contain assumptions that __self__ can be set to None, a holdover from Python 2 where methods could be *unbound*. These vestiges can safely be removed, because method_new() and PyMethod_New() both ensure that self is always non-null. In addition, the datamodel description of methods includes this section: When a user-defined method object is created by retrieving another method object from a class or instance, the behaviour is the same as for a function object, except that the :attr:`__func__` attribute of the new instance is not the original method object but its :attr:`__func__` attribute. which also only applies to Python 2 unbound methods. Python 3 bound methods never change what they are bound to, let alone produce a new method object from __get__ that has to be careful about what __func__ is set to. I'm submitting a PR that removes these vestiges, no need to maintain code that never runs. -- components: Interpreter Core messages: 337142 nosy: mjpieters priority: normal severity: normal status: open title: Remove vestiges of Python 2 unbound methods from Python 3 versions: Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue36188> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12169] Factor out common code for d2 commands register, upload and upload_docs
Change by Martijn Pieters : -- pull_requests: +12166 ___ Python tracker <https://bugs.python.org/issue12169> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue36002] configure --enable-optimizations with clang fails to detect llvm-profdata
New submission from Martijn Pieters : This is probably a automake bug. When running CC=clang CXX=clang++ ./configure --enable-optimizations, configure tests for a non-existing -llvm-profdata binary: checking for --enable-optimizations... yes checking for --with-lto... no checking for -llvm-profdata... no configure: error: llvm-profdata is required for a --enable-optimizations build but could not be found. The generated configure script looks for "$target_alias-llvm-profdata", and $target_alias is an empty string. This problem is not visible on Macs, where additional checks for "/usr/bin/xcrun -find llvm-profdata" locate the binary. The work-around would be to specify a target when configuring. -- components: Build messages: 335610 nosy: mjpieters priority: normal severity: normal status: open title: configure --enable-optimizations with clang fails to detect llvm-profdata versions: Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue36002> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35805] email package folds msg-id identifiers using RFC2047 encoded words where it must not
New submission from Martijn Pieters : When encountering identifier headers such as Message-ID containing a msg-id token longer than 77 characters (including the <...> angle brackets), the email package folds that header using RFC 2047 encoded words, e.g. Message-ID: <154810422972.4.16142961424846318...@aaf39fce-569e-473a-9453-6862595bd8da.prvt.dyno.rt.heroku.com> becomes Message-ID: =?utf-8?q?=3C154810422972=2E4=2E16142961424846318784=40aaf39fce-?= =?utf-8?q?569e-473a-9453-6862595bd8da=2Eprvt=2Edyno=2Ert=2Eheroku=2Ecom=3E?= The msg-id token here is this long because Heroku Dyno machines use a UUID in the FQDN, but Heroku is hardly the only source of such long msg-id tokens. Microsoft's Outlook.com / Office365 email servers balk at the RFC2047 encoded word use here and attempt to wrap the email in a TNEF winmail.dat attachment, then may fail at this under some conditions that I haven't quite worked out yet and deliver an error message to the recipient with the helpful message "554 5.6.0 Corrupt message content", or just deliver the ever unhelpful winmail.dat attachment to the unsuspecting recipient (I'm only noting these symptom here for future searches). I encountered this issue with long Message-ID values generated by email.util.make_msgid(), but this applies to all RFC 5322 section 3.6.4 Identification Fields headers, as well as the corresponding headers from RFC 822 section 4.6 (covered by section 4.5.4 in 5322). What is happening here is that the email._header_value_parser module has no handling for the msg-id tokens *at all*, and email.headerregistry has no dedicated header class for identifier headers. So these headers are parsed as unstructured, and folded at will. RFC2047 section 5 on the other hand states that the msg-id token is strictly off-limits, and no RFC2047 encoding should be used to encode such elements. Because headers *can* exceed 78 characters (RFC 5322 section 2.1.1 states that "Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters[.]") I think that RFC5322 msg-id tokens should simply not be folded, at all. The obsoleted RFC822 syntax for msg-id makes them equal to the addr-spec token, where the local-part (before the @) contains word tokens; those would be fair game but then at least apply the RFC2047 encoded word replacement only to those word tokens. For now, I worked around the issue by using a custom policy that uses 998 as the maximum line length for identifier headers: from email.policy import EmailPolicy # Headers that contain msg-id values, RFC5322 MSG_ID_HEADERS = {'message-id', 'in-reply-to', 'references', 'resent-msg-id'} class MsgIdExcemptPolicy(EmailPolicy): def _fold(self, name, value, *args, **kwargs): if name.lower() in MSG_ID_HEADERS and self.max_line_length - len(name) - 2 < len(value): # RFC 5322, section 2.1.1: "Each line of characters MUST be no # more than 998 characters, and SHOULD be no more than 78 # characters, excluding the CRLF.". To avoid msg-id tokens from being folded # by means of RFC2047, fold identifier lines to the max length instead. return self.clone(max_line_length=998)._fold(name, value, *args, **kwargs) return super()._fold(name, value, *args, **kwargs) This ignores the fact that In-Reply-To and References contain foldable whitespace in between each msg-id, but it at least let us send email through smtp.office365.com again without confusing recipients. -- components: email messages: 334210 nosy: barry, mjpieters, r.david.murray priority: normal severity: normal status: open title: email package folds msg-id identifiers using RFC2047 encoded words where it must not versions: Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue35805> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35654] Remove 'guarantee' that sorting only relies on __lt__ from sorting howto
Martijn Pieters added the comment: (I have no opinion on this having to be a language feature however) -- ___ Python tracker <https://bugs.python.org/issue35654> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35654] Remove 'guarantee' that sorting only relies on __lt__ from sorting howto
Martijn Pieters added the comment: Well, if this is indeed by design (and I missed the list.sort() reference) then I agree the HOWTO should not be changed! I'd be happy to change this to asking for more explicit mentions in the docs for sorted, heapq and bisect that using only < (__lt__) is a deliberate choice. -- ___ Python tracker <https://bugs.python.org/issue35654> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35654] Remove 'guarantee' that sorting only relies on __lt__ from sorting howto
New submission from Martijn Pieters : Currently, the sorting HOWTO at https://docs.python.org/3/howto/sorting.html#odd-and-ends contains the text: > The sort routines are guaranteed to use __lt__() when making comparisons > between two objects. So, it is easy to add a standard sort order to a class > by defining an __lt__() method Nowhere else in the Python documentation is this guarantee made, however. That sort currently uses __lt__ only is, in my opinion, an implementation detail. The above advice also goes against the advice PEP 8 gives: > When implementing ordering operations with rich comparisons, it is best to > implement all six operations (__eq__, __ne__, __lt__, __le__, __gt__, __ge__) > rather than relying on other code to only exercise a particular comparison. > > To minimize the effort involved, the functools.total_ordering() decorator > provides a tool to generate missing comparison methods. The 'guarantee' seems to have been copied verbatim from the Wiki version of the HOWTO in https://github.com/python/cpython/commit/0fe095e87f727f4a19b6cbfd718d51935a888740, where that part of the Wiki page was added by an anonymous user in revision 44 to the page: https://wiki.python.org/moin/HowTo/Sorting?action=diff&rev1=43&rev2=44 Can this be removed from the HOWTO? -- assignee: docs@python components: Documentation messages: 332949 nosy: docs@python, mjpieters, rhettinger priority: normal severity: normal status: open title: Remove 'guarantee' that sorting only relies on __lt__ from sorting howto versions: Python 3.6 ___ Python tracker <https://bugs.python.org/issue35654> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35547] email.parser / email.policy does not correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers
Martijn Pieters added the comment: While RFC2047 clearly states that an encoder MUST not split multi-byte encodings in the middle of a character (section 5, "Each 'encoded-word' MUST represent an integral number of characters. A multi-octet character may not be split across adjacent 'encoded-word's.), it also states that to fit length restrictions, CRLF SPACE is used as a delimiter between encoded words (section 2, "If it is desirable to encode more text than will fit in an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be used."). In section 6.2 it states When displaying a particular header field that contains multiple 'encoded-word's, any 'linear-white-space' that separates a pair of adjacent 'encoded-word's is ignored. (This is to allow the use of multiple 'encoded-word's to represent long strings of unencoded text, without having to separate 'encoded-word's where spaces occur in the unencoded text.) (linear-white-space is the RFC822 term for foldable whitespace). The parser is leaving spaces between two encoded-word tokens in place, where it must remove them instead. And it is doing so correctly for unstructured headers, just not in get_bare_quoted_string, get_atom and get_dot_atom. Then there is Postel's law (*be liberal in what you accept from others*), and the email package already applies that principle to RFC2047 elsewhere; RFC2047 also states that "An 'encoded-word' MUST NOT appear within a 'quoted-string'." yet email._header_value_parser's handling of quoted-string will process EW sections. -- title: email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers -> email.parser / email.policy does not correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers ___ Python tracker <https://bugs.python.org/issue35547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35547] email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers
Martijn Pieters added the comment: That regex is incorrect, I should not post untested code from a mobile phone. Corrected workaround with more context: import re from email.policy import EmailPolicy class UnfoldingEncodedStringHeaderPolicy(EmailPolicy): def header_fetch_parse(self, name, value): # remove any leading whitespace from header lines # that separates apparent encoded-word token before further processing # using somewhat crude CRLF-FWS-between-encoded-word matching value = re.sub(r'(?<=\?=)((?:\r\n|[\r\n])[\t ]+)(?==\?)', '', value) return super().header_fetch_parse(name, value) -- ___ Python tracker <https://bugs.python.org/issue35547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35547] email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers
Martijn Pieters added the comment: Right, re-educating myself on the MIME RFCs, and found https://bugs.python.org/issue1372770 where the same issue is being discussed for previous incarnations of the email library. Removing the FWS after CRLF is the wrong thing to do, **unless** RFC2047 separating encoded-word tokens. The work-around regex is a bit more complicated, but ideally the EW handling should use a specialist FWS token to delimit encoded-word sections that renders to '' as is done in unstructured headers, but everywhere. Because in practice, there are email clients out there that use EW in structured headers, regardless. Regex to work around this # crude CRLF-FWS-between-encoded-word matching value = re.sub(r'(?<=\?=(\r\n|\n|\r))([\t ]+)(?==\?)', '', value) -- ___ Python tracker <https://bugs.python.org/issue35547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35547] email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers
Change by Martijn Pieters : -- components: +email nosy: +barry, r.david.murray type: -> behavior versions: +Python 3.7 ___ Python tracker <https://bugs.python.org/issue35547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35547] email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers
New submission from Martijn Pieters : The From header in the following email headers is not correctly decoded; both the subject and from headers contain UTF-8 encoded data encoded with RFC2047 encoded-words, in both cases a multi-byte UTF-8 codepoint has been split between the two encoded-word tokens: >>> msgdata = '''\ From: =?utf-8?b?4ZuX4Zqr4Zqx4ZuP4ZuB4ZuD4Zq+4ZuI4ZuB4ZuW4ZuP4ZuW4Zo=?= =?utf-8?b?seGbiw==?= Subject: =?utf-8?b?c8qHdcSxb2THnXBvyZQgOC3ihLLiiqXiiKkgx53Kh8qOcS3E?= =?utf-8?b?scqHyoNuya8gyaXKh8Sxyo0gx53Gg8mQc3PHncmvIMqHc8edyocgybnHncaDdW/Kgw==?= ''' >>> from io import StringIO >>> from email.parser import Parser >>> from email import policy >>> msg = Parser(policy=policy.default).parse(StringIO(msgdata)) >>> print(msg['Subject']) # correct sʇuıodǝpoɔ 8-Ⅎ⊥∩ ǝʇʎq-ıʇʃnɯ ɥʇıʍ ǝƃɐssǝɯ ʇsǝʇ ɹǝƃuoʃ >>> print(msg['From']) # incorrect ᛗᚫᚱᛏᛁᛃᚾᛈᛁᛖᛏᛖ� �ᛋ Note the two FFFD placeholders in the From line. The issue is that the raw value of the From and Subject contain the folding space at the start of the continuation lines: >>> for name, value in msg.raw_items(): ... if name in {'Subject', 'From'}: ... print(name, repr(value)) ... >From '=?utf-8?b?4ZuX4Zqr4Zqx4ZuP4ZuB4ZuD4Zq+4ZuI4ZuB4ZuW4ZuP4ZuW4Zo=?=\n >=?utf-8?b?seGbiw==?= ' Subject '=?utf-8?b?c8qHdcSxb2THnXBvyZQgOC3ihLLiiqXiiKkgx53Kh8qOcS3E?=\n =?utf-8?b?scqHyoNuya8gyaXKh8Sxyo0gx53Gg8mQc3PHncmvIMqHc8edyocgybnHncaDdW/Kgw==?=' For the Subject header, _header_value_parser.get_unstructured is used, which *expects* there to be spaces between encoded words; it inserts EWWhiteSpaceTerminal tokens in between which are turned into empty strings. But for the From header, AddressHeader parser does not, the space at the start of the line is retained, and the surrogate escapes at the end of one encoded-word and the start start of the next encoded-word never ajoin, so the later handling of turning surrogates back into proper data fails. Since unstructured header parsing doesn't mind if a space is missing between encoded-word atoms, the work-around is to explicitly remove the space at the start of every line; this can be done in a custom policy: import re from email.policy import EmailPolicy class UnfoldingHeaderEmailPolicy(EmailPolicy): def header_fetch_parse(self, name, value): # remove any leading whitespace from header lines # before further processing value = re.sub(r'(?<=[\n\r])([\t ])', '', value) return super().header_fetch_parse(name, value) custom_policy = UnfoldingHeaderEmailPolicy() after which the From header comes out without placeholders: >>> msg = Parser(policy=custom_policy).parse(StringIO(msgdata)) >>> msg['from'] 'ᛗᚫᚱᛏᛁᛃᚾᛈᛁᛖᛏᛖᚱᛋ ' >>> msg['subject'] 'sʇuıodǝpoɔ 8-Ⅎ⊥∩ ǝʇʎq-ıʇʃnɯ ɥʇıʍ ǝƃɐssǝɯ ʇsǝʇ ɹǝƃuoʃ' This issue was found by way of https://stackoverflow.com/q/53868584/100297 -- messages: 332243 nosy: mjpieters priority: normal severity: normal status: open title: email.parser / email.policy does correctly handle multiple RFC2047 encoded-word tokens across RFC5322 folded headers ___ Python tracker <https://bugs.python.org/issue35547> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35366] Monkey Patching class derived from ctypes.Union doesn't work
Martijn Pieters added the comment: This is a repeat of old-tracker issue 1700288, see https://github.com/python/cpython/commit/08ccf202e606a08f4ef85df9a9c0d07e1ba1#diff-998bfefaefe2ab83d5f523e18f158fa4, which fixed this for StructType_setattro but failed to do the same for UnionType_setattro -- nosy: +mjpieters ___ Python tracker <https://bugs.python.org/issue35366> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26832] ProactorEventLoop doesn't support stdin/stdout nor files with connect_read_pipe/connect_write_pipe
Martijn Pieters added the comment: I'm trying to figure out why Windows won't let us do this. I think the reason is that sys.std(in|out) filehandles are not opened as pipes, and do not have the required OVERLAPPED flag set (see the CreateIoCompletionPort documentation at https://docs.microsoft.com/en-us/windows/desktop/fileio/createiocompletionport; it's that function that is used to handle pipes (via IocpProactor.recv -> IocpProactor._register_with_iocp -> overlapped.CreateIoCompletionPort). The solution then would be to create a pipe for a stdio filehandle with the flag set. And that's where my Windows-fu ends, and where I lack the VM and motivation to go try that out. -- nosy: +mjpieters ___ Python tracker <https://bugs.python.org/issue26832> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33567] Explicitly mention bytes and other buffers in the documentation for float()
New submission from Martijn Pieters : float(bytesobject) treats the contents of the bytesobject as a sequence of ASCII characters, and converts those to a float value as if you used float(bytesobject.decode('ASCII')). The same support is extended to other objects implementing the buffer protocol. The documentation, however, doesn't mention this: > Return a floating point number constructed from a number or string x. Everywhere else in the functions documentation, "string" refers to an object of type `str`. Please make it explicit that `bytes` is also acceptedable, like it does for the int() documentation. -- messages: 317022 nosy: mjpieters priority: normal severity: normal status: open title: Explicitly mention bytes and other buffers in the documentation for float() ___ Python tracker <https://bugs.python.org/issue33567> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33516] unittest.mock: Add __round__ to supported magicmock methods
New submission from Martijn Pieters : I notice that __trunc__, __floor__ and __ceil__ are supported methods for MagicMock, but __round__ (in the same grouping of numeric types emulation methods, see https://docs.python.org/3/reference/datamodel.html#object.__round__), is not. Please add this to the mapping too. -- components: Library (Lib) messages: 316641 nosy: mjpieters priority: normal severity: normal status: open title: unittest.mock: Add __round__ to supported magicmock methods type: enhancement ___ Python tracker <https://bugs.python.org/issue33516> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33492] Updating the Evaluation order section to cover *expression in calls
New submission from Martijn Pieters : Can the *Evaluation order* (_evalorder) section in reference/expressions.rst please be updated to cover this exception in a *call* primary (quoting from the _calls section): A consequence of this is that although the ``*expression`` syntax may appear *after* explicit keyword arguments, it is processed *before* the keyword arguments (and any ``**expression`` arguments -- see below). So:: This exception to the normal expression evaluation order is rather buried in the _calls section only. -- assignee: docs@python components: Documentation messages: 316494 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: Updating the Evaluation order section to cover *expression in calls versions: Python 3.5, Python 3.6, Python 3.7, Python 3.8 ___ Python tracker <https://bugs.python.org/issue33492> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32963] Python 2.7 tutorial claims source code is UTF-8 encoded
Martijn Pieters added the comment: Thanks for the quick fix, sorry I didn't have a PR for this one! -- ___ Python tracker <https://bugs.python.org/issue32963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32963] Python 2.7 tutorial claims source code is UTF-8 encoded
Change by Martijn Pieters : -- nosy: +Mariatta, rhettinger ___ Python tracker <https://bugs.python.org/issue32963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32963] Python 2.7 tutorial claims source code is UTF-8 encoded
New submission from Martijn Pieters : Issue #29381 updated the tutorial to clarify #! use, but the 2.7 patch re-used Python 3 material that doesn't apply. See r40ba60f6 at https://github.com/python/cpython/commit/40ba60f6bf2f7192f86da395c71348d0fa24da09 It now reads: "By default, Python source files are treated as encoded in UTF-8." and " To display all these characters properly, your editor must recognize that the file is UTF-8, and it must use a font that supports all the characters in the file." This is a huge deviation from the previous text, and confusing and wrong to people new to Python 2. -- assignee: docs@python components: Documentation messages: 312986 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: Python 2.7 tutorial claims source code is UTF-8 encoded versions: Python 2.7 ___ Python tracker <https://bugs.python.org/issue32963> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32836] Symbol table for comprehensions (list, dict, set) still includes temporary _[1] variable
New submission from Martijn Pieters : In Python 2.6, a list comprehension was implemented in the current scope using a temporary _[1] variable to hold the list object: >>> import dis >>> dis.dis(compile('[x for x in y]', '?', 'exec')) 1 0 BUILD_LIST 0 3 DUP_TOP 4 STORE_NAME 0 (_[1]) 7 LOAD_NAME1 (y) 10 GET_ITER >> 11 FOR_ITER13 (to 27) 14 STORE_NAME 2 (x) 17 LOAD_NAME0 (_[1]) 20 LOAD_NAME2 (x) 23 LIST_APPEND 24 JUMP_ABSOLUTE 11 >> 27 DELETE_NAME 0 (_[1]) 30 POP_TOP 31 LOAD_CONST 0 (None) 34 RETURN_VALUE Nick Cochlan moved comprehensions into a separate scope in #1660500, and removed the need for a temporary variable in the process (the list / dict / set lives only on the stack). However, the symbol table generates the _[1] name: >>> import symtable >>> symtable.symtable('[x for x in y]', '?', >>> 'exec').get_children()[0].get_symbols() [, , ] Can this be dropped? I think all temporary variable handling can be ripped out. -- messages: 312081 nosy: mjpieters priority: normal severity: normal status: open title: Symbol table for comprehensions (list, dict, set) still includes temporary _[1] variable ___ Python tracker <https://bugs.python.org/issue32836> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32112] Should uuid.UUID() accept another UUID() instance?
New submission from Martijn Pieters : When someone accidentally passes in an existing uuid.UUID() instance into uuid.UUID(), an attribute error is thrown because it is not a hex string: >>> import uuid >>> value = uuid.uuid4() >>> uuid.UUID(value) Traceback (most recent call last): File "", line 1, in File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python2.7/uuid.py", line 133, in __init__ hex = hex.replace('urn:', '').replace('uuid:', '') AttributeError: 'UUID' object has no attribute 'replace' This happened in the Stack Overflow question at https://stackoverflow.com/q/47429929/100297, because the code there didn't take into account that some database drivers may already have mapped the PostgreSQL UUID column to a Python uuid.UUID() object. The fix could be as simple as: if hex is not None: if isinstance(hex, uuid.UUID): int = hex.int else: hex = hex.replace('urn:', '').replace('uuid:', '') hex = hex.strip('{}').replace('-', '') if len(hex) != 32: raise ValueError('badly formed hexadecimal UUID string') int = int_(hex, 16) Or we could add a uuid=None keyword argument, and use if hex is not None: if isinstance(hex, uuid.UUID): uuid = hex else: hex = hex.replace('urn:', '').replace('uuid:', '') hex = hex.strip('{}').replace('-', '') if len(hex) != 32: raise ValueError('badly formed hexadecimal UUID string') int = int_(hex, 16) if uuid is not None: int = uuid.int -- messages: 306719 nosy: mjpieters priority: normal severity: normal status: open title: Should uuid.UUID() accept another UUID() instance? ___ Python tracker <https://bugs.python.org/issue32112> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
Changes by Martijn Pieters : -- pull_requests: +3125 ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
Changes by Martijn Pieters : -- pull_requests: +3124 ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
Martijn Pieters added the comment: Disregard my last message, I misread Serhiy's sentence (read 'correct' for 'incorrect'). -- ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
Martijn Pieters added the comment: This does not increase clarity. It creates confusion. There are two distinct syntax errors, and they should be reported separately, just like `print "abc" 42` is two syntax errors; you'll hear about the second one once the first one is fixed. -- ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
Martijn Pieters added the comment: It's confusing; a syntax error reports on the first error found, not two errors at once. The TabError or IndentationError exception detail message itself is lost (it should be "IndentationError: Improper mixture of spaces and tabs." or "TabError: Improper indentation.", respectively). So you end up with an end-user scratching their head, the two parts of the message make no sense together. -- ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
Martijn Pieters added the comment: Credit for uncovering this gem: https://stackoverflow.com/questions/45591883/why-is-an-indentionerror-being-raised-here-rather-than-a-syntaxerror -- ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31161] Only check for print and exec parentheses cases for SyntaxError, not subclasses
New submission from Martijn Pieters: SyntaxError.__init__() checks for the `print` and `exec` error cases where the user forgot to use parentheses: >>> exec 1 File "", line 1 exec 1 ^ SyntaxError: Missing parentheses in call to 'exec' >>> print 1 File "", line 1 print 1 ^ SyntaxError: Missing parentheses in call to 'print' However, this check is also applied to *subclasses* of SyntaxError: >>> if True: ... print "Look ma, no parens!" File "", line 2 print "Look ma, no parens!" ^ IndentationError: Missing parentheses in call to 'print' and >>> compile('if 1:\n1\n\tprint "Look ma, tabs!"', '', 'single') Traceback (most recent call last): File "", line 1, in File "", line 3 print "Look ma, tabs!" ^ TabError: Missing parentheses in call to 'print' Perhaps the check needs to be limited to just the exact type. -- messages: 32 nosy: mjpieters priority: normal severity: normal status: open title: Only check for print and exec parentheses cases for SyntaxError, not subclasses ___ Python tracker <http://bugs.python.org/issue31161> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30524] iter(classmethod, sentinel) broken for Argument Clinic class methods?
Martijn Pieters added the comment: Forgot to addthis: this bug was found via https://stackoverflow.com/questions/44283540/iter-not-working-with-datetime-now -- ___ Python tracker <http://bugs.python.org/issue30524> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30524] iter(classmethod, sentinel) broken for Argument Clinic class methods?
New submission from Martijn Pieters: I'm not sure where exactly the error lies, but issue 27128 broke iter() for Argument Clinic class methods. The following works in Python 3.5, but not in Python 3.6: from datetime import datetime from asyncio import Task next(iter(datetime.now, None)) next(iter(Task.all_tasks, None)) In 3.6 StopIteration is raised: >>> next(iter(datetime.now, None)) Traceback (most recent call last): File "", line 1, in StopIteration >>> next(iter(Task.all_tasks, None)) Traceback (most recent call last): File "", line 1, in StopIteration (In 3.5 a `datetime.datetime` and `set` object are produced, respectively) The only thing these two methods have in common is that they are class methods with no arguments, parsed out by the Argument Clinic generated code (so using _PyArg_Parser). What appears to have changed is that iter() was switched from using PyObject_Call to _PyObject_FastCall, see https://github.com/python/cpython/commit/99ee9c70a73ec2f3db68785821a9f2867c3f637f -- messages: 294835 nosy: mjpieters priority: normal severity: normal status: open title: iter(classmethod, sentinel) broken for Argument Clinic class methods? versions: Python 3.6, Python 3.7 ___ Python tracker <http://bugs.python.org/issue30524> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30293] Peephole binops folding can lead to memory and bytecache ballooning
Martijn Pieters added the comment: Thanks Raymond, for the response. I agree, we can't prevent all possible misuse, and avoiding the memory issue would require overly costly checks as to what is being multiplied or added. -- ___ Python tracker <http://bugs.python.org/issue30293> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30293] Peephole binops folding can lead to memory and bytecache ballooning
New submission from Martijn Pieters: The following expression produces 127MB in constants in `co_consts` due to two 63.5MB integer objects produced when folding: ((200*200 - 2) & ((1 << 5) - 1)) + ((200*200 - 2) >> 5) The optimizer already does not store optimized *sequences* of more than 20 elements to avoid making bytecode files too large: If the new constant is a sequence, only folds when the size is below a threshold value. That keeps pyc files from becoming large in the presence of code like: (None,)*1000. Perhaps the same should be extended to number objects? Context: https://stackoverflow.com/questions/43823807/why-does-using-arguments-make-this-function-so-much-slower -- messages: 293167 nosy: mjpieters priority: normal severity: normal status: open title: Peephole binops folding can lead to memory and bytecache ballooning ___ Python tracker <http://bugs.python.org/issue30293> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30154] subprocess.run with stderr connected to a pipe won't timeout when killing a never-ending shell commanad
Martijn Pieters added the comment: Apologies, I copied the wrong sleep 10 demo. The correct demo is: cat >test.sh< #!/bin/sh > sleep 10 > EOF time bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)" Traceback (most recent call last): File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 405, in run stdout, stderr = process.communicate(input, timeout=timeout) File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 836, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 1497, in _communicate self._check_timeout(endtime, orig_timeout) File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 864, in _check_timeout raise TimeoutExpired(self.args, orig_timeout) subprocess.TimeoutExpired: Command '['./test.sh']' timed out after 3 seconds During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 1, in File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 410, in run stderr=stderr) subprocess.TimeoutExpired: Command '['./test.sh']' timed out after 3 seconds real0m10.054s user0m0.033s sys 0m0.015s -- ___ Python tracker <http://bugs.python.org/issue30154> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30154] subprocess.run with stderr connected to a pipe won't timeout when killing a never-ending shell commanad
New submission from Martijn Pieters: You can't time out a process tree that includes a never-ending process, *and* which redirects stderr: cat >test.sh< /dev/null # never-ending EOF chmod +x test.sh python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)" This hangs forever; the timeout kicks in, but then the kill on the child process fails and Python forever tries to read stderr, which won't produce data. See https://github.com/python/cpython/blob/v3.6.1/Lib/subprocess.py#L407-L410. The `sh` process is killed, but listed as a zombie process and the `cat` process has migrated to parent id 1: ^Z bg jobs -lr [2]- 21906 Running bin/python -c "import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3)" & pstree 21906 -+= 21906 mjpieters bin/python -c import subprocess; subprocess.run(['./test.sh'], stderr=subprocess.PIPE, timeout=3) \--- 21907 mjpieters (sh) ps -j | grep 'cat /dev/random' mjpieters 24706 1 24704 01 Rs0030:26.54 cat /dev/random mjpieters 24897 99591 24896 02 R+ s0030:00.00 grep cat /dev/random Killing Python at that point leaves the `cat` process running indefinitely. Replace the `cat /dev/random > /dev/null` line with `sleep 10`, and the `subprocess.run()` call returns after 10+ seconds: cat >test.sh<", line 1, in File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 403, in run with Popen(*popenargs, **kwargs) as process: File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 707, in __init__ restore_signals, start_new_session) File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/subprocess.py", line 1326, in _execute_child raise child_exception_type(errno_num, err_msg) OSError: [Errno 8] Exec format error real0m12.326s user0m0.041s sys 0m0.018s When you redirect stdin instead, `process.communicate()` does return, but the `cat` subprocess runs on indefinitely nonetheless; only the `sh` process was killed. Is this something subprocess.run should handle better (perhaps by adding in a second timeout poll and a terminate())? Or should the documentation be updated to warn about this behaviour instead (with suitable advice on how to write a subprocess that can be killed properly). -- components: Library (Lib) messages: 292217 nosy: mjpieters priority: normal severity: normal status: open title: subprocess.run with stderr connected to a pipe won't timeout when killing a never-ending shell commanad type: behavior versions: Python 3.6, Python 3.7 ___ Python tracker <http://bugs.python.org/issue30154> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Changes by Martijn Pieters : -- pull_requests: +318 ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Martijn Pieters added the comment: > Is 2.7 free from this bug? No, 2.7 is affected too: >>> class SubclassedStr(str): ... def __rmod__(self, other): ... return 'Success, self.__rmod__({!r}) was called'.format(other) ... >>> 'lhs %% %r' % SubclassedStr('rhs') "lhs % 'rhs'" Expected output is "Success, self.__rmod__('lhs %% %r') was called" On the plus side, unicode is not affected: >>> class SubclassedUnicode(unicode): ... def __rmod__(self, other): ... return u'Success, self.__rmod__({!r}) was called'.format(other) ... >>> u'lhs %% %r' % SubclassedUnicode(u'rhs') u"Success, self.__rmod__(u'lhs %% %r') was called" -- ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Changes by Martijn Pieters : -- pull_requests: +299 ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Changes by Martijn Pieters : -- pull_requests: +294 ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Changes by Martijn Pieters : -- pull_requests: +57 ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Martijn Pieters added the comment: I'm not sure if issues are linked automatically yet. I put the patch up as a pull request on GitHub: https://github.com/python/cpython/pull/51 -- ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12268] file readline, readlines & readall methods can lose data on EINTR
Martijn Pieters added the comment: Follow-up bug, readahead was missed: http://bugs.python.org/issue1633941 -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue12268> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1633941] for line in sys.stdin: doesn't notice EOF the first time
Martijn Pieters added the comment: It looks like readahead was missed when http://bugs.python.org/issue12268 was fixed. -- ___ Python tracker <http://bugs.python.org/issue1633941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21090] File read silently stops after EIO I/O error
Martijn Pieters added the comment: The Python 2.7 issue (using fread without checking for interrupts) looks like a duplicate of http://bugs.python.org/issue1633941 -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue21090> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1633941] for line in sys.stdin: doesn't notice EOF the first time
Martijn Pieters added the comment: This bug affects all use of `file.__iter__` and interrupts (EINTR), not just sys.stdin. You can reproduce the issue by reading from a (slow) pipe in a terminal window and resizing that window, for example; the interrupt is not handled and a future call ends up raising `IOError: [Errno 0] Error`, a rather confusing message. The Mercurial community is switching away from using direct iteration over this bug; Jun's excellent analysis is included and enlightening: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-November/090522.html The fix is to use interrupted = ferror(f->f_fp) && errno == EINTR; // .. if (interrupted) { clearerr(f->f_fp); if (PyErr_CheckSignals()) { Py_DECREF(v); return NULL; } } and check for interrupted == 0 in the chunksize == 0 case after Py_UniversalNewlineFread calls, as file_read does, for example, but which readahead doesn't (where the only public user of readahead is file_iternext). -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue1633941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
Martijn Pieters added the comment: Here's a proposed patch for tip; what versions would it be worth backporting this to? (Note, there's no NEWS update in this patch). -- keywords: +patch Added file: http://bugs.python.org/file45338/issue28598.patch ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28598] RHS not consulted in `str % subclass_of_str` case.
New submission from Martijn Pieters: The `BINARY_MODULO` operator hardcodes a test for `PyUnicode`: TARGET(BINARY_MODULO) { PyObject *divisor = POP(); PyObject *dividend = TOP(); PyObject *res = PyUnicode_CheckExact(dividend) ? PyUnicode_Format(dividend, divisor) : PyNumber_Remainder(dividend, divisor); This means that a RHS subclass of str can't override the operator: >>> class Foo(str): ... def __rmod__(self, other): ... return self % other ... >>> "Bar: %s" % Foo("Foo: %s") 'Bar: Foo %s' The expected output there is "Foo: Bar %s". This works correctly for `bytes`: >>> class FooBytes(bytes): ... def __rmod__(self, other): ... return self % other ... >>> b"Bar: %s" % FooBytes(b"Foo: %s") b'Foo: Bar: %s' and for all other types where the RHS is a subclass. Perhaps there should be a test to see if `divisor` is a subclass, and in that case take the slow path? -- components: Interpreter Core messages: 279993 nosy: mjpieters priority: normal severity: normal status: open title: RHS not consulted in `str % subclass_of_str` case. type: behavior versions: Python 2.7, Python 3.3, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue28598> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27797] ASCII file with UNIX line conventions and enough lines throws SyntaxError when ASCII-compatible codec is declared
New submission from Martijn Pieters: To reproduce, create an ASCII file with > io.DEFAULT_BUFFER_SIZE bytes (can be blank lines) and *UNIX line endings*, with the first two lines reading: #!/usr/bin/env python # -*- coding: cp1252 -*- Try to run this as a script on Windows: C:\Python35\python.exe encoding-problem-cp1252.py File "encoding-problem-cp1252.py", line 2 SyntaxError: encoding problem: cp1252 Converting the file to use CRLF (Windows) line endings makes the problem go away. This appears to be a fallout from issue #20731. Demo file that reproduces this issue at 710 bytes: https://github.com/techtonik/testbin/raw/fbb8aec3650b45f690c4febfd621fe5d6892b14a/python/encoding-problem-cp1252.py First reported by anatoly techtonik at https://stackoverflow.com/questions/39032416/python-3-5-syntaxerror-encoding-prob-em-cp1252 -- components: Interpreter Core, Windows messages: 273087 nosy: mjpieters, paul.moore, steve.dower, tim.golden, zach.ware priority: normal severity: normal status: open title: ASCII file with UNIX line conventions and enough lines throws SyntaxError when ASCII-compatible codec is declared versions: Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue27797> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Martijn Pieters added the comment: The catalyst for this question was a Stack Overflow question I answered: https://stackoverflow.com/questions/37365311/why-are-python-3-6-literal-formatted-strings-so-slow Compared the `str.format()` the BUILD_LIST is the bottleneck here; dropping the LOAD_ATTR and CALL_FUNCTION opcodes are nice bonuses. -- ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27078] Make f'' strings faster than .format: BUILD_STRING opcode?
Changes by Martijn Pieters : -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue27078> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26650] calendar: OverflowErrors for year == 1 and firstweekday > 0
New submission from Martijn Pieters: For anything other than calendar.Calendar(0), many methods lead to OverflowError exceptions: >>> import calendar >>> c = calendar.Calendar(0) >>> list(c.itermonthdays(1, 1)) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 0, 0, 0, 0] >>> c = calendar.Calendar(1) >>> list(c.itermonthdays(1, 1)) Traceback (most recent call last): File "", line 1, in File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python2.7/calendar.py", line 188, in itermonthdays for date in self.itermonthdates(year, month): File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python2.7/calendar.py", line 160, in itermonthdates date -= datetime.timedelta(days=days) OverflowError: date value out of range This echoes a similar problem with year = , see issue #15421 -- components: Library (Lib) messages: 262514 nosy: mjpieters priority: normal severity: normal status: open title: calendar: OverflowErrors for year == 1 and firstweekday > 0 type: crash versions: Python 2.7, Python 3.5 ___ Python tracker <http://bugs.python.org/issue26650> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26477] typing forward references and module attributes
Martijn Pieters added the comment: > I wonder why they forward references are evaluated *at all* at this point. The Union type tries to reduce the set of allowed types by removing any subclasses (so Union[int, bool] becomes Union[int] only). That's all fine, but it should not at that point fail if a forward reference is not available yet. Arguably, the except NameError there should be converted to a except Exception, since forward references are supposed to be *a valid Python expression [...] and it should evaluate without errors once the module has been fully loaded.* (from the PEP); anything goes, and thus any error goes until the module is loaded. -- ___ Python tracker <http://bugs.python.org/issue26477> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26477] typing forward references and module attributes
Martijn Pieters added the comment: A temporary work-around is to use a function to raise a NameError exception when the module attribute doesn't exist yet: def _forward_A_reference(): try: return a.A except AttributeError: # not yet.. raise NameError('A') class B: def spam(self: 'B', eggs: typing.Union['_forward_A_reference()', None]): pass -- ___ Python tracker <http://bugs.python.org/issue26477> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26477] typing forward references and module attributes
Martijn Pieters added the comment: Sorry, that should have read "the forward references section of PEP 484". The section uses this example: # File models/a.py from models import b class A(Model): def foo(self, b: 'b.B'): ... # File models/b.py from models import a class B(Model): def bar(self, a: 'a.A'): ... # File main.py from models.a import A from models.b import B which doesn't fail because the forward references are not being tested until after all imports have completed; creating a Union however triggers a subclass test between the different types in the union. -- ___ Python tracker <http://bugs.python.org/issue26477> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26477] typing forward references and module attributes
New submission from Martijn Pieters: Forward references to a module can fail, if the module doesn't yet have the required object. The "forward references" section names circular dependencies as one use for forward references, but the following example fails: $ cat test/__init__.py from .a import A from .b import B $ cat test/a.py import typing from . import b class A: def foo(self: 'A', bar: typing.Union['b.B', None]): pass $ cat test/b.py import typing from . import a class B: def spam(self: 'B', eggs: typing.Union['a.A', None]): pass $ bin/python -c 'import test' Traceback (most recent call last): File "", line 1, in File "/Users/mjpieters/Development/venvs/stackoverflow-3.5/test/__init__.py", line 1, in from .a import A File "/Users/mjpieters/Development/venvs/stackoverflow-3.5/test/a.py", line 2, in from . import b File "/Users/mjpieters/Development/venvs/stackoverflow-3.5/test/b.py", line 4, in class B: File "/Users/mjpieters/Development/venvs/stackoverflow-3.5/test/b.py", line 5, in B def spam(self: 'B', eggs: typing.Union['a.A', None]): File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.5/typing.py", line 537, in __getitem__ dict(self.__dict__), parameters, _root=True) File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.5/typing.py", line 494, in __new__ for t2 in all_params - {t1} if not isinstance(t2, TypeVar)): File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.5/typing.py", line 494, in for t2 in all_params - {t1} if not isinstance(t2, TypeVar)): File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.5/typing.py", line 185, in __subclasscheck__ self._eval_type(globalns, localns) File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.5/typing.py", line 172, in _eval_type eval(self.__forward_code__, globalns, localns), File "", line 1, in AttributeError: module 'test.a' has no attribute 'A' The forward reference test fails because only NameError exceptions are caught, not AttributeError exceptions. -- components: Library (Lib) messages: 261172 nosy: mjpieters priority: normal severity: normal status: open title: typing forward references and module attributes versions: Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue26477> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26449] Tutorial on Python Scopes and Namespaces uses confusing 'read-only' terminology
Martijn Pieters added the comment: +1 for "... can only be read". read-only can too easily be construed to mean that the variable cannot be set from *anywhere*, even the original scope. Another alternative would be "... is effectively read-only", but "... can only be read" is simpler. -- ___ Python tracker <http://bugs.python.org/issue26449> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26449] Tutorial on Python Scopes and Namespaces uses confusing 'read-only' terminology
Changes by Martijn Pieters : -- assignee: -> docs@python components: +Documentation nosy: +docs@python ___ Python tracker <http://bugs.python.org/issue26449> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26449] Tutorial on Python Scopes and Namespaces uses confusing 'read-only' terminology
New submission from Martijn Pieters: >From the 9.2. Python Scopes and Namespace section: > If a name is declared global, then all references and assignments go directly > to the middle scope containing the module’s global names. To rebind variables > found outside of the innermost scope, the nonlocal statement can be used; if > not declared nonlocal, those variable are read-only (an attempt to write to > such a variable will simply create a new local variable in the innermost > scope, leaving the identically named outer variable unchanged). This terminology is extremely confusing to newcomers; see https://stackoverflow.com/questions/35667757/read-only-namespace-in-python for an example. Variables are never read-only. The parent scope name simply is *not visible*, which is an entirely different concept. Can this section be re-written to not use the term 'read-only'? -- messages: 260933 nosy: mjpieters priority: normal severity: normal status: open title: Tutorial on Python Scopes and Namespaces uses confusing 'read-only' terminology ___ Python tracker <http://bugs.python.org/issue26449> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24856] Mock.side_effect as iterable or iterator
Martijn Pieters added the comment: Bugger, that's the last time I take someone's word for it and not test properly. Indeed, I missed the inheritance of NonCallableMock, so the property is inherited from there. Mea Culpa! -- ___ Python tracker <http://bugs.python.org/issue24856> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24856] Mock.side_effect as iterable or iterator
New submission from Martijn Pieters: The documentation states that `side_effect` can be set to an [iterable](https://docs.python.org/3/glossary.html#term-iterable): > If you pass in an iterable, it is used to retrieve an iterator which must > yield a value on every call. This value can either be an exception instance > to be raised, or a value to be returned from the call to the mock (`DEFAULT` > handling is identical to the function case). but the [actual handling of the side effect](https://github.com/testing-cabal/mock/blob/27a20329b25c8de200a8964ed5dd7762322e91f6/mock/mock.py#L1112-L1123) expects it to be an [*iterator*](https://docs.python.org/3/glossary.html#term-iterator): if not _callable(effect): result = next(effect) This excludes using a list or tuple object to produce the side effect sequence. Can the documentation be updated to state an *iterator* is required (so an object that defines __next__ and who's __iter__ method returns self), or can the CallableMixin constructor be updated to call iter() on the side_effect argument if it is not an exception or a callable? You could even re-use the [_MockIter() class](https://hg.python.org/cpython/file/256d2f01e975/Lib/unittest/mock.py#l348) already used for the [NonCallableMock.side_effect property](https://hg.python.org/cpython/file/256d2f01e975/Lib/unittest/mock.py#l509). -- components: Library (Lib) messages: 248501 nosy: mjpieters priority: normal severity: normal status: open title: Mock.side_effect as iterable or iterator versions: Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue24856> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12892] UTF-16 and UTF-32 codecs should reject (lone) surrogates
Martijn Pieters added the comment: I don't understand why encoding with `surrogateescape` isn't supported still; is it the fact that a surrogate would have to produce *single bytes* rather than double? E.g. b'\x80' -> '\udc80' -> b'\x80' doesn't work because that would mean the UTF-16 and UTF-32 codec could then end up producing an odd number of bytes? -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue12892> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23495] The writer.writerows method should be documented as accepting any iterable (not only a list)
Martijn Pieters added the comment: I'd be happy to provide a patch for the DictWriter.writerows code; I was naively counting on it accepting an iterable and that it would not pull the whole sequence into memory (while feeding it gigabytes of CSV data). -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue23495> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23864] issubclass without registration only works for "one-trick pony" collections ABCs.
Martijn Pieters added the comment: I should have added the mixin methods for the Sequence implementation; the more complete demonstration is: >>> from collections.abc import Sequence, Container, Sized >>> class MySequence(object): ... def __contains__(self, item): pass ... def __len__(self): pass ... def __iter__(self): pass ... def __getitem__(self, index): pass ... def __len__(self): pass ... def __reversed__(self): pass ... def index(self, item): pass ... def count(self, item): pass ... >>> issubclass(MySequence, Container) True >>> issubclass(MySequence, Sized) True >>> issubclass(MySequence, Sequence) False -- ___ Python tracker <http://bugs.python.org/issue23864> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23864] issubclass without registration only works for "one-trick pony" collections ABCs.
New submission from Martijn Pieters: The collections.abc documentation implies that *any* of the container ABCs can be used in an issubclass test against a class that implements all abstract methods: > These ABCs allow us to ask classes or instances if they provide particular > functionality [...] In reality this only applies to the "One Trick Ponies" (term from PEP 3119, things like Container and Iterable, those classes with one or two methods). It fails for the compound container ABCs: >>> from collections.abc import Sequence, Container, Sized >>> class MySequence(object): ... def __contains__(self, item): pass ... def __len__(self): pass ... def __iter__(self): pass ... def __getitem__(self, index): pass ... def __len__(self): pass ... >>> issubclass(MySequence, Container) True >>> issubclass(MySequence, Sized) True >>> issubclass(MySequence, Sequence) False That's because the One Trick Ponies implement a __subclasshook__ method that is locked to the specific class and returns NotImplemented for subclasses; for instance, the Iterable.__subclasshook__ implementation is: @classmethod def __subclasshook__(cls, C): if cls is Iterable: if any("__iter__" in B.__dict__ for B in C.__mro__): return True return NotImplemented The compound container classes build on top of the One Trick Ponies, so the class test will fail, NotImplemented is returned and the normal ABC tests for base classes that have been explicitly registered continues, but this won't include unregistered complete implementations. Either the compound classes need their own __subclasshook__ implementations, or the documentation needs to be updated to make it clear that without explicit registrations the issubclass() (and isinstance()) tests only apply to the One Trick Ponies. -- assignee: docs@python components: Documentation, Library (Lib) messages: 240060 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: issubclass without registration only works for "one-trick pony" collections ABCs. ___ Python tracker <http://bugs.python.org/issue23864> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23730] Document return value for ZipFile.extract()
New submission from Martijn Pieters: The documentation for zipfile.ZipFile.extract() doesn't mention at all that it returns the local path created, either for the directory that the member represents, or the new file created from the zipped data. *Returns the full local path created (a directory or new file)* or similar. -- assignee: docs@python components: Documentation messages: 238778 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: Document return value for ZipFile.extract() type: enhancement versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue23730> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23583] IDLE: printing unicode subclasses broken (again)
Martijn Pieters added the comment: I like the unicode.__getitem__(s, slice(None)) approach, it has the advantage of not having to rely on len(s). -- ___ Python tracker <http://bugs.python.org/issue23583> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23583] IDLE: printing unicode subclasses broken (again)
Martijn Pieters added the comment: Proposed fix, replace line 1352-1353 in PseudoOutputFile.write(): if isinstance(s, unicode): s = unicode.__getslice__(s, None, None) with if isinstance(s, unicode): s = unicode.__getslice__(s, 0, len(s)) -- ___ Python tracker <http://bugs.python.org/issue23583> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23583] IDLE: printing unicode subclasses broken (again)
Changes by Martijn Pieters : -- components: +IDLE ___ Python tracker <http://bugs.python.org/issue23583> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23583] IDLE: printing unicode subclasses broken (again)
New submission from Martijn Pieters: This is a regression or recurrence of issue #19481. To reproduce, create a subclass of unicode and try and print an instance of that class: class Foo(unicode): pass print Foo() results in a traceback: Traceback (most recent call last): File "", line 1, in print Foo() File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/idlelib/PyShell.py", line 1353, in write s = unicode.__getslice__(s, None, None) TypeError: an integer is required because unicode.__getslice__ does not accept None for the start and end indices. -- messages: 237181 nosy: mjpieters priority: normal severity: normal status: open title: IDLE: printing unicode subclasses broken (again) versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue23583> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19481] IDLE hangs while printing instance of Unicode subclass
Martijn Pieters added the comment: Created a new issue: http://bugs.python.org/issue23583 -- ___ Python tracker <http://bugs.python.org/issue19481> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19481] IDLE hangs while printing instance of Unicode subclass
Martijn Pieters added the comment: This changes causes printing BeautifulSoup NavigableString objects to fail; the code actually could never work as `unicode.__getslice__` insists on getting passed in integers, not None. To reproduce, create a new file in IDLE and paste in: from bs4 import BeautifulSoup html_doc = """The Dormouse's story""" soup = BeautifulSoup(html_doc) print soup.title.string Then pick *Run Module* to see: Traceback (most recent call last): File "/private/tmp/test.py", line 4, in print soup.title.string File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/idlelib/PyShell.py", line 1353, in write s = unicode.__getslice__(s, None, None) TypeError: an integer is required The same error can be induced with: unicode.__getslice__(u'', None, None) while specifying a start and end index (0 and len(s)) should fix this. -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue19481> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7334] ElementTree: file locking in Jython 2.5 (OSError on Windows)
Martijn Pieters added the comment: Indeed, the 2.7 backport was not correctly applied for _elementtree.c, leaving files open because the close_source flag is set to False *again* when opening a filename. Should a new issue be opened or should this ticket be re-opened? -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue7334> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17876] Doc issue with threading.Event
Martijn Pieters added the comment: Ah! Mea Culpa, you are correct. The issue is then with Python 2.7 only for which no doubt exists a separate ticket. -- ___ Python tracker <http://bugs.python.org/issue17876> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17876] Doc issue with threading.Event
Martijn Pieters added the comment: I notice that the same issue still exists in the 3.5 documentation. Surely this can at least be fixed in the development copy? -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue17876> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22755] contextlib.closing documentation should use a new example
New submission from Martijn Pieters: urllib.request.urlopen() now always produces a context manager (either a HTTPResponse or addinfourl object). The example for contextlib.closing still uses urllib.request.urlopen as an example for the context manager wrapper, see https://docs.python.org/3/library/contextlib.html#contextlib.closing This is confusing users that now expect the object not to be a context manager, see: http://stackoverflow.com/questions/26619404/with-and-closing-of-files-in-python Can a different example be chosen? -- assignee: docs@python components: Documentation messages: 230184 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: contextlib.closing documentation should use a new example versions: Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <http://bugs.python.org/issue22755> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13769] json.dump(ensure_ascii=False) return str instead of unicode
Martijn Pieters added the comment: I'd say this is a bug in the library, not the documentation. The library varies the output type, making it impossible to use `json.dump()` with a `io.open()` object as the library will *mix data type* when writing. That is *terrible* behaviour. -- nosy: +mjpieters ___ Python tracker <http://bugs.python.org/issue13769> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22575] bytearray documentation confuses string for unicode objects
New submission from Martijn Pieters: The Python 2 version of the bytearray() documentation appears to be copied directly from its Python 3 counterpart and states that when passing in a string an encoding is required: * If it is a string, you must also give the encoding (and optionally, errors) parameters; bytearray() then converts the string to bytes using str.encode(). (from https://docs.python.org/2/library/functions.html#bytearray). This obviously doesn't apply to Python 2 str() objects, but would only apply to unicode() objects. Can this be corrected? The current wording is confusing new users (see http://stackoverflow.com/questions/26230745/how-to-convert-python-str-to-bytearray). -- assignee: docs@python components: Documentation messages: 228771 nosy: docs@python, mjpieters priority: normal severity: normal status: open title: bytearray documentation confuses string for unicode objects versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue22575> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22288] Incorrect Call grammar in documentation
Martijn Pieters added the comment: Fixed by revision 3ae399c6ecf6 -- resolution: -> fixed status: open -> closed ___ Python tracker <http://bugs.python.org/issue22288> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22288] Incorrect Call grammar in documentation
Changes by Martijn Pieters : -- hgrepos: -270 ___ Python tracker <http://bugs.python.org/issue22288> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22288] Incorrect Call grammar in documentation
Martijn Pieters added the comment: Sigh, patch creation fails against a remove repository; I guess this only works for the default branch. Attached as a patch file instead. -- Added file: http://bugs.python.org/file36488/issue22288.patch ___ Python tracker <http://bugs.python.org/issue22288> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22288] Incorrect Call grammar in documentation
Changes by Martijn Pieters : Removed file: http://bugs.python.org/file36487/ffe77dc2979a.diff ___ Python tracker <http://bugs.python.org/issue22288> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com