[issue35459] Use PyDict_GetItemWithError() instead of PyDict_GetItem()

2019-02-26 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

#36110 was closed as a duplicate; the superseder is #36109 (which has been 
fixed). The change should still be documented, just in case anyone gets bitten 
by it.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35459>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35996] Optional modulus argument for new math.prod() function

2019-02-14 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

"One other issue is that the arguments to prod() need not be integers, so a 
modulus argument wouldn't make sense in those contexts."

The arguments to pow don't need to be integers either, yet the optional third 
argument is only really relevant to integers.

Not saying we should do this, but we definitely allow optional arguments that 
are only meaningful for certain input types in other cases.

The best argument for this change I can come up with from other Python 
functions is the push for an efficient math.comb (#35431); if we didn't want to 
bother supporting minimizing intermediate values, math.comb could be 
implemented directly in terms of math.factorial instead of trying to pre-cancel 
values. But even that's not a strong argument for this, given the relative 
frequency with which each feature is needed (the binomial coefficient coming up 
much more often than modular reduction of huge products).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35996>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25737] array is not a Sequence

2019-02-14 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Correction: It should actually be registered as a subclass of MutableSequence 
(which should make it a virtual subclass of Sequence too; list is only 
registered on MutableSequence as well).

--

___
Python tracker 
<https://bugs.python.org/issue25737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25737] array is not a Sequence

2019-02-14 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
versions: +Python 3.7, Python 3.8 -Python 3.5

___
Python tracker 
<https://bugs.python.org/issue25737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25737] array is not a Sequence

2019-02-14 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

This should not be closed as a duplicate. Yes, array.array isn't automatically 
a Sequence, but since it isn't, the array module should be modified to 
explicitly do the equivalent of:

import _collections_abc

_collections_abc.Sequence.register(array)

so it's properly registered manually.

--
nosy: +josh.r
resolution: duplicate -> 
status: closed -> open
superseder: issubclass without registration only works for "one-trick pony" 
collections ABCs. -> 

___
Python tracker 
<https://bugs.python.org/issue25737>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35190] collections.abc.Sequence cannot be used to test whether a class provides a particular interface (doc issue)

2019-02-14 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Wait, why should #25737 be closed? This bug is a docs issue; collections.abc 
shouldn't claim that all the ABCs do duck-typing checks since Sequence doesn't. 
But #25737 is specific: array.array *should* be registered as a Sequence, but 
isn't; that requires a code fix (to make array perform the registration), not a 
doc fix.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35190>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15753] No-argument super in method with variable arguments raises SystemError

2019-02-13 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Moving from pending back to open (not sure what was "pending" about it?).

The workaround is viable (and used by Python implemented dict subclasses in the 
standard library since they must accept **kwargs with arbitrary strings, 
including self), but it does seem a little silly that it's required. Leaving it 
as low priority since the workaround exists.

Still, would be nice to make super() seamless, pulling the first argument if 
the function accepts it as non-varargs, and the first element of the first 
argument if it's varargs. If someone reassigns self/args, that's on them; it's 
fine to raise a RuntimeError if they use no-arg super(), requiring them to use 
two-arg super explicitly in that case.

--
priority: normal -> low
status: pending -> open
versions: +Python 3.8 -Python 3.2, Python 3.3, Python 3.4

___
Python tracker 
<https://bugs.python.org/issue15753>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35988] Python interpreter segfault

2019-02-13 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

"your application is using more memory than what is available in the system." 

Well, it alone may not be using more memory, but the cumulative usage on the 
system is "too high" by whatever metric the OOM killer is using (IIRC the 
default rule is that actual committed memory must be less than swap size + 50% 
of RAM). The OOM killer is a strange and terrible beast, and the behavior 
varies based on configuration, relative memory usage of each process grouping, 
minimizing number of processes killed, etc.

You can find deep implementation details on it (including how to disable a 
given process for consideration) here: https://linux-mm.org/OOM_Killer

The real solution to problems like this usually amounts to:

1. Install more RAM.

2. Increase the size of your swap partition. Doesn't "fix" being shy of memory 
if you're actually using more memory than you have RAM, but allows you to 
handle overcommit (particularly for fork+exec scenarios where the forked 
process's memory will be freed momentarily) without the OOM killer getting 
involved, and to limp along slowly, without actually failing, if you actually 
allocate and use more memory than you have RAM.

3. Tweak the overcommit heuristics to allow more overcommit before invoking the 
OOM killer.

4. Disable overcommit entirely, so memory allocations fail immediately if 
sufficient backing storage is not available, rather than succeeding, only to 
invoke the OOM killer when the allocated memory gets used and the shortage is 
discovered. This is a good solution if the program(s) in question aren't poorly 
designed such that they try to allocate many GBs of memory up front even when 
they're unlikely to need it; unfortunately, there are commonly used programs 
that overallocate like this and render this solution non-viable if they're part 
of your setup.

Regardless, this isn't a bug in Python itself. Any process that uses a lot of 
memory (Apache, MySQL) and hasn't explicitly removed itself from OOM killer 
consideration is going to look tasty when an OOM scenario occurs.

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35988>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35961] test_slice: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small

2019-02-12 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

+1 on PR 11830 from me. Can always revisit if #11107 is ever implemented and it 
turns out that the reference count manipulation means startup is too slow due 
to all the slice interning triggered comparisons (unlikely, but theoretically 
possible I guess).

--

___
Python tracker 
<https://bugs.python.org/issue35961>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35961] test_slice: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small

2019-02-12 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Ah, I see Victor posted an alternative PR that avoids the reference counting 
overhead by explicitly removing the temporary tuples from GC tracking. I'm 
mildly worried by that approach, only because the only documented use case for 
PyObject_GC_UnTrack is in tp_dealloc (that said, the code explicitly allows it 
to be called twice due to the Py_TRASHCAN mechanism, so it's probably safe so 
long as the GC design never changes dramatically). If slice comparison really 
is performance sensitive enough to justify this, so be it, but I'd personally 
prefer to reduce the custom code involved in a rarely used code path (we're not 
even caching constant slices yet, so no comparisons are likely to occur for 
99.99% of slices, right?).

--
nosy: +josh.r
versions: +Python 3.6, Python 3.7

___
Python tracker 
<https://bugs.python.org/issue35961>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35961] test_slice: gc_decref: Assertion "gc_get_refs(g) > 0" failed: refcount is too small

2019-02-12 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Victor found the same bug I found while I was composing this, posting only to 
incorporate proposed solution:

I *think* I have a cause for this, but someone else with greater understanding 
of the cycle collector should check me if the suggested fix has non-trivial 
performance implications (I suspect the answer is no, performance is 
unaffected).

slice_richcompare borrows its behavior from tuple by creating a temporary tuple 
for each slice, the delegating to the tuple comparison ( 
https://github.com/python/cpython/blob/master/Objects/sliceobject.c#L591 ).

Problem is, it uses borrowed references when creating said tuples, not owned 
references. Because test_slice's BadCmp.__eq__ is implemented in Python, the 
comparison can be interrupted by cycle collection during the __eq__ call. When 
then happens, there are precisely two references to the BadCmp object:

1. In the slice (owned)
2. In the temporary tuple (borrowed)

When a cycle collection occurs during the comparison, and subtract_refs ( 
https://github.com/python/cpython/blob/master/Modules/gcmodule.c#L398 ) is 
called, the BadCmp object in question is visited via both the slice and the 
tuple, and since it has no non-container objects referencing it, it ends up 
with the initial reference count of 1 attempting to drop to -1, and the 
assertion is violated. While the code of gcmodule.c appears to have been 
refactored since 3.7 so the assert occurs in a different function, with a 
slightly different message, it would break the same way in both 3.7 and master, 
and whether or not it triggers the bug, the broken behavior of 
slice_richcompare hasn't changed for a *long* time. 

Underlying problem would seem to be slice's richcompare believing it's okay to 
make a tuple from borrowed references, then make a call on it that can trigger 
calls into Python level code (and therefore into the cycle collector); 
everything else is behaving correctly here. I'm guessing the only reason it's 
not seen in the wild is that slices based on Python defined types are almost 
never compared at all, let alone compared on debug builds that would be 
checking the assert and with an accelerated cycle collection cycle that would 
make a hit likely.

Solution would be to stop trying to microoptimize slice_richcompare to avoid 
reference count manipulation and just build a proper tuple. It would even 
simplify the code since we could just use PyTuple_Pack, reducing custom code by 
replacing:

t1 = PyTuple_New(3);
if (t1 == NULL)
return NULL;
t2 = PyTuple_New(3);
if (t2 == NULL) {
Py_DECREF(t1);
return NULL;
}

PyTuple_SET_ITEM(t1, 0, ((PySliceObject *)v)->start);
PyTuple_SET_ITEM(t1, 1, ((PySliceObject *)v)->stop);
PyTuple_SET_ITEM(t1, 2, ((PySliceObject *)v)->step);
PyTuple_SET_ITEM(t2, 0, ((PySliceObject *)w)->start);
PyTuple_SET_ITEM(t2, 1, ((PySliceObject *)w)->stop);
PyTuple_SET_ITEM(t2, 2, ((PySliceObject *)w)->step);

with:

t1 = PyTuple_Pack(3, ((PySliceObject *)v)->start, ((PySliceObject 
*)v)->stop, ((PySliceObject *)v)->step);
if (t1 == NULL)
return NULL;
t2 = PyTuple_Pack(3, ((PySliceObject *)w)->start, ((PySliceObject 
*)w)->stop, ((PySliceObject *)w)->step);
if (t2 == NULL) {
Py_DECREF(t1);
return NULL;
}

and makes cleanup simpler, since you can just delete:

PyTuple_SET_ITEM(t1, 0, NULL);
PyTuple_SET_ITEM(t1, 1, NULL);
PyTuple_SET_ITEM(t1, 2, NULL);
PyTuple_SET_ITEM(t2, 0, NULL);
PyTuple_SET_ITEM(t2, 1, NULL);
PyTuple_SET_ITEM(t2, 2, NULL);

and let the DECREFs for t1/t2 do their work normally.

If for some reason the reference count manipulation is unacceptable, this 
*could* switch between two behaviors depending on whether or not 
start/stop/step are of known types (e.g. if all are NoneType/int, this could 
use the borrowed refs code path safely) where a call back into Python level 
code is impossible; given that slices are usually made of None and/or ints, 
this would remove most of the cost for the common case, at the expense of more 
complicated code. Wouldn't help numpy types though, and I suspect the cost of 
pre-checking the types for all six values involved would eliminate most of the 
savings.

Sorry for not submitting a proper PR; the work machine I use during the day is 
not suitable for development (doesn't even have Python installed).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35961>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35918] multiprocessing's SyncManager.dict.has_key() method is broken

2019-02-12 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35918>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5996] abstract class instantiable when subclassing built-in types

2019-02-12 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Closed #35958 as a duplicate of this issue (and updated the title, since 
clearly the problem is not specific to dict).

Patch probably needs to be rebased/rewritten against latest trunk (given it 
dates from Mercurial days).

--
nosy: +Jon McMahon, josh.r
stage:  -> patch review
title: abstract class instantiable when subclassing dict -> abstract class 
instantiable when subclassing built-in types
versions: +Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue5996>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35904] Add statistics.fmean(seq)

2019-02-05 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Correct me if I'm wrong, but at least initially, the first listed goal of 
statistics (per the PEP) was:

"Correctness over speed. It is easier to speed up a correct but slow function 
than to correct a fast but buggy one."

numpy already exists for people who need insane speed for these algorithms and 
are willing to compromise accuracy; am I wrong in my impression that statistics 
is more about providing correct batteries included that are fast enough for 
simple uses, not reimplementing numpy piece by piece for hardcore number 
crunching?

Even if such a function were desirable, I don't like the naming symmetry 
between fsum and fmean; it's kind of misleading. math.fsum is a slower, but 
more precise, version of the built-in sum. Having statistics.fmean be a faster, 
less accurate, version of statistics.mean reverses that relationship between 
the f-prefixed and non-f-prefixed versions of a function.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35904>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35862] Change the environment for a new process

2019-01-31 Thread Josh Rosenberg


Change by Josh Rosenberg :


Removed file: https://bugs.python.org/file48088/bq-nix.snapshot.json

___
Python tracker 
<https://bugs.python.org/issue35862>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35862] Change the environment for a new process

2019-01-31 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Why is not having the target assign to the relevant os.environ keys before 
doing whatever depends on the environment not an option?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35862>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35862] Change the environment for a new process

2019-01-31 Thread Josh Rosenberg


Change by Josh Rosenberg :


Removed file: https://bugs.python.org/file48087/core-nix.snapshot.json

___
Python tracker 
<https://bugs.python.org/issue35862>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35862] Change the environment for a new process

2019-01-31 Thread Josh Rosenberg


Change by Josh Rosenberg :


Removed file: https://bugs.python.org/file48089/bq-nix.manifest

___
Python tracker 
<https://bugs.python.org/issue35862>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35862] Change the environment for a new process

2019-01-31 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
Removed message: https://bugs.python.org/msg334593

___
Python tracker 
<https://bugs.python.org/issue35862>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35866] concurrent.futures deadlock

2019-01-31 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I've only got 3.7.1 Ubuntu bash on Windows (also amd64) immediately available, 
but I'm not seeing a hang, nor is there any obvious memory leak that might 
eventually lead to problems (memory regularly drops back to under 10 MB shared, 
24 KB private working set). I modified your code to add a sys.stdout.flush() 
after the write so it would actually echo the dots as they were written instead 
of waiting for a few thousand of them to build up in the buffer, but otherwise 
it's the same code.

Are you sure you're actually hanging, and it's not just the output getting 
buffered?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35866>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35842] A potential bug about use of uninitialised variable

2019-01-28 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

One additional note, just in case you're wondering. slice explicitly does not 
set Py_TPFLAGS_BASETYPE (in either Py2 or Py3), so you can't make a subclass of 
slice with NULLable fields by accident (you'll get a TypeError the moment you 
try to define it). There is one, and only one, slice type, and its fields are 
never NULL.

--

___
Python tracker 
<https://bugs.python.org/issue35842>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35842] A potential bug about use of uninitialised variable

2019-01-28 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Yes, the 2.7 version of _PyEval_SliceIndex would bypass the NULL pointer 
dereference, so *if* you could make a slice with a NULL stop value, you could 
trigger a read from uninitialized stack memory, rather than dying due to a NULL 
pointer dereference.

But just like Python 3, 2.7's PySlice_New explicitly replaces all NULLs with 
None ( https://github.com/python/cpython/blob/2.7/Objects/sliceobject.c#L60 ), 
so such a slice cannot exist.

Since you can't make a slice with a NULL value through any supported API, and 
any unsupported means of doing this means you already have the ability to 
execute arbitrary code (and do far worse things that just trigger a read from 
an uninitialized C stack value), the fact that _PyEval_SliceIndex returns 
success for v == NULL is irrelevant; v isn't NULL in any code path aside of the 
specific one documented (the SLICE opcode, gone in Py3, which can pass in NULL, 
but uses defaults of 0 and PY_SSIZE_T_MAX for low and high respectively, so the 
silent success just leaves the reasonable defaults set), because all other uses 
use slice objects as the source for v, and they cannot have NULL values.

--
resolution:  -> not a bug
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35842>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35707] time.sleep() should support objects with __float__

2019-01-28 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

You've got a reference leak in your __index__ based paths. PyNumber_Index is 
returning a new reference (either to the existing obj, or a new one, if the 
existing obj isn't already an int). You never release this reference. Simplest 
fix is to make intobj top level, initialized to NULL, and Py_XDECREF it along 
the convert_from_int code path (you can't DECREF it in the index specific path 
because it needs to survive past the goto, since it's replacing obj).

I'm also mildly concerned by how duplicative the code becomes post-patch. If 
it's not a major performance hit (don't think it is; not even sure the API is 
even used anymore), perhaps just implement _PyTime_ObjectToTime_t as a wrapper 
for _PyTime_ObjectToDenominator (with a denominator of 2, so rounding 
simplifies to just 0 == round down, 1 == round up)?

Example:

int
_PyTime_ObjectToTime_t(PyObject *obj, time_t *sec, _PyTime_round_t round)
{
long numerator;
if (_PyTime_ObjectToDenominator(obj, sec, , 2, round) == 0) {
   if (numerator) {
   if (*sec == _Py_IntegralTypeMax(time_t)) {
   error_time_t_overflow();
   return -1;
   }
   ++*sec;
   }
   return 0;
}
return -1;
}

Sorry for not commenting on GitHub, but my work computer has a broken Firefox 
that GitHub no longer supports properly.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35707>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35431] Add a function for computing binomial coefficients to the math module

2019-01-28 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Steven: I'm assuming Brett rearranged the title to put emphasis on the new 
function and to place it earlier in the title. Especially important if you're 
reading e-mails with the old subject on an e-mail client with limited subject 
preview lengths, you end up seeing something like:

"The math module should provide a function for computing..."

rather than the more useful:

"Add a function for computing binomial coefficients to t..."

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35431>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35842] A potential bug about use of uninitialised variable

2019-01-28 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Your analysis would be (almost) correct if a slice object could have a stop 
value of NULL. It's wrong in that the error would be a NULL deference, not a 
silent use of an uninitialized value, but it would be a bug. In your scenario 
where v == NULL, it would pass the test for v != Py_None, then call 
PyIndex_Check(v), and since the macro doesn't check for the passed value being 
NULL, it would perform a NULL deference.

But even that's not possible; PySlice_New (which is ultimately responsible for 
all slice construction) explicitly replaces any argument of NULL with Py_None, 
so there is no such thing as a slice with *any* value being NULL.

So since r->stop is definitely non-NULL, either:

1. It's None, PySlice_Unpack line 232 executes, and stop is initialized

or

2. It's non-None, _PyEval_SliceIndex is called with a v that is definitely not 
None and non-NULL, so it always enters the `if (v != Py_None) {` block, and 
either it received a value index integer, in which case it initializes *pi (aka 
stop) and returns 1 (success), or returns 0 (failure), which means stop is 
never used.

The only way you could trigger your bug is to make a slice with an actual NULL 
for its stop value (and as noted, the bug would be a NULL dereference in 
PyIndex_Check, not a use of an uninitialized value, because v != Py_None would 
return true for v == NULL), which is only possible through intentionally 
misusing PySliceObject (reaching in and tweaking values of the struct 
directly). And if you can do that, you're already a C extension (or ctypes 
code) and can crash the interpreter any number of ways without resorting to 
this level of complexity.

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35842>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20399] Comparison of memoryview

2019-01-18 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Not my use case specifically, but my link in the last post (repeated below) was 
to a StackOverflow answer to a problem where using buffer was simple and fast, 
but memoryview required annoying workarounds. Admittedly, in most cases it's 
people wanting to do this with strings, so in Python 3 it only actually works 
if you convert to bytes first (possibly wrapped in a memoryview cast to a 
larger width if you need to support ordinals outside the latin-1 range). But it 
seems a valid use case.

Examples where rich comparisons were needed include:

Effcient way to find longest duplicate string for Python (From Programming 
Pearls) - https://stackoverflow.com/a/13574862/364696 (which provides a 
side-by-side comparison of code using buffer and memoryview, and memoryview 
lost, badly)

strcmp for python or how to sort substrings efficiently (without copy) when 
building a suffix array - https://stackoverflow.com/q/2282579/364696 (a case 
where they needed to sort based on potentially huge suffixes of huge strings, 
and didn't want to end up copying all of them)

--

___
Python tracker 
<https://bugs.python.org/issue20399>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20399] Comparison of memoryview

2019-01-18 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

The lack of support for the rich comparison operators on even the most basic 
memoryviews (e.g. 'B' format) means that memoryview is still a regression from 
some of the functionality buffer offered back in Python 2 ( 
https://stackoverflow.com/a/13574862/364696 ); you either need to convert back 
to bytes (losing the zero-copy behavior) or hand-write a comparator of your own 
to allow short-circuiting (which thanks to sort not taking cmp functions 
anymore, means you need to write it, then wrap it with functools.cmp_to_key if 
you're sorting, not just comparing individual items).

While I'll acknowledge it gets ugly to try to support every conceivable format, 
it seems like, at the very least, we could provide the same functionality as 
buffer for 1D contiguous memoryviews in the 'B' and 'c' formats (both of which 
should be handleable by a simple memcmp).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue20399>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35757] slow subprocess.Popen(..., close_fds=True)

2019-01-17 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Others can correct me if I'm wrong, but I'm fairly sure 2.7 isn't making 
changes unless they fix critical or security-related bugs.

The code here is suboptimal, but it's already been fixed in Python 3 (in 
#8052), as part of a C accelerator module (that reduces the risk of race 
conditions and other conflicts your Python level fix entails). Unless someone 
corrects me, I'll close this as "Won't Fix".

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35757>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35701] [uuid] 3.8 breaks weak references for UUIDs

2019-01-17 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

The UUID module documentation (and docstring) begin with:

"This module provides immutable UUID objects"

Immutable is a stronger guarantee than __slots__ enforces already, so the 
documentation already ruled out adding arbitrary attributes to UUID (and the 
__setattr__ that unconditionally raised TypeError('UUID objects are immutable') 
supported that.

Given the behavior hasn't changed in any way that contradicts the docs, nor 
would it affect anyone who wasn't intentionally working around the __setattr__ 
block, I don't feel a need to mention the arbitrary attribute limitation.

It's fine to leave in the What's New note (it is a meaningful memory savings 
for applications using lots of UUIDs), but the note can simplify to just:

"""uuid.UUID now uses __slots__ to reduce its memory footprint."""

--

___
Python tracker 
<https://bugs.python.org/issue35701>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35701] [uuid] 3.8 breaks weak references for UUIDs

2019-01-16 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

David, the What's New note about weak references no longer being possible 
should be removed as part of this change. I'm not sure the note on arbitrary 
attributes no longer being addable is needed either (__setattr__ blocked that 
beforehand, it's just even more impossible now).

--

___
Python tracker 
<https://bugs.python.org/issue35701>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29871] Enable optimized locks on Windows

2019-01-15 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I assume you meant #35662 (based on the superseder note in the history).

--

___
Python tracker 
<https://bugs.python.org/issue29871>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35712] Make NotImplemented unusable in boolean context

2019-01-10 Thread Josh Rosenberg


New submission from Josh Rosenberg :

I don't really expect this to go anywhere until Python 4 (*maybe* 3.9 after a 
deprecation period), but it seems like it would have been a good idea to make 
NotImplementedType's __bool__ explicitly raise a TypeError (rather than leaving 
it unset, so NotImplemented evaluates as truthy). Any correct use of 
NotImplemented per its documented intent would never evaluate it in a boolean 
context, but rather use identity testing, e.g. back in the Py2 days, the 
canonical __ne__ delegation to __eq__ for any class should be implemented as 
something like:

def __ne__(self, other):
equal = self.__eq__(other)
return equal if equal is NotImplemented else not equal

Problem is, a lot of folks would make mistakes like doing:

def __ne__(self, other):
return not self.__eq__(other)

which silently returns False when __eq__ returns NotImplemented, rather than 
returning NotImplemented and allowing Python to check the mirrored operation. 
Similar issues arise when hand-writing the other rich comparison operators in 
terms of each other.

It seems like, given NotImplemented is a sentinel value that should never be 
evaluated in a boolean context, at some point it might be nice to explicitly 
prevent it, to avoid errors like this.

Main argument against it is that I don't know of any other type/object that 
explicitly makes itself unevaluable in a boolean context, so this could be 
surprising if someone uses NotImplemented as a sentinel unrelated to its 
intended purpose and suffers the problem.

--
messages: 333421
nosy: josh.r
priority: normal
severity: normal
status: open
title: Make NotImplemented unusable in boolean context
type: behavior

___
Python tracker 
<https://bugs.python.org/issue35712>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35698] Division by 2 in statistics.median

2019-01-09 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

vstinner: The problem isn't the averaging, it's the type inconsistency. In both 
examples (median([1]), median([1, 1])), the median is unambiguously 1 (no 
actual average is needed; the values are identical), yet it gets converted to 
1.0 only in the latter case.

I'm not sure it's possible to fix this though; right now, there is consistency 
among two cases:

1. When the length is odd, you get the median by identity (and therefore type 
and value are unchanged)
2. When the length is even, you get the median by adding and dividing by 2 (so 
for ints, the result is always float).

A fix that changed that would add yet another layer of complexity:

1. When the length is odd, you get the median by identity (and therefore type 
and value are unchanged)
2. When the length is even, 
  a. If the two middle values are equal (possibly only if they have equal types 
as well, to resolve the issue with [1, 1.0] or [1, True]), return the first of 
the two middle values (median by identity as in #1)
  b. Otherwise, you get the median by adding and dividing by 2

And note the required type checking in 2a required to even make it that 
consistent. Even if we accepted that, we'd pretty quickly get into a debate 
over whether median([3, 5]) should try to return 4 instead of 4.0, given that 
the median is representable in the source type (which would further damage 
consistency).

If anything, I think the best design would have been to *always* include a 
division step (so odd length cases performed middle_elem / 1, while even did 
(middle_elem1 + middle_elem2) / 2) so the behavior was consistent regardless 
odd vs. even input length, but that shipped has probably sailed, given the 
documented behavior specifically notes that the precise middle data point is 
itself returned for the odd case.

I think the solution for people concerned is to explicitly convert int values 
to be median-ed to fractions.Fraction (or decimal.Decimal) ahead of time, so 
floating point math never gets involved, and the return type is consistent 
regardless of length.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35700] Place, Pack and Grid should return the widget

2019-01-09 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Closing as rejected; to my knowledge, *no* built-in Python method both mutate 
an object and returns the object just mutated, precisely because:

1. It allows for chaining that leads fairly quickly to unreadable code (Python 
is not Perl/Ruby)

2. It creates doubt as to whether the original object was mutated or not (if 
list.sort returns a sorted list, it becomes unclear as to whether the original 
list was sorted as well, or whether a new list was returned; sortedlist = 
unsortedlist.sort() might give an inaccurate impression of what was going on). 
Zachary's example of using top-level functions to do the work instead is 
basically the same practicality compromise that sorted makes in relation to 
list.sort.

--
nosy: +josh.r
resolution:  -> rejected
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35700>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35701] 3.8 needlessly breaks weak references for UUIDs

2019-01-09 Thread Josh Rosenberg


New submission from Josh Rosenberg :

I 100% agree with the aim of #30977 (reduce uuid.UUID() memory footprint), but 
it broke compatibility for any application that was weak referencing UUID 
instances (which seems a reasonable thing to do; a strong reference to a UUID 
can be stored in a single master container or passed through a processing 
pipeline, while also keying WeakKeyDictionary with cached supplementary data). 
I specifically noticed this because I was about to do that very thing in a 
processing flow, then noticed UUIDs in 3.6 were a bit heavyweight, memory-wise, 
went to file a bug on memory usage to add __slots__, and discovered someone had 
already done it for me.

Rather than break compatibility in 3.8, why not simply include '__weakref__' in 
the __slots__ listing? It would also remove the need for a What's New level 
description of the change, since the description informs people that:

1. Instances can no longer be weak-referenced (which adding __weakref__ would 
undp)
2. Instances can no longer add arbitrary attributes. (which was already the 
case in terms of documented API, programmatically enforced via a __setattr__ 
override, so it seems an unnecessary thing to highlight outside of Misc/NEWS)

The cost of changing __slots__ from:

__slots__ = ('int', 'is_safe')

to:

__slots__ = 'int', 'is_safe', '__weakref__'

would only be 4-8 bytes (for 64 bit Python, total cost of object + int would go 
from 100 to 108 bytes, still about half of the pre-__slots__ cost of 212 
bytes), and avoid breaking any code that might rely on being able to weak 
reference UUIDs.

I've marked this as release blocker for the time being because if 3.8 actually 
releases with this change, it will cause back compat issues that might prevent 
people relying on UUID weak references from upgrading their code.

--
components: Library (Lib)
keywords: 3.7regression, easy
messages: 38
nosy: Nir Soffer, josh.r, serhiy.storchaka, taleinat, vstinner, wbolster
priority: release blocker
severity: normal
stage: needs patch
status: open
title: 3.8 needlessly breaks weak references for UUIDs
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue35701>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35657] multiprocessing.Process.join() ignores timeout if child process use os.exec*()

2019-01-04 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Looks like the cause of the change was when os.pipe was changed to create 
non-inheritable pipes by default; if I monkey-patch 
multiprocessing.popen_fork.Popen._launch to use os.pipe2(0) instead of 
os.pipe() to get inheritable descriptors or just clear FD_CLOEXEC in the child 
with fcntl.fcntl(child_w, fcntl.F_SETFD, 0), the behavior returns to Python 2's 
behavior.

The problem is caused by the mismatch in lifetimes between the pipe fd and the 
child process itself; normally the pipe lives as long as the child process 
(it's never actually touched in the child process at all, so it just dies with 
the child), but when exec gets involved, the pipe is closed long before the 
child ends.

The code in Popen.wait that is commented with "This shouldn't block if wait() 
returned successfully" is probably the issue; wait() first waits on the parent 
side of the pipe fd, which returns immediately when the child execs and the 
pipe is closed. The code is assumes the poll on the process itself can be run 
in blocking (since the process should have ended already) but this assumption 
is wrong of course.

Possible solutions:

1. No code changes; document that exec in worker processes is unsupported (use 
subprocess, possibly with a preexec_fn, for this use case).

2. Precede the call to process_obj._bootstrap() in the child with 
fcntl.fcntl(child_w, fcntl.F_SETFD, 0) to clear the CLOEXEC flag on the child's 
descriptor, so the file descriptor remains open in the child post-exec. Using 
os.pipe2(0) instead of os.pipe() in _launch would also work and restore the 
precise 3.3 and earlier behavior, but it would introduce reintroduce race 
conditions with parent threads, so it's better to limit the scope to the child 
process alone, for the child's version of the fd alone.

3. Change multiprocessing.popen_fork.Popen.wait to use os.WNOHANG for all calls 
with a non-None timeout (not just timeout=0.0), rather than trusting 
multiprocessing.connection.wait's return value (which only says whether the 
pipe is closed, not whether the process is closed). Problem is, this would just 
change the behavior from waiting for the lifetime of the child no matter what 
to waiting until the exec and then returning immediately, even well before the 
timeout; it might also introduce race conditions if the fd registers as being 
closed before the process is fully exited. Point is, this approach would likely 
require a lot of subtle tweaks to make it work.

I'm in favor of either #1 or #2. #2 feels like a intentionally opening a 
resource leak on the surface, but I think it's actually fine, since we already 
signed up for a file descriptor that would live for the life of the process; 
the fact that it's exec-ed seems sort of irrelevant.

--
keywords: +3.4regression

___
Python tracker 
<https://bugs.python.org/issue35657>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35657] multiprocessing.Process.join() ignores timeout if child process use os.exec*()

2019-01-04 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I don't know what triggered the change, but I strongly suspect this is not a 
supported use of the multiprocessing module; Process is for worker processes 
(still running Python), and it has a lot of coordination machinery set up 
between parent and child (for use by, among other things,  join) that exec 
severs rather abruptly.

Launching unrelated child processes is what the subprocess module is for.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35657>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35588] Speed up mod/divmod/floordiv for Fraction type

2018-12-28 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

divmod imposes higher fixed overhead in exchange for operating more efficiently 
on larger values.

Given the differences are small either way, and using divmod reduces 
scalability concerns for larger values (which are more likely to occur in code 
that delays normalization), I'd be inclined to stick with the simpler 
divmod-based implementation.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35588>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35338] set union/intersection/difference could accept zero arguments

2018-12-10 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Given the "feature" in question isn't actually an intended feature (just an 
accident of how unbound methods work), I'm closing this. We're not going to try 
to make methods callable without self.

--
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35338>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35438] Cleanup extension functions using _PyObject_LookupSpecial

2018-12-10 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Agreed with everything in Serhiy's comments. This patch disregards why 
_PyObject_LookupSpecial and the various _Py_IDENTIFIER related stuff was 
created in the first place (to handle a non-trivial task efficiently/correctly) 
in favor of trying to avoid C-APIs that are explicitly okay to use for the 
CPython standard extensions. The goal is a mistake in the first place; no patch 
fix will make the goal correct.

Closing as not a bug.

--
resolution:  -> not a bug
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35438>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35438] Extension modules using non-API functions

2018-12-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Batteries-included extension modules aren't limited to the public and/or 
limited API; they use tons of undocumented internal APIs (everything to do with 
Py_IDENTIFIERs being an obvious and frequently used non-public API).

_PyObject_LookupSpecial is necessary to lookup special methods on the class of 
an instance (bypassing the instance itself) when no C level slot is associated 
with the special method (e.g. the math module using it to look up __ceil__ to 
implement math.ceil). Sure, each of these modules could reimplement it from 
scratch, but I'm not seeing the point in doing so.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35438>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35434] Wrong bpo linked in What's New in 3.8

2018-12-06 Thread Josh Rosenberg

New submission from Josh Rosenberg :

https://docs.python.org/3.8/whatsnew/3.8.html#optimizations begins with:

shutil.copyfile(), shutil.copy(), shutil.copy2(), shutil.copytree() and 
shutil.move() use platform-specific “fast-copy” syscalls on Linux, macOS and 
Solaris in order to copy the file more efficiently. ... more explanation ... 
(Contributed by Giampaolo Rodola’ in bpo-25427.)

That's all correct, except bpo-25427 is about removing the pyvenv script; it 
should be referencing bpo-33671.

--
assignee: docs@python
components: Documentation
keywords: easy
messages: 331264
nosy: docs@python, giampaolo.rodola, josh.r
priority: low
severity: normal
status: open
title: Wrong bpo linked in What's New in 3.8
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue35434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11107] Cache constant "slice" instances

2018-12-03 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
versions: +Python 3.8 -Python 3.5

___
Python tracker 
<https://bugs.python.org/issue11107>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35338] set union/intersection/difference could accept zero arguments

2018-11-30 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

set.union() without constructing the set you call union on only happens to work 
for the set.union(a) case because `a` is already a set. union takes arbitrary 
iterables, not just sets, and you're just cheating by explicitly passing `a` as 
the expected self argument. If you'd set `a = [1, 2]` (a list, not a set), 
set.union(a) would fail, because set.union(a) was only working by accident of a 
being interpreted as self; any such use is misuse.

Point is, the zero args case isn't a unique corner case;

args = ([1, 2], ANY OTHER ITERABLES HERE)
set.union(*args)

fails too, because the first argument is interpreted as self, and must be a set 
for this to work.

SilentGhost's solution of constructing the set before union-ing via 
set().union(*args) is the correct solution; it's free of corner cases, removing 
the specialness of the first element in args (because self is passed in 
correctly), and not having any troubles with empty args.

intersection is the only interesting case here, where preconstruction of the 
empty set doesn't work, because that would render the result the empty set 
unconditionally. The solution there is set(args[0]).intersection(*args) (or 
*args[1:]), but that's obviously uglier.

I'm -1 on making any changes to set.union to support this misuse case.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35338>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19865] create_unicode_buffer() fails on non-BMP strings on Windows

2018-11-28 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
keywords:  -3.2regression

___
Python tracker 
<https://bugs.python.org/issue19865>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19865] create_unicode_buffer() fails on non-BMP strings on Windows

2018-11-28 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
keywords: +3.2regression
versions: +Python 3.6, Python 3.7, Python 3.8

___
Python tracker 
<https://bugs.python.org/issue19865>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35314] fnmatch failed with leading caret (^)

2018-11-26 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Finished typing this while Serhiy was closing, but just for further explanation:

This isn't a bug. fnmatch provides "shell-style" wildcards, but that doesn't 
mean it supports every shell's extensions to the globbing syntax. It doesn't 
even claim support for full POSIX globbing syntax. The docs explicitly specify 
support for only four forms:

*
?
[seq]
[!seq]

There is no support for [^seq]; [^seq] isn't even part of POSIX globbing per 
glob(7):

"POSIX has declared the effect of a wildcard pattern "[^...]" to be undefined."

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35314>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35303] A reference leak in _operator.c's methodcaller_repr()

2018-11-26 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

This is completely fixed, right? Just making sure there is nothing left to be 
done to close the issue.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35303>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35273] 'eval' in generator expression behave different in dict from list

2018-11-20 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

The "bug" is the expected behavior for 2.7, as previously noted, and does not 
exist on Python 3 (where list comprehensions follow the same rules as generator 
expressions for scoping), where NameErrors are raised consistently.

--
nosy: +josh.r
resolution:  -> not a bug
versions: +Python 2.7 -Python 3.6

___
Python tracker 
<https://bugs.python.org/issue35273>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34805] Explicitly specify `MyClass.__subclasses__()` returns classes in definition order

2018-11-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Keep in mind, had this guarantee been in place in 3.4, the improvement in 3.5 
couldn't have been made, and issue17936 might have been closed and never 
addressed, even once dicts were ordered, simply because we never rechecked it. 
It makes the whole "potential downside" more obvious, because we would have 
paid that price not so long ago. Knowing that 3.5 improved by breaking this 
guarantee was part of what made me cautious here.

--

___
Python tracker 
<https://bugs.python.org/issue34805>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34805] Explicitly specify `MyClass.__subclasses__()` returns classes in definition order

2018-11-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I wrote the response to the OP's use case before I saw your response; it wasn't 
really intended as an additional critique of the proposed change or a 
counterargument to your post, just a note that the required behavior could be 
obtained on all versions of Python via metaclasses, including on 3.5.

I have no specific plans to rewrite the typeobject.c, nor make a C implemented 
WeakSet. I'm just leery of adding language guarantees that limit future 
development when they:

1. Provide very little benefit (I doubt one package in 10,000 even uses 
__subclasses__, let alone relies on its ordering)

2. The benefit is achievable without herculean efforts with existing tools 
(metaclasses can provide the desired behavior with minimal effort at the 
trivial cost of an additional side-band dict on the root class)

If the guarantee never limits a proposed change, then our best case scenario is 
we provided a guarantee that benefits almost no one (guaranteed upside 
minimal). But if it limits a proposed change, we might lose out on a 
significant improvement in performance, code maintainability, what have you 
(much larger potential downside). I'm just not seeing enough of a benefit to 
justify the potential cost.

--

___
Python tracker 
<https://bugs.python.org/issue34805>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34805] Explicitly specify `MyClass.__subclasses__()` returns classes in definition order

2018-11-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I'm also a little skeptical of the OP's proposed use case for other reasons. In 
any circumstance other than "all classes are defined in the same module", you 
can't really make useful guarantees about subclass definition order, because:

1. If the converters are defined in multiple modules in a single package, the 
module with IntConverter could be imported first explicitly, and now 
BoolConverter will come second.

2. If all the provided converters occur in a single monolithic module, and some 
other package tries to make a converter for their own int subclass, well, 
IntConverter is already first in the list of subclasses, so the other package's 
converter will never be called (unless it's for the direct subclass of int, 
rather than a grandchild of int, but that's an implementation detail of the 
OP's project).

Essentially, to write such a class hierarchy properly, you'd need to rejigger 
the ordering each time a class was registered such that any converter for a 
parent class was pushed until after the converter for all of its descendant 
classes (and if there is multiple inheritance involved, you're doomed).

Even ignoring all that, their use case doesn't require explicit registration if 
they don't want it to. By making a simple metaclass for the root class, the 
metaclass's __new__ can perform registration on the descendant class's behalf, 
either with the definition time ordering of the current design, or with a more 
complicated rejiggering I described that would be necessary to ensure parent 
classes are considered after child classes (assuming no multiple inheritance).

--

___
Python tracker 
<https://bugs.python.org/issue34805>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34805] Explicitly specify `MyClass.__subclasses__()` returns classes in definition order

2018-11-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

First off, the OP's original case seems like a use case for 
functools.singledispatch. Not really related to the problem, just thought I'd 
mention it.

Secondly, are we sure we want to make such a guarantee? That restricts the 
underlying storage to ordered types (list/dict; possibly tuple at the cost of 
making modifications slightly more expensive), or an unordered type with 
additional ordering layered on it (like old-school OrderedDict).

That does tie our hands in the future. For example, it seems like it would be a 
perfectly reasonable approach for the internal collection of subclasses to be 
implemented as a weakref.WeakSet (some future version of it implemented in C, 
rather than the current Python layer version) so as to reduce code duplication 
and improve handling when a subclass disappears. Right now, tp_subclasses is a 
dict keyed by the raw memory address of the subclass (understandable, but eww), 
with a value of a weakref to the subclass itself. There is tons of custom code 
involved in handling this (e.g. the dict only self-cleans because the dealloc 
for classes explicitly removes the subclass from the parent classes, but every 
use of the dict still has to assume weakrefs have gone dead anyway, because of 
reentrancy issues; these are solved problems in WeakSet which hides all the 
complexity from the user). Being able to use WeakSets would mean a huge amount 
of special purpose code in typeobject.c could go away,
  but guaranteeing ordering would make that more difficult (it would require 
providing an ordering guarantee for WeakSet, which, being built on set, would 
likely require ordering guarantees for sets in general, or changing WeakSet to 
be built on dicts).

There is also (at least) one edge case that would need to be fixed (based on a 
brief skim of the code). type_set_bases (which handles assignment to __bases__ 
AFAICT, admittedly a niche use case) simplified its own implementation by 
making the process of changing __bases__ be to remove itself as a subclass of 
all of its original bases, then add itself as a subclass of the new bases. This 
is done even if there are overlaps in the bases, and even if the new bases are 
the same.

Minimal repro:

>>> class A: pass
>>> class B(A): pass
>>> class C(A): pass
>>> A.__subclasses__()  # Appear in definition order
[__main__.B, __main__.C]

>>> B.__bases__ = B.__bases__# Should be no-op...
>>> A.__subclasses__()   # But oops, order changed
[__main__.C, __main__.B]

I'm not going to claim this is common or useful (I've done something like this 
exactly once, interactively, while making an OrderedCounter from OrderedDict 
and Counter back before dicts were ordered; I got the inheritance order wrong 
and reversed it after the fact), but making the guarantee would be more than 
just stating it; we'd either have to complicate the code to back it up, or 
qualify the guarantee with some weird, possibly CPython-specific details.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34805>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35182] Popen.communicate() breaks when child closes its side of pipe but not exits

2018-11-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Hmm... Correction to my previous post. communicate itself has a test for:

"if self._communication_started and input:"

that raises an error if it passes, so the second call to communicate can only 
be passed None/empty input. And _communicate only explicitly closes self.stdin 
when input is falsy and _communication_started is False, so the required 
behavior right now is:

1. First call *may* pass input
2. Second call must not pass (non-empty) input under any circumstance

So I think we're actually okay on the code for stdin, but it would be a good 
idea to document that input *must* be None on all but the first call, and that 
the input passed to the first call is cached such that as long as at least one 
call to communicate completes without a TimeoutError (and the stdin isn't 
explicitly closed), it will all be sent.

Sorry for the noise; I should have rechecked communicate itself, not just 
_communicate.

--

___
Python tracker 
<https://bugs.python.org/issue35182>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35182] Popen.communicate() breaks when child closes its side of pipe but not exits

2018-11-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Sounds like the solution you'd want here is to just change each if check in 
_communicate, so instead of:

if self.stdout:
selector.register(self.stdout, selectors.EVENT_READ)
if self.stderr:
selector.register(self.stderr, selectors.EVENT_READ)

it does:

if self.stdout and not self.stdout.closed:
selector.register(self.stdout, selectors.EVENT_READ)
if self.stderr and not self.stderr.closed:
selector.register(self.stderr, selectors.EVENT_READ)

The `if self.stdin and input:` would also have to change. Right now it's buggy 
in a related, but far more complex way. Specifically if you call it with input 
the first time:

1. If some of the input is sent but not all, and the second time you call 
communicate you rely on the (undocumented, but necessary for consistency) input 
caching and don't pass input at all, it won't register the stdin handle for 
read (and in fact, will explicitly close the stdin handle), and the remaining 
cached data won't be sent. If you try to pass some other non-empty input, it 
just ignores it and sends whatever remains in the cache (and fails out as in 
the stdout/stderr case if the data in the cache was sent completely before the 
timeout).

2. If all of the input was sent on the first call, you *must* pass input=None, 
or you'll die trying to register self.stdin with the selector

The fix for this would be to either:

1. Follow the pattern for self.stdout/stderr (adding "and not 
self.stdin.closed"), and explicitly document that repeated calls to communicate 
must pass the exact same input each time (and optionally validate this in the 
_save_input function, which as of right now just ignores the input if a cache 
already exists); if input is passed the first time, incompletely transmitted, 
and not passed the second time, the code will error as in the OP's case, but it 
will have violated the documented requirements (ideally the error would be a 
little more clear though)

or

2. Change the code so populating the cache (if not already populated) is the 
first step, and replace all subsequent references to input with references to 
self._input (for setup tests, also checking if self._input_offset >= 
len(self._input), so it doesn't register for notifications on self.stdin if all 
the input has been sent), so it becomes legal to pass input=None on a second 
call and rely on the first call to communicate caching it. It would still 
ignore new input values on the subsequent calls, but at least it would behave 
in a sane way (not closing sys.stdin despite having unsent cached data, then 
producing a confusing error that is several steps removed from the actual 
problem)

Either way, the caching behavior for input should be properly documented; we 
clearly specify that output is preserved after a timeout and retrying 
communicate ("If the process does not terminate after timeout seconds, a 
TimeoutExpired exception will be raised. Catching this exception and retrying 
communication will not lose any output."), but we don't say anything about 
input, and right now, the behavior is the somewhat odd and hard to express:

"Retrying a call to communicate when the original call was passed 
non-None/non-empty input requires subsequent call(s) to pass non-None, 
non-empty input. The input on said subsequent calls is otherwise ignored; only 
the unsent remainder of the original input is sent. Also, it will just fail 
completely if you pass non-empty input and it turns out the original input was 
sent completely on the previous call, in which case you *must* call it with 
input=None."

It might also be worth changing the selectors module to raise a more obvious 
exception when register is passed a closed file-like object, but given it only 
requires non-integer fileobjs to have a .fileno() method, adding a requirement 
for a "closed" attribute/property could break other code.

--
nosy: +josh.r
stage:  -> needs patch

___
Python tracker 
<https://bugs.python.org/issue35182>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35180] Ctypes segfault or TypeError tested for python2.7 and 3

2018-11-06 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

As soon as you use ctypes, you sign up for all the security vulnerabilities, 
including denial of service, buffer overrun, use-after-free, etc. that plain 
old C programs are subject to. In this case, it's just a NULL pointer 
dereference (read: segfault in most normal cases), but in general, if you don't 
use ctypes with the same discipline as you would actual C code (at best it 
provides a little in the way of automatic memory management), you're subject to 
all the same problems.

Side-note: When replying to e-mails, don't include the quotes from the e-mail 
you're replying to; it just clutters the tracker.

--

___
Python tracker 
<https://bugs.python.org/issue35180>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35180] Ctypes segfault or TypeError tested for python2.7 and 3

2018-11-06 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

The TypeError on Py3 would be because functions taking c_char_p need bytes-like 
objects, not str, on Python 3. '%s' % directory is pointless when directory is 
a str; instead you need to encode it to a bytes-like object, e.g. 
opendir(os.fsencode(directory)) (os.fsencode is Python 3 specific; plain str 
works fine on Py 2).

Your segfault isn't occurring when you load dirfd, it occurs when you call it 
on the result of opendir, when opendir returned NULL on failure (due to the 
non-existent directory you call it with). You didn't check the return value, 
and end up doing flagrantly illegal things with it.

In neither case is this a bug in Python; ctypes lets you do evil things that 
break the rules, and if you break the rules the wrong way, segfaults are to be 
expected. Fix your argument types (for Py3), check your return values (for Py2).

--
nosy: +josh.r
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35180>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35175] Builtin function all() is handling dict() types in a weird way.

2018-11-06 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue35175>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35114] ssl.RAND_status docs describe it as returning True/False; actually returns 1/0

2018-10-30 Thread Josh Rosenberg

New submission from Josh Rosenberg :

The ssl.RAND_status online docs say (with code format on True/False):

"Return True if the SSL pseudo-random number generator has been seeded with 
‘enough’ randomness, and False otherwise."

This is incorrect; the function actually returns 1 or 0 (and the docstring 
agrees).

Fix can be one of:

1. Update docs to be less specific about the return type (use true/false, not 
True/False)
2. Update docs to match docstring (which specifically says 1/0, not True/False)
3. Update implementation and docstring to actually return True/False (replacing 
PyLong_FromLong with PyBool_FromLong and changing docstring to use True/False 
to match online docs)

#3 involves a small amount of code churn, but it also means we're not 
needlessly replicating a C API's use of int return values when the function is 
logically bool (there is no error status for the C API AFAICT, so it's not like 
returning int gains us anything on flexibility). bool would be mathematically 
equivalent to the original 1/0 return value in the rare cases someone uses it 
mathematically.

--
assignee: docs@python
components: Documentation, SSL
messages: 328917
nosy: docs@python, josh.r
priority: low
severity: normal
status: open
title: ssl.RAND_status docs describe it as returning True/False; actually 
returns 1/0
type: behavior

___
Python tracker 
<https://bugs.python.org/issue35114>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35098] Deleting __new__ does not restore previous behavior

2018-10-30 Thread Josh Rosenberg


Change by Josh Rosenberg :


--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> Assigning and deleting __new__ attr on the class does not allow 
to create instances of this class

___
Python tracker 
<https://bugs.python.org/issue35098>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35043] functools.reduce doesn't work properly with itertools.chain

2018-10-23 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Blech. Copy'n'paste error in last post:

a = list(itertools.chain.from_iterable(*my_list))

should be:

a = list(itertools.chain.from_iterable(my_list))

(Note removal of *, which is the whole point of from_iterable)

--

___
Python tracker 
<https://bugs.python.org/issue35043>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35043] functools.reduce doesn't work properly with itertools.chain

2018-10-22 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Your example code doesn't behave the way you claim. my_list isn't changed, and 
`a` is a chain generator, not a list (without a further list wrapping).

In any event, there is no reason to involve reduce here. chain already handles 
varargs what you're trying to do without involving reduce at all:

a = list(itertools.chain(*my_list))

or if you prefer to avoid unnecessary unpacking:

a = list(itertools.chain.from_iterable(*my_list))

Either way, a will be [1, 2, 3, 4], and my_list will be unchanged, with no 
wasteful use of reduce.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35043>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35006] itertools.combinations has wrong type when using the typing package

2018-10-16 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Looks like a bug in the typeshed (which mypy depends on to provide typing info 
for most of the stdlib, which isn't explicitly typed). Affects both 
combinations and combinations_with_replacement from a quick check of the code: 
https://github.com/python/typeshed/blob/94485f9e4f86df143801c1810a58df993b2b79b3/stdlib/3/itertools.pyi#L103

Presumably this should be opened on the typeshed tracker. 
https://github.com/python/typeshed/issues

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue35006>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34947] inspect.getclosurevars() does not get all globals

2018-10-09 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Problem: The variables from the nested functions (which comprehensions are 
effectively a special case of) aren't actually closure variables for the 
function being inspected.

Allowing recursive identification of all closure variables might be helpful in 
some contexts, but you wouldn't want it to be the only behavior; it's easier to 
convert a non-recursive solution to a recursive solution than the other way 
around.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34947>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19270] Document that sched.cancel() doesn't distinguish equal events and can break order

2018-10-09 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Victor: "I would be interested of the same test on Windows."

Looks like someone performed it by accident, and filed #34943 in response 
(because time.monotonic() did in fact return the exact same time twice in a row 
on Windows).

--

___
Python tracker 
<https://bugs.python.org/issue19270>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34889] int.to_bytes and int.from_bytes should default to the system byte order like the struct module does

2018-10-03 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

to_bytes and from_bytes aren't remotely related to native primitive types, 
struct is. If the associated lengths aren't 2, 4 or 8, there is no real 
correlation with system level primitives, and providing these defaults makes it 
easy to accidentally write non-portable code.

Providing a default might make sense, but if you do, it should be a fixed 
default (so output is portable). Making it depend on the system byte order for 
no real reason aside from "so I can do struct-like things faster in a 
non-struct way" is not a valid reason to make a behavior both implicit and 
inconsistent.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34889>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34886] subprocess.run throws exception when input and stdin are passed as kwargs

2018-10-03 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

The actual code receives input by name, but stdin is received in **kwargs. The 
test is just:

if input is not None:
if 'stdin' in kwargs:
raise ValueError(...)
kwargs['stdin'] = PIPE

Perhaps just change `if 'stdin' in kwargs:` to:

if kwargs.get('stdin') is not None:

so it obeys the documented API (that says stdin defaults to None, and therefore 
passing stdin=None explicitly should be equivalent to not passing it at all)?

--

___
Python tracker 
<https://bugs.python.org/issue34886>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34886] subprocess.run throws exception when input and stdin are passed as kwargs

2018-10-03 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

I just tried:

subprocess.run('ls', input=b'', stdin=None)

and I got the same ValueError as for passing using kwargs. Where did you get 
the idea subprocess.run('ls', input=b'', stdin=None) worked?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34886>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34784] Heap-allocated StructSequences

2018-10-03 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

This looks like a duplicate of #28709, though admittedly, that bug hasn't seen 
any PRs.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34784>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34750] locals().update doesn't work in Enum body, even though direct assignment to locals() does

2018-09-21 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

The documentation for locals ( 
https://docs.python.org/3/library/functions.html#locals ) specifically states:

Note: The contents of this dictionary should not be modified; changes may not 
affect the values of local and free variables used by the interpreter. 

The docstring for locals is similar, making it clear that any correlation 
between the returned dict and the state of locals if *either* is subsequently 
modified is implementation dependent, subject to change without back-compat 
concerns; even if we made this change, we've given ourselves the freedom to 
undo it at any time, which makes it useless to anyone who might try to rely on 
it.

The fact that even locals()["a"] = 1 happens to work is an implementation 
detail AFAICT; normally, locals() is and should remain read-only (or at least, 
modifications don't actually affect the local scope aside from the dict 
returned by locals()).

I'm worried that making _EnumDict inherit from collections.abc.MutableMapping 
in general would slow down Enums (at the very least creation, I'm not clear on 
whether _EnumDict remains, hidden behind the mappingproxy, for future lookups 
on the class), since MutableMapping would introduce a Python layer of overhead 
to most calls.

I'm also just not inclined to encourage the common assumption that locals() 
returns a dict where mutating it actually works, since it usually doesn't.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34750>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34601] Typo: "which would rather raise MemoryError than give up", than or then?

2018-09-07 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

"than" is correct; "giving up" in this context would mean "not even trying to 
allocate the memory and just preemptively raising OverflowError, like 
non-integer numeric types with limited ranges". Rather than giving up that way, 
it chooses to try to allocate the huge integer and raises a MemoryError only if 
that fails.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34601>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34574] OrderedDict iterators are exhausted during pickling

2018-09-05 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

This would presumably be a side-effect of all generic pickling operations of 
iterators; figuring out what the iterator produces requires running out the 
iterator. You could special case it case-by-case, but that just makes the 
behavior unreliable/confusing; now some iterators pickle without being mutated, 
and others don't. Do you have a proposal to fix it? Is it something that needs 
to be fixed at all, when the option to pickle the original OrderedDict directly 
is there?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34574>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34535] queue.Queue(timeout=0.001) avg delay Windows:14.5ms, Ubuntu: 0.063ms

2018-08-29 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Victor, that was a little overboard. By that logic, there doesn't need to be a 
Windows version of Python.

That said, Paul doesn't seem to understand that the real resolution limit isn't 
1 ms; that's the lower limit on arguments to the API, but the real limit is the 
system clock, which has a granularity in the 10-16 ms range. It's a problem 
with Windows in general, and the cure is worse than the disease.

Per 
https://msdn.microsoft.com/en-us/library/windows/desktop/ms724411(v=vs.85).aspx 
, the resolution of the system timer is typically in the range of 10 
milliseconds to 16 milliseconds.

Per 
https://docs.microsoft.com/en-us/windows/desktop/Sync/wait-functions#wait-functions-and-time-out-intervals
 :

> Wait Functions and Time-out Intervals

> The accuracy of the specified time-out interval depends on the resolution of 
> the system clock. The system clock "ticks" at a constant rate. If the 
> time-out interval is less than the resolution of the system clock, the wait 
> may time out in less than the specified length of time. If the time-out 
> interval is greater than one tick but less than two, the wait can be anywhere 
> between one and two ticks, and so on.

All the Windows synchronization primitives (e.g. WaitForSingleObjectEx 
https://docs.microsoft.com/en-us/windows/desktop/api/synchapi/nf-synchapi-waitforsingleobjectex
 , which is what ultimately implements timed lock acquisition on Windows) are 
based on the system clock, so without drastic measures, it's impossible to get 
better granularity than the 10-16 ms of the default system clock configuration.

The link on "Wait Functions and Time-out Intervals" does mention that this 
granularity *can* be increased, but it recommends against fine-grained tuning 
(so you can't just tweak it before a wait and undo the tweak after; the only 
safe thing to do is change it on program launch and undo it on program exit). 
Even then, it's a bad idea for Python to use it; per timeBeginPeriod's own docs 
( 
https://docs.microsoft.com/en-us/windows/desktop/api/timeapi/nf-timeapi-timebeginperiod
 ):

> This function affects a global Windows setting. Windows uses the lowest value 
> (that is, highest resolution) requested by any process. Setting a higher 
> resolution can improve the accuracy of time-out intervals in wait functions. 
> However, it can also reduce overall system performance, because the thread 
> scheduler switches tasks more often. High resolutions can also prevent the 
> CPU power management system from entering power-saving modes. Setting a 
> higher resolution does not improve the accuracy of the high-resolution 
> performance counter.

Basically, to improve the resolution of timed lock acquisition, we'd have to 
change the performance profile of the entire OS while Python was running, 
likely increasing power usage and possibly reducing performance. Global 
solutions to local problems are a bad idea.

The most reasonable solution to the problem is to simply document it (maybe not 
for queue.Queue, but for the threading module). Possibly even provide an 
attribute in the threading module similar to  threading.TIMEOUT_MAX that 
reports the system clock's granularity for informational purposes (might need 
to be a function so it reports the potentially changing granularity).

Other, less reasonable solutions, would be:

1. Expose a function (with prominent warnings about not using it in a fine 
grained manner, and the effects on power management and performance) that would 
increase the system clock granularity as much as possible timeGetDevCaps 
reports possible (possibly limited to a user provided suggestion, so while the 
clock could go to 1 ms resolution, the user could request only 5 ms resolution 
to reduce the costs of doing so). Requires some additional state (whether 
timeBeginPeriod has been called, and with what values) so timeEndPeriod can be 
called properly before each adjustment and when Python exits. Pro is the code 
is *relatively* simple and would mostly fix the problem. Cons are that it 
wouldn't be super discoverable (unless we put notes in every place that uses 
timeouts, not just in threading docs), it encourages bad behavior (one 
application deciding its needs are more important that conserving power), and 
we'd have to be *really* careful to pair our calls universally (timeEndPeriod 
mus
 t be called, even when other cleanup is skipped, such as when calling 
os._exit; AFAICT, the docs imply that per-process adjustments to the clock 
aren't undone even when the process completes, which means failure to pair all 
calls would leave the system with a suboptimal system clock resolution that 
would remain in effect until rebooted).

2. (Likely a terrible idea, and like option 1, should be explicitly opt-in, not 
enabled by default) Offer the option to have Python lock timeouts only use 
WaitForSingleObjectEx 

[issue34494] simple "sequence" class ignoring __len__

2018-08-26 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

That's the documented behavior. Per 
https://docs.python.org/3/reference/datamodel.html#object.__getitem__ :

>Note: for loops expect that an IndexError will be raised for illegal indexes 
>to allow proper detection of the end of the sequence. 

The need for *only* __getitem__ is also mentioned in the documentation of the 
iter builtin ( https://docs.python.org/3/library/functions.html#iter ):

>Without a second argument, object must be a collection object which supports 
>the iteration protocol (the __iter__() method), or it must support the 
>sequence protocol (the __getitem__() method with integer arguments starting at 
>0).

At no point is a dependency on __len__ mentioned.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34494>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34434] Removal of kwargs for built-in types not covered with "changed in Python" note in documentation

2018-08-22 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

For tuple and list, no, they couldn't have looked at the help (because the help 
calls the argument "iterable", while the only keyword accepted was "sequence"). 
Nor was "sequence" documented in the online docs, nor anywhere else that I can 
find; it was solely in the C source code.

If it was discoverable in any other way, I wouldn't say documenting the change 
(outside of What's New) was completely unjustifiable (I acknowledge that int, 
bool and float warrant a mention, since they did document a functioning name 
for the argument; I was a little too down on them in my original messages).

But the only way someone would accidentally use keyword arguments for 
list/tuple is if they were fuzzing the constructor by submitting random keyword 
arguments until something worked. That seems an odd thing to worry about 
breaking. The error message wouldn't help either; the exception raised tells 
you what argument was unrecognized, but not the names of recognized arguments.

Even if you want to document it, it's hard to do so without being confusing, 
inaccurate, or both. The original PR's versionchanged message was:

*iterable* is now a positional-only parameter.

But "iterable" was never a legal keyword, so saying it's "now a positional-only 
parameter" implies that at some point, it wasn't, and you could pass it with 
the name "iterable", which is wrong/confusing. If you mention "sequence", 
you're mentioning a now defunct detail (confusing, but not wrong). I suppose 
you could have the versionchanged say "This function does not accept keyword 
arguments", but again, for all discoverable purposes, it never did.

I'm not saying *no* documentation of the change is needed, but I am saying, for 
list/tuple, the What's New note is sufficient to cover it for those people who 
went mucking through the CPython source code to find an undocumented keyword 
they could use.

--

___
Python tracker 
<https://bugs.python.org/issue34434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34434] Removal of kwargs for built-in types not covered with "changed in Python" note in documentation

2018-08-22 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Oh, I was checking old docs when I said the online docs didn't call int's 
argument "x"; the current docs do, so int, float and bool all justify a change 
(barely), it's just tuple and list for which it's completely unjustifiable.

--

___
Python tracker 
<https://bugs.python.org/issue34434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34434] Removal of kwargs for built-in types not covered with "changed in Python" note in documentation

2018-08-22 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Bloating the documentation is almost certainly unjustifiable for list and 
tuple, and only barely justifiable for int, bool and float, given that:

1. The documentation (at least for Python 3) has *never* claimed the arguments 
could be passed by keyword (all of them used brackets to indicate the argument 
was optional without implying a meaningful default, which is typically how 
"does not take arguments by keyword" was described before the current "/" 
convention)

and

2. Aside from bool and float (and to a lesser extent, int), the documented name 
of said parameter didn't match the name it was accepted under, e.g.:

   a. The docs for tuple and list claimed the name was "iterable"; the only 
accepted name was "sequence"
   b. The online docs for int gave a wholly invalid "name", calling it "number 
| string", when in fact it was accepted only as "x". That said, int's docstring 
does describe the name "correctly" as "x"

So for tuple/list it would have been impossible to write code that depended on 
being able to pass the first parameter by keyword unless you'd gone mucking 
about in the CPython source code to figure out the secret keyword name. I could 
justify a note for int/bool/float given that the docstrings for all of them 
named the argument, and bool/float named it in the online docs, but we don't 
need to document a change that no one could have taken a dependency on without 
going to extreme trouble.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34434>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34458] No way to alternate options

2018-08-22 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

That's a *really* niche use case; you want to store everything to a common 
destination list, in order, but distinguish which switch added each one? I 
don't know of any programs that use such a design outside of Python (and 
therefore, it seems unlikely there would be enough demand from argparse users 
to justify the development, maintenance, and complexity cost of adding it).

argparse does support defining custom Actions, so it's wholly possible to add 
this sort of support for yourself if there isn't enough demand to add it to 
argparse itself. For example, a simple implementation would be:

class AppendWithSwitchAction(argparse.Action):
def __init__(self, option_strings, dest, *args, **kwargs):
super().__init__(option_strings, dest, *args, **kwargs)
# Map all possible switches to the final switch provided
# so we store a consistent switch name
self.option_map = dict.fromkeys(option_strings, option_strings[-1])

def __call__(self, parser, namespace, values, option_string=None):
option = self.option_map.get(option_string)
try:
getattr(namespace, self.dest).append((option, value))
except AttributeError:
setattr(namespace, self.dest, [(option, value)])

then use it with:

parser.add_argument('-p', '--preload', help='preload asset', 
action=AppendWithSwitchAction, metavar='NAMESPACE')

parser.add_argument('-f', '--file', help='preload file', 
action=AppendWithSwitchAction, metavar='FILE', dest='preload')

All that does is append ('--preload', argument) or ('--file', argument) instead 
of just appending the argument, so you can distinguish one from the other (for 
switch, arg in args.preload:, then test if switch=='--preload' or '--file'). 
It's bare bones (the actual class underlying the 'append' action ensures nargs 
isn't 0, and that if const is provided, nargs is '?'), but it would serve.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34458>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34410] itertools.tee not thread-safe; can segfault interpreter when wrapped iterator releases GIL

2018-08-22 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Carlo: The point of Xiang's post is that this is only tangentially related to 
multiprocessing; the real problem is that tee-ing an iterator implemented in 
Python (of which pool.imap_unordered is just one example) and using the 
resulting tee-ed iterators in multiple threads (which pool.imap_unordered does 
implicitly, as there is a thread involved in dispatching work).

The problem is *exposed* by multiprocessing.pool.imap_unordered, but it 
entirely a problem with itertools.tee, and as Xiang's repro indicates, it can 
be triggered easily without the complexity of multiprocessing being involved.

I've updated the bug title to reflect this.

--
components: +Library (Lib)
nosy: +josh.r
title: Segfault/TimeoutError: itertools.tee of 
multiprocessing.pool.imap_unordered -> itertools.tee not thread-safe; can 
segfault interpreter when wrapped iterator releases GIL
versions: +Python 3.6

___
Python tracker 
<https://bugs.python.org/issue34410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34364] problem with traceback for syntax error in f-string

2018-08-08 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

So the bug is that the line number and module are incorrect for the f-string, 
right? Nothing else?

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34364>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34321] mmap.mmap() should not necessarily clone the file descriptor

2018-08-03 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Why would it "cause an issue if the file is closed before accessing the mmapped 
region"? As shown in your own link, the constructor performs the mmap call 
immediately after the descriptor is duplicated, with the GIL held; any race 
condition that could close the file before the mmap occurs could equally well 
close it before the descriptor is duplicated.

The possible issues aren't tied to accessing the memory (once the mapping has 
been performed, the file descriptor can be safely closed in general), but 
rather, to the size and resize methods of mmap objects (the former using the fd 
to fstat the file, the latter using it to ftruncate the file). As long as you 
don't use size/resize, nothing else depends on the file descriptor after 
construction has completed. The size method in particular seems like a strange 
wart on the API; it returns the total file size, not the size of the mapping 
(len(mapping) gets the size of the actual mapping).

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34259] Improve docstring of list.sort

2018-08-01 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Copying from the sorted built-in's docstring would make sense here, given that 
sorted is implemented in terms of list.sort in the first place.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34259>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29842] Make Executor.map work with infinite/large inputs correctly

2018-07-25 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

In response to Max's comments:

>But consider the case where input is produced slower than it can be processed 
>(`iterables` may fetch data from a database, but the callable `fn` may be a 
>fast in-memory transformation). Now suppose the `Executor.map` is called when 
>the pool is busy, so there'll be a delay before processing begins. In this 
>case, the most efficient approach is to get as much input as possible while 
>the pool is busy, since eventually (when the pool is freed up) it will become 
>the bottleneck. This is exactly what the current implementation does.

I'm not sure the "slow input iterable, fast task, competing tasks from other 
sources" case is all that interesting. Uses of Executor.map in the first place 
are usually a replacement for complex task submission; perhaps my viewpoint is 
blinkered, but I see the Executors used for *either* explicit use of submit 
*or* map, rather than mixing and matching (you might use it for both, but 
rarely interleave usages). Without a mix and match scenario (and importantly, a 
mix and match scenario where enough work is submitted before the map to occupy 
all workers, and very little work is submitted after the map begins to space 
out map tasks such that additional map input is requested while workers are 
idle), the smallish default prefetch is an improvement, simply by virtue of 
getting initial results more quickly.

The solution of making a dedicated input thread would introduce quite a lot of 
additional complexity, well beyond what I think it justifiable for a relatively 
niche use case, especially one with many available workarounds, e.g.

1. Raising the prefetch count explicitly

2. Having the caller listify the iterable (similar to passing an arbitrarily 
huge prefetch value, with the large prefetch value having the advantage of 
sending work to the workers immediately, while listifying has the advantage of 
allowing you to handle any input exceptions up front rather than receiving them 
lazily during processing)

3. Use cheaper inputs (e.g. the query string, not the results of the DB query) 
and perform the expensive work as part of the task (after all, the whole point 
is to parallelize the most expensive work)

4. Using separate Executors so the manually submitted work doesn't interfere 
with the mapped work, and vice versa

5. Making a separate ThreadPoolExecutor to generate the expensive input values 
via its own map function (optionally with a larger prefetch count), e.g. 
instead of

with SomeExecutor() as executor:
for result in executor.map(func, (get_from_db(query) for query in queries)):

do:

with SomeExecutor() as executor, ThreadPoolExecutor() as inputexec:
inputs = inputexec.map(get_from_db, queries)
for result in executor.map(func, inputs):

Point is, yes, there will still be niche cases where Executor.map isn't 
perfect, but this patch is intentionally a bit more minimal to keep the Python 
code base simple (no marshaling exceptions across thread boundaries) and avoid 
extreme behavioral changes; it has some smaller changes, e.g. it necessarily 
means input-iterator-triggered exceptions can be raised after some results are 
successfully produced, but it doesn't involve adding more implicit threading, 
marshaling exceptions across threads, etc.

Your proposed alternative, with a thread for prefetching inputs, a thread for 
sending tasks, and a thread for returning results creates a number of problems:

1. As you mentioned, if no prefetch limit is imposed, memory usage remains 
unbounded; if the input is cheap to generate and slow to process, memory 
exhaustion is nearly guaranteed for infinite inputs, and more likely for "very 
large" inputs. I'd prefer the default arguments to be stable in (almost) all 
cases, rather than try to maximize performance for rare cases at the expense of 
stability in many cases.

2. When input generation is CPU bound, you've just introduced an additional 
source of unavoidable GIL contention; granted, after the GIL fixes in 3.2, GIL 
contention tends to hurt less (before those fixes, I could easily occupy 1.9 
cores doing 0.5 cores worth of actual work with just two CPU bound threads). 
Particularly in the ProcessPoolExecutor case (where avoiding GIL contention is 
the goal), it's a little weird if you can end up with unavoidable GIL 
contention in the main process.

3. Exception handling from the input iterator just became a nightmare; in a 
"single thread performs input pulls and result yield" scenario, the exceptions 
from the input thread naturally bubble to the caller of Executor.map (possibly 
after several results have been produced, but eventually). If a separate thread 
is caching from the input iterator, we'd need to marshal the exception from 
that thread back to the thread running Executor.map so it's visible to the 
caller, and providing a traceba

[issue29842] Make Executor.map work with infinite/large inputs correctly

2018-07-25 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

In any event, sorry to be a pain, but is there any way to get some movement on 
this issue? One person reviewed the code with no significant concerns to 
address. There have been a duplicate (#30323) and closely related (#34168) 
issues opened that this would address; I'd really like to see Executor.map made 
more bulletproof against cases that plain map handles with equanimity.

Even if it's not applied as is, something similar (with prefetch count defaults 
tweaked, or, at the expense of code complexity, a separate worker thread to 
perform the prefetch to address Max's concerns) would be a vast improvement 
over the status quo.

--

___
Python tracker 
<https://bugs.python.org/issue29842>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34168] RAM consumption too high using concurrent.futures (Python 3.7 / 3.6 )

2018-07-25 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

Note: While this particular use case wouldn't be fixed (map returns in order, 
not as completed), applying the fix from #29842 would make many similar use 
cases both simpler to implement and more efficient/possible.

That said, no action has been taken on #29842 (no objections, but no action 
either), so I'm not sure what to do to push it to completion.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue34168>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1617161] Instance methods compare equal when their self's are equal

2018-06-21 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

If [].append == [].append is True in the "unique set of callbacks" scenario, 
that implies that it's perfectly fine to not call one of them when both are 
registered. But this means that only one list ends up getting updated, when you 
tried to register both for updates. That's definitely surprising, and not in a 
good way.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue1617161>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33864] collections.abc.ByteString does not register memoryview

2018-06-15 Thread Josh Rosenberg


Josh Rosenberg  added the comment:

memoryview isn't just for bytes strings though; the format can make it a 
sequence of many types of different widths, meanings, etc. Calling it a 
BytesString would be misleading in many cases.

--
nosy: +josh.r

___
Python tracker 
<https://bugs.python.org/issue33864>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33361] readline() + seek() on codecs.EncodedFile breaks next readline()

2018-05-21 Thread Josh Rosenberg

Change by Josh Rosenberg <shadowranger+pyt...@gmail.com>:


--
title: readline() + seek() on io.EncodedFile breaks next readline() -> 
readline() + seek() on codecs.EncodedFile breaks next readline()

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33361>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33381] Incorrect documentation for strftime()/strptime() format code %f

2018-05-02 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

Note: strftime follows the existing documentation:

>>> datetime.datetime(1970, 1, 1, microsecond=1).strftime('%f')
'01'

The strptime behavior bug seems like a duplicate of #32267, which claims to be 
fixed in master as of early January; may not have made it into a release yet 
though. I can't figure out how to view the patch on that issue, it doesn't seem 
to be linked to GitHub like normal.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33381>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33404] Phone Number Generator

2018-05-01 Thread Josh Rosenberg

Change by Josh Rosenberg <shadowranger+pyt...@gmail.com>:


--
versions:  -Python 3.6

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33404>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33404] Phone Number Generator

2018-05-01 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

You named your loop variable i, overlapping the name of your second to last 
digit, so you end up replacing the original value of i in each (given the 
break, the only) loop.

So before the loop begins, i has the expected value of '6', but on the first 
iteration, i is rebound to the value of a (the first element in the tuple), 
'5', and your format string uses that value instead. If you removed the break, 
you'd see the second to last digit cycle through all the other values as it 
goes, because i would be repeatedly rebound to each digit as it goes.

This is a bug in your code, not a problem with Python; in the future, direct 
questions of this sort to other online resources (e.g. Stack Overflow); unless 
you have a provable bug in Python itself, odds are it's a bug in your code's 
logic.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33404>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33315] Allow queue.Queue to be used in type annotations

2018-04-20 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

None of the actual classes outside of the typing module support this either to 
my knowledge. You can't do:

from collections import deque

a: deque[int]

nor can you do:

a: list[int]

Adding Queue to the typing module might make sense (feel free to edit it if 
that's what you're looking for), but unless something has changed in 3.7 (my 
local install is 3.6.4), it's never been legal to do what you're trying to do 
with queue.Queue itself with the original type, only with the special typing 
types that exist for that specific purpose.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33315>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33319] `subprocess.run` documentation doesn't tell is using `stdout=PIPE` safe

2018-04-20 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

If the goal is just to suppress stdout, that's what passing subprocess.DEVNULL 
is for (doesn't exist in Py2, but opening os.devnull and passing that is a 
slightly higher overhead equivalent).

subprocess.run includes a call to communicate as part of its default behavior, 
and stores its results, so call() isn't quite equivalent to run().returncode 
when PIPE was passed for standard handles, because call only includes an 
implicit call to wait, not communicate, and therefore pipes are not explicitly 
read and can block.

Basically, subprocess.run is deadlock-safe (because it uses communicate, not 
just wait), but if you don't care about the results, and the results might be 
huge, don't pass it PIPE for stdout/stderr (because it will store the complete 
outputs in memory, just like any use of communicate with PIPE).

The docs effectively tell you PIPE is safe; it returns a CompletedProcess 
object, and explicitly tells you that it has attributes that are (completely) 
populated based on whether capture was requested. If it had such attributes and 
still allowed deadlocks, it would definitely merit a warning.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33319>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33267] ctypes array types create reference cycles

2018-04-12 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

Pretty sure this is a problem with classes in general; classes are 
self-referencing, and using multiplication to create new ctypes array types is 
creating new classes.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33267>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19993] Pool.imap doesn't work as advertised

2018-04-12 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

Related: issue29842 "Make Executor.map work with infinite/large inputs 
correctly" for a similar problem in concurrent.futures (but worse, since it 
doesn't even allow you to begin consuming results until all inputs are 
dispatched).

A similar approach to my Executor.map patch could probably be used with 
imap/imap_unordered.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue19993>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33231] Potential memory leak in normalizestring()

2018-04-05 Thread Josh Rosenberg

New submission from Josh Rosenberg <shadowranger+pyt...@gmail.com>:

Patch is good, but while we're at it, is there any reason why this 
multi-allocation design was even used? It PyMem_Mallocs a buffer, makes a 
C-style string in it, then uses PyUnicode_FromString to convert C-style string 
to Python str.

Seems like the correct approach would be to just use PyUnicode_New to 
preallocate the final string buffer up front, then pull out the internal buffer 
with PyUnicode_1BYTE_DATA and populate that directly, saving a pointless 
allocation/deallocation, which also means the failure case means no cleanup 
needed at all, while barely changing the code (aside from removing the need to 
explicitly NUL terminate).

Only reason I can see to avoid this would be if the codec names could contain 
arbitrary Unicode encoded as UTF-8 (and therefore strlen wouldn't tell you the 
final length in Unicode ordinals), but I'm pretty sure that's not the case (if 
it is, we're not normalizing properly, since we only lower case ASCII). If 
Unicode codec names need to be handled, there are other options, though the 
easy savings go away.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33231>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33229] Documentation - io — Core tools for working with streams - seek()

2018-04-05 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

As indicated in the seek docs ( 
https://docs.python.org/3/library/io.html#io.IOBase.seek ), all three names 
were added to the io module in 3.1:

> New in version 3.1: The SEEK_* constants.

Since they're part of the io module too, there is no need to qualify them on 
the io module docs page.

They're available in os as well, but you don't need to import it to use them. 
The OS specific addition position flags are explicitly documented to be found 
on the os module.

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33229>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33200] Optimize the empty set "literal"

2018-04-02 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

I may have immediately latched onto this, dubbing it the "one-eyed monkey 
operator", the moment the generalized unpacking released.

I always hated the lack of an empty set literal, and enjoyed having this exist 
just to fill that asymmetry with the other built-in collection types (that 
said, I never use it in production code, nor teach it in classes I run).

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33200>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33124] Lazy execution of module bytecode

2018-03-28 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

Serhiy: There is a semi-common case where global constants can be quite 
expensive, specifically, initializing a global full of expensive to 
compute/serialize data so it will be shared post-fork when doing 
multiprocessing on a POSIX system. That said, that would likely be a case where 
lazy initialization would be a problem; you don't want each worker 
independently initializing the global lazily.

Also, for all practical purposes, aren't enums and namedtuples global constants 
too? Since they don't rely on any syntax based support at point of use, they're 
just a "function call" followed by assignment to a global name; you couldn't 
really separate the concept of global constants from enums/namedtuple 
definitions, right?

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33124>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33087] No reliable clean shutdown method

2018-03-16 Thread Josh Rosenberg

Josh Rosenberg <shadowranger+pyt...@gmail.com> added the comment:

To my knowledge, there is no safe way to do this for other threads for a reason.

If you make all your worker threads daemons, then they will terminate with the 
main thread, but they won't perform cleanup actions.

If you don't make them daemons, any "clean exit" procedure risks the threads 
choosing not to exit (even if you inject a SystemExit into every other thread, 
they might be in a try/except: or try/finally that suppresses it, or blocks 
waiting for something from another thread that has already exited, etc.). 
Exiting the thread that calls sys.exit() this way is considered okay, since you 
control when it is called, and it's up to you to do it at a safe place, but 
doing so asynchronously in other threads introduces all sorts of problems.

Basically, you want a reliable "shut down the process" and a reliable "clean up 
every thread", but anything that allows clean up in arbitrary threads also 
allows them to block your desired "shut down the process". Do you have a 
proposal for handling this?

--
nosy: +josh.r

___
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33087>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



<    1   2   3   4   5   6   7   8   >