[issue47082] No protection: `import numpy` in two different threads can lead to race-condition
Sebastian Berg added the comment: Thanks, so there should already be a lock in place (sorry, I missed that). But somehow we seem to get around it? Do you know what may cause the locking logic to fail in this case? Recursive imports in NumPy itself? Or Cython using low-level C-API? I.e. can you think of something to investigate that may help NumPy/Cython to make sure that locking is successful? /* Cythons Import code (slightly cleaned up for Python 3 only): */ static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int level) { PyObject *empty_list = 0; PyObject *module = 0; PyObject *global_dict = 0; PyObject *empty_dict = 0; PyObject *list; if (from_list) list = from_list; else { empty_list = PyList_New(0); if (!empty_list) goto bad; list = empty_list; } global_dict = PyModule_GetDict(__pyx_m); if (!global_dict) goto bad; empty_dict = PyDict_New(); if (!empty_dict) goto bad; { if (level == -1) { if ((1) && (strchr(__Pyx_MODULE_NAME, '.'))) { module = PyImport_ImportModuleLevelObject( name, global_dict, empty_dict, list, 1); if (!module) { if (!PyErr_ExceptionMatches(PyExc_ImportError)) goto bad; PyErr_Clear(); } } level = 0; } if (!module) { module = PyImport_ImportModuleLevelObject( name, global_dict, empty_dict, list, level); } } bad: Py_XDECREF(empty_list); Py_XDECREF(empty_dict); return module; } -- ___ Python tracker <https://bugs.python.org/issue47082> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue47082] No protection: `import numpy` in two different threads can lead to race-condition
Sebastian Berg added the comment: To add to this: it would seem to me that the side-effects of importing should be guaranteed to only be called once? However, IO or other operations could be part of the import side-effects and release the GIL. So even a simple, pure-Python, package could run into this same issue and probably won't even realize that they can run into it. (Assuming I understand how this is happening correctly.) So it would seem to me that either: * Python should lock on the thread level or maybe the `sys.modules` dictionary? * The `threading` module could somehow ensure safety by hooking into the import machinery? * Packages that may release the GIL or have side-effects that must only be run once have to lock (i.e. NumPy). (But it seems to me that many packages will not even be aware of possible issues.) -- nosy: +seberg ___ Python tracker <https://bugs.python.org/issue47082> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Tracing causes module globals to be mutated when calling functions from C
Sebastian Berg added the comment: While I have a repro for Python, I think the pre release of cython already fixes it (and I just did not regenerated the C sources when trying, I guess. A `git clean` to the rescue...). -- ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Tracing causes module globals to be mutated when calling functions from C
Sebastian Berg added the comment: Not reopening for now, but I will note again that (AFAIK) Cython uses `PyEval_EvalCodeEx`, and the docs say that it is not used internally to CPython anymore. So it seems pretty plausible that the bug is in `PyEval_EvalCodeEx` and not the generated Cython code? -- ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Tracing causes module globals to be mutated when calling functions from C
Change by Sebastian Berg : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Tracing causes module globals to be mutated when calling functions from C
Sebastian Berg added the comment: Thanks for having a look. I have confirmed this is related to Cython (no pandas/NumPy involved) – repro at https://github.com/seberg/bpo46451. What happens under the hood in Cython is probably: https://github.com/cython/cython/blob/master/Cython/Utility/ObjectHandling.c#L2569-L2611 Which generates `PyEval_EvalCodeEx`, and I could not repro with just a `PyObject_FastCallDict`, so I assume Cython is doing something wrong and will open an issue there, but if you have a quick tip as to what might wrong, that could be nice :). Otherwise, will just close this, and may reopen if Cython hits a wall. -- ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Tracing causes module globals to be mutated when calling functions from C
Change by Sebastian Berg : -- title: Possibly bad interaction with tracing and cython? -> Tracing causes module globals to be mutated when calling functions from C ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Possibly bad interaction with tracing and cython?
Sebastian Berg added the comment: Ahh, a further data-point. The name from the module scope that is overwritten IS a parameter name used in the function locals. Strangly, if I modify the tracing to print more: stop = 0 def trace(frame, event, arg): global stop if stop > 10: return None if np.core.numeric.dtype is not np.dtype: #print("Something happened here, `np.core.numeric.dtype IS np.dtype`") print(np.core.numeric.dtype) print(frame, event, arg) stop += 1 else: print(frame, event, arg) return trace Then what I get is: None call None None line None None line None None line None None line None bool So, upon entering the function, the value is (already) cleared/set to None (which is correct of course for `dtype=None`) and then while the function runs storing into the function locals _mutates_ the module global? For the fact that it keeps changing during the function run, points very strongly at CPython? -- ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46451] Possibly bad interaction with tracing and cython?
New submission from Sebastian Berg : Starting here, but there could be Cython interaction or something else in theory. But, when running the following: * Python 3.10.1 (not 3.9.9, debug version or not) * Setting a tracing function (not setting a trace-function will fix the issue) * Running Cython (maybe also C code) calling back into Python (the best I can tell) It can happen that module globals in the called funtions scope seem to be modified. Oddly enough to a value used in the locals of that function?! The pandas issue: https://github.com/pandas-dev/pandas/issues/41935 has a reproducer (sorry that it takes NumPy and pandas for now). I will paste it at the end here also. I can find that the value is modified by the time the `trace` function is called. No other "trace" triggers are processed before in this example (Cython calls other functions in NumPy, but all of those are C implemented, so I guess that is why). The function in question (unlike all?) should further be called with `__Pyx_PyFunction_FastCall`, so that is probably an important data point: Fastcall protocol seems involved. (Reproducible using NumPy 1.21.5 and Pandas 1.3.5, but except maybe pandas due to the Cython version, I don't expect version dependency.) The output of the script is very brief: Something happened here, `np.core.numeric.dtype IS np.dtype` call None The full reproducer script is: import sys import numpy as np import pandas as pd from numpy.core import numeric stop = False def trace(frame, event, arg): global stop if stop: return None if np.core.numeric.dtype is not np.dtype: print("Something happened here, `np.core.numeric.dtype IS np.dtype`") print(frame, event, arg) stop = True else: print(frame, event, arg) return trace sys.settrace(trace) pd._libs.lib.maybe_convert_objects(np.array([None], dtype=object)) For completeness, the Cython code calling the NumPy function in question, is (likley, there is more, this is Cython, I just cut it out a bit :)): #if CYTHON_FAST_PYCALL if (PyFunction_Check(__pyx_t_5)) { PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False}; __pyx_t_15 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error) __Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0; __Pyx_GOTREF(__pyx_t_15); __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0; } else #endif -- components: Interpreter Core messages: 411082 nosy: seberg priority: normal severity: normal status: open title: Possibly bad interaction with tracing and cython? type: crash versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue46451> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45383] PyType_FromSpec API fails to use metaclass of bases
Sebastian Berg added the comment: Btw. huge thanks for looking into this! Let me know if I can try to help out (I can make due with static metatypes, but my story seems much clearer if I could say: Well with Py 3.11 you can, and probably should, do it dynamically.). I had lost a lot of time chasing "metaclass should just work" followed by "but I can't get it right without bad hacks". And now the solution seems fairly clear, which is amazing :)! -- ___ Python tracker <https://bugs.python.org/issue45383> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45383] PyType_FromSpec API fails to use metaclass of bases
Sebastian Berg added the comment: Well, what we need is a way to say: I am calling `type.__new__` (i.e. PyType_FromSpec) on purpose from (effectively) my own `mytype.__new__`? That is, because right now I assume we want to protect users from calling PyType_FromSpec thinking that it is equivalent to calling `class new(base)` when it may not be if base is a metaclass. So calling `PyType_FromSpec` might refuse to work if it finds a custom `metaclass.__new__` (init?). I don't really see that it matters if we only support effectively this from C: ``` class MyMetaClass(type): def __new__(cls, *args, **kwargs): self = type.__new__(...) # this is PyType_FromSpec # more stuff ``` So, I thought telling `PyType_FromSpec` that we are "inside" a custom `__new__` is sufficient and that even as a flag passed as part of the spec could be enough. But... I agree that I do not quite see that it would be pretty, so it probably was a bad idea :). Plus, if you add a new method it should also solves the issue of setting the `tp_type` slot to the metaclass explicitly when it is not implicit by inheritance (which is the only thing I care about). (PyType_FromSpec and PyType_ApplySpec will still need to do the work of resolving the metaclass from the base classes, though.) -- ___ Python tracker <https://bugs.python.org/issue45383> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45383] PyType_FromSpec API fails to use metaclass of bases
Sebastian Berg added the comment: Fully, agree! In the end, `PyType_FromSpec` replaces `type.__new__()` (and init I guess) when working in C. In Python, we would call `type.__new__` (maybe via super) from the `metatype.__new__`, but right now, in C, the metatype cannot reliably use `PyType_FromSpec` in its own `metatype.__new__` to do the same. I agree with the scenarios: * If we do not have a custom `metatype.__new__` (init?) then `PyType_FromSpec` should have no reason to refuse doing the work, because nothing can go wrong. * If we do have a custom `tp_new` the user has to provide C API to create the metaclass instance. But they still need a way to call `type.__new__` in C (i.e. get what `PyType_FromSpec` does, and promising to do the rest). `PyType_ApplySpec` would provide that way to create a custom `metatype.__new__` in C when `PyType_FromSpec()` would otherwise reject it to make the first scenario safe. A flag probably can do the same. I have no preference, `ApplySpec` seems great to me. -- ___ Python tracker <https://bugs.python.org/issue45383> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45383] PyType_FromSpec API fails to use metaclass of bases
Sebastian Berg added the comment: It is probably early, but I am starting to like the idea of a "C MetaClass factory" helper/indicator. It seems to me that yes, at least `tp_new` cannot be called reasonable for a class that is created in C, it is just too confusing/awkward to try to push the C stuff _through_ the Python API. (And the Python API is the typical one that should not be inconvenienced) Which gives two possibilities if I want this capability?: 1. Force use of a custom class factory in Python (i.e. not through `__new__`). But that seems hard, since I need to refuse `__new__()`!). 2. Use a class factor in C which never calls `__new__` and knows that this is OK. This is not my turf, so I like unholy, but maybe pragmatic things :). Would a slot/flag indicating `Py_using_metaclass_cinit_pretty_promise_please` "solve" these issues? I mean, we have two ways to create classes (C and Python), I am not sure it is plausible to untangle this on the MetaClass level, so maybe the only way forward is to embrace it: Some Python MetaClasses just can't be instantiated from C, because we don't know it will work. If we want to allow instantiating the MetaClass from C, we need some way to set this up. And either we cannot use `PyType_FromSpec` then, or we need to tell it that we know what we are doing. -- ___ Python tracker <https://bugs.python.org/issue45383> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45383] PyType_FromSpec API fails to use metaclass of bases
Sebastian Berg added the comment: Sorry, I need some time to dive back into this, so some things might be garbled :). Yes, I do agree supporting a custom `tp_new` here seems incredibly tricky. I have not thought about the implications of this, though. > guess the safest option is to fail when the metaclass has a custom tp_new That would seem OK, but I can't quite judge it. It may mean that I have to do a custom factory to create new metaclass instances from Python, but that is probably for the better anyway. Now, calling `tp_new` is a bit useless, since from C we don't have a dict anyway (at least not really). So yeah, this does not make sense at all for most Python defined metaclasses... (they may want to inspect/mutate the dict) > But at that point, this is duplicating a lot of existing functionality, and > I'm starting to wonder if this wouldn't all be better with calling the > metaclass instead. I do not think I am following :(. My worry is that I want people to create a MetaClass instance through C (but not be locked into static types forever). My current thought is that I would like it be possible to do something like: /* Create a new type object */ type_spec = {stuff}; newtype = PyType_FromSpec(&type_spec); /* Finalize the MetaClass */ metatype_spec = {more_stuff}; PyArray_InitDTypeFromSpec(newtype, &metatype_spec); Of course I can move this into a single function and create the class for the user. But I am uncertain that piping everything through `tp_new` will help? At some point I thought that the user should create a subclass, and I create another class inheriting it. But it also seems convoluted and complicated. I have no idea right now, but maybe there could also be a way to make creating a metaclass-factory in C easier, rather than supporting `PyType_FromSpec` for metaclasses. (I.e. an `PType_InitFromSpec()` doing most of what `PyType_FromSpec` does, but really meant to be only used be such metaclass factories.) > - basicsize/itemsize could be allowed with __basicsize__/__itemsize__ in the > dict. Do we need this? I need the basicsize of the metaclass, but that of the class seems fixed/OK? > - flags: we could add mechanisms to set individual flags after the type is > created, as needed. Seems fine, yeah. > - slots can usually be applied after the class is created; maybe there should > be a public function for this. > - members could theoretically be copied to individual descriptors; there > doesn't seem much need for keeping tp_members around. But a Python MetaClass (that the author may not even realize about) might need access to these to work properly? A bit far fetched, but... -- ___ Python tracker <https://bugs.python.org/issue45383> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45383] PyType_FromSpec API fails to use metaclass of bases
New submission from Sebastian Berg : The PyType_FromSpec fails to take care about MetaClasses. https://bugs.python.org/issue15870 Asks to create a new API to pass in the MetaClass. This issue is only about "inheriting" the metaclass of the bases correctly. Currently, Python fails to take into account that the bases may be MetaClass and not `PyType_Type` instances. -- components: C API, Interpreter Core messages: 403273 nosy: petr.viktorin, seberg priority: normal pull_requests: 27093 severity: normal status: open title: PyType_FromSpec API fails to use metaclass of bases type: crash versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue45383> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Sebastian Berg added the comment: Yeah, I will try and have a look. I had posted the patch, because the test looked like a bit of a larger chunk of work ;). > And I'm surprised that you're surprised :) :). I am coming from a completely different angle, probably. Just if you are curious, I use a from-spec like API (not public yet) in NumPy and dynamic use seemed natural/logical in that use-case, e.g.: https://github.com/numpy/numpy/blob/c92864091e5928d92bc109d1505febe35f3909f1/numpy/core/src/multiarray/convert_datatype.c#L2434 But, I am trying to understand the preference for static better. There is probably something to learn or even important I am missing. > And I have some plans to make static specs useful in type checking, since we > can assume that types made from the same spec share the memory layout. (I don't understand how it is useful, unless you reuse slot structs?) It sounds interesting, but even static does not guarantee constant unless the user indicates it through a flag? Maybe you could achieve this by figuring it out by inspecting/comparing content rather than using the spec pointer(s)? (More complex, but also more powerful?) -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Sebastian Berg added the comment: > But if tp_name is created dynamically, it could lead to a dangling pointer. I will guess this is probably just an oversight/bug since the main aim was to move towards heap-types, and opened an issue: https://bugs.python.org/issue45315 -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45315] `PyType_FromSpec` does not copy the name
New submission from Sebastian Berg : As noted in the issue: https://bugs.python.org/issue15870#msg402800 `PyType_FromSpec` assumes that the `name` passed is persistent for the program lifetime. This seems wrong/unnecessary: We are creating a heap-type, a heap-type's name is stored as a unicode object anyway! So, there is no reason to require the string be persistent, and a program that decides to build a custom name dynamically (for whatever reasons) would run into a crash when the name is accessed. The code: https://github.com/python/cpython/blob/0c50b8c0b8274d54d6b71ed7bd21057d3642f138/Objects/typeobject.c#L3427 Should be modified to point to to the `ht_name` (utf8) data instead. My guess is, the simplest solution is calling `type_set_name`, even if that runs some unnecessary checks. I understand that the FromSpec API is mostly designed to replace the static type definition rather than extend it, but it seems an unintentional/strange requirement. -- components: Interpreter Core messages: 402804 nosy: seberg priority: normal severity: normal status: open title: `PyType_FromSpec` does not copy the name type: crash versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue45315> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Sebastian Berg added the comment: Just to note, that there are two – somewhat distinct – issues here in my opinion: 1. `FromSpec` does not scan `bases` for the correct metaclass, which it could; this could even be considered a bug? 2. You cannot pass in a desired metaclass, which may require a new API function. My patch tries to address the first (the class creator has to take care that this is reasonable for the metaclass). I had hoped the `slot` mechanism can avoid the API discussion for the second one, but I guess not. On the discussion on `tp_type/meta` being incorrect usage: I am slightly surprised we actually care about static C-definitions? There is no reason that a `spec` should be declared static (aside that you have to move it into the function otherwise)? Everything is copied by `_FromSpec` after all. However, I suppose that would replace a safe-by-design API with a "best practice" to never define the spec/slots statically (a best practice that is probably not generally followed or even advertised currently, I guess). -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Sebastian Berg added the comment: I can make a PR from the patch (and add the `Py_tp_metaclass` slot if desired) with a basic test here, if that is what is blocking things. Fixing the type and size of the allocation (as the patch does) would allow me to give people a way to create a new NumPy DType dynamically. I only need the user to call my initialization function as soon as the type was created (with `PyType_FromSpec` or otherwise). (And I can avoid any internal acrobatics creating the type for the user; this stuff tends to be super confusing even if the principles are fairly straight forward...) Happy to pursue other avenues, but I am not clear which... > IIRC, mixing function pointers and data pointers doesn't work on some > platforms? ... I guess it is too late to do some weird thing like (not sure it would be reasonable or is valid anyway though): typedef union { void *pdata; void (*pfunc)(void); } slot_value; I am a bit interested in it, because I want to use a `FromSpec` API in NumPy and it would be nice to be sure I can grow it to include data without too much hassle. But the easier thing may just be to add one or two `void *reserved` slot to the spec struct that must be NULL for now... -- ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15870] PyType_FromSpec should take metaclass as an argument
Sebastian Berg added the comment: I am still fighting with this (and the issues surrounding it) for NumPy. The main point is that my new DTypes in NumPy are metaclasses that extend the (heap)type struct. That just feels right and matches the structure perfectly, but seems to get awkward when you want (users) to dynamically create new MetaClass instances. It also works fine: I could allow creating new instances from Python (although I don't need it now) and allow users to create static types (for this I have to fix the size of the additional fields for a stable ABI, but that is fine, I can allocate a new opaque slot). But cython or heaptypes (stable API) seem not accessible without annoying hacks... For this concrete issue, would it be acceptable to use the base classes to correctly find the metaclass and use its alloc? I personally don't see why a new signature would be needed even if the metaclass was to be passed (I do not need this), it seems like you could pass a `Py_tp_meta` slot and not add a new function? I have attached a potential patch (it is a bit large because it needs to move the `bases` discovery code to before the `res` allocation). Doing this would allow to provide a `FromSpec` function which internally calls `PyType_FromSpec`. That may not make things much neater for cython, but it feels like this is really just a bug fix? Unless you want to consider any extension of the type struct an unofficial hack to begin with :). (Python metaclasses won't get their `__new__` called, but that is already the case, presumably a metaclass author will provide their own `InitFromSpec` function that must be called immediately after type creation, or just create the type completely. I can do the second even now probably, but this tiny change would make it a lot cleaner.) Trying more desperate angles, I was even wondering today if I should instead petition for Python to "donate" a `tp_metaslot` type slot... A single `void *` unused by `PyType_Type` but available to metatypes to use as they wish. While a bit strange, that might even simplify some ABI concerns or cython compatibility ;). -- nosy: +seberg Added file: https://bugs.python.org/file50285/typeobject.patch ___ Python tracker <https://bugs.python.org/issue15870> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow
Sebastian Berg added the comment: Thanks for looking into it! `cpow` is indeed complicated. We had some discussion in NumPy about `0**y` branch cuts (I did yet not finish it, because thinking about the branch cuts is tricky). It is almost reassuring that the C standard also hedges out :). Although, it means that there is no reliance on math libraries if we want to treat the special cases more carefully :(. -- ___ Python tracker <https://bugs.python.org/issue44698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow
Sebastian Berg added the comment: (Sorry for the spam. I think we can/should just hard-code the expected values in the NumPy test-suite. So this is not actually an issue for NumPy and probably just warrants a double-check that the behaviour change is desirable.) -- ___ Python tracker <https://bugs.python.org/issue44698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow
Sebastian Berg added the comment: Hmm, sorry, I overshot/misread :(. The thing that the NumPy test-suite trips over is that: c_powi(inf+0j, 3) seems to not raise, but: _Py_c_pow(inf+0j, 3.+0j) (or nan+0.j rather then inf+0j) does seem to raise (returning `nan+nanj` in both cases). If this is the `inf` code path, raising an error may actually be fix and the integer code path should maybe also do it. -- ___ Python tracker <https://bugs.python.org/issue44698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow
Sebastian Berg added the comment: The fix broke NumPy (see also https://github.com/numpy/numpy/pull/19612) It seems incorrect. After all, it doesn't matter much whether the float can be converted to an integer correctly (or even if it returns an undefined value), so long `int_value == float_value` remains sensible. The old cast cast integers to complex when they were out of range (which is fine), the new code raises an error instead. -- nosy: +seberg ___ Python tracker <https://bugs.python.org/issue44698> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43608] `bytes_concat` and Buffer cleanup
New submission from Sebastian Berg : `pybytes_concate` currently uses the following code to get the data: va.len = -1; vb.len = -1; if (PyObject_GetBuffer(a, &va, PyBUF_SIMPLE) != 0 || PyObject_GetBuffer(b, &vb, PyBUF_SIMPLE) != 0) { PyErr_Format(PyExc_TypeError, "can't concat %.100s to %.100s", Py_TYPE(b)->tp_name, Py_TYPE(a)->tp_name); goto done; } I don't actually know if it is realistically possible to issues here (I ended up here by chasing the wrong thing). But this (and the identical code in `bytearray`) strictly rely on `view->len` not being modified on error (or else may not clean `va`)! That seems wrong to me? Although, I now saw that `PyBuffer_GetBuffer` says: If the exporter cannot provide a buffer of the exact type, it MUST raise PyExc_BufferError, set view->obj to NULL and return -1. Pretty much all code in NumPy (and cpython as far as I can tell), will guarantee that `obj` (and `len` probably) is untouched on error, but it will not set it to NULL! I can see some wisdom in NULL'ing `view->obj` since it means the caller can call `PyBuffer_Release` unconditionally (but then we have to explicitly do that!). But realistically, it seems to me the realistic thing is to say that a caller must never release an unexported buffer and make no assumption about its content? (Which doesn't mean I won't ensure NumPy will keep `len` and `obj` unmodified or NULL `obj` on error.) -- components: C API messages: 389428 nosy: seberg priority: normal severity: normal status: open title: `bytes_concat` and Buffer cleanup versions: Python 3.10, Python 3.9 ___ Python tracker <https://bugs.python.org/issue43608> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40522] [subinterpreters] Get the current Python interpreter state from Thread Local Storage (autoTSSkey)
Change by Sebastian Berg : -- nosy: +seberg ___ Python tracker <https://bugs.python.org/issue40522> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15751] [subinterpreters] Make the PyGILState API compatible with subinterpreters
Sebastian Berg added the comment: In NumPy ufuncs and datatype casting (iteration) we have the following setup: user-code (releasing GIL) -> NumPy API -> 3rd-party C-function (in the ufunc code, numpy is the one releasing the GIL, although others, such as numba probably hook in and replace that.) Now, I want the 3rd-party C-function to be able to report errors in a reasonable way. Which, in my opinion means it must be able to grab the GIL, set an error (potentially release the GIL) and return an error result. In theory, setting the error could be deferred to some later "cleanup" when the GIL is held, but that just does not seem practical to me. One thing which may make this not as problematic is that the all of this can be expected to run within a Python created thread, so I somewhat think the initial proposal here would effectively fix the current NumPy usage (I am not sure). The reason I ask/post this is, that this is all part of public-API which requires a complete re-implementation very soon. While extensions to that new API may be possible, that is always a bit painful, so it would be good to know how that API should be designed right now. At this point, it seems that I should design the 3rd-party C-function API so that it is passed a `void *reserved = NULL` (always NULL) to be able to pass a `PyThreadState *` or `PyInterpreterState *` at some point safely. (Or a `PyThreadState **`/ `PyInterpreterState **`?) In the majority of cases, I could already pass this right now, but I have currently no clear idea of what is the best practice. I also need to take such an argument (requiring new API) in at least one place. If we know how this will pan out, adding it sooner rather than later would be good, since it will make future transition/adoption faster or at least easier. As far as I understand, right now PyGILState_Ensure()` not working is probably the single most blocking issue for correct subinterpreter support in NumPy, since this is baked into public API it may take years for true support in the above way, but we may be able to go a large part of the way now. My problem is that I can only do that if I am clear on what is needed. -- nosy: +seberg ___ Python tracker <https://bugs.python.org/issue15751> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue41037] Add (optional) threadstate to: PyOS_InterruptOccurred()
New submission from Sebastian Berg : In https://bugs.python.org/issue40826 it was defined that `PyOS_InterruptOccurred()` can only be called with a GIL. NumPy had a few places with very unsafe sigint handling (not thread-safe). But generally when we are in a situation that catching sigints would be nice as an enhancement, we do of course not hold the GIL. So I am wondering if we can find some kind of thread-safe solution, or even just recipe. Briefly looking at the code, it seemed to me that adding a new function with an additional `tstate` signature: PyOS_InterruptOccurred(PyThreadState *tstate) Could work, and be a simple way to allow this in the future? It is probably not high priority for us (NumPy had only one place where it tried to stop long running computations). But right now I am unsure if the function has much use if it requires the GIL to be held and a `tstate=NULL` argument seemed reasonable (if it works), except that it adds a C-API name. -- components: C API messages: 371885 nosy: seberg, vstinner priority: normal severity: normal status: open title: Add (optional) threadstate to: PyOS_InterruptOccurred() type: enhancement versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue41037> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39471] Meaning and clarification of PyBuffer_Release()
Sebastian Berg added the comment: Ok, I will just close it. It is painfully clear that e.g. `mmap` uses it this way to prohibit closing, and also `memoryview` has all the machinery necessary to do counting of how many exports, etc. exists. I admit, this still rubs me the wrong way, and I think it means that we may need to check `bf_releasebuffer != NULL` also in NumPy. We still have the issue of not being able to use `releasebuffer` easily in NumPy. But there is nothing to be done, unless we can consider limiting the `"#s"`, etc. type argparsing, which is likely not an option. -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue39471> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39471] Meaning and clarification of PyBuffer_Release()
Sebastian Berg added the comment: I went through Python, `array` seems to not break the logic. pickling has a comment which specifically wants to run into the argument parsing corner case above (I am not sure that it is really important). However, `Modules/_testbuffer.c` (which is just test) actually does allocate new memory for the buffer! I can be convinced that this is how it is supposed to be, but right now I am still not quite. The main idea is efficient no-copy data sharing. It is always possible to create a `to_memoryview()` method if a realloc was desired/necessary... -- ___ Python tracker <https://bugs.python.org/issue39471> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39471] Meaning and clarification of PyBuffer_Release()
Sebastian Berg added the comment: Hmmm, it seems I had missed this chunk of PEP 3118 before: > Exporters will need to define a bf_releasebuffer function if they can > re-allocate their memory, strides, shape, suboffsets, or format variables > which they might share through the struct bufferinfo. Which reads like the opposite of what I would like to see, unfortunately. Which I guess means that the parsing issue should be addressed differently. Opinions still welcome, though. -- ___ Python tracker <https://bugs.python.org/issue39471> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39471] Meaning and clarification of PyBuffer_Release()
New submission from Sebastian Berg : The current documentation of ``PyBuffer_Release()`` and the PEP is a bit fuzzy about what the function can and cannot do. When an object exposes the buffer interface, I believe it should always return a `view` (in NumPy speak) of its own data, i.e. the data exposed by the object is also owned by it directly. On the other hand the buffer view _itself_ has fields such as `strides`, etc. which may need allocating. In other words, I think `PyBuffer_Release()` should be documented to deallocate/invalidate the `Py_buffer`. But, it *must not* invalidate the actual memory it points to. If I copy all information out of the `Py_buffer` and then free it, the copy must still be valid. I think this is the intention, but it is not spelled out clear enough, it is also the reason for the behaviour of the "#s", etc. keyword argument parsers failing due to the code: if (pb != NULL && pb->bf_releasebuffer != NULL) { *errmsg = "read-only bytes-like object"; return -1; } which in turn currently means NumPy decides to _not_ implement bf_releasebuffer at all (leading to very ugly work arounds). I am happy to make a PR, if we can get to a point where everyone is absolutely certain that the above interpretation was always correct, we could clean up a lot of code inside NumPy as well! -- components: C API messages: 360809 nosy: seberg priority: normal severity: normal status: open title: Meaning and clarification of PyBuffer_Release() type: behavior versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue39471> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39274] Conversion from fractions.Fraction to bool
Change by Sebastian Berg : -- keywords: +patch pull_requests: +17412 stage: -> patch review pull_request: https://github.com/python/cpython/pull/18017 ___ Python tracker <https://bugs.python.org/issue39274> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39347] Use of argument clinic like parsing and `METH_FASTCALL` support in extension modules
New submission from Sebastian Berg : This is mainly an information request, so sorry if its a bit besides the point (I do not mind if you just close it). But it seemed a bit too specific to get answers in most places... In Python you use argument clinic, which supports `METH_FASTCALL`, that seems pretty awesome. For extension modules, I am not sure that argument clinic is a straight forward choice, since it probably generates code specific to a single Python version and also using, while we need to support multiple versions (including versions that do not support `METH_FASTCALL`. So the question would be if there is some idea for providing such C-API, or for example exposing `_PyArg_UnpackKeywords` (which is at the heart of kwarg parsing). My use-case is that some NumPy functions do have a nice speedup when using `METH_FASTCALL` and better argument clinic style faster arg-parsing. Which is why, looking at these things, I practically reimplemented a slightly dumbed down version of `PyArg_ParseTupleAndKeywords` working much like argument clinic (except less smart and only using macros and no code generation). That seems viable, but also feels a bit wrong, so I am curious if there may be a better solution or whether it would be plausible to expose `_PyArg_UnpackKeywords` to reduce code duplication. (although I suppose due to old python version support that would effectively take years) For completeness, my code in question is here: https://github.com/numpy/numpy/pull/15269 with the actual usage pattern being: static PyObject *my_method(PyObject *self, NPY_ARGUMENTS_DEF) { NPY_PREPARE_ARGPARSER; PyObject *argument1; int argument2 = -1; if (!npy_parse_arguments("method", 1, -1, NPY_ARGUMENTS_PASS), "argument1", NULL, &argument1, "argument2", &PyArray_PythonPyIntFromInt, &argument2, NULL, NULL, NULL) { return NULL; } } -- components: Argument Clinic messages: 360073 nosy: larry, seberg priority: normal severity: normal status: open title: Use of argument clinic like parsing and `METH_FASTCALL` support in extension modules type: enhancement ___ Python tracker <https://bugs.python.org/issue39347> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39274] Conversion from fractions.Fraction to bool
Sebastian Berg added the comment: Thanks for the quick responses. @Victor Stinner, I suppose you could change `numbers.Complex.__bool__()` by adding the no-op bool to make it: `bool(self != 0)`. But I am not sure I feel it is necessary. NumPy is a bit a strange in that it uses its own boolean scalar, and it is easy to override the slot for such objects. -- nosy: +seberg ___ Python tracker <https://bugs.python.org/issue39274> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39028] ENH: Fix performance issue in keyword extraction
Sebastian Berg added the comment: Fair enough, we had code that does it the other way, so it seemed "too obvious" since the current check seems mainly useful with few kwargs. However, a single kwarg is super common in python, while many seem super rare (in any argument clinic function) Which is why I had even trouble finding a function where it could make a difference, since it needs many kwargs. With changes: sebastian@seberg-x1 ~/python-dev/bin % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)' 100 loops, best of 5: 205 nsec per loop sebastian@seberg-x1 ~/python-dev/bin % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)' 100 loops, best of 5: 207 nsec per loop On master: sebastian@seberg-x1 ~/python-dev/bin % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)' 100 loops, best of 5: 221 nsec per loop sebastian@seberg-x1 ~/python-dev/bin % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", signed=True)' 100 loops, best of 5: 218 nsec per loop Which, at close to 5% is barely noticeable, I suppose with more kwargs it could start to make a difference. With less than 3, it just does not matter I guess. Also, I am currently not sure how likely non-interned strings could actually happen. But I am not sure it matters to optimize for them (unless PyArg_From* functions use them). In any case, sorry if this was noise, happy to fix things up or just drop it if many kwargs are considered non-existing. -- ___ Python tracker <https://bugs.python.org/issue39028> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39028] ENH: Fix performance issue in keyword extraction
New submission from Sebastian Berg : The keyword argument extraction/finding function seems to have a performance bug/enhancement (unless I am missing something here). It reads: ``` for (i=0; i < nkwargs; i++) { PyObject *kwname = PyTuple_GET_ITEM(kwnames, i); /* ptr==ptr should match in most cases since keyword keys should be interned strings */ if (kwname == key) { return kwstack[i]; } assert(PyUnicode_Check(kwname)); if (_PyUnicode_EQ(kwname, key)) { return kwstack[i]; } } ``` However, it should be split into two separate for loops, using the `PyUnicode_EQ` check only if it failed for _all_ other arguments. I will open a PR for this (it seemed like a bpo number is wanted for almost everything. -- components: C API messages: 358287 nosy: seberg priority: normal severity: normal status: open title: ENH: Fix performance issue in keyword extraction versions: Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue39028> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39028] ENH: Fix performance issue in keyword extraction
Change by Sebastian Berg : -- keywords: +patch pull_requests: +17049 stage: -> patch review pull_request: https://github.com/python/cpython/pull/17576 ___ Python tracker <https://bugs.python.org/issue39028> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37980] regression when passing numpy bools to sorted(..., reverse=r)
Sebastian Berg added the comment: I applaud the stricter rules in general, as Mark noted nicely, the issue is that `__index__` is maybe a strange way to achieve that for bools (it is not like `123` is a clean bool)? `__nonzero__` coerces to bools, there is no `__bool__` to convert to bool safely. Basically: this seems to force numpy to back down from saying that `list[np.True_]` will be invalid in the future. (And we cannot just get rid of our bools unfortunately). -- ___ Python tracker <https://bugs.python.org/issue37980> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37980] regression when passing numpy bools to sorted(..., reverse=r)
Change by Sebastian Berg : -- nosy: +seberg ___ Python tracker <https://bugs.python.org/issue37980> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26568] Add a new warnings.showwarnmsg() function taking a warnings.WarningMessage object
Sebastian Berg added the comment: To make warning testing saner, in numpy we added basically my own version of catch_warnings on steroids, which needed/will need changing because of this. Unless I missed it somewhere, this change should maybe put into the release notes to warn make it a bit easier to track down what is going on. -- nosy: +seberg ___ Python tracker <http://bugs.python.org/issue26568> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23352] PyBuffer_IsContiguous() sometimes returns wrong result if suboffsets != NULL
Sebastian Berg added the comment: I do not have an opinion either way, but I do think there is a small difference here compared to the other issue. With the other issue, there are cases where we cannot set the strides correctly. If you ask numpy directly whether the array is contiguous (or create it from numpy) and it says "yes", and then you get it without asking for a contiguous array it is, without the change, possible to get an array that would *not* be considered contiguous. On the other hand, setting suboffsets to NULL when all are -1 is always possible, if tedious I admit. -- ___ Python tracker <http://bugs.python.org/issue23352> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23352] PyBuffer_IsContiguous() sometimes returns wrong result if suboffsets != NULL
Sebastian Berg added the comment: Numpy does not understand suboffsets. The buffers we create will always have them NULL. The other way around To be honest, think it is probably ignoring the whole fact that they might exist at all :/, really needs to be fixed if it is the case. -- ___ Python tracker <http://bugs.python.org/issue23352> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: Antoine, sounds good to me, I don't mind this being in python rather sooner then later, for NumPy itself it does not matter I think. I just wanted to warn that there were problems when we first tried to switch in NumPy, which, if I remember correctly, is now maybe 2 years ago (in a dev version), though. -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: @pitrou, yes of course. This would make python do the same thing as numpy does (currently only with that compile flag given). About the time schedule, I think I will try to see if some other numpy dev has an opinion. Plus, should look into documenting it for the moment, so that someone who reads up on the buffer protocol should get things right. -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: Numpy 1.9. was only released recently, so 1.10. might be a while. If no problems show up during release or until then, we will likely switch it by then. But that could end up being a year from now, so I am not sure if 3.6 might not fit better. The problems should be mostly mitigated on our side. So bug-wise it shouldn't be a big issue I would guess. I will try to look at it more soon, but am completly overloaded at least for the next few days, and maybe some other numpy devs can chip in. Not sure I get your last point, slicing should give the "organic" values even for the mangled up thing with relaxed strides on (currently)?! -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: Yeah, the code does much the same as the old numpy code (at least most of the same funny little things, though I seem to remember the old numpy code had something yet a bit weirder, would have to check). To be honest, I do not know. It isn't implausible that the original numpy code dates back 15 years or more to numeric. I doubt whoever originally wrote it thought much about it, but there may be some good reason, and there is the safety considerations that people use the strides in a way they should not, which may trip us here in any case. -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Changes by Sebastian Berg : Added file: http://bugs.python.org/file36680/contiguous.py ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Changes by Sebastian Berg : Added file: http://bugs.python.org/file36678/contiguous.py ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Changes by Sebastian Berg : Added file: http://bugs.python.org/file36677/relaxed-strides-checking.patch ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: I am very sorry. The attached patch fixes this (not sure if quite right, but if anything should be more general then necessary). One test fails, but it looks like exactly the intended change. -- Added file: http://bugs.python.org/file36676/relaxed-strides-checking.patch ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: An extra dimension is certainly not irrelevant! The strides *are* valid and numpy currently actually commonly creates such arrays when slicing. The question is whether or not we want to ignore them for contiguity checks even if they have no effect on the memory layout. So there are three options I currently see: 1. Python also generalizes like I would like numpy to end up in the future (the current patch should do that) and just don't care about such strides, because the actual memory layout is what matters. 2. We say it is either too dangerous (which may very well be) or you want to preserve Fortran/C-order information even when it does not matter to the memory layout. This leads to this maybe: 2a) we just keep it as it is and live with minor inconsistencies (or never do the relaxed strides in numpy) 2b) We let these buffers return False on checking for contiguity but *allow* allow fetching a buffer when C-/F-contiguous is explicitly asked for when getting the buffer. Which is a weird middle way, but it might actually be a pretty sane solution (have to think about it). -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: Well, the 9223372036854775807 is certainly no good for production code and we would never have it in a release version, it is just there currently to expose if there are more problems. However I don't care what happens on overflow (as long as it is not an error). Note that the stride here is on a dimension with shape 1. The only valid index is thus always 0 and 0*9223372036854775807=0, so the stride value does not actually matter when calculating offsets into the array. You could simply set it to 80 to get something that would be considered C-contiguous or to 8 to get something that is considered F-contiguous. But both is the case in a way, so just "cleaning up" the strides does not actually get you all the way. -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: To be clear, the important part here, is that to me all elements *can* be accessed using that scheme. It is not correct to assume that `stride[-1]` or `stride[0]` is actually equal to `itemsize`. In other words, you have to be able to pass the pointer to the start of a c-contiguous array into some C-library that knows nothing about strides without any further thinking. The 0-strides badly violate that. -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
Sebastian Berg added the comment: #12845 should be closed, seems like a bug in some old version. The definition now is simply that the array is contiguous if you can legally access it in a contiguous fashion. Which means first stride is itemsize, second is itemsize*shape[0] for Fortran, inverted for C-order. -- ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue22445] Memoryviews require more strict contiguous checks then necessary
New submission from Sebastian Berg: In NumPy we decided some time ago that if you have a multi dimensional buffer, shaped for example 1x10, then this buffer should be considered both C- and F-contiguous. Currently, some buffers which can be used validly in a contiguous fashion are rejected. CPython does not support this currently possibly creating smaller nuisance if/once we change it fully in NumPy, see for example https://github.com/numpy/numpy/issues/5085 I have attached a patch which should (sorry I did not test this at all yet) relax the checks as much as possible. I think this is right, but we did some subtle breaks in user code (mostly cython code) when we first tried changing it, and while numpy arrays may be more prominently C/F-contiguous, compatibility issues with libraries checking for contiguity explicitly and then requesting a strided buffer are very possible. If someone could give me a hint about adding tests, that would be great. Also I would like to add a small note to the PEP in any case regarding this subtlety, in the hope that more code will take care about such subtleties. -- components: Library (Lib) files: relaxed-strides-checking.patch keywords: patch messages: 227113 nosy: seberg priority: normal severity: normal status: open title: Memoryviews require more strict contiguous checks then necessary type: enhancement Added file: http://bugs.python.org/file36663/relaxed-strides-checking.patch ___ Python tracker <http://bugs.python.org/issue22445> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7279] decimal.py: == and != comparisons involving NaNs
Sebastian Berg added the comment: Thanks, yes, you are right, should have googled a bit more anyway. Though I did not find much on the hashable vs unhashable itself, so if I ever stumble across it again, I will write a mail... -- ___ Python tracker <http://bugs.python.org/issue7279> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7279] decimal.py: == and != comparisons involving NaNs
Sebastian Berg added the comment: This is closed, and maybe I am missing something. But from a general point of view, why does hashing of NaN not raise an error as it did for decimals, i.e. why was this not resolved exactly the other way around? I am mostly just wondering about this it is not a problem for me. Hashing NaNs seems dangerous and surprising because it might work in dicts/sets, but normally doesn't. (The only thing for it to work right would be if NaN was a singleton, but that is impossible for subclasses, etc.). The thing is: In [16]: s = {float('nan'): 1, float('nan'): 2, float('nan'): 3} In [17]: s Out[17]: {nan: 1, nan: 2, nan: 3} In [18]: s[float('nan')] KeyError: nan In [19]: n = float('nan') In [20]: s = {n: 1, n: 2, n: 3} In [21]: s Out[21]: {nan: 3} This is because `n is n`, and PyObject_RichCompareBool assumes that if `a is b` then `a == b` which is simply wrong for NaNs and also makes comparisons of iterables including NaNs an impossible business. NaNs have their unavoidable weirdness, but at least for dictionaries/sets it would seem more clear to me if they raised an error. -- nosy: +seberg ___ Python tracker <http://bugs.python.org/issue7279> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16845] warnings.simplefilter should validate input
New submission from Sebastian Berg: `warnings.simplefilter` does not validate that the category passed in is actually a class. This means that an invalid category leads to a `TypeError` whenever a warning would otherwise occur due to `issubclass` check failing. It is a very small thing, but for usability it would be clearer if this was checked right away. -- messages: 178867 nosy: seberg priority: normal severity: normal status: open title: warnings.simplefilter should validate input type: enhancement ___ Python tracker <http://bugs.python.org/issue16845> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com