[issue47082] No protection: `import numpy` in two different threads can lead to race-condition

2022-03-21 Thread Sebastian Berg


Sebastian Berg  added the comment:

Thanks, so there should already be a lock in place (sorry, I missed that).  But 
somehow we seem to get around it?

Do you know what may cause the locking logic to fail in this case?  Recursive 
imports in NumPy itself?  Or Cython using low-level C-API?

I.e. can you think of something to investigate that may help NumPy/Cython to 
make sure that locking is successful?


/* Cythons Import code (slightly cleaned up for Python 3 only): */
static PyObject *__Pyx_Import(PyObject *name, PyObject *from_list, int level) {
PyObject *empty_list = 0;
PyObject *module = 0;
PyObject *global_dict = 0;
PyObject *empty_dict = 0;
PyObject *list;
if (from_list)
list = from_list;
else {
empty_list = PyList_New(0);
if (!empty_list)
goto bad;
list = empty_list;
}
global_dict = PyModule_GetDict(__pyx_m);
if (!global_dict)
goto bad;
empty_dict = PyDict_New();
if (!empty_dict)
goto bad;
{
if (level == -1) {
if ((1) && (strchr(__Pyx_MODULE_NAME, '.'))) {
module = PyImport_ImportModuleLevelObject(
name, global_dict, empty_dict, list, 1);
if (!module) {
if (!PyErr_ExceptionMatches(PyExc_ImportError))
goto bad;
PyErr_Clear();
}
}
level = 0;
}
if (!module) {
module = PyImport_ImportModuleLevelObject(
name, global_dict, empty_dict, list, level);
}
}
bad:
Py_XDECREF(empty_list);
Py_XDECREF(empty_dict);
return module;
}

--

___
Python tracker 
<https://bugs.python.org/issue47082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47082] No protection: `import numpy` in two different threads can lead to race-condition

2022-03-21 Thread Sebastian Berg


Sebastian Berg  added the comment:

To add to this: it would seem to me that the side-effects of importing should 
be guaranteed to only be called once?

However, IO or other operations could be part of the import side-effects and 
release the GIL.  So even a simple, pure-Python, package could run into this 
same issue and probably won't even realize that they can run into it.
(Assuming I understand how this is happening correctly.)

So it would seem to me that either:
* Python should lock on the thread level or maybe the `sys.modules` dictionary?
* The `threading` module could somehow ensure safety by hooking into
  the import machinery?
* Packages that may release the GIL or have side-effects that must
  only be run once have to lock (i.e. NumPy).
  (But it seems to me that many packages will not even be aware of
  possible issues.)

--
nosy: +seberg

___
Python tracker 
<https://bugs.python.org/issue47082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Tracing causes module globals to be mutated when calling functions from C

2022-02-01 Thread Sebastian Berg


Sebastian Berg  added the comment:

While I have a repro for Python, I think the pre release of cython already 
fixes it (and I just did not regenerated the C sources when trying, I guess.  A 
`git clean` to the rescue...).

--

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Tracing causes module globals to be mutated when calling functions from C

2022-02-01 Thread Sebastian Berg


Sebastian Berg  added the comment:

Not reopening for now, but I will note again that (AFAIK) Cython uses 
`PyEval_EvalCodeEx`, and the docs say that it is not used internally to CPython 
anymore.

So it seems pretty plausible that the bug is in `PyEval_EvalCodeEx` and not the 
generated Cython code?

--

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Tracing causes module globals to be mutated when calling functions from C

2022-02-01 Thread Sebastian Berg


Change by Sebastian Berg :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Tracing causes module globals to be mutated when calling functions from C

2022-02-01 Thread Sebastian Berg

Sebastian Berg  added the comment:

Thanks for having a look. I have confirmed this is related to Cython (no 
pandas/NumPy involved) – repro at https://github.com/seberg/bpo46451.  What 
happens under the hood in Cython is probably:

https://github.com/cython/cython/blob/master/Cython/Utility/ObjectHandling.c#L2569-L2611

Which generates `PyEval_EvalCodeEx`, and I could not repro with just a 
`PyObject_FastCallDict`, so I assume Cython is doing something wrong and will 
open an issue there, but if you have a quick tip as to what might wrong, that 
could be nice :).

Otherwise, will just close this, and may reopen if Cython hits a wall.

--

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Tracing causes module globals to be mutated when calling functions from C

2022-01-31 Thread Sebastian Berg


Change by Sebastian Berg :


--
title: Possibly bad interaction with tracing and cython? -> Tracing causes 
module globals to be mutated when calling functions from C

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Possibly bad interaction with tracing and cython?

2022-01-20 Thread Sebastian Berg


Sebastian Berg  added the comment:

Ahh, a further data-point.  The name from the module scope that is overwritten 
IS a parameter name used in the function locals.  Strangly, if I modify the 
tracing to print more:

stop = 0
def trace(frame, event, arg):
global stop
if stop > 10:
return None

if np.core.numeric.dtype is not np.dtype:
#print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
print(np.core.numeric.dtype)
print(frame, event, arg)
stop += 1
else:
print(frame, event, arg)

return trace

Then what I get is:

None
 call None
None
 line None
None
 line None
None
 line None
None
 line None
bool


So, upon entering the function, the value is (already) cleared/set to None 
(which is correct of course for `dtype=None`) and then while the function runs 
storing into the function locals _mutates_ the module global?

For the fact that it keeps changing during the function run, points very 
strongly at CPython?

--

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46451] Possibly bad interaction with tracing and cython?

2022-01-20 Thread Sebastian Berg


New submission from Sebastian Berg :

Starting here, but there could be Cython interaction or something else in 
theory.  But, when running the following:

* Python 3.10.1 (not 3.9.9, debug version or not)
* Setting a tracing function  (not setting a trace-function will fix the issue)
* Running Cython (maybe also C code) calling back into Python (the best I can 
tell)

It can happen that module globals in the called funtions scope seem to be 
modified.  Oddly enough to a value used in the locals of that function?!

The pandas issue:

https://github.com/pandas-dev/pandas/issues/41935

has a reproducer (sorry that it takes NumPy and pandas for now).  I will paste 
it at the end here also.

I can find that the value is modified by the time the `trace` function is 
called.  No other "trace" triggers are processed before in this example (Cython 
calls other functions in NumPy, but all of those are C implemented, so I guess 
that is why).
The function in question (unlike all?) should further be called with 
`__Pyx_PyFunction_FastCall`, so that is probably an important data point: 
Fastcall protocol seems involved.

(Reproducible using NumPy 1.21.5 and Pandas 1.3.5, but except maybe pandas due 
to the Cython version, I don't expect version dependency.)

The output of the script is very brief:

Something happened here, `np.core.numeric.dtype IS np.dtype`
 call None


The full reproducer script is:


import sys
import numpy as np
import pandas as pd

from numpy.core import numeric

stop = False
def trace(frame, event, arg):
global stop
if stop:
return None

if np.core.numeric.dtype is not np.dtype:
print("Something happened here, `np.core.numeric.dtype IS np.dtype`")
print(frame, event, arg)
stop = True
else:
print(frame, event, arg)

return trace

sys.settrace(trace)

pd._libs.lib.maybe_convert_objects(np.array([None], dtype=object))


For completeness, the Cython code calling the NumPy function in question, is 
(likley, there is more, this is Cython, I just cut it out a bit :)):

  #if CYTHON_FAST_PYCALL
  if (PyFunction_Check(__pyx_t_5)) {
PyObject *__pyx_temp[3] = {__pyx_t_2, __pyx_t_6, Py_False};
__pyx_t_15 = __Pyx_PyFunction_FastCall(__pyx_t_5, __pyx_temp+1-__pyx_t_8, 
2+__pyx_t_8); if (unlikely(!__pyx_t_15)) __PYX_ERR(0, 2441, __pyx_L1_error)
__Pyx_XDECREF(__pyx_t_2); __pyx_t_2 = 0;
__Pyx_GOTREF(__pyx_t_15);
__Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
  } else
  #endif

--
components: Interpreter Core
messages: 411082
nosy: seberg
priority: normal
severity: normal
status: open
title: Possibly bad interaction with tracing and cython?
type: crash
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue46451>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45383] PyType_FromSpec API fails to use metaclass of bases

2021-12-17 Thread Sebastian Berg


Sebastian Berg  added the comment:

Btw. huge thanks for looking into this!  Let me know if I can try to help out 
(I can make due with static metatypes, but my story seems much clearer if I 
could say: Well with Py 3.11 you can, and probably should, do it dynamically.). 
 I had lost a lot of time chasing "metaclass should just work" followed by "but 
I can't get it right without bad hacks".

And now the solution seems fairly clear, which is amazing :)!

--

___
Python tracker 
<https://bugs.python.org/issue45383>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45383] PyType_FromSpec API fails to use metaclass of bases

2021-12-17 Thread Sebastian Berg


Sebastian Berg  added the comment:

Well, what we need is a way to say: I am calling `type.__new__` (i.e. 
PyType_FromSpec) on purpose from (effectively) my own `mytype.__new__`?

That is, because right now I assume we want to protect users from calling 
PyType_FromSpec thinking that it is equivalent to calling `class new(base)` 
when it may not be if base is a metaclass.  So calling `PyType_FromSpec` might 
refuse to work if it finds a custom `metaclass.__new__` (init?).

I don't really see that it matters if we only support effectively this from C:
```
class MyMetaClass(type):
def __new__(cls, *args, **kwargs):
self = type.__new__(...)  # this is PyType_FromSpec
# more stuff
```
So, I thought telling `PyType_FromSpec` that we are "inside" a custom `__new__` 
is sufficient and that even as a flag passed as part of the spec could be 
enough.
But... I agree that I do not quite see that it would be pretty, so it probably 
was a bad idea :).

Plus, if you add a new method it should also solves the issue of setting the 
`tp_type` slot to the metaclass explicitly when it is not implicit by 
inheritance (which is the only thing I care about).
(PyType_FromSpec and PyType_ApplySpec will still need to do the work of 
resolving the metaclass from the base classes, though.)

--

___
Python tracker 
<https://bugs.python.org/issue45383>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45383] PyType_FromSpec API fails to use metaclass of bases

2021-12-17 Thread Sebastian Berg


Sebastian Berg  added the comment:

Fully, agree!  In the end, `PyType_FromSpec` replaces `type.__new__()` (and 
init I guess) when working in C.  In Python, we would call `type.__new__` 
(maybe via super) from the `metatype.__new__`, but right now, in C, the 
metatype cannot reliably use `PyType_FromSpec` in its own `metatype.__new__` to 
do the same.

I agree with the scenarios:
* If we do not have a custom `metatype.__new__` (init?) then `PyType_FromSpec` 
should have no reason to refuse doing the work, because nothing can go wrong.
* If we do have a custom `tp_new` the user has to provide C API to create the 
metaclass instance.  But they still need a way to call `type.__new__` in C 
(i.e. get what `PyType_FromSpec` does, and promising to do the rest).

`PyType_ApplySpec` would provide that way to create a custom `metatype.__new__` 
in C when `PyType_FromSpec()` would otherwise reject it to make the first 
scenario safe.
A flag probably can do the same.  I have no preference, `ApplySpec` seems great 
to me.

--

___
Python tracker 
<https://bugs.python.org/issue45383>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45383] PyType_FromSpec API fails to use metaclass of bases

2021-12-16 Thread Sebastian Berg


Sebastian Berg  added the comment:

It is probably early, but I am starting to like the idea of a "C MetaClass 
factory" helper/indicator.

It seems to me that yes, at least `tp_new` cannot be called reasonable for a 
class that is created in C, it is just too confusing/awkward to try to push the 
C stuff _through_ the Python API.  (And the Python API is the typical one that 
should not be inconvenienced)

Which gives two possibilities if I want this capability?:
1. Force use of a custom class factory in Python (i.e. not through `__new__`).  
But that seems hard, since I need to refuse `__new__()`!).
2. Use a class factor in C which never calls `__new__` and knows that this is 
OK.

This is not my turf, so I like unholy, but maybe pragmatic things :).  Would a 
slot/flag indicating `Py_using_metaclass_cinit_pretty_promise_please` "solve" 
these issues?

I mean, we have two ways to create classes (C and Python), I am not sure it is 
plausible to untangle this on the MetaClass level, so maybe the only way 
forward is to embrace it: Some Python MetaClasses just can't be instantiated 
from C, because we don't know it will work.  If we want to allow instantiating 
the MetaClass from C, we need some way to set this up.  And either we cannot 
use `PyType_FromSpec` then, or we need to tell it that we know what we are 
doing.

--

___
Python tracker 
<https://bugs.python.org/issue45383>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45383] PyType_FromSpec API fails to use metaclass of bases

2021-12-16 Thread Sebastian Berg


Sebastian Berg  added the comment:

Sorry, I need some time to dive back into this, so some things might be garbled 
:).  Yes, I do agree supporting a custom `tp_new` here seems incredibly tricky. 
 I have not thought about the implications of this, though.

> guess the safest option is to fail when the metaclass has a custom tp_new

That would seem OK, but I can't quite judge it.  It may mean that I have to do 
a custom factory to create new metaclass instances from Python, but that is 
probably for the better anyway.

Now, calling `tp_new` is a bit useless, since from C we don't have a dict 
anyway (at least not really).  So yeah, this does not make sense at all for 
most Python defined metaclasses...  (they may want to inspect/mutate the dict)

> But at that point, this is duplicating a lot of existing functionality, and 
> I'm starting to wonder if this wouldn't all be better with calling the 
> metaclass instead.

I do not think I am following :(.  My worry is that I want people to create a 
MetaClass instance through C (but not be locked into static types forever).

My current thought is that I would like it be possible to do something like:

/* Create a new type object */
type_spec = {stuff};
newtype = PyType_FromSpec(&type_spec);
/* Finalize the MetaClass */
metatype_spec = {more_stuff};
PyArray_InitDTypeFromSpec(newtype, &metatype_spec);

Of course I can move this into a single function and create the class for the 
user.  But I am uncertain that piping everything through `tp_new` will help?  
At some point I thought that the user should create a subclass, and I create 
another class inheriting it.  But it also seems convoluted and complicated.

I have no idea right now, but maybe there could also be a way to make creating 
a metaclass-factory in C easier, rather than supporting `PyType_FromSpec` for 
metaclasses.
(I.e. an `PType_InitFromSpec()` doing most of what `PyType_FromSpec` does, but 
really meant to be only used be such metaclass factories.)

> - basicsize/itemsize could be allowed with __basicsize__/__itemsize__ in the 
> dict.

Do we need this?  I need the basicsize of the metaclass, but that of the class 
seems fixed/OK?

> - flags: we could add mechanisms to set individual flags after the type is 
> created, as needed.

Seems fine, yeah.

> - slots can usually be applied after the class is created; maybe there should 
> be a public function for this.
> - members could theoretically be copied to individual descriptors; there 
> doesn't seem much need for keeping tp_members around.

But a Python MetaClass (that the author may not even realize about) might need 
access to these to work properly?
A bit far fetched, but...

--

___
Python tracker 
<https://bugs.python.org/issue45383>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45383] PyType_FromSpec API fails to use metaclass of bases

2021-10-05 Thread Sebastian Berg


New submission from Sebastian Berg :

The PyType_FromSpec fails to take care about MetaClasses.

 https://bugs.python.org/issue15870

Asks to create a new API to pass in the MetaClass.  This issue is only about 
"inheriting" the metaclass of the bases correctly.  Currently, Python fails to 
take into account that the bases may be MetaClass and not `PyType_Type` 
instances.

--
components: C API, Interpreter Core
messages: 403273
nosy: petr.viktorin, seberg
priority: normal
pull_requests: 27093
severity: normal
status: open
title: PyType_FromSpec API fails to use metaclass of bases
type: crash
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue45383>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15870] PyType_FromSpec should take metaclass as an argument

2021-10-05 Thread Sebastian Berg


Sebastian Berg  added the comment:

Yeah, I will try and have a look.  I had posted the patch, because the test 
looked like a bit of a larger chunk of work ;).

> And I'm surprised that you're surprised :)

:).  I am coming from a completely different angle, probably.  Just if you are 
curious, I use a from-spec like API (not public yet) in NumPy and dynamic use 
seemed natural/logical in that use-case, e.g.: 
https://github.com/numpy/numpy/blob/c92864091e5928d92bc109d1505febe35f3909f1/numpy/core/src/multiarray/convert_datatype.c#L2434

But, I am trying to understand the preference for static better.  There is 
probably something to learn or even important I am missing.


> And I have some plans to make static specs useful in type checking, since we 
> can assume that types made from the same spec share the memory layout.

(I don't understand how it is useful, unless you reuse slot structs?)
It sounds interesting, but even static does not guarantee constant unless the 
user indicates it through a flag?
Maybe you could achieve this by figuring it out by inspecting/comparing content 
rather than using the spec pointer(s)? (More complex, but also more powerful?)

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-28 Thread Sebastian Berg


Sebastian Berg  added the comment:

> But if tp_name is created dynamically, it could lead to a dangling pointer.

I will guess this is probably just an oversight/bug since the main aim was to 
move towards heap-types, and opened an issue: https://bugs.python.org/issue45315

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45315] `PyType_FromSpec` does not copy the name

2021-09-28 Thread Sebastian Berg


New submission from Sebastian Berg :

As noted in the issue: https://bugs.python.org/issue15870#msg402800 
`PyType_FromSpec` assumes that the `name` passed is persistent for the program 
lifetime.  This seems wrong/unnecessary: We are creating a heap-type, a 
heap-type's name is stored as a unicode object anyway!

So, there is no reason to require the string be persistent, and a program that 
decides to build a custom name dynamically (for whatever reasons) would run 
into a crash when the name is accessed.

The code:

https://github.com/python/cpython/blob/0c50b8c0b8274d54d6b71ed7bd21057d3642f138/Objects/typeobject.c#L3427

Should be modified to point to to the `ht_name` (utf8) data instead.  My guess 
is, the simplest solution is calling `type_set_name`, even if that runs some 
unnecessary checks.

I understand that the FromSpec API is mostly designed to replace the static 
type definition rather than extend it, but it seems an unintentional/strange 
requirement.

--
components: Interpreter Core
messages: 402804
nosy: seberg
priority: normal
severity: normal
status: open
title: `PyType_FromSpec` does not copy the name
type: crash
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue45315>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-28 Thread Sebastian Berg

Sebastian Berg  added the comment:

Just to note, that there are two – somewhat distinct – issues here in my 
opinion:

1. `FromSpec` does not scan `bases` for the correct metaclass, which it could; 
this could even be considered a bug?
2. You cannot pass in a desired metaclass, which may require a new API function.

My patch tries to address the first (the class creator has to take care that 
this is reasonable for the metaclass).  I had hoped the `slot` mechanism can 
avoid the API discussion for the second one, but I guess not.


On the discussion on `tp_type/meta` being incorrect usage:
I am slightly surprised we actually care about static C-definitions?

There is no reason that a `spec` should be declared static (aside that you have 
to move it into the function otherwise)?  Everything is copied by `_FromSpec` 
after all.
However, I suppose that would replace a safe-by-design API with a "best 
practice" to never define the spec/slots statically (a best practice that is 
probably not generally followed or even advertised currently, I guess).

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-23 Thread Sebastian Berg


Sebastian Berg  added the comment:

I can make a PR from the patch (and add the `Py_tp_metaclass` slot if desired) 
with a basic test here, if that is what is blocking things.

Fixing the type and size of the allocation (as the patch does) would allow me 
to give people a way to create a new NumPy DType dynamically.  I only need the 
user to call my initialization function as soon as the type was created (with 
`PyType_FromSpec` or otherwise).
(And I can avoid any internal acrobatics creating the type for the user; this 
stuff tends to be super confusing even if the principles are fairly straight 
forward...)

Happy to pursue other avenues, but I am not clear which...


> IIRC, mixing function pointers and data pointers doesn't work on some 
> platforms?

... I guess it is too late to do some weird thing like (not sure it would be 
reasonable or is valid anyway though):

typedef union {
void *pdata;
void (*pfunc)(void);
} slot_value;

I am a bit interested in it, because I want to use a `FromSpec` API in NumPy 
and it would be nice to be sure I can grow it to include data without too much 
hassle.  But the easier thing may just be to add one or two `void *reserved` 
slot to the spec struct that must be NULL for now...

--

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15870] PyType_FromSpec should take metaclass as an argument

2021-09-17 Thread Sebastian Berg


Sebastian Berg  added the comment:

I am still fighting with this (and the issues surrounding it) for NumPy.  The 
main point is that my new DTypes in NumPy are metaclasses that extend the 
(heap)type struct.
That just feels right and matches the structure perfectly, but seems to get 
awkward when you want (users) to dynamically create new MetaClass instances.


It also works fine: I could allow creating new instances from Python (although 
I don't need it now) and allow users to create static types (for this I have to 
fix the size of the additional fields for a stable ABI, but that is fine, I can 
allocate a new opaque slot).

But cython or heaptypes (stable API) seem not accessible without annoying 
hacks...


For this concrete issue, would it be acceptable to use the base classes to 
correctly find the metaclass and use its alloc?
I personally don't see why a new signature would be needed even if the 
metaclass was to be passed (I do not need this), it seems like you could pass a 
`Py_tp_meta` slot and not add a new function?


I have attached a potential patch (it is a bit large because it needs to move 
the `bases` discovery code to before the `res` allocation).


Doing this would allow to provide a `FromSpec` function which internally calls 
`PyType_FromSpec`.
That may not make things much neater for cython, but it feels like this is 
really just a bug fix?  Unless you want to consider any extension of the type 
struct an unofficial hack to begin with :).


(Python metaclasses won't get their `__new__` called, but that is already the 
case, presumably a metaclass author will provide their own `InitFromSpec` 
function that must be called immediately after type creation, or just create 
the type completely.  I can do the second even now probably, but this tiny 
change would make it a lot cleaner.)


Trying more desperate angles, I was even wondering today if I should instead 
petition for Python to "donate" a `tp_metaslot` type slot...  A single `void *` 
unused by `PyType_Type` but available to metatypes to use as they wish.

While a bit strange, that might even simplify some ABI concerns or cython 
compatibility ;).

--
nosy: +seberg
Added file: https://bugs.python.org/file50285/typeobject.patch

___
Python tracker 
<https://bugs.python.org/issue15870>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow

2021-08-04 Thread Sebastian Berg


Sebastian Berg  added the comment:

Thanks for looking into it!

`cpow` is indeed complicated.  We had some discussion in NumPy about `0**y` 
branch cuts (I did yet not finish it, because thinking about the branch cuts is 
tricky).

It is almost reassuring that the C standard also hedges out :).  Although, it 
means that there is no reliance on math libraries if we want to treat the 
special cases more carefully :(.

--

___
Python tracker 
<https://bugs.python.org/issue44698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow

2021-08-04 Thread Sebastian Berg


Sebastian Berg  added the comment:

(Sorry for the spam.  I think we can/should just hard-code the expected values 
in the NumPy test-suite.  So this is not actually an issue for NumPy and 
probably just warrants a double-check that the behaviour change is desirable.)

--

___
Python tracker 
<https://bugs.python.org/issue44698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow

2021-08-04 Thread Sebastian Berg


Sebastian Berg  added the comment:

Hmm, sorry, I overshot/misread :(.

The thing that the NumPy test-suite trips over is that:

c_powi(inf+0j, 3)

seems to not raise, but:

_Py_c_pow(inf+0j, 3.+0j)

(or nan+0.j rather then inf+0j)

does seem to raise (returning `nan+nanj` in both cases).  If this is the `inf` 
code path, raising an error may actually be fix and the integer code path 
should maybe also do it.

--

___
Python tracker 
<https://bugs.python.org/issue44698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44698] Undefined behaviour in Objects/complexobject.c's complex_pow

2021-08-04 Thread Sebastian Berg


Sebastian Berg  added the comment:

The fix broke NumPy (see also https://github.com/numpy/numpy/pull/19612)

It seems incorrect.  After all, it doesn't matter much whether the float can be 
converted to an integer correctly (or even if it returns an undefined value), 
so long `int_value == float_value` remains sensible.

The old cast cast integers to complex when they were out of range (which is 
fine), the new code raises an error instead.

--
nosy: +seberg

___
Python tracker 
<https://bugs.python.org/issue44698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43608] `bytes_concat` and Buffer cleanup

2021-03-23 Thread Sebastian Berg


New submission from Sebastian Berg :

`pybytes_concate` currently uses the following code to get the data:

va.len = -1;
vb.len = -1;
if (PyObject_GetBuffer(a, &va, PyBUF_SIMPLE) != 0 ||
PyObject_GetBuffer(b, &vb, PyBUF_SIMPLE) != 0) {
PyErr_Format(PyExc_TypeError, "can't concat %.100s to %.100s",
 Py_TYPE(b)->tp_name, Py_TYPE(a)->tp_name);
goto done;
}

I don't actually know if it is realistically possible to issues here (I ended 
up here by chasing the wrong thing).  But this (and the identical code in 
`bytearray`) strictly rely on `view->len` not being modified on error (or else 
may not clean `va`)!

That seems wrong to me?  Although, I now saw that `PyBuffer_GetBuffer` says:

If the exporter cannot provide a buffer of the exact type, it MUST raise 
PyExc_BufferError, set view->obj to NULL and return -1.

Pretty much all code in NumPy (and cpython as far as I can tell), will 
guarantee that `obj` (and `len` probably) is untouched on error, but it will 
not set it to NULL!

I can see some wisdom in NULL'ing `view->obj` since it means the caller can 
call `PyBuffer_Release` unconditionally (but then we have to explicitly do 
that!).  But realistically, it seems to me the realistic thing is to say that a 
caller must never release an unexported buffer and make no assumption about its 
content?

(Which doesn't mean I won't ensure NumPy will keep `len` and `obj` unmodified 
or NULL `obj` on error.)

--
components: C API
messages: 389428
nosy: seberg
priority: normal
severity: normal
status: open
title: `bytes_concat` and Buffer cleanup
versions: Python 3.10, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue43608>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40522] [subinterpreters] Get the current Python interpreter state from Thread Local Storage (autoTSSkey)

2020-11-05 Thread Sebastian Berg


Change by Sebastian Berg :


--
nosy: +seberg

___
Python tracker 
<https://bugs.python.org/issue40522>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15751] [subinterpreters] Make the PyGILState API compatible with subinterpreters

2020-08-10 Thread Sebastian Berg


Sebastian Berg  added the comment:

In NumPy ufuncs and datatype casting (iteration) we have the following setup:

   user-code (releasing GIL) -> NumPy API -> 3rd-party C-function 

(in the ufunc code, numpy is the one releasing the GIL, although others, such 
as numba probably hook in and replace that.)
Now, I want the 3rd-party C-function to be able to report errors in a 
reasonable way. Which, in my opinion means it must be able to grab the GIL, set 
an error (potentially release the GIL) and return an error result.
In theory, setting the error could be deferred to some later "cleanup" when the 
GIL is held, but that just does not seem practical to me.

One thing which may make this not as problematic is that the all of this can be 
expected to run within a Python created thread, so I somewhat think the initial 
proposal here would effectively fix the current NumPy usage (I am not sure).


The reason I ask/post this is, that this is all part of public-API which 
requires a complete re-implementation very soon. While extensions to that new 
API may be possible, that is always a bit painful, so it would be good to know 
how that API should be designed right now.

At this point, it seems that I should design the 3rd-party C-function API so 
that it is passed a `void *reserved = NULL` (always NULL) to be able to pass a 
`PyThreadState *` or `PyInterpreterState *` at some point safely. (Or a 
`PyThreadState **`/ `PyInterpreterState **`?)

In the majority of cases, I could already pass this right now, but I have 
currently no clear idea of what is the best practice. I also need to take such 
an argument (requiring new API) in at least one place. If we know how this will 
pan out, adding it sooner rather than later would be good, since it will make 
future transition/adoption faster or at least easier.

As far as I understand, right now PyGILState_Ensure()` not working is probably 
the single most blocking issue for correct subinterpreter support in NumPy, 
since this is baked into public API it may take years for true support in the 
above way, but we may be able to go a large part of the way now. My problem is 
that I can only do that if I am clear on what is needed.

--
nosy: +seberg

___
Python tracker 
<https://bugs.python.org/issue15751>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41037] Add (optional) threadstate to: PyOS_InterruptOccurred()

2020-06-19 Thread Sebastian Berg


New submission from Sebastian Berg :

In https://bugs.python.org/issue40826 it was defined that 
`PyOS_InterruptOccurred()` can only be called with a GIL.  NumPy had a few 
places with very unsafe sigint handling (not thread-safe).  But generally when 
we are in a situation that catching sigints would be nice as an enhancement, we 
do of course not hold the GIL.

So I am wondering if we can find some kind of thread-safe solution, or even 
just recipe.  Briefly looking at the code, it seemed to me that adding a new 
function with an additional `tstate` signature:

PyOS_InterruptOccurred(PyThreadState *tstate)

Could work, and be a simple way to allow this in the future?  It is probably 
not high priority for us (NumPy had only one place where it tried to stop long 
running computations).
But right now I am unsure if the function has much use if it requires the GIL 
to be held and a `tstate=NULL` argument seemed reasonable (if it works), except 
that it adds a C-API name.

--
components: C API
messages: 371885
nosy: seberg, vstinner
priority: normal
severity: normal
status: open
title: Add (optional) threadstate to: PyOS_InterruptOccurred()
type: enhancement
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue41037>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39471] Meaning and clarification of PyBuffer_Release()

2020-04-22 Thread Sebastian Berg


Sebastian Berg  added the comment:

Ok, I will just close it. It is painfully clear that e.g. `mmap` uses it this 
way to prohibit closing, and also `memoryview` has all the machinery necessary 
to do counting of how many exports, etc. exists.

I admit, this still rubs me the wrong way, and I think it means that we may 
need to check `bf_releasebuffer != NULL` also in NumPy.

We still have the issue of not being able to use `releasebuffer` easily in 
NumPy. But there is nothing to be done, unless we can consider limiting the 
`"#s"`, etc. type argparsing, which is likely not an option.

--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue39471>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39471] Meaning and clarification of PyBuffer_Release()

2020-01-27 Thread Sebastian Berg


Sebastian Berg  added the comment:

I went through Python, `array` seems to not break the logic. pickling has a 
comment which specifically wants to run into the argument parsing corner case 
above (I am not sure that it is really important). However, 
`Modules/_testbuffer.c` (which is just test) actually does allocate new memory 
for the buffer!

I can be convinced that this is how it is supposed to be, but right now I am 
still not quite. The main idea is efficient no-copy data sharing. It is always 
possible to create a `to_memoryview()` method if a realloc was 
desired/necessary...

--

___
Python tracker 
<https://bugs.python.org/issue39471>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39471] Meaning and clarification of PyBuffer_Release()

2020-01-27 Thread Sebastian Berg


Sebastian Berg  added the comment:

Hmmm, it seems I had missed this chunk of PEP 3118 before:

> Exporters will need to define a bf_releasebuffer function if they can 
> re-allocate their memory, strides, shape, suboffsets, or format variables 
> which they might share through the struct bufferinfo.

Which reads like the opposite of what I would like to see, unfortunately. Which 
I guess means that the parsing issue should be addressed differently. Opinions 
still welcome, though.

--

___
Python tracker 
<https://bugs.python.org/issue39471>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39471] Meaning and clarification of PyBuffer_Release()

2020-01-27 Thread Sebastian Berg


New submission from Sebastian Berg :

The current documentation of ``PyBuffer_Release()`` and the PEP is a bit fuzzy 
about what the function can and cannot do.

When an object exposes the buffer interface, I believe it should always return 
a `view` (in NumPy speak) of its own data, i.e. the data exposed by the object 
is also owned by it directly.

On the other hand the buffer view _itself_ has fields such as `strides`, etc. 
which may need allocating.

In other words, I think `PyBuffer_Release()` should be documented to 
deallocate/invalidate the `Py_buffer`. But, it *must not* invalidate the actual 
memory it points to. If I copy all information out of the `Py_buffer` and then 
free it, the copy must still be valid.

I think this is the intention, but it is not spelled out clear enough, it is 
also the reason for the behaviour of the "#s", etc. keyword argument parsers 
failing due to the code:

if (pb != NULL && pb->bf_releasebuffer != NULL) {
*errmsg = "read-only bytes-like object";
return -1;
}

which in turn currently means NumPy decides to _not_ implement bf_releasebuffer 
at all (leading to very ugly work arounds).

I am happy to make a PR, if we can get to a point where everyone is absolutely 
certain that the above interpretation was always correct, we could clean up a 
lot of code inside NumPy as well!

--
components: C API
messages: 360809
nosy: seberg
priority: normal
severity: normal
status: open
title: Meaning and clarification of PyBuffer_Release()
type: behavior
versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39471>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39274] Conversion from fractions.Fraction to bool

2020-01-15 Thread Sebastian Berg


Change by Sebastian Berg :


--
keywords: +patch
pull_requests: +17412
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/18017

___
Python tracker 
<https://bugs.python.org/issue39274>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39347] Use of argument clinic like parsing and `METH_FASTCALL` support in extension modules

2020-01-15 Thread Sebastian Berg


New submission from Sebastian Berg :

This is mainly an information request, so sorry if its a bit besides the point 
(I do not mind if you just close it). But it seemed a bit too specific to get 
answers in most places...
In Python you use argument clinic, which supports `METH_FASTCALL`, that seems 
pretty awesome.

For extension modules, I am not sure that argument clinic is a straight forward 
choice, since it probably generates code specific to a single Python version 
and also using, while we need to support multiple versions (including versions 
that do not support `METH_FASTCALL`.

So the question would be if there is some idea for providing such C-API, or for 
example exposing `_PyArg_UnpackKeywords` (which is at the heart of kwarg 
parsing).

My use-case is that some NumPy functions do have a nice speedup when using 
`METH_FASTCALL` and better argument clinic style faster arg-parsing. Which is 
why, looking at these things, I practically reimplemented a slightly dumbed 
down version of `PyArg_ParseTupleAndKeywords` working much like argument clinic 
(except less smart and only using macros and no code generation).

That seems viable, but also feels a bit wrong, so I am curious if there may be 
a better solution or whether it would be plausible to expose 
`_PyArg_UnpackKeywords` to reduce code duplication. (although I suppose due to 
old python version support that would effectively take years)

For completeness, my code in question is here: 
https://github.com/numpy/numpy/pull/15269 with the actual usage pattern being:

static PyObject *my_method(PyObject *self, NPY_ARGUMENTS_DEF)
{
NPY_PREPARE_ARGPARSER;
PyObject *argument1;
int argument2 = -1;
if (!npy_parse_arguments("method", 1, -1, NPY_ARGUMENTS_PASS),
"argument1", NULL, &argument1,
"argument2", &PyArray_PythonPyIntFromInt, &argument2,
NULL, NULL, NULL) {
return NULL;
}
}

--
components: Argument Clinic
messages: 360073
nosy: larry, seberg
priority: normal
severity: normal
status: open
title: Use of argument clinic like parsing and `METH_FASTCALL` support in 
extension modules
type: enhancement

___
Python tracker 
<https://bugs.python.org/issue39347>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39274] Conversion from fractions.Fraction to bool

2020-01-09 Thread Sebastian Berg


Sebastian Berg  added the comment:

Thanks for the quick responses.

@Victor Stinner, I suppose you could change `numbers.Complex.__bool__()` by 
adding the no-op bool to make it: `bool(self != 0)`.

But I am not sure I feel it is necessary. NumPy is a bit a strange in that it 
uses its own boolean scalar, and it is easy to override the slot for such 
objects.

--
nosy: +seberg

___
Python tracker 
<https://bugs.python.org/issue39274>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39028] ENH: Fix performance issue in keyword extraction

2019-12-12 Thread Sebastian Berg


Sebastian Berg  added the comment:

Fair enough, we had code that does it the other way, so it seemed "too obvious" 
since the current check seems mainly useful with few kwargs. However, a single 
kwarg is super common in python, while many seem super rare (in any argument 
clinic function)

Which is why I had even trouble finding a function where it could make a 
difference, since it needs many kwargs.

With changes:

sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", 
signed=True)'
100 loops, best of 5: 205 nsec per loop
sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", 
signed=True)'
100 loops, best of 5: 207 nsec per loop

On master:

sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", 
signed=True)'
100 loops, best of 5: 221 nsec per loop
sebastian@seberg-x1 ~/python-dev/bin
 % ./python3 -m timeit -s 'i = 4' 'i.to_bytes(length=5, byteorder="big", 
signed=True)'
100 loops, best of 5: 218 nsec per loop

Which, at close to 5% is barely noticeable, I suppose with more kwargs it could 
start to make a difference. With less than 3, it just does not matter I guess. 
Also, I am currently not sure how likely non-interned strings could actually 
happen. But I am not sure it matters to optimize for them (unless PyArg_From* 
functions use them).

In any case, sorry if this was noise, happy to fix things up or just drop it if 
many kwargs are considered non-existing.

--

___
Python tracker 
<https://bugs.python.org/issue39028>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39028] ENH: Fix performance issue in keyword extraction

2019-12-11 Thread Sebastian Berg


New submission from Sebastian Berg :

The keyword argument extraction/finding function seems to have a performance 
bug/enhancement (unless I am missing something here). It reads:


```
for (i=0; i < nkwargs; i++) {
PyObject *kwname = PyTuple_GET_ITEM(kwnames, i);

/* ptr==ptr should match in most cases since keyword keys
   should be interned strings */
if (kwname == key) {
return kwstack[i];
}
assert(PyUnicode_Check(kwname));
if (_PyUnicode_EQ(kwname, key)) {
return kwstack[i];
}
}
```

However, it should be split into two separate for loops, using the 
`PyUnicode_EQ` check only if it failed for _all_ other arguments.

I will open a PR for this (it seemed like a bpo number is wanted for almost 
everything.

--
components: C API
messages: 358287
nosy: seberg
priority: normal
severity: normal
status: open
title: ENH: Fix performance issue in keyword extraction
versions: Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39028>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39028] ENH: Fix performance issue in keyword extraction

2019-12-11 Thread Sebastian Berg


Change by Sebastian Berg :


--
keywords: +patch
pull_requests: +17049
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/17576

___
Python tracker 
<https://bugs.python.org/issue39028>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37980] regression when passing numpy bools to sorted(..., reverse=r)

2019-08-30 Thread Sebastian Berg


Sebastian Berg  added the comment:

I applaud the stricter rules in general, as Mark noted nicely, the issue is 
that `__index__` is maybe a strange way to achieve that for bools (it is not 
like `123` is a clean bool)? `__nonzero__` coerces to bools, there is no 
`__bool__` to convert to bool safely.

Basically: this seems to force numpy to back down from saying that 
`list[np.True_]` will be invalid in the future. (And we cannot just get rid of 
our bools unfortunately).

--

___
Python tracker 
<https://bugs.python.org/issue37980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37980] regression when passing numpy bools to sorted(..., reverse=r)

2019-08-29 Thread Sebastian Berg


Change by Sebastian Berg :


--
nosy: +seberg

___
Python tracker 
<https://bugs.python.org/issue37980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26568] Add a new warnings.showwarnmsg() function taking a warnings.WarningMessage object

2016-10-19 Thread Sebastian Berg

Sebastian Berg added the comment:

To make warning testing saner, in numpy we added basically my own version of 
catch_warnings on steroids, which needed/will need changing because of this.

Unless I missed it somewhere, this change should maybe put into the release 
notes to warn make it a bit easier to track down what is going on.

--
nosy: +seberg

___
Python tracker 
<http://bugs.python.org/issue26568>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23352] PyBuffer_IsContiguous() sometimes returns wrong result if suboffsets != NULL

2015-02-01 Thread Sebastian Berg

Sebastian Berg added the comment:

I do not have an opinion either way, but I do think there is a small difference 
here compared to the other issue. With the other issue, there are cases where 
we cannot set the strides correctly.

If you ask numpy directly whether the array is contiguous (or create it from 
numpy) and it says "yes", and then you get it without asking for a contiguous 
array it is, without the change, possible to get an array that would *not* be 
considered contiguous.

On the other hand, setting suboffsets to NULL when all are -1 is always 
possible, if tedious I admit.

--

___
Python tracker 
<http://bugs.python.org/issue23352>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23352] PyBuffer_IsContiguous() sometimes returns wrong result if suboffsets != NULL

2015-01-30 Thread Sebastian Berg

Sebastian Berg added the comment:

Numpy does not understand suboffsets. The buffers we create will always have 
them NULL. The other way around To be honest, think it is probably ignoring 
the whole fact that they might exist at all :/, really needs to be fixed if it 
is the case.

--

___
Python tracker 
<http://bugs.python.org/issue23352>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-10-20 Thread Sebastian Berg

Sebastian Berg added the comment:

Antoine, sounds good to me, I don't mind this being in python rather sooner 
then later, for NumPy itself it does not matter I think. I just wanted to warn 
that there were problems when we first tried to switch in NumPy, which, if I 
remember correctly, is now maybe 2 years ago (in a dev version), though.

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-10-15 Thread Sebastian Berg

Sebastian Berg added the comment:

@pitrou, yes of course. This would make python do the same thing as numpy does 
(currently only with that compile flag given).
About the time schedule, I think I will try to see if some other numpy dev has 
an opinion. Plus, should look into documenting it for the moment, so that 
someone who reads up on the buffer protocol should get things right.

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-10-02 Thread Sebastian Berg

Sebastian Berg added the comment:

Numpy 1.9. was only released recently, so 1.10. might be a while. If no
problems show up during release or until then, we will likely switch it
by then. But that could end up being a year from now, so I am not sure
if 3.6 might not fit better. The problems should be mostly mitigated on
our side. So bug-wise it shouldn't be a big issue I would guess.

I will try to look at it more soon, but am completly overloaded at least
for the next few days, and maybe some other numpy devs can chip in. Not
sure I get your last point, slicing should give the "organic" values
even for the mangled up thing with relaxed strides on (currently)?!

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-22 Thread Sebastian Berg

Sebastian Berg added the comment:

Yeah, the code does much the same as the old numpy code (at least most of the 
same funny little things, though I seem to remember the old numpy code had 
something yet a bit weirder, would have to check).

To be honest, I do not know. It isn't implausible that the original numpy code 
dates back 15 years or more to numeric. I doubt whoever originally wrote it 
thought much about it, but there may be some good reason, and there is the 
safety considerations that people use the strides in a way they should not, 
which may trip us here in any case.

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-21 Thread Sebastian Berg

Changes by Sebastian Berg :


Added file: http://bugs.python.org/file36680/contiguous.py

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-21 Thread Sebastian Berg

Changes by Sebastian Berg :


Added file: http://bugs.python.org/file36678/contiguous.py

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-21 Thread Sebastian Berg

Changes by Sebastian Berg :


Added file: http://bugs.python.org/file36677/relaxed-strides-checking.patch

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-21 Thread Sebastian Berg

Sebastian Berg added the comment:

I am very sorry. The attached patch fixes this (not sure if quite right, but if 
anything should be more general then necessary). One test fails, but it looks 
like exactly the intended change.

--
Added file: http://bugs.python.org/file36676/relaxed-strides-checking.patch

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-20 Thread Sebastian Berg

Sebastian Berg added the comment:

An extra dimension is certainly not irrelevant! The strides *are* valid
and numpy currently actually commonly creates such arrays when slicing.
The question is whether or not we want to ignore them for contiguity
checks even if they have no effect on the memory layout.

So there are three options I currently see:

1. Python also generalizes like I would like numpy to end up in the
future (the current patch should do that) and just don't care about such
strides, because the actual memory layout is what matters.
2. We say it is either too dangerous (which may very well be) or you
want to preserve Fortran/C-order information even when it does not
matter to the memory layout.

This leads to this maybe:
2a) we just keep it as it is and live with minor inconsistencies (or
never do the relaxed strides in numpy)
2b) We let these buffers return False on checking for contiguity but
*allow* allow fetching a buffer when C-/F-contiguous is explicitly asked
for when getting the buffer. Which is a weird middle way, but it might
actually be a pretty sane solution (have to think about it).

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-19 Thread Sebastian Berg

Sebastian Berg added the comment:

Well, the 9223372036854775807 is certainly no good for production code and we 
would never have it in a release version, it is just there currently to expose 
if there are more problems. However I don't care what happens on overflow (as 
long as it is not an error).

Note that the stride here is on a dimension with shape 1. The only valid index 
is thus always 0 and 0*9223372036854775807=0, so the stride value does not 
actually matter when calculating offsets into the array. You could simply set 
it to 80 to get something that would be considered C-contiguous or to 8 to get 
something that is considered F-contiguous. But both is the case in a way, so 
just "cleaning up" the strides does not actually get you all the way.

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-19 Thread Sebastian Berg

Sebastian Berg added the comment:

To be clear, the important part here, is that to me all elements *can* be 
accessed using that scheme. It is not correct to assume that `stride[-1]` or 
`stride[0]` is actually equal to `itemsize`.

In other words, you have to be able to pass the pointer to the start of a 
c-contiguous array into some C-library that knows nothing about strides without 
any further thinking. The 0-strides badly violate that.

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-19 Thread Sebastian Berg

Sebastian Berg added the comment:

#12845 should be closed, seems like a bug in some old version. The definition 
now is simply that the array is contiguous if you can legally access it in a 
contiguous fashion. Which means first stride is itemsize, second is 
itemsize*shape[0] for Fortran, inverted for C-order.

--

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22445] Memoryviews require more strict contiguous checks then necessary

2014-09-19 Thread Sebastian Berg

New submission from Sebastian Berg:

In NumPy we decided some time ago that if you have a multi dimensional buffer, 
shaped for example 1x10, then this buffer should be considered both C- and 
F-contiguous. Currently, some buffers which can be used validly in a contiguous 
fashion are rejected.

CPython does not support this currently possibly creating smaller nuisance 
if/once we change it fully in NumPy, see for example 
https://github.com/numpy/numpy/issues/5085

I have attached a patch which should (sorry I did not test this at all yet) 
relax the checks as much as possible. I think this is right, but we did some 
subtle breaks in user code (mostly cython code) when we first tried changing 
it, and while numpy arrays may be more prominently C/F-contiguous, 
compatibility issues with libraries checking for contiguity explicitly and then 
requesting a strided buffer are very possible.

If someone could give me a hint about adding tests, that would be great.
Also I would like to add a small note to the PEP in any case regarding this 
subtlety, in the hope that more code will take care about such subtleties.

--
components: Library (Lib)
files: relaxed-strides-checking.patch
keywords: patch
messages: 227113
nosy: seberg
priority: normal
severity: normal
status: open
title: Memoryviews require more strict contiguous checks then necessary
type: enhancement
Added file: http://bugs.python.org/file36663/relaxed-strides-checking.patch

___
Python tracker 
<http://bugs.python.org/issue22445>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7279] decimal.py: == and != comparisons involving NaNs

2013-02-13 Thread Sebastian Berg

Sebastian Berg added the comment:

Thanks, yes, you are right, should have googled a bit more anyway. Though I did 
not find much on the hashable vs unhashable itself, so if I ever stumble across 
it again, I will write a mail...

--

___
Python tracker 
<http://bugs.python.org/issue7279>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7279] decimal.py: == and != comparisons involving NaNs

2013-02-13 Thread Sebastian Berg

Sebastian Berg added the comment:

This is closed, and maybe I am missing something. But from a general point of 
view, why does hashing of NaN not raise an error as it did for decimals, i.e. 
why was this not resolved exactly the other way around? I am mostly just 
wondering about this it is not a problem for me.

Hashing NaNs seems dangerous and surprising because it might work in 
dicts/sets, but normally doesn't. (The only thing for it to work right would be 
if NaN was a singleton, but that is impossible for subclasses, etc.).

The thing is:

In [16]: s = {float('nan'): 1, float('nan'): 2, float('nan'): 3}
In [17]: s
Out[17]: {nan: 1, nan: 2, nan: 3}

In [18]: s[float('nan')]
KeyError: nan

In [19]: n = float('nan')
In [20]: s = {n: 1, n: 2, n: 3}
In [21]: s
Out[21]: {nan: 3}

This is because `n is n`, and PyObject_RichCompareBool assumes that if `a is b` 
then `a == b` which is simply wrong for NaNs and also makes comparisons  of 
iterables including NaNs an impossible business. NaNs have their unavoidable 
weirdness, but at least for dictionaries/sets it would seem more clear to me if 
they raised an error.

--
nosy: +seberg

___
Python tracker 
<http://bugs.python.org/issue7279>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16845] warnings.simplefilter should validate input

2013-01-02 Thread Sebastian Berg

New submission from Sebastian Berg:

`warnings.simplefilter` does not validate that the category passed in is 
actually a class. This means that an invalid category leads to a `TypeError` 
whenever a warning would otherwise occur due to `issubclass` check failing.

It is a very small thing, but for usability it would be clearer if this was 
checked right away.

--
messages: 178867
nosy: seberg
priority: normal
severity: normal
status: open
title: warnings.simplefilter should validate input
type: enhancement

___
Python tracker 
<http://bugs.python.org/issue16845>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com