[lxml] Re: Broken EXSLT link in docs

2024-09-29 Thread Stefan Behnel via lxml - The Python XML Toolkit

Hi Jens,

Jens Tröger via lxml - The Python XML Toolkit schrieb am 28.09.24 um 09:45:

I think the EXSLT link here:

   https://lxml.de/xpathxslt.html#regular-expressions-in-xpath

or source here:

   
https://github.com/lxml/lxml/blob/9818374770aedc96f8f1e77943f45dea8e7fb4a8/doc/xpathxslt.txt#L319

should change from http://www.exslt.org/ to https://exslt.github.io/ or
some other valid URL.


Thanks, fixed.

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Consider keeping manylinux1 wheels for Python 3.6

2024-05-16 Thread Stefan Behnel

James Belchamber schrieb am 15.05.24 um 22:37:

Would you be able to do the same thing for aarch64?


manylinux1 never supported aarch64:

https://github.com/pypa/manylinux?tab=readme-ov-file#manylinux1-centos-5-based---eol

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] nested CDATA - was Re: Building on Windows

2024-05-02 Thread Stefan Behnel

Hi,

Gertjan Klein schrieb am 02.05.24 um 17:52:

Op 25-04-2024 om 16:58 schreef Stefan Behnel:
I'm trying to write a conversion program that 
outputs XML[2]. It must match the output of an existing program. 
Semantically it already does, but I'd like it to match the way CDATA is 
handled. To this end, I'd like to allow "wrapped" CDATA. The CDATA class 
currently disallows this: it checks for the presence of ']]>', and raises 
if found.


The exception probably comes from a time where libxml2 didn't handle this 
itself.


I added a parameter to turn off this check. I expected to need to do the 
escaping myself, but it seems lxml handles this just fine out of the box. 
For example, this tester code:


from lxml import etree
from lxml.etree import CDATA
def main():
     root = etree.Element("dummy")
     txt = ''
     root.text = CDATA(txt, False)


Such a flag would need to be a keyword-only argument to make this readable. 
It's entirely unclear what the "False" refers to, unless you know the call 
signature by heart.



     out = etree.tostring(root).decode()
     print(out)
if__name__ == '__main__':
     main()

...prints this:




Looks good to me. According to the XML spec (both 1.0 and 1.1), "CDATA 
sections cannot nest":


https://www.w3.org/TR/REC-xml/#sec-cdata-sect

But splitting the CDATA section makes perfect sense. This does not even 
need an option, we can just remove the check and add a test for it.


Do you want to propose a PR?

The Python "xml.etree.ElementTree" package can also parse this correctly, 
but escapes this on output since it doesn't support CDATA sections 
directly. Thus, it seems best to add the test in "test_etree.py" rather 
than "test_elementtree.py" since the behaviour of both differ here.


Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Building on Windows

2024-04-25 Thread Stefan Behnel

Gi,

Gertjan Klein schrieb am 21.04.24 um 16:12:
I'd like to try a tiny change to the CDATA class. In order to try, I have 
to be able to build lxml. Unfortunately, on Windows.


Yeah, supporting Windows is everything but trivial due to the general lack 
of platform provided build support. Thus, all libraries have to do their 
own thing, and bringing that together is not easy. I'm happy myself that 
there is a working build setup at all.


If it's a somewhat straightforward change that doesn't need tons of 
back-and-forth testing and debugging, and you have a github account, you 
could also use their CI service (Github Actions), either on your own 
account or in lxml's account via a pull request.



I've downloaded Visual Studio 2019 CE. I created a (Python 3.12) virtual 
environment, where I installed Cython (latest version). I cloned lxml 
sources from GitHub. I then opened a "Developer command prompt for VS 
2019", activated the virtual environment, and typed:


(.venv) C:\Temp\lxml\lxml>python setup.py build_ext -i --with-cython 
--static-deps


This downloads the dependencies like libxml2 etc.; this goes without 
problems. Then compilation starts, and gives errors:


[...]
    Creating library 
build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib and 
object build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.exp

etree.obj : error LNK2001: unresolved external symbol _xmlStrchr
etree.obj : error LNK2001: unresolved external symbol _xmlIOParseDTD
etree.obj : error LNK2001: unresolved external symbol _xmlMemShow
[...]

There are in total 503 unresolved externals. I checked the first one, and 
find that is is present in the downloaded libxml2_a.lib, but without the 
underscore. The directories of the downloaded libraries are correctly added 
to the compiler command line.


It might help to see the command line.

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Consider keeping manylinux1 wheels for Python 3.6

2024-04-04 Thread Stefan Behnel

I've uploaded a simple Py3.6 manylinux1 wheel for x86_64.

https://files.pythonhosted.org/packages/b8/93/768dabd4032e15dc6e7ca6767c132685545b7b0e12549dfa923fd2bd/lxml-5.2.1-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl

Please try it out.

Stefan



Stefan Behnel schrieb am 03.04.24 um 21:46:

Hi,

thanks for the report.

Miro Hrončok schrieb am 03.04.24 um 15:55:

I've noticed that lxml 5.1+ upgraded the manylinux wheels to a newer tag.


That came from the migration to cibuildwheel and was only partly intended.


The default ensurpip-bundled pip version in Python 3.6 does not support 
newer manylinuxes, hence it is likely that many CI systems that still 
test 3.6 now attempt to build lxml from sources. Since 5.2, this also 
fails with the old pip due to the old bundled pytoml, as indicated in a 
previous thread on this list.


$ python3.6 -m venv venv3.6
$ venv3.6/bin/pip list
Package    Version
-- ---
pip    18.1
setuptools 40.6.2


5.0.2 has a manylinux1 wheel:

$ venv3.6/bin/pip install lxml==5.0.2
... lxml-5.0.2-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl


5.1.0 builds from source but uses setup.py and works (with devel deps):

$ venv3.6/bin/pip install lxml==5.1.0
... lxml-5.1.0.tar.gz
   Running setup.py install for lxml ...


5.2.1 builds from source and will outright blow up when parsing 
pyproject.toml:


$ venv3.6/bin/pip install lxml==5.2.1
... lxml-5.1.0.tar.gz
...
pip._vendor.pytoml.core.TomlError: /tmp/.../lxml/pyproject.toml(40, 1): msg


Hmm, right, that's annoying.


If support for Python 3.6 is still desired, would it maybe make sense to 
keep building and uploading manylinux1 wheels to make it easier?


I'll see what I can do.

Stefan


___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Consider keeping manylinux1 wheels for Python 3.6

2024-04-03 Thread Stefan Behnel

Hi,

thanks for the report.

Miro Hrončok schrieb am 03.04.24 um 15:55:

I've noticed that lxml 5.1+ upgraded the manylinux wheels to a newer tag.


That came from the migration to cibuildwheel and was only partly intended.


The default ensurpip-bundled pip version in Python 3.6 does not support 
newer manylinuxes, hence it is likely that many CI systems that still test 
3.6 now attempt to build lxml from sources. Since 5.2, this also fails with 
the old pip due to the old bundled pytoml, as indicated in a previous 
thread on this list.


$ python3.6 -m venv venv3.6
$ venv3.6/bin/pip list
Package    Version
-- ---
pip    18.1
setuptools 40.6.2


5.0.2 has a manylinux1 wheel:

$ venv3.6/bin/pip install lxml==5.0.2
... lxml-5.0.2-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl


5.1.0 builds from source but uses setup.py and works (with devel deps):

$ venv3.6/bin/pip install lxml==5.1.0
... lxml-5.1.0.tar.gz
   Running setup.py install for lxml ...


5.2.1 builds from source and will outright blow up when parsing 
pyproject.toml:


$ venv3.6/bin/pip install lxml==5.2.1
... lxml-5.1.0.tar.gz
...
pip._vendor.pytoml.core.TomlError: /tmp/.../lxml/pyproject.toml(40, 1): msg


Hmm, right, that's annoying.


If support for Python 3.6 is still desired, would it maybe make sense to 
keep building and uploading manylinux1 wheels to make it easier?


I'll see what I can do.

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: 5.2.0 doesn't build

2024-04-01 Thread Stefan Behnel

Hi,
thanks for the report.

da.ve.k.gu...@...com schrieb am 01.04.24 um 20:10:

I'm attempting to develop a project that has been operational for a while. The 
project makes use of mixbox which references this library. It seems that 
version 5.2.0 of lxml was released yesterday. Strangely, pytoml is encountering 
an error related to the pyproject.toml file. Can someone investigate this issue?

#21 9.785   Saved /wheels/tox-2.7.0-py2.py3-none-any.whl
12:18:26
#21 9.806 Collecting lxml (from mixbox==1.0.5->-r requirements.txt (line 7))
12:18:26
#21 10.88   Downloading https://.../lxml-5.2.0.tar.gz (3.7MB)


Could you state the platform/architecture that you're running? And which 
Python version? I wonder why it picks up the source distribution instead of 
a ready-made binary wheel. lxml takes a while to build and requires 
external system libraries, so building from source is discouraged for 
"normal" use.




#21 15.22   File 
"/usr/share/python-wheels/pytoml-0.1.2-py2.py3-none-any.whl/pytoml/parser.py", 
line 253, in error
12:18:26
#21 15.22 raise TomlError(message, self.pos[0][0], self.pos[0][1], 
self._filename)
12:18:26
#21 15.22 pytoml.core.TomlError: 
/tmp/pip-wheel-29w0tw8j/lxml/pyproject.toml(26, 8): expected_equals


This seems to use an old version of pytoml, a library which (apparently) 
has been deprecated in favour of other tools.


https://pypi.org/project/pytoml/

I'd try upgrading your build environment (pip, setuptools, wheel, etc.).

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: What replaced xpath.evaluate() ?

2024-02-16 Thread Stefan Behnel

lpsm...@uw.edu schrieb am 16.02.24 um 00:38:

I'm maintaining older code, which just broke because lxml took out 
xpath.evaluate().  The only note in the lxml changelog about it says it was 
'redundant', meaning (I assume) that there's a better way to do the same thing, 
but there's no documentation about what that other way might be.

Does anyone know what the new code should be?  The code in question looks like:

 xpath = lxml.etree.XPath(target, namespaces=namespaces)
 root = lxml.etree.Element("root")
 try:
 xpath.evaluate(root)


You can simply call the XPath object. Thus, it's common to write something like

find_config = lxml.etree.XPath("//config[1]")
config_element = find_config(root)

https://lxml.de/xpathxslt.html#the-xpath-class

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Streaming read/write

2024-01-21 Thread Stefan Behnel

Charlie Clark schrieb am 19.01.24 um 15:00:

On 18 Jan 2024, at 18:10, Charlie Clark wrote:


Apart from the fact that this currently doesn't work, I imagine that both 
Elements and their children would happily be passed to the write, which could 
lead to an almighty mess. Getting this to work properly, possibly rewritten for 
async to avoid the awfully awful (yield) hack could be a nice addition to the 
documentation.


Thinking about this again, I think a pull parser is probably the way to go as I 
really don't want or need to create elements, it's probably fine if I just make 
the changes to what's coming through and stream the text straight back into 
another file. I'll give that a go.


If you want to avoid creating element objects all together, maybe even 
don't need a full (sub-)tree structure to get all relevant information, I 
suggest you try the low-level SAX interface.


https://lxml.de/parsing.html#the-target-parser-interface

It's quite efficient and usable for locally constrained XML 
transformations, e.g. filtering elements or attributes.


And you can still parse input chunk by chunk, if you need that:

https://lxml.de/parsing.html#the-feed-parser-interface

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Streaming read/write

2024-01-18 Thread Stefan Behnel

Hi Charlie,

Charlie Clark schrieb am 18.01.24 um 12:13:

I was recently wondering about the best way to edit XML documents using
both a streaming reader and writer. I'm sure this is possible using
iterparse and xmlfile but I seem to remember that iterparse produces the
full tree so that parent elements and their children are returned.


You might want to look into the more general XMLPullParser, but yes, both 
that and iterparse() generate a full XML tree in the back. The idea is that 
you actively delete parts of it when you're done with them, but you gain 
easy tree navigation for that. If you need to do somewhat complex and 
non-local tree transformations, the additional tree building and cleanup 
work is a price you might want to pay.


Alternatively, for the parsing side, there's also still SAX (i.e. pass a 
"target" object into the parser). It matches somewhat well with xmlfile(), 
at the cost of requiring separate callback methods and thus, probably, some 
state keeping on your side. But depending on the kind of "editing" that 
you're doing on your XML documents, it might not be too bad.


Basically, lxml can do all the state keeping for you if you let it build a 
tree (but then you have to clean up after yourself to save memory), or you 
choose to do all the state keeping yourself and take the bare parse events, 
and then have full control over the amount of state that you keep. Whatever 
is better for your use case.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


Re: [Cython] Ready for Cython 3.1 ?

2023-11-06 Thread Stefan Behnel

Stefan Behnel schrieb am 05.11.23 um 23:06:
I'd like to ease our feature development by using more modern Python 
features in our code base and by targeting less Python versions in Cython 
3.1 compared to the "all things supported" Cython 3.0.


I created a 3.0.x maintenance branch and removed the Py<3.7 test jobs from 
the master branch. That should make the CI response visibly faster.


Happy code cleaning :)

Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Ready for Cython 3.1 ?

2023-11-06 Thread Stefan Behnel

Lisandro Dalcin schrieb am 06.11.23 um 09:05:

On Mon, 6 Nov 2023 at 01:19, Stefan Behnel wrote:

it looks like Cython 3.0.6 is going to be a "most things fixed" kind of
release for the 3.0.x release series.


I'm having issues using CYTHON_LIMITED_API with some Python versions
(<=3.9).
If you are not in a rush to release 3.0.6, I would like to have some time
to properly investigate what's going on.


I'd rather postpone these things to 3.1. They are not critical for 3.0, and 
as I wrote, I think it's actually helpful for users to target 3.1 rather 
than 3.0.


Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Ready for Cython 3.1 ?

2023-11-06 Thread Stefan Behnel

da-woods schrieb am 06.11.23 um 08:48:

 > I also consider Cython 3.1 a prime target for better Limited API support.

Yes - but I wouldn't treat complete support as a blocker (I don't think 
this is what you meant though).


It's experimental in 3.0 and I don't expect it to "fully" work in 3.1.


There's a separate question about what we consider the minimum viable 
Limited API version we want to support. I imagine that'll ultimately be 
decided by "what we can make work", but I don't think it'll be less that 
3.4 (when PyType_GetSlot) was added. It's probably something to decide later.


That's another thing that moving the support to 3.1 would solve. If we can 
target Py3.7/3.8+ instead of older versions, then also the Limited API will 
be more usable.


Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Should we start using the internal CPython APIs?

2023-11-05 Thread Stefan Behnel

da-woods schrieb am 04.11.23 um 14:45:

I'm a bit late in replying to this but here are some unordered thoughts.

* I'm fairly relaxed about using `Py_BUILD_CORE` if useful - I think we 
mostly do have good fallback paths for most things so can adapt quickly 
when stuff changes.


I'm not entirely relaxed about it, but I agree that the fallbacks should 
usually make it easy to keep things working also after larger changes in 
CPython.



* CYTHON_USE_CPYTHON_CORE_DETAILS sounds reasonable, but it's yet another 
variation to test.


True.


* I wonder if fixing up the limited API implementation should be higher 
priority than creating a third level been "full" and "limited API".


I think there's potential for all three. Basically modes "aggressively 
fast", "highly compatible" and "version independent". The latter is what 
the Stable ABI together with the Limited API should give you.



* I recall we were planning to ditch c89 as a strict requirement after 3.0? 
Incompatibility with C++ might be more of an issue though.


Yes. C++ is not an issue for CPython, so their internal header files are 
not tested with C++ at all. That's the highest potential for breakage, if 
we accept to generate C99 from Cython 3.1 onwards.


We should make sure that we use "-std=c89" in at least one Cython 3.0 test 
setup, BTW.



* Even so, if there's a good way of turning it off then we could say: "if 
you want strict c89 support then you can't use 
CYTHON_USE_CPYTHON_CORE_DETAILS" and people would always have options.


That could be part of it, yes.



* Waiting and seeing may be a good option for now.


I agree. This still seems best for now, especially given the amount of 
recent changes in the C-API. Let's wait for those to settle down, at least.


Thanks everyone for your opinions and comments!

Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Ready for Cython 3.1 ?

2023-11-05 Thread Stefan Behnel

Hi all,

it looks like Cython 3.0.6 is going to be a "most things fixed" kind of 
release for the 3.0.x release series. Given the work that lies ahead of us 
for Cython 3.1, I think we're at a point to get started on that, making the 
future 3.0.x releases stable and "boring".


As a reminder, Cython 3.1 will remove support for Python 2.7 and Python 
3.[567], i.e. all Python versions that are now EOL. Python 3.8 will 
continue to receive security fixes for another year. Python 3.7 is EOL but 
still up for debate since it's probably not hard to support and still 
maintained in some Linux distributions for another couple of years. But I'm 
fine with considering it legacy. We'll probably notice if it gets in the 
way while preparing Cython 3.0, and can leave support in until there's a 
reason to remove it.


https://github.com/cython/cython/issues/2800

I'd like to ease our feature development by using more modern Python 
features in our code base and by targeting less Python versions in Cython 
3.1 compared to the "all things supported" Cython 3.0.


I also consider Cython 3.1 a prime target for better Limited API support. 
Users probably won't care both for that and for outdated Python versions at 
the same time. Or, they can use Cython 3.0.x for continued legacy support.


Since Cython 3.1 is mostly about ripping out old code, we can try to keep 
the development cycle short, so that new features don't have to wait that 
long. Certainly not as long as for Cython 3.0…


Is everyone and everything ready to start working on Cython 3.1?

Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Should we start using the internal CPython APIs?

2023-10-30 Thread Stefan Behnel

Thank you for your comments so far.

Stefan Behnel schrieb am 29.10.23 um 22:06:
I seriously start wondering if we shouldn't just define 
"Py_BUILD_CORE" (or have our own "CYTHON_USE_CPYTHON_CORE_DETAILS" macro 
guard that triggers its #define) and include the internal "pycore_*.h" 
CPython header files from here:


https://github.com/python/cpython/tree/main/Include/internal


I just remembered that there's a one major technical issue with this. 
CPython now requires C99 for its own code base (Py3.13 actually uses 
"-std=c11" on my side). While they care about keeping public header files 
compatible with C89 and C++, their internal header files may not always 
have that quality, and won't be tested for it.


So, governance is one argument, but technical reasons can also make this 
appear less appealing overall.


I'll let things settle some more and see in what direction Py3.13 will 
eventually be moving.


Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Should we start using the internal CPython APIs?

2023-10-29 Thread Stefan Behnel

Hi all,

given the latest blow against exposing implementation details of CPython in 
their C-API (see https://github.com/cython/cython/pull/5767 for the endless 
story), I seriously start wondering if we shouldn't just define 
"Py_BUILD_CORE" (or have our own "CYTHON_USE_CPYTHON_CORE_DETAILS" macro 
guard that triggers its #define) and include the internal "pycore_*.h" 
CPython header files from here:


https://github.com/python/cpython/tree/main/Include/internal

This would give us greater freedom in accessing all the implementation 
details, so that we could directly integrate with those. We'd obviously 
still need one or more fallback implementations for "stable CPython", 
Limited API, PyPy and friends.


There's a risk, clearly, that these internals change even during point 
releases. Maybe not a big risk, but not impossible either. We'd have to 
deal with that and so would our users.


OTOH, having a single macro switch would make it easy for users to adapt if 
something breaks on their side, and also easy to benchmark if it makes a 
difference for their code.


We could also leave it off by default and simply allow users with high 
performance needs to enable it manually. Or start by leaving it off until a 
new CPython X.Y release has stabilised and its (used-by-us) internals have 
proven not to change, and then switch it on for that release series. In any 
case, having a single switch for this feels like it could be easy to handle.


What do you think?

Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Can we remove the FastGIL implementation?

2023-09-20 Thread Stefan Behnel

da-woods schrieb am 19.09.23 um 21:38:
I think the detail that was missing is you need to add the `#cython: 
fast_gil = True` to enable it.

[...]
So my conclusion is that from 3.11 onwards Python sped up their own GIL 
handling to about the same as we used to have, and fastgil has turned into 
a pessimization.


I tried the benchmark with the master branch on my side again, this time 
with correct configuration. :)


Turns out that enabling the FastGIL feature makes it much slower for me (on 
Ubuntu Linux 20.04) in both Py3.8 and 3.10:


"""
* Python 3.10 (-DCYTHON_FAST_GIL=0)
Running the test (already held)...
took 1.2482502460479736
Running the test (released)...
took 6.444956541061401
Running the test (already held)...
took 1.2358744144439697
Running the test (released)...
took 6.4064109325408936

* Python 3.10 (-DCYTHON_FAST_GIL=1)
Running the test (already held)...
took 2.243091583251953
Running the test (released)...
took 7.32707667350769
Running the test (already held)...
took 2.4065449237823486
Running the test (released)...
took 7.50264573097229
"""

I also tried it with PGO enabled and got more or less the same result. The 
Python installations that I tried it with were both PGO builds.


It's probably mixed across platforms, different configurations and C 
compilers. I looked through the "What's new" document for Py3.10 and 3.11 
but couldn't find mentions of GIL improvements. Just that some other things 
have become faster.


So – disable the feature in Python 3.11 and later? (Currently it's disabled 
in 3.12+.)


Py3.11+ would suggest that we keep the code in Cython 3.1, since that will 
support older Python versions that still seem to benefit from it.


Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Can we remove the FastGIL implementation?

2023-09-19 Thread Stefan Behnel

Hi,

I've seen reports that Cython's "FastGIL" implementation (which basically 
keeps the GIL state in a thread-local variable) is no longer faster than 
CPython's plain GIL implementation in recent Python 3.x versions. 
Potentially even slower. See the report in


https://github.com/cython/cython/issues/5703

It would be helpful to get user feedback on this.

If you have GIL-heavy Cython code, especially with nested 
with-nogil/with-gil sections across functions, and a benchmark that 
exercises it, could you please run the benchmark with and without the 
feature enabled and report the results?


You can add "-DCYTHON_FAST_GIL=0" to your CFLAGS to disabled it (and "=1" 
to enable it explicitly). It's enabled by default in CPython 3.6-3.11 (but 
disabled in Cython 0.29.x on Python 3.11).


Thanks,
Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Cython 3.0.2 released

2023-08-27 Thread Stefan Behnel

Hi all,

Cython 3.0.2 is released. It fixes two major regressions in 3.0.1, so 
please upgrade if that failed for you.


https://cython.readthedocs.io/en/latest/src/changes.html

Have fun,
Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Cython 3.0 final released

2023-07-17 Thread Stefan Behnel

Hi all,

after close to five long years, I'm proud to announce the release of
Cython 3.0. It's done. It's out. Finally!

The full list of improvements compared to the 0.29.x release series is 
entirely incredible.


https://cython.readthedocs.io/en/latest/src/changes.html

Cython 3.0 is better than any other Cython release before, in all aspects. 
It's much more Python, integrates better with C and C++, supports more 
Python implementations and configurations, provides many great new language 
features

– it's faster, safer and easier to use. It's simply better.

New language features include:

- Python 3 syntax and semantics by default
- Cython type annotations in plain Python code
- automatic NumPy ufunc generation
- fast @dataclass and @total_ordering extension types
- safe exception propagation in C functions by default
- Unicode identifiers in Cython code

All of this wouldn't have been possible without the help of the many, many 
people who contributed code and documentation, tested features, found and 
described bugs, helped debugging problems. Those who started using Cython 
in new environments, new build systems, new use cases, and helped to get it 
working there. Who proposed new features or found mismatches and gaps in 
the existing set of features.


Thank you all, you helped making Cython 3.0 an awesome language!

Along the way, we added two people to the list of Cython developers.

* David Woods has contributed a tremendous list of features and fixes to 
this release. It would honestly not have been possible without his efforts.


* Matúš Valo has put a lot of work into the documentation and the pure 
Python mode. He found many issues that make Cython now easier and more 
consistent to use from Python code.


Thank you both for your contributions. I'm happy to work together with you.

Everyone, have fun using Cython 3.0, and whatever good comes after it.

Best,
Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Cython 3.0 RC 2 released

2023-07-12 Thread Stefan Behnel

Hi all,

after close to five long years, we're almost there – I've pushed a release 
candidate for Cython 3.0 with a long list of bug fixes (followed by a 
second one with one important fix).


https://cython.readthedocs.io/en/latest/src/changes.html

Please give it some final testing. Unless we find something really serious 
in the RC2 release, the changes for the final release will be very limited 
and safe, hopefully none at all.


The RC is just in time for this week's US-SciPy, and I'll make sure we have 
a final release for next week's EuroPython in Praha.


Have fun,
Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Current CI crashes in Py3.12

2023-06-04 Thread Stefan Behnel

Hi,

just a note that the current CI crashes in Py3.12b1 are due to

https://github.com/python/cpython/issues/104614

They fixed it and Py3.12b2 will hopefully support multiple inheritance of 
extension types again. It's expected next week (June 6th).


Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] cython 3 migration update and next releases

2023-05-21 Thread Stefan Behnel

Dima Pasechnik schrieb am 21.05.23 um 11:38:

On Sun, 21 May 2023, 10:21 Stefane Fermigier,  wrote:

IFAIK, 15k lines of Cython makes it among one of the largest Cython
projects I'm aware of (I did some research a couple of years ago):

https://github.com/sfermigier/awesome-cython#some-projects-with-more-that-10-000-lines-of-cython-code



> SageMath has 700K Cython lines, yet not mentioned.

Certainly worth mentioning, yes.

Looking at the numbers, I also noticed that lxml is listed in the 5-10k 
lines range. It actually has about 18k lines of Cython code (.pyx/.pxi 
files) and another 1.5k lines in compiled Python (.py) files, according to 
pygount [1]. I tried sloccount first, but that doesn't seem to have Cython 
support.


Might be worth redoing that count for the other projects as well.

Stefan


[1] https://pypi.org/project/pygount/

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] cython 3 migration update and next releases

2023-05-20 Thread Stefan Behnel

matus valo schrieb am 16.05.23 um 21:09:

I would like to inform you about recent porting of projects to Cython 3.
Recently, I participated in migration of 3 bigger projects to Cython 3:


Thanks a lot for doing this, Matúš. It helps Cython as much as it helps 
these projects.




When migrating to Cython 3, I was able to find out several issues in the
Cython, all of them are merged in master now. Hence, I would like to ask
about next steps. It would help greatly to release Cython 3 beta3. This
will allow me to pin scipy CI to real pre-release instead of master branch.


I'll try to get beta 3 released soon, but need to find a bit of consecutive 
time to get it out. There are still a couple of PRs that I'd like to look 
through.




Moreover, I would like to ask whether we can do the final Cython 3 release
after beta 3. The rationale is that the projects won't start really using
Cython 3 until we do the final release. Now, we have 3 big users of Cython
migrated, hence I think we have some confidence that Cython 3 is ready.
What do you think?


It's probably a good time to have a final call for merges. Promoting and 
voting for PRs is welcome.


Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Cython 3.0 beta 2 is released

2023-03-27 Thread Stefan Behnel

Hi everyone,

we received a lot of feedback for our first beta release (thanks you, 
everyone!) and were able to (hopefully) resolve all blockers that prevented 
some of you from making good use of it.


Let's hear what you think about the second beta. It's up on PyPI.

https://cython.readthedocs.io/en/latest/src/changes.html#beta-2-2023-03-26

Have fun,
Stefan



Stefan Behnel schrieb am 26.02.23 um 11:31:

Hi all,

Cython 3.0 has left the alpha status – the first beta release is available 
from PyPI.


The changes in this release are huge – and the full list of improvements 
compared to the 0.29.x release series is entirely incredible. Cython 3.0 is 
better than any other Cython release before, in all aspects. It's much more 
Python, integrates better with C++, supports more Python implementations 
and configurations, provides many great new language features –

it's faster, safer and easier to use. It's simply better.

https://cython.readthedocs.io/en/latest/src/changes.html#beta-1-2023-02-25

The development of the Cython 3.0 release series started all the way back 
in 2018, with the first branch commit happening on October 27, 2018.


https://github.com/cython/cython/commit/c2de8efb67f80bff59975641aac387d652324e4e

List of Milestones along the way, and a long list of contributors:
https://github.com/cython/cython/issues/4022#issuecomment-1404305257

Thank you to everyone who contributed. Especially to David Woods, who 
contributed a tremendous amount of changes, both fixes and new features. 
Thank you, David!


A couple of people have also joined in an effort to make the documentation 
reflect what this great new Cython has to offer. Thank you all, our users 
will love you for your help.


https://github.com/cython/cython/issues/4187

https://cython.readthedocs.io/en/latest/

Now, go and give it a try. We've taken great care to make the transition 
from Cython 0.29.x as smooth as possible, which was not easy given the 
large amount of changes, including some well-motivated breaking changes. We 
wanted to let all users benefit from this new release.


Let us know how it works for you, and tell others about it. :)

Have fun,
Stefan


___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[lxml] Re: When is a number not a number

2023-03-03 Thread Stefan Behnel

Stefan Behnel schrieb am 03.03.23 um 09:00:

Stefan Behnel schrieb am 02.03.23 um 08:50:

Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de:

Probably a bug in _checkNumber():
https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974


Ah, yes, it might be the isdigit() check, actually. That could be too 
broad. Not every digit is a valid part of a number.


Thanks for the report and the investigation. I'll try a fix when I get to 
it.


According to the XML Schema 1.1 spec, it's really just [0-9] that we should 
detect.


https://www.w3.org/TR/xmlschema11-2/#decimal

I'll remove the ".isdigit()" check all together and only leave the '0-9' 
comparison in there. Even when we're parsing Unicode strings, we should 
only care about XML numbers, not everything that Python accepts.


https://github.com/lxml/lxml/commit/3d4e60f2835e4d85fd357c182656d3eca534f2ff

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: When is a number not a number

2023-03-03 Thread Stefan Behnel

Stefan Behnel schrieb am 02.03.23 um 08:50:

Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de:

Probably a bug in _checkNumber():
https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974


Ah, yes, it might be the isdigit() check, actually. That could be too broad. 
Not every digit is a valid part of a number.

Thanks for the report and the investigation. I'll try a fix when I get to it.


According to the XML Schema 1.1 spec, it's really just [0-9] that we should 
detect.


https://www.w3.org/TR/xmlschema11-2/#decimal

I'll remove the ".isdigit()" check all together and only leave the '0-9' 
comparison in there. Even when we're parsing Unicode strings, we should 
only care about XML numbers, not everything that Python accepts.


Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: When is a number not a number

2023-03-01 Thread Stefan Behnel
Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de:
>Probably a bug in _checkNumber():
>https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974

Ah, yes, it might be the isdigit() check, actually. That could be too broad. 
Not every digit is a valid part of a number.

Thanks for the report and the investigation. I'll try a fix when I get to it.

Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[Python-announce] Cython 3.0 beta 1 is released

2023-02-26 Thread Stefan Behnel

Hi all,

Cython 3.0 has left the alpha status – the first beta release is available 
from PyPI.


https://cython.org/

https://pypi.org/project/Cython/

The changes in this release are huge – and the full list of improvements 
compared to the 0.29.x release series is entirely incredible. Cython 3.0 is 
better than any other Cython release before, in all aspects. It's much more 
Python, integrates better with C++, supports more Python implementations 
and configurations, provides many great new language features –

it's faster, safer and easier to use. It's simply better.

https://cython.readthedocs.io/en/latest/src/changes.html#beta-1-2023-02-25

What is Cython?

In case you didn't hear about Cython before, it's the most widely used
statically optimising Python compiler out there. It translates Python (2/3)
code to C, and makes it as easy as Python itself to tune the code all the
way down into fast native code. If you have any non-trivial Python 
application running, chances are you'll find some piece of Cython generated 
package in it.


The development of the Cython 3.0 release series started all the way back 
in 2018, with the first branch commit happening on October 27, 2018.


https://github.com/cython/cython/commit/c2de8efb67f80bff59975641aac387d652324e4e

A list of Milestones along the way, and a long list of contributors:
https://github.com/cython/cython/issues/4022#issuecomment-1404305257

Thank you to everyone who contributed. A couple of people have also joined 
in an effort to make the documentation reflect what this great new Cython 
has to offer.


https://cython.readthedocs.io/en/latest/

Now, go and give it a try. We've taken great care to make the transition 
from Cython 0.29.x as smooth as possible, which was not easy given the 
large amount of changes, including some well-motivated breaking changes. We 
wanted to let all users benefit from this new release.


Let us know how it works for you, and tell others about it. :)

Have fun,
Stefan
___
Python-announce-list mailing list -- python-announce-list@python.org
To unsubscribe send an email to python-announce-list-le...@python.org
https://mail.python.org/mailman3/lists/python-announce-list.python.org/
Member address: arch...@mail-archive.com


[Cython] Cython 3.0 beta 1 is released

2023-02-26 Thread Stefan Behnel

Hi all,

Cython 3.0 has left the alpha status – the first beta release is available 
from PyPI.


The changes in this release are huge – and the full list of improvements 
compared to the 0.29.x release series is entirely incredible. Cython 3.0 is 
better than any other Cython release before, in all aspects. It's much more 
Python, integrates better with C++, supports more Python implementations 
and configurations, provides many great new language features –

it's faster, safer and easier to use. It's simply better.

https://cython.readthedocs.io/en/latest/src/changes.html#beta-1-2023-02-25

The development of the Cython 3.0 release series started all the way back 
in 2018, with the first branch commit happening on October 27, 2018.


https://github.com/cython/cython/commit/c2de8efb67f80bff59975641aac387d652324e4e

List of Milestones along the way, and a long list of contributors:
https://github.com/cython/cython/issues/4022#issuecomment-1404305257

Thank you to everyone who contributed. Especially to David Woods, who 
contributed a tremendous amount of changes, both fixes and new features. 
Thank you, David!


A couple of people have also joined in an effort to make the documentation 
reflect what this great new Cython has to offer. Thank you all, our users 
will love you for your help.


https://github.com/cython/cython/issues/4187

https://cython.readthedocs.io/en/latest/

Now, go and give it a try. We've taken great care to make the transition 
from Cython 0.29.x as smooth as possible, which was not easy given the 
large amount of changes, including some well-motivated breaking changes. We 
wanted to let all users benefit from this new release.


Let us know how it works for you, and tell others about it. :)

Have fun,
Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[lxml] Re: Question about inheritance in cssselect.py

2022-12-23 Thread Stefan Behnel

Dani Litovsky Alcala schrieb am 10.06.22 um 17:34:

lxml v.4.9

cssselect.py:CSSSelector.__init__ calls on `etree.XPath.__init__(self, path, 
namespaces=namespaces)` to initialize the parent class.

Is there a reason why `super()` or even `super(CSSSelector, self)__init__...` 
is not used?


Probably the age of the code.



I bring this up as if I attempt to monkey patch (private project) 
`etree.XPath.__init__` the current code causes an error
```
TypeError: super(type, obj): obj must be an instance or subtype of type
```

while replacing it with the suggested use of `super()` fixes my error.


I'll change it to use super(). Thanks for the suggestion.

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: %s formatting in documentation (in stead of f-string)

2022-12-20 Thread Stefan Behnel

Hi,

t.r...@247interfaces.nl schrieb am 19.12.22 um 17:58:

Switching to lxml for xml parsing and generating, I was somewhat puzzeled by 
the usage line's like

XHTML_NAMESPACE = "http://www.w3.org/1999/xhtml";
XHTML = "{%s}" % XHTML_NAMESPACE

With more modern f-string's this could also be written as

XHTML = f"{{{XHTML_NAMESPACE}}}"

This may be because I only started on Python on 3.7, and have never worked with 
any 2.x

I quite understand there is no time to rework all the doc's in the low income 
on this project. Just two questions:

- is there (another) good reason not to is f-string formatting? And if not
- is there a way to assist on reworking the doc's


It's reasonable to update the docs to Py3 style by now, and a bit of that 
has already been done. The question is whether


XHTML = f"{{{XHTML_NAMESPACE}}}"

is really more readable than

XHTML = "{%s}" % XHTML_NAMESPACE

given the amount of curly braces with different meanings that a reader has 
to go through. To me, personally, the second seems quicker and more obvious 
to read, whereas it takes me a while to understand what the equivalent 
f-string does.


I think this is a case where we should keep the (IMHO) simpler non-f-string 
variant.


Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: zlib error

2022-12-15 Thread Stefan Behnel

Hi,

Ajayi, Temitope schrieb am 14.12.22 um 17:21:

It seems the version of zlib used in lxml is outdated. It currently shows up as 
zlib 1.2.11 instead of zlib 1.2.13 on scan reports and therefore vulnerable to 
CVE-2018-25032 and CVE-2022-37434.

Can I get some help on if this is correct or I am doing something wrong?


What lxml version are you using on which operating system? Are you using 
pre-built binary wheels or building locally?


The binary wheels of lxml 4.9.2 should be using zlib 1.2.13 on Linux/macOS 
and 1.2.12 on Windows.


Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[Cython] Cython 3.0 planning - Re: [cython-users] Re: Anything missing for 0.29.33 ?

2022-12-07 Thread Stefan Behnel

Hi Matúš,

Matúš Valo schrieb am 06.12.22 um 15:58:

I have a thought about Cython 3.0. Based on discussion in [1] we should be
done with all breaking changes. There is also [2] but PR is already there
[3] (I am not sure what is the state of the PR though).

Is it possible to make final release (In case we postpone [2] to Cython 3.1
or 3.0.X)?

Or, at least, can we move closer to final release and release beta or RC
version? I think it would be great to communicate to the community how far
we are from final release (not in time but e.g. this is beta/RC release and
will be followed by final release).

Additional reason for release of Cython 3.0 is that in near future (Python
3.12) two important components used by Cython will be removed: imp module
and distutils. In my opinion, Cython 3.0 should be released early to ensure
users transition period so we can avoid back-porting this changes to 0.29.X
releases.

Any thoughts?

[1] https://github.com/cython/cython/issues/4022
[2] https://github.com/cython/cython/issues/4936
[3] https://github.com/cython/cython/pull/5016


Let's see that we get [3] merged to close [2], I think then we're ready for 
a new release, once 0.29.33 is out.


As you wrote, we're through with the breaking changes for 3.0 then, so yes, 
a first beta release might be appropriate.


Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Anything missing for 0.29.33 ?

2022-12-06 Thread Stefan Behnel

Hi,

I'll try to push out the next 0.29.x (and hopelfully also 3.0alpha) release 
before Christmas. If you think I might have forgotten anything that's ready 
to be included in 0.29.33, please comment in the relevant ticket or PR, or 
reply to this message on cython-users.


Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[lxml] Re: Turn three-line block into single?

2022-08-10 Thread Stefan Behnel

Gilles schrieb am 10.08.22 um 15:20:

for row in tree.iter("wpt"):
     lat,lon = row.attrib.values()


Note that this assignment depends on the order of the two attributes in the 
XML document, i.e. in data that you may not control yourself. It will break 
if the provider of your input documents ever decides to change the order.


I'd probably just use

 lat, lon = row.get('lat'), row.get('lon')


Also:

> #remove dups
> no_dups = []
> for row in tree.iter("wpt"):
> lat,lon = row.attrib.values()
> if lat not in no_dups:
> no_dups.append(lat)
> else:
> row.getparent().remove(row)

You're using a list here instead of a set. It might be that a list is 
faster for very small amounts of data, but I'd expect a set to win quite 
quickly. Regardless of my guessing, you shouldn't be using a list here 
unless benchmarking tells you that it's faster. And if you do, you'd better 
add a comment for the reasoning. It's just too surprising to see this 
implemented with a list, so readers will end up wasting their time thinking 
more into it than there is.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[Cython] Welcome David Woods as a Cython core developer

2022-07-31 Thread Stefan Behnel

Hi everyone,

with the release of the first 3.0 alpha that supports Python 3.11 (aptly 
named "alpha 11"), I'm happy to announce that David Woods has been promoted 
to a Cython core developer.


David has shown an extraordinary commitment and dedication over the last 
years. His first merged commits were already back in 2015, mostly related 
to the C++ support. But within the last two years, he voluntarily took over 
more and more responsibility for bugs and issues and developed several 
major new features for the project. This includes the Walrus operator (PEP 
572), cdef dataclasses (modelled after PEP 557), internal "std::move()" 
usage in C++ mode or support for Unicode identifiers and module names, all 
of which form a major part of the 3.0 feature set. David has more than 
deserved a place in the circle of present and prior core devs.


David, thank you for your impressive work on Cython,
and welcome to the core team!

Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Python-Dev] Re: Switching to Discourse

2022-07-20 Thread Stefan Behnel

h.vetin...@gmx.com schrieb am 18.07.22 um 18:04:

One of the comments in the retro was:

Searching the archives is much easier and have found me many old threads that I 
probably would have problem finding before since I haven’t been subscribed for 
that long.


I'm actually reading python-dev, c.l.py etc. through Gmane, and have done 
that ever since I joined. Simply because it's a mailing list of which I 
don't need a local (content) copy, and wouldn't want one. Gmane seems to 
have a complete archive that's searchable, regardless of "when I subscribed".


It's really sad that Discourse lacks an NNTP interface. There's an 
unmaintained bridge to NNTP servers [1], but not an emulating interface 
that would serve the available discussions via NNTP messages, so that users 
can get them into their NNTP/Mail clients to read them in proper discussion 
threads. I think adding that next to the existing web interface would serve 
everyone's needs just perfectly.


Anyone up for giving that a try? It can't be *that* difficult. ;-)

Stefan


[1] https://github.com/sman591/discourse-nntp-bridge

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/USPYYNP24UYQQ64YBBTHNOEDNGX46LVM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Switching to Discourse

2022-07-16 Thread Stefan Behnel

Petr Viktorin schrieb am 15.07.22 um 13:18:
The discuss.python.org experiment has been going on for quite a while, and 
while the platform is not without its issues, we consider it a success. The 
Core Development category is busier than python-dev. According to staff, 
discuss.python.org is much easier to moderate.. If you're following 
python-dev but not discuss.python.org, you're missing out.


That's one of the reasons then why I pretty much lost track of the CPython 
development since d.p.o was introduced. It's sad, but it was just too much 
work for me (compared to threaded Newsgroups) to follow the discussions 
there, definitely more than I wanted to invest.


It's not the only reason, though, so please take a decision for the home of 
CPython discussions that suits the (currently) more active part of the 
development community.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TA5YNMEJURKMJHTSYTM5Z6G2YQ6UM5TP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Cython] Nested prange loops - (was: [cython-users] Converting to Python objects with nogil (inside prange for loop))

2022-07-15 Thread Stefan Behnel

Hi,

nested prange loops seem to be a common gotcha for users. I can't say if 
there is ever a reason to do this, but at least I can't think of any. For 
me, this sounds like we should turn it into a compile time error – unless 
someone can think of a use case? Even in that case, I'd still emit a 
warning since it seems so unlikely to be intended.


Please reply to the cython-users list to facilitate user feedback.

Stefan



 Forwarded Message 
Subject: Re: [cython-users] Converting to Python objects with nogil (inside 
prange for loop)

Date: Fri, 15 Jul 2022 07:43:26 +0100



with nogil, parallel():
  for i in prange(N):
    for j in prange(km.BatchSize):


You usually only want one loop in a set of nested loops to be prange. 
Typically the outer loop, but in this case it might be easier to 
parallelize the inner loop.


___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[lxml] Re: Iterparse raises TypeError on attempt to clean up preceding siblings

2022-06-25 Thread Stefan Behnel
Am June 23, 2022 11:20:59 PM UTC schrieb Parfait G :
>I see one fix is to also check if `elem.getparent() is not None`.
>Thoughts?
>
>elem.clear()
> while elem.getprevious() is not None and elem.getparent() is not None:
>del elem.getparent()[0]

The parent won't change during the loop, so it's enough to check it once before 
the loop.

Also, there is only one element without parent, that's the root element. Maybe 
you can skip that altogether in your processing? It should be the first item 
returned by the iterator that you got through .iter(). Just call next() on it 
once.

Stefan

___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Build problems on Python 3.11

2022-05-31 Thread Stefan Behnel

Charlie Clark schrieb am 31.05.22 um 17:54:
while I don't see this locally, I'm getting problems on my CI with the 
Docker Image:


```
  Compile failed: command '/usr/bin/gcc' failed with exit code 1
   cc -I/usr/include/libxml2 -I/usr/include/libxml2 -c 
/tmp/xmlXPathInitw7u6s7rr.c -o tmp/xmlXPathInitw7u6s7rr.o

   cc tmp/xmlXPathInitw7u6s7rr.o -lxml2 -o a.out
   error: command '/usr/bin/gcc' failed with exit code 1
   [end of output]

   note: This error originates from a subprocess, and is likely not a 
problem with pip.

error: legacy-install-failure
× Encountered error while trying to install package.
╰─> lxml
```

I'm wondering if there is anything that can be done about this? Presumably 
inform the maintainer?


I've never seen this either, but at least there's an lxml 4.9.0 release now 
that should work with Py3.11. I didn't upload wheels for 3.11, though.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[Python-ideas] Re: Less is more? Smaller code and data to fit more into the CPU cache?

2022-04-05 Thread Stefan Behnel

Barry Scott schrieb am 27.03.22 um 22:23:

On 22 Mar 2022, at 15:57, Jonathan Fine wrote:
As you may have seen, AMD has recently announced CPUs that have much larger L3 
caches. Does anyone know of any work that's been done to research or make 
critical Python code and data smaller so that more of it fits in the CPU cache? 
I'm particularly interested in measured benefits.


I few years ago (5? 10?) there was a blog about making the python eval loop fit 
into L1 cache.
The author gave up on the work as he claimed it was too hard to contribute any 
changes to python at the time.
I have not kept a link to the blog post sadly.

What I recall is that the author found that GCC was producing far more code 
then was required to implement sections of ceval.c.
Fixing that would shrink the ceval code by 50% I recall was the claim. He had a 
PoC that showed the improvements.


Might be worth trying out if "gcc -Os" changes anything for ceval.c. Can 
also be enabled temporarily with a pragma (and MSVC has a similar option).


We use it in Cython for the (run once) module init code to reduce the 
binary module size, but it might have an impact on cache usage as well.


Stefan

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QQVYUUKOKN472N4OLNCAA76HLVFXMKLB/
Code of Conduct: http://python.org/psf/codeofconduct/


The Cython compiler is 20 years old today !

2022-04-04 Thread Stefan Behnel

Dear Python community,

it's now 20 years since Greg Ewing posted his first announcement of Pyrex, 
the tool that is now known and used under the name Cython.


https://mail.python.org/pipermail/python-list/2002-April/126661.html

It was a long way, and I've written up some of it in a blog post:

http://blog.behnel.de/posts/cython-is-20/

Today, if you're working on any kind of larger application in Python, 
you're likely to have some piece of code downloaded into your venv that was 
built with Cython. Or many of them.


I'm proud of what we have achieved. And I'm happy to see and talk to the 
many, many users out there whom we could help to help their users get their 
work done.


Happy anniversary, Cython!

Stefan



PS: The list of Cython implemented packages on PyPI is certainly 
incomplete, so please add the classifier to yours if it's missing. With 
almost 3000 dependent packages on Github (and almost 100,000 related 
repos), I'm sure we can crack the number of 1000 Cython built packages on 
PyPI as a birthday present. (No Spam, please, just honest classifiers.)


https://pypi.org/search/?q=&o=-created&c=Programming+Language+%3A%3A+Cython

https://github.com/cython/cython/network/dependents?dependent_type=PACKAGE
--
https://mail.python.org/mailman/listinfo/python-list


[Python-announce] The Cython compiler is 20 years old today !

2022-04-04 Thread Stefan Behnel

Dear Python community,

it's now 20 years since Greg Ewing posted his first announcement of Pyrex, 
the tool that is now known and used under the name Cython.


https://mail.python.org/pipermail/python-list/2002-April/126661.html

It was a long way, and I've written up some of it in a blog post:

http://blog.behnel.de/posts/cython-is-20/

Today, if you're working on any kind of larger application in Python, 
you're likely to have some piece of code downloaded into your venv that was 
built with Cython. Or many of them.


I'm proud of what we have achieved. And I'm happy to see and talk to the 
many, many users out there whom we could help to help their users get their 
work done.


Happy anniversary, Cython!

Stefan



PS: The list of Cython implemented packages on PyPI is certainly 
incomplete, so please add the classifier to yours if it's missing. With 
almost 3000 dependent packages on Github (and almost 100,000 related 
repos), I'm sure we can crack the number of 1000 Cython built packages on 
PyPI as a birthday present. (No Spam, please, just honest classifiers.)


https://pypi.org/search/?q=&o=-created&c=Programming+Language+%3A%3A+Cython

https://github.com/cython/cython/network/dependents?dependent_type=PACKAGE
___
Python-announce-list mailing list -- python-announce-list@python.org
To unsubscribe send an email to python-announce-list-le...@python.org
https://mail.python.org/mailman3/lists/python-announce-list.python.org/
Member address: arch...@mail-archive.com


[Cython] The Cython compiler is 20 years old today !

2022-04-04 Thread Stefan Behnel

Dear Cython community,

it's now 20 years since Greg Ewing posted his first announcement of Pyrex, 
the tool that is now known and used under the name Cython.


https://mail.python.org/pipermail/python-list/2002-April/126661.html

It was a long way, and I've written up some of it in a blog post:

http://blog.behnel.de/posts/cython-is-20/

Today, if you're working on any kind of larger application in Python, 
you're likely to have some piece of code downloaded into your venv that was 
built with Cython. Or many of them.


I'm proud of what we have achieved. And I'm happy to see and talk to the 
many, many users out there whom we could help to help their users get their 
work done.


Happy anniversary, Cython!

Stefan



PS: The list of Cython implemented packages on PyPI is certainly 
incomplete, so please add the classifier to yours if it's missing. With 
almost 3000 dependent packages on Github (and almost 100,000 related 
repos), I'm sure we can crack the number of 1000 Cython built packages on 
PyPI as a birthday present. (No Spam, please, just honest classifiers.)


https://pypi.org/search/?q=&o=-created&c=Programming+Language+%3A%3A+Cython

https://github.com/cython/cython/network/dependents?dependent_type=PACKAGE
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [xml] libxml2 2.9.23 download

2022-03-16 Thread Stefan Behnel

Hi,

Jeffrey Walton via xml schrieb am 16.03.22 um 05:45:

libxml2 2.9.13 seems to be missing from ftp://xmlsoft.org/libxml2/.


As mentioned in the release announcement:

https://mail.gnome.org/archives/xml/2022-February/msg9.html

the releases have moved to

https://download.gnome.org/sources/libxml2/2.9/

Stefan
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[lxml] Re: Is there an ElementTree class lookup hook?

2022-03-08 Thread Stefan Behnel

Salut encore,

Xavier Morel schrieb am 04.03.22 um 12:58:
lxml provides support for custom Element classes (as well as element-ish 
e.g. Comment or PI) via the `ElementDefaultClassLookup` registry, and the 
ability to hook it into a parser.


But that registry does not seem to have a slot for the root tree of the 
elements. Is there a hook somewhere to set *that*? I tried looking around 
the API docs but nothing really jumped out.


Do you really need something like that? Can't you just inherit from the 
ElementTree class? (Assuming that's what you meant.)


The reason why you can register your own Element classes is because they 
can appear all over the place in the API. The ElementTree class is either 
instantiated by the user or returned from the parse() function. That's 
mostly it. Ok, maybe XSLT. But still easy enough to wrap yourself.



PS: the documentation for `set_default_parser` explains that it sets the 
default parser *for the current thread* and that "You can create a separate 
parser for each thread explicitly or use a parser pool.", does
it mean that in a "don't call any API which gets an implicit parser and 
manage your parsers by hand" sense or something else?


Parsers are really only used where an explicit "parser" argument is 
accepted. Everything else just inherits them. If you want to use your own 
parser, write a wrapper function for parse() that always passes it in, and 
then use that function instead.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: Compatibility issues between `lxml.etree.set_element_class_lookup`` and `lxml.html`

2022-03-08 Thread Stefan Behnel

Salut,

Xavier Morel schrieb am 07.03.22 um 13:27:
Sorry for the bother, but I've been looking at 
`lxml.etree.set_element_class_lookup`[0] as a way to add validation and 
features to lxml usage without having to ban "standard" lxml constructs 
(and to control usage by dependencies as well).


I consider the function fine for what it does, but if it gets in the way, 
don't use it. It's a global setting, which means that it can break stuff 
elsewhere, unintentionally and without warning.


Just create your own parser instance and configure the class lookup only there.


Is there a "proper" way to make these things collaborate? I looked at 
lxml.html and it looked like it might have to be rebuilt from the HTMLMixin 
(which already seems icky) but `objectify` is a cython module so there 
doesn't seem to be a good way to interact with it.


Cython modules are mostly just compiled Python mpdules and behave pretty 
much the same, from a user perspective. If you can read Python, you can 
probably read Cython code, and if you know how to use Python modules, you 
can probably also work with Cython compiled modules.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value

2022-03-05 Thread Stefan Behnel


Change by Stefan Behnel :


--
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46798>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node

2022-03-04 Thread Stefan Behnel

Dr. Volker Jaenisch schrieb am 04.03.22 um 00:02:

Am 03.03.22 um 23:54 schrieb Stefan Behnel:
this reads like something you could implement on top of lxml.objectify, 
via subclassing and an appropriate element class lookup.


This could really be a plain Python package that you could distribute on 
PyPI to give users an easy choice which interface they prefer. Not 
everything needs to be part of lxml itself.


My prototype is still clued to lxml since I use internal cython functions 
of lxml that are not exported to python space. But with a little help of 
the kind lxml people it may be possible to completely seperate it from lxml.


The idea is to do pretty much what objectify currently does, using (I 
guess) the same element lookup, but to use a Python subclass of the 
ObjectifiedElement class for the tree structure that implements your 
different attribute lookup scheme in "__getattr__".


The general mechanism for selecting element class implementations is 
described here:


https://lxml.de/element_classes.html

Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node

2022-03-03 Thread Stefan Behnel
Hi Volker,

this reads like something you could implement on top of lxml.objectify, via 
subclassing and an appropriate element class lookup.

This could really be a plain Python package that you could distribute on PyPI 
to give users an easy choice which interface they prefer. Not everything needs 
to be part of lxml itself.

Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node

2022-03-03 Thread Stefan Behnel

Dr. Volker Jaenisch schrieb am 03.03.22 um 18:19:
Therefore I am currently working on enabling LXML to have _ 
properties in objectify. The changes are not too complicated since the 
source code quality is good. I am hopeful that after the weekend I will 
have full functional prototype.


As Holger wrote, the issue with prefixes is that they are provided by the 
input document. There are well-known prefixes for a hand full of 
namespaces, but that is a pure naming convention and in no way an obligation.


While I can see that it might be helpful for debugging purposes to see that 
there are attributes like "html_image", no-one keeps them from ending up as 
"s_image" or just "image" (with a default namespace and no prefix), if the 
creator of the specific document at hand decides so.


Aside from debugging, I fail to see a use case for this. And it increases 
the risk for innocent users to write code that seems to work with most 
documents (that use "standard" prefixes) but fail for others (which tend to 
be missing from the test suite).


So … I think keeping prefixes generally out of the interface is a good 
decision.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node

2022-03-03 Thread Stefan Behnel

Dr. Volker Jaenisch schrieb am 01.03.22 um 16:06:
To find the desired sibling the code loops over all childern and matches 
(parentNamespace, propertyName) against them.


The correct operation of _findFollowingSibling should IMHO be:

Make a lookup on all children (with the python property name only). If one 
match is found then return this match. If none or more than one match is 
found then no answer is possible.


I see a major drawback with this behaviour, and that is non-local 
dependencies. If you have this XML:







then "root.ch1" would give you the first child. Great, so you use that in 
your code. Now, someone decides to send you an input document that looks 
like this:








And your code will suddenly fail to find "root.ch1". Depending on what your 
code does and how it does it, it may fail with an exception, or it may fail 
silently to find the desired data and just keep working without it.


Note that the content of the XML file that your code is designed to process 
did not change at all. It's just that some entirely unrelated content was 
added, in a completely different and unrelated namespace. And it was just 
externally added to the input data, or maybe just some tiny portion it, 
without telling you or your code about it. Especially in places with 
optional content, where different namespaces are already a little more 
common than elsewhere, this is fairly likely to go unnoticed.


I find this kind of behaviour dangerous enough to restrict the "magic" in 
the API to what is easy to understand and predict.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[issue46786] embed, source, track, wbr HTML elements not considered empty

2022-02-27 Thread Stefan Behnel


Change by Stefan Behnel :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46786] embed, source, track, wbr HTML elements not considered empty

2022-02-27 Thread Stefan Behnel


Stefan Behnel  added the comment:


New changeset 345572a1a0263076081020524016eae867677cac by Jannis Vajen in 
branch 'main':
bpo-46786: Make ElementTree write the HTML tags embed, source, track, wbr as 
empty tags (GH-31406)
https://github.com/python/cpython/commit/345572a1a0263076081020524016eae867677cac


--

___
Python tracker 
<https://bugs.python.org/issue46786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46389] 3.11: unused generator comprehensions cause f_lineno==None

2022-02-25 Thread Stefan Behnel


Stefan Behnel  added the comment:

Possibly also related, so I though I'd mention it here (sorry if this is 
hijacking the ticket, seems difficult to tell). We're also seeing None values 
in f_lineno in Cython's test suite with 3.11a5:

  File "", line 1, in 
run_trace(py_add, 1, 2)
^^^
  File "tests/run/line_trace.pyx", line 231, in line_trace.run_trace 
(line_trace.c:7000)
func(*args)
  File "tests/run/line_trace.pyx", line 60, in line_trace.trace_trampoline 
(line_trace.c:3460)
raise
  File "tests/run/line_trace.pyx", line 54, in line_trace.trace_trampoline 
(line_trace.c:3359)
result = callback(frame, what, arg)
  File "tests/run/line_trace.pyx", line 81, in 
line_trace._create_trace_func._trace_func (line_trace.c:3927)
trace.append((map_trace_types(event, event), frame.f_lineno - 
frame.f_code.co_firstlineno))
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'

https://github.com/cython/cython/blob/7ab11ec473a604792bae454305adece55cd8ab37/tests/run/line_trace.pyx

No generator expressions involved, though. (Much of that test was written while 
trying to get the debugger in PyCharm to work with Cython compiled modules.)

There is a chance that Cython is doing something wrong in its own line tracing 
code, obviously.
(I also remember seeing other tracing issues before, where the line reported 
was actually in the trace function itself rather than the code to be traced. We 
haven't caught up with the frame-internal changes yet.)

--
nosy: +scoder

___
Python tracker 
<https://bugs.python.org/issue46389>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [xml] Release of libxml2 2.9.13

2022-02-23 Thread Stefan Behnel

Nick Wellnhofer schrieb am 23.02.22 um 11:36:
I asked on GNOME infra if it is possible to offer .tar.gz downloads, but 
this would require changes to the upload script.


Thanks for asking.

Stefan
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[issue46836] [C API] Move PyFrameObject to the internal C API

2022-02-23 Thread Stefan Behnel


Stefan Behnel  added the comment:

I haven't looked fully into this yet, but I *think* that Cython can get rid of 
most of the direct usages of PyFrameObject by switching to the new 
InterpreterFrame struct instead. It looks like the important fields have now 
been moved over to that.

That won't improve the situation regarding the usage of CPython internals, but 
it's probably worth keeping in mind before we start adding new API functions 
that work on frame objects.

--

___
Python tracker 
<https://bugs.python.org/issue46836>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value

2022-02-23 Thread Stefan Behnel


Stefan Behnel  added the comment:

> IMHO if the developer doesn't manage the XML itself it is VERY unreasonable 
> to use the document value and not the developer one.

I disagree. If the document says "this is the default if no explicit value if 
given", then I consider that just as good as providing a value each time. 
Meaning, the attribute *is* in fact present, just not explicitly spelled out on 
the element.

I would specifically like to avoid adding a new option just to override the way 
the document distributes its attribute value spelling across DTD and document 
structure. In particular, the .get() method is the wrong place to deal with 
this.

You can probably configure the parser to ignore the internal DTD subset, if 
that's what you want.

--

___
Python tracker 
<https://bugs.python.org/issue46798>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [xml] Release of libxml2 2.9.13

2022-02-22 Thread Stefan Behnel

Nick Wellnhofer via xml schrieb am 20.02.22 um 13:53:

Version 2.9.13 of libxml2 is available at:

     https://download.gnome.org/sources/libxml2/2.9/


Thank you for the release, Nick!


Note that starting with this release, libxml2 tarballs are published on 
download.gnome.org instead of ftp.xmlsoft.org.


I noticed that they now use xz compression, whereas they were simply gzip 
compressed before. libxslt also changed the compression. That makes it more 
difficult to download them automatically, because scripts that want to list 
the available files now have to search for different file names. Also, 
Python 2.7 does not have built-in lzma compression support and needs an 
external module in order to handle it. (Both gz and bz2 have been supported 
essentially forever, OTOH.)


And it seems that xz is not considered safe for long-term storage by everyone:

https://www.nongnu.org/lzip/xz_inadequate.html

Could you make the archives available in a (second) format that matches all 
(previous) releases? Apparently, both libxml2 and libxslt were made 
available with gz and bz2 compression before. Either of them would probably 
be fine. bz2 seems to compress equally well as xz here. (And compression 
speed, where bz2 suffers a bit, was never an issue for downloads anyway, 
just decompression speed, where all three are fine.)


Thanks,
Stefan
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[lxml] Re: v4.8.0 breaking regression?

2022-02-22 Thread Stefan Behnel

Charlie Clark schrieb am 22.02.22 um 17:51:

On 22 Feb 2022, at 17:26, Stefan Behnel wrote:


If you set STATIC_BUILD=true, and LIBXML_VERSION=2.9.12, lxml will use the git 
version instead of the release version.


I just tried this but got the same result. Presumably, I did something wrong 
but ENVVARs are not my strength anyway.

However, it sounds very much like a know issue that will hopefully disappear 
once 2.9.13 is released. MacPorts is normally pretty up to date, but I see that 
this hasn't been updated for nine months but 2.9.13 was only released on the 
19th of February.


Yes, 2.9.13 was freshly released. That may explain why it works for Bob. A 
static build would pick up the latest version.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: v4.8.0 breaking regression?

2022-02-22 Thread Stefan Behnel

Bob Kline schrieb am 22.02.22 um 17:29:

On Tue, Feb 22, 2022 at 11:20 AM Stefan Behnel wrote:

...
Help with building more universal macOS wheels would be appreciated.


What would that involve?

Finding a good way to do it. :)

As I wrote, cibuildwheels probably has a way to do it from Github Actions, 
but changing build systems (or replacing the build configuration) isn't 
exactly something I'd like to put work into right now. I'm not saying that 
it would be difficult, just that it needs doing and testing, and probably a 
couple of iterations until everything runs smoothly again.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: v4.8.0 breaking regression?

2022-02-22 Thread Stefan Behnel

Charlie Clark schrieb am 22.02.22 um 09:48:

On 21 Feb 2022, at 20:37, Jens Tröger wrote:
Yes, when I installed lxml it built locally on my Intel Mac 10.14.6 with 
Python 3.9.10, and in another email I actually wanted to ask for a 
pre-compiled whl:


Collecting lxml

Using cached lxml-4.8.0.tar.gz (3.2 MB)

Using legacy 'setup.py install' for lxml, since package 'wheel' is not 
installed.


Installing collected packages: lxml

Running setup.py install for lxml ... done

Successfully installed lxml-4.8.0


FWIW I can confirm that this happens if lxml is built on the machine but 
not with the wheel


This is locally build lxml

Python  : sys.version_info(major=3, minor=9, micro=10, 
releaselevel='final', serial=0)

lxml.etree  : (4, 8, 0, 0)
libxml used : (2, 9, 12)
libxml compiled : (2, 9, 12)
libxslt used    : (1, 1, 34)
libxslt compiled    : (1, 1, 34)
3
 b'\n    baz\n 
\n  \n    baz\n  \n  id="b-3">\n    baz\n  \n\n  '
 b'\n    baz\n 
\n  \n    baz\n  \n\n  '
 b'\n    baz\n 
\n\n'


And this with the wheel

Python  : sys.version_info(major=3, minor=9, micro=10, 
releaselevel='final', serial=0)

lxml.etree  : (4, 8, 0, 0)
libxml used : (2, 9, 12)
libxml compiled : (2, 9, 12)
libxslt used    : (1, 1, 34)
libxslt compiled    : (1, 1, 34)
3
 b'\n    baz\n 
\n  '
 b'\n    baz\n 
\n  '

 b'\n    baz\n \n'

All libraries have the same version so it must be something else. I use 
MacPorts to keep libraries up to date.


Sadly, libxml2 2.9.12 is not libxml2 2.9.12 here. On your machine, you 
probably have the latest release version installed. The lxml wheels are 
built with a newer git version that has a fix for this issue. Or a 
work-around, if you want.


If you set STATIC_BUILD=true, and LIBXML_VERSION=2.9.12, lxml will use the 
git version instead of the release version.


It would probably be worth adding a runtime detection for this issue, so 
that lxml can fail to import if it finds an incompatible libxml2 version. 
The broken behaviour seems heavy enough to fail hard instead of issuing 
just a warning (which the build currently does, but you normally won't see 
that in pip installations).


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[lxml] Re: v4.8.0 breaking regression?

2022-02-22 Thread Stefan Behnel

Bob Kline schrieb am 21.02.22 um 17:14:

got the expected (correct) output. This is on macOS 12.2.1 (M1).
Another interesting data point is that although
https://pypi.org/project/lxml/ claims that there are builds of 4.8.0
for Python 3.10, pip on this machine concluded that it needed to build
lxml from code. Perhaps an M1 thing? I will see what happens on Linux
and Windows.


The macOS wheels are not currently compatible with M1, so you end up with a 
local build instead.


Help with building more universal macOS wheels would be appreciated. I 
guess a switch to cibuildwheels would help, but I doubt that that's done 
lightly.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[issue46786] embed, source, track, wbr HTML elements not considered empty

2022-02-22 Thread Stefan Behnel


Stefan Behnel  added the comment:

Makes sense. That list hasn't been updated in 10 years.

--
versions:  -Python 3.10, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue46786>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value

2022-02-22 Thread Stefan Behnel


Stefan Behnel  added the comment:

The question here is simply, which is considered more important: the default 
provided by the document, or the default provided by Python. I don't think it's 
a clear choice, but the way it is now does not seem unreasonable. Changing it 
would mean deliberate breakage of existing code that relies on the existing 
behaviour, and I do not see a reason to do that.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46798>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24053] Define EXIT_SUCCESS and EXIT_FAILURE constants in sys

2022-02-18 Thread Stefan Behnel


Stefan Behnel  added the comment:

> Any reasons the PR still not merged?

There was dissent about whether these constants should be added or not. It 
doesn't help to merge a PR that is not expected to provide a benefit.

--

___
Python tracker 
<https://bugs.python.org/issue24053>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[lxml] Re: Undefined symbol error when using lxml from within OBS on Linux machine

2022-02-10 Thread Stefan Behnel

Hi,

Daniel Beiter schrieb am 10.02.22 um 14:53:

For a project I am using OBS (Open Broadcaster Software) that provides Python
scripting capabilities to manipulate scenes, objects, etc. (
https://obsproject.com/wiki/Getting-Started-With-OBS-Scripting ). The API is in
C and wrapper functions for Python are built by SWIG (
https://obsproject.com/docs/scripting.html ). When loading a Python script from
within the OBS software containing nothing else but 'from lxml import etree', it
throws an import error because of an undefined symbol. Outside of OBS lxml works
as expected and no errors occur.
from lxml import etree

ImportError:
/home/[USER]/.local/lib/python3.8/site-packages/lxml/etree.cpython-38-x86_64-linux-gnu.so:
undefined symbol: PyExc_ImportError


Can you import other binary packages that you install with pip? E.g. pyyaml 
or numpy?


That symbol is part of Python. It's definitely there. The question is how 
OBS integrates with Python. Does the application (or the Python library, if 
it provides one) export the Python symbols? You can list the exported 
symbols of a library with "nm -D the_library.so". There should be loads of 
"Py..." symbols in there, including the one above.




When Cythonizing
src/lxml/etree.pyx warnings occur that the local variable 'args' is referenced
before assigned


That's unrelated. (And actually a false positive.)

Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)

2022-02-10 Thread Stefan Behnel

Petr Viktorin schrieb am 10.02.22 um 11:22:
So, should there be a mechanism to set source/lineno/position on 
tracebacks/exceptions, rather than always requiring a frame for it?


There's "_PyTraceback_Add()" currently, but it's incomplete in terms of 
what Cython would need.


As it stands, Cython could make use of a function that accepted

- string object arguments for filename and function name
- (optionally) a 'globals' dict (or a reference to the current module)
- (optionally) a 'locals' mapping
- (optionally) a code object
- a C integer source line
- a C integer position, probably start and end lines and columns

to add a traceback level to the current exception.

I'm not sure about the code object since that's a rather heavy thing, but 
given that Cython needs to create code objects in order for its functions 
to be introspectible, that seems like a worthwhile option to have.


However, with the recent frame stack refactoring and frame object now being 
lazily created, according to


https://bugs.python.org/issue44032
https://bugs.python.org/issue44590

I guess Cython should rather integrate with the new stack frame 
infrastructure in general. That shifts the requirements a bit.


An API function like the above would then still be helpful for the reduced 
API compile mode, I guess. But as soon as Cython uses InterpreterFrame 
structs internally, it would no longer be helpful for the fast mode.


InterpreterFrame object are based on byte code instructions again, which 
brings us back to co_positions.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YSP36JL5SRSPEG4X67G5RMWUWLVXSDC5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)

2022-02-09 Thread Stefan Behnel

Andrew Svetlov schrieb am 09.02.22 um 19:40:

Stefan, do you really need to emulate call stack with positions?
Could the __note__ string with generated Cython part of exception traceback
solve your needs (https://www.python.org/dev/peps/pep-0678/) ?


Thanks for the link, but I think it would be surprising for users if a 
traceback displayed some code positions differently than others, when all 
code lines refer to Python code.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BSDVX7MJFDZ6PFB7FG7Z3R4IO56FZ47T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)

2022-02-09 Thread Stefan Behnel

Guido van Rossum schrieb am 09.02.22 um 19:36:

On Wed, Feb 9, 2022 at 9:41 AM Pablo Galindo Salgado wrote:

On Wed, 9 Feb 2022 at 17:38, Stefan Behnel wrote:

Pablo Galindo Salgado schrieb am 09.02.22 um 17:40:

Should there be a getter/setter for co_positions?


We consider the representation of co_postions private


Yes, and that's the issue.


I can only say that currently, I am not confident to expose such an API,
at least for co_positions, as the internal implementation is very likely to
heavily change and we want to have the possibility of changing it between
patch versions if required (to address bugs and other things like that).

>
> It might require a detailed API design proposal coming from outside
> CPython
> (e.g. from Cython) to get this to change. I imagine for co_positions in
> particular this would have to use a "builder" pattern.
>
> I am unclear on how this would work though, given that Cython generates C
> code, not CPython bytecode. How would the synthesized co_positions be
> used?
> Would Cython just generate a co_positions fragment at the moment an
> exception is raised, pointing at the .pyx file from which the code was
> generated?

So, what we currently do is to update the line number (which IIRC is really 
the start line number of the current function) on the current frame when an 
exception is raised, and the byte code offset to 0. That's a hack but shows 
the correct code line in the traceback. Probably conflicts with pdb, but 
there are still other issues with that anyway.


I remember looking into the old lnotab mapping at some point and trying to 
implement that with fake byte code offsets but never got it finished.


The idea is pretty simple, though. Instead of byte code offsets, we'd count 
our syntax tree nodes and just store the code position range of each syntax 
node at the "byte code offset" of the node's counter number. That's 
probably fairly easy to do in C code, maybe even with a statically 
allocated data structure. Then, instead of setting the frame function's 
line number, we'd set the frame's byte code instruction counter to the 
number of the failing syntax node, and CPython would retrieve the code 
position from that offset.


That sounds simple enough, probably simpler than any API usage – but 
depends on implementation details.


Especially the idea of storing all this statically in the data segment of 
the shared library sounds very tempting.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GAJFB6ABFYXF3RFXFDQ3YUZD23FMXPEY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)

2022-02-09 Thread Stefan Behnel

Pablo Galindo Salgado schrieb am 09.02.22 um 17:40:

Should there be a getter/setter for co_positions?


We consider the representation of co_postions private


Yes, and that's the issue.



so we don't want (for now) to ad
getters/setters. If you want to get the position of a instruction, you can
use PyCode_Addr2Location


What Cython needs is the other direction. How can we provide the current 
source position range for a given piece of code to an exception?


As it stands, the way to do this is to copy the implementation details of 
CPython into Cython in order to let it expose the specific data structures 
that CPython uses for its internal representation of code positions.


I would prefer using an API instead that allows exposing this mapping 
directly to CPython's traceback handling, rather than having to emulate 
byte code positions. While that would probably be quite doable, it's far 
from a nice interface for something that is not based on byte code.


And that's not just a Cython issue. The same applies to Domain Specific 
Languages or other programming languages that integrate with Python and 
want to show users code positions for their source code.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VQSWX6MFKIA3RYPSX7O6RTVC422LTJH4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-08 Thread Stefan Behnel

Inada Naoki schrieb am 08.02.22 um 06:15:

On Tue, Feb 8, 2022 at 1:47 PM Guido van Rossum wrote:


Thanks for trying it! I'm curious why it would be slower (perhaps less 
locality? perhaps the ...Id... APIs have some other trick up their sleeve?) but 
since it's also messier and less backwards compatible than just leaving 
_Py_IDENTIFIER alone and just not using it, I'd say let's not spend more time 
on that alternative and just focus on the two other horses still in the race: 
immortal objects or what you have now.



I think it's because statically allocated strings are not interned.


That would explain such a difference.



I think deepfreeze should stop using statically allocated strings for
interned strings too.


… or consider the statically allocated strings the interned string value. 
Unless another one already exists, but that shouldn't be the case for 
CPython internal strings.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5NE7EI3TVW4C3ZZI6LO5HNPIZRQNPMHG/
Code of Conduct: http://python.org/psf/codeofconduct/


[issue45948] Unexpected instantiation behavior for xml.etree.ElementTree.XMLParser(target=None)

2022-02-08 Thread Stefan Behnel


Stefan Behnel  added the comment:

This is a backwards incompatible change, but unlikely to have a wide impact.

I was thinking for a second if it's making the change in the right direction 
because it's not unreasonable to pass "None" for saying "I want no target". But 
it's documented this way and lxml does it the same, so I agree that this should 
be changed to make "None" behave the same as no argument.

--

___
Python tracker 
<https://bugs.python.org/issue45948>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Stefan Behnel

Eric Snow schrieb am 04.02.22 um 17:35:

On Fri, Feb 4, 2022 at 8:21 AM Stefan Behnel wrote:

Correct. We (intentionally) have our own way to intern strings and do not
depend on CPython's identifier framework.


You're talking about __Pyx_StringTabEntry (and __Pyx_InitString())?


Yes, that's what we generate. The C code parsing is done here:

https://github.com/cython/cython/blob/79637b23da77732e753b1e1ab5669b3e29978be3/Cython/Compiler/Code.py#L531-L550

The deduplication is a bit complex on our side because it needs to handle 
Python source encodings, and also distinguishes between identifiers (that 
become 'str' in Py2), plain Unicode strings and byte strings. You don't 
need most of that for plain C code. But it's done here:


https://github.com/cython/cython/blob/79637b23da77732e753b1e1ab5669b3e29978be3/Cython/Compiler/Code.py#L1009-L1088

And then there's a whole bunch of code that helps in getting Unicode 
character code points and arbitrary byte values in very long strings pushed 
through C compilers, while keeping it mostly readable for interested users. :)


https://github.com/cython/cython/blob/master/Cython/Compiler/StringEncoding.py

You probably don't need that either, as long as you only deal with ASCII 
strings.


Any way, have fun. Feel free to ask if I can help.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QHJBAKIQUKFPIM6GZ7DYNJF3HDMDQQUH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Stefan Behnel

Ronald Oussoren via Python-Dev schrieb am 03.02.22 um 14:46:

On 2 Feb 2022, at 23:41, Eric Snow wrote:
* a little less convenient: adding a global string requires modifying
a separate file from the one where you actually want to use the string
* strings can get "orphaned" (I'm planning on checking in CI)
* some strings may never get used for any given ./python invocation
(not that big a difference though)


The first two cons can probably be fixed by adding some indirection, with some
markers at the place of use and a script that uses those to generate the
C definitions.

Although my gut feeling is that adding a the CI check you mention is good
enough and adding the tooling for generating code isn’t worth the additional
complexity.


It's what we do in Cython, and it works really well there. It's very 
straight forward, you just write something like


PYUNICODE("some text here")
PYIDENT("somename")

in your C code and Cython creates a deduplicated global string table from 
them and replaces the string constants with the corresponding global 
variables. (We have two different names because an identifier in Py2 is 
'str', not 'unicode'.)


Now, the thing with CPython is that the C sources where the replacement 
would take place are VCS controlled. And a script that replaces the 
identifiers would have to somehow make sure that the new references do not 
get renamed, which would lead to non-local changes when strings are added.


What you could try is to number the identifiers, i.e. use a macro like

_Py_STR(123, "some text here")

where you manually add a new identifier as

_Py_STR("some text here")

and the number is filled in automatically by a script that finds all of 
them, deduplicates, and adds new identifiers at the end, adding 1 to the 
maximum number that it finds. That makes sure that identifiers that already 
have an ID number will not be touched, deleted strings disappear 
automatically, and non-local changes are prevented.


Defining the _Py_STR() macro as

   #define _Py_STR(id, text)  (_Py_global_string_table[id])

or

   #define _Py_STR(id, text)  (_Py_global_string_table##id)

would also give you a compile error if someone forgets to run the script.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LD3JM2NQ5ZUZDK63RH4IVZPCZ7HC4X3G/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-04 Thread Stefan Behnel

Petr Viktorin schrieb am 03.02.22 um 13:47:

On 02. 02. 22 11:50, Stefan Behnel wrote:
Maybe we should advertise the two modes more. And make sure that both 
work. There are certainly issues with the current state of the "limited 
API" implementation, but that just needs work and testing.


I wonder if it can it be renamed? "Limited API" has a specific meaning 
since PEP 384, and using it for the public API is adding to the general 
confusion in this area :(


I was more referring to it as an *existing* compilation mode of Cython that 
avoids the usage of CPython implementation details. The fact that the 
implementation is incomplete just means that we spill over into non-limited 
API code when no limited API is available for a certain feature. That will 
usually be public API code, unless that is really not available either.


One recent example is the new error locations in tracebacks, where PEP 657 
explicitly lists the new "co_positions" field in code objects as an 
implementation detail of CPython. If we want to implement this in Cython, 
then there is no other way than to copy these implementation details pretty 
verbatimly from CPython and to depend on them.


https://www.python.org/dev/peps/pep-0657/

In this specific case, we're lucky that this can be considered an entirely 
optional feature that we can separately disable when users request "public 
API" mode (let's call it that). Not sure if that's what users want, though.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A55HYBIFBOTAX5IB4YUYWUHI3IDLRD2F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Moving away from _Py_IDENTIFIER().

2022-02-04 Thread Stefan Behnel

Victor Stinner schrieb am 03.02.22 um 22:46:

Oh right, Cython seems to be a false positive.

A code search found 3 references to __Pyx_PyObject_LookupSpecial():

PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz:
Cython-0.29.26/Cython/Compiler/ExprNodes.py: lookup_func_name =
'__Pyx_PyObject_LookupSpecial'
PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz:
Cython-0.29.26/Cython/Compiler/Nodes.py: code.putln("%s =
__Pyx_PyObject_LookupSpecial(%s, %s); %s" % (
PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz:
Cython-0.29.26/Cython/Utility/ObjectHandling.c: static CYTHON_INLINE
PyObject* __Pyx_PyObject_LookupSpecial(PyObject* obj, PyObject*
attr_name) {

Oh, that's not "_PyObject_LookupSpecial()", it doesn't use the
_Py_Identifier type:

static CYTHON_INLINE PyObject*
__Pyx_PyObject_LookupSpecial(PyObject* obj, PyObject* attr_name)
{ ... }


Correct. We (intentionally) have our own way to intern strings and do not 
depend on CPython's identifier framework.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4ATP4FSVRNI5CLAJDN43QRDH5IHW7BW2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel

Victor Stinner schrieb am 02.02.22 um 23:23:

On Wed, Feb 2, 2022 at 3:54 PM Stefan Behnel wrote:

So people using stable Python versions like Python 3.10 would not need
Cython, but people testing the "next Python" (Python 3.11) would not
have to manually removed generated C code.


That sounds like an environment variable might help?


Something like CYTHON_FORCE_REGEN=1 would be great :-)


https://github.com/cython/cython/commit/b859cf2bd72d525a724149a6e552abecf9cd9d89

Note that this only applies when cythonize() is actually called. Some 
setup.py scripts may not do that unless requested to.




My use case is to use a project on the "next Python" version (the main
branch) when the project contains outdated generated C code, whereas I
have a more recent Cython version installed.


That use case would probably be covered by the Cython version check now, in 
case that stays in (the decision is pending user feedback).


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N6R5BE4GVNYRUTOET5QRQ5N2ZCJYZC7X/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel

Ronald Oussoren via Python-Dev schrieb am 02.02.22 um 16:44:

On 2 Feb 2022, at 11:50, Stefan Behnel wrote:
Petr Viktorin schrieb am 02.02.22 um 10:22:

- "normal" public API, covered by the backwards compatibility policy (users 
need to recompile for every minor release, and watch for deprecation warnings)


That's probably close to what "-DCYTHON_LIMITED_API" does by itself as it stands. I can 
see that being a nice feature that just deserves a more suitable name. (The name was chosen because 
it was meant to also internally define "Py_LIMITED_API" at some point. Not sure if it 
will ever do that.)



- internal API (underscore-prefixed names, `internal` headers, things 
documented as private)
AFAIK, only the last one is causing trouble here.


Yeah, and that's the current default mode on CPython.


Is is possible to automatically pick a different default version when building 
with a too new CPython version?  That way projects can at least be used and 
tested with pre-releases of CPython, although possibly with less performance.


As I already wrote elsewhere, that is making the assumption (or at least 
optimising for the case) that a new CPython version always breaks Cython. 
And it has the drawback that we'd get less feedback on the "normal" 
integration and may thus end up noticing problems only later in the CPython 
development cycle.


I don't think this really solves a problem.

In any case, before we start playing with the default settings, I'd rather 
let users see what *they* can make of the available options. Then we can 
still come back and see which use cases there are and how to support them 
better.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2SIGLMW4HNF5BDF2DTFZFXCHNSR4VAGB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel

Petr Viktorin schrieb am 02.02.22 um 10:22:
Moving off the internal (unstable) API would be great, but I don't think 
Cython needs to move all the way to the limited API.

There are three "levels" in the C API:

- limited API, with long-term ABI compatibility guarantees


That's what "-DCYTHON_LIMITED_API -DPy_LIMITED_API=..." is supposed to do, 
which currently fails for much if not most code.



- "normal" public API, covered by the backwards compatibility policy (users 
need to recompile for every minor release, and watch for deprecation warnings)


That's probably close to what "-DCYTHON_LIMITED_API" does by itself as it 
stands. I can see that being a nice feature that just deserves a more 
suitable name. (The name was chosen because it was meant to also internally 
define "Py_LIMITED_API" at some point. Not sure if it will ever do that.)



- internal API (underscore-prefixed names, `internal` headers, things 
documented as private)


AFAIK, only the last one is causing trouble here.


Yeah, and that's the current default mode on CPython.

Maybe we should advertise the two modes more. And make sure that both work. 
There are certainly issues with the current state of the "limited API" 
implementation, but that just needs work and testing.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ESEPW36K3PH4RM7OFVKAOE4QMBI2WYVU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-02 Thread Stefan Behnel

Victor Stinner schrieb am 02.02.22 um 11:35:

I wish that there would be a 3rd option: ship C code generated by
Cython *but* run Cython if this C code "looks" outdated, for example
if building the C code fails with a compiler error.


So, one thing I did yesterday was to make sure that .c files get 
regenerated when a different Cython version is used at build time than what 
was used to generate them originally.


Thinking about this some more now, I'm no longer sure that this is really a 
good idea, because it can lead to "random" build failures when a package 
does not pin its Cython version and a newer (or, probably worse, older) one 
happens to be installed at build time.


Not sure how to best deal with this. I'm open to suggestions, although this 
might be the wrong forum.


Let's discuss it in a ticket:

https://github.com/cython/cython/issues/4611

Note that what you propose sounds more like a setuptools feature than a 
Cython feature, though.




So people using stable Python versions like Python 3.10 would not need
Cython, but people testing the "next Python" (Python 3.11) would not
have to manually removed generated C code.


That sounds like an environment variable might help?

I don't really want to add something like a "last supported CPython 
version". There is no guarantee that the code breaks between CPython 
versions, so that would just introduce an artificial support blocker.




In Fedora RPM packages of Python projects, we have to force manually
running Cython. For example, the numpy package does: "rm PKG-INFO"
with the comment: "Force re-cythonization (ifed for PKG-INFO presence
in setup.py)".
https://src.fedoraproject.org/rpms/numpy/blob/rawhide/f/numpy.spec#_107

In my pythonci project, I use a worse hack, I search for generated C
files and remove them manually with this shell command:

 rm -f -v $(grep -rl '/\* Generated by Cython') PKG-INFO

This command searchs for the pattern "/* Generated by Cython".


Right. Hacks like these are just awful. There must be a better way.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V76GA5DRWPEJ7PRBSPRQX335WARZLUHJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[lxml] Re: Missing PDFs from lxml.de/lxml-VERSION.pdf?

2022-02-02 Thread Stefan Behnel

Hi,

Thomas Schraitle schrieb am 02.02.22 um 08:20:

not really remember, but some time ago it was possible to download the latest
PDF from lxml.de/lxml-.pdf. This worked quite well, but now I get a 
404.

Is there a replacement that I can use? If not, would it be possible to build
the PDF from the sources and upload it to the assets on a GitHub release?


Yeah, that was based on LaTeX PDF generation, broke on my machine at some 
point and wasn't repaired since.


PR welcome, especially if it gets the machinery running as a Github Actions 
job in the wheel build workflow.


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Guido van Rossum schrieb am 02.02.22 um 01:43:

It may be hard to imagine if you're working on Cython, which only exists
because of performance needs, but there are other things that people want
to test with the upcoming CPython release in addition to performance


I know. Cython (and originally Pyrex) has come a long way from a tool to 
get stuff done to a dependency that a large number of packages depend on. 
Maintainer decisions these days are quite different from those 10 years 
ago. Let alone 20.


Let's just try to keep things working in general, and fix stuff that needs 
to be broken.




On Tue, Feb 1, 2022 at 4:14 PM Stefan Behnel wrote:

I'd rather make it more obvious to users what their intentions are. And
there is already a way to do that – the Limited API. (and similarly, HPy)


Your grammar confuses me. Do you want users to be clearer in expressing
their intentions?


Erm, sort of. They should be able to choose and express what they prefer, 
in a simple way.




For Cython, support for the Limited API is still work in progress, although
many things are in place already. Getting it to work completely would give
users a simple way to decide whether they want to opt in for a) speed,
lots of wheels and adaptations for each CPython version, or b) less
performance, less hassle.


But until that work is complete, we're stuck with the unlimited API, right?
And by its own statements in a recent post here, HPy is still not ready for
all use cases, so it's also still a pipe dream.


Yes. HPy is certainly far from ready for anything real, but even for the 
Limited API, it's still unclear whether it's actually complete enough to 
cover Cython's needs. Basically, the API that Cython uses must really to be 
able to implement CPython on top of itself. And at the same time interact 
not with the reimplementation but with the underlying original, at the C 
level. The C-API, and especially the Limited API, were never really meant 
for that.




As it looks now, that switch can be done after the code generation, by
defining a simple C define in their build script. That also makes both
modes easily comparable. I think that is as good as it can get.


Do you have specific instructions for package developers here? I could
imagine that the scikit-learn maintainer (sorry to pick on you guys :-)
might not know where to start with this if until now they've always been
able to rely on either numpy wheels or building everything from source with
default settings.


It's not well documented yet, since the implementation isn't complete, and 
so, a bunch of things simply won't work. I don't remember if the buffer 
protocol is part of the Limited API by now, but last I checked it was still 
missing, so the scikit-learn (or NumPy) people would be fairly unhappy with 
the current state of affairs.


But it's mostly just passing "-DCYTHON_LIMITED_API" to your C compiler. 
That's the part that will still work but won't do (yet) what you think. 
Because then, you currently also have to define "-DPy_LIMITED_API=..." and 
that's when your C compiler will get angry with you.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2UFG7IPKR77HQG36BZAUEUDJJKIGBSLE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Thomas Caswell schrieb am 01.02.22 um 23:15:

I think it would be better to discourage projects from including the output
of cython in their sdists.  They should either have cython as a build-time
requirement or provide built wheels (which are specific a platform and
CPython version).  The middle ground of not expecting the user to have
cython while expecting them to have a working c-complier is a very narrow
case and I think asking those users to install cython is worth the forward
compatibility for Python versions you get by requiring people installing
from source to re-cythonize.


I agree. Shipping the generated C sources was a very good choice as long as 
CPython's C-API was very stable and getting a build time dependency safely 
installed on user side was very difficult.


These days, it's the opposite way.

Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KTWDJGHPQW7AIKDQQYV4IFHAKQZVXACL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Guido van Rossum schrieb am 02.02.22 um 00:21:

On Tue, Feb 1, 2022 at 3:07 David wrote:

Greg Ewing wrote:

To address this there could be an option to choose between
"compatible code" and "fast code", with the former restricting
itself to the stable API.


To some extent, that exists at the moment - many of the real abuses of the
CPython internals can be controlled by setting C defines. For the
particular feature that caused this discussion the majority of the uses can
be turned off by defining CYTHON_USE_EXC_INFO_STACK=0 and
CYTHON_FAST_THREAD_STATE=0. (There's still a few uses relating to
coroutines, but those too flags are sufficient to get Cython to build
itself and Numpy on Python 3.11a4).

Obviously it could still be better. But the desire to support PyPy (and
the beginnings of the limited API) mean that Cython does actually have
alternate "clean" code-paths for a lot of cases.


Hm... So maybe the issue is either with Cython's default settings (perhaps
traditionally it defaults to "as fast as possible but relies on internal
APIs a lot"?) or with the Cython settings selected by default by projects
*using* Cython?

I wonder if a solution during CPython's rocky alpha release cycle could be
to default (either in Cython or in projects using it) to the "not quite as
fast but not relying on a lot of internal APIs" mode, and to switch to
Cython's faster mode only once (a) beta is entered and (b) Cython has been
fixed to work with that beta?


This seems tempting – with the drawback that it would make Cython modules 
less comparable between final and alpha/beta CPython releases. So users 
would start reporting ghost performance regressions because it 
(understandably) feels important to them that the slow-down they witness 
needs to be resolved before the final release, and they just won't know 
that this will happen automatically triggered by the version switch. :)


Feels a bit like car manufacturers who switch their exhaust cleaners on and 
off based on the test mode detection.


More importantly, though, we'd get less bug reports during the alpha/beta 
cycle ourselves, because things may look like they work but can still stop 
working when we switch back to fast mode.


I'd rather make it more obvious to users what their intentions are. And 
there is already a way to do that – the Limited API. (and similarly, HPy)


For Cython, support for the Limited API is still work in progress, although 
many things are in place already. Getting it to work completely would give 
users a simple way to decide whether they want to opt in for a) speed, lots 
of wheels and adaptations for each CPython version, or b) less performance, 
less hassle.


As it looks now, that switch can be done after the code generation, by 
defining a simple C define in their build script. That also makes both 
modes easily comparable. I think that is as good as it can get.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FXSNX7UCQWNXXC7OWG4LBLILAYXQEOUB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Hi Irit,

Irit Katriel via Python-Dev schrieb am 01.02.22 um 23:04:

There two separate issues here. One is the timing of committing changes into
cython, and the other is the process by which the cython devs learn about
cpython development.

On the first issue, you wrote:

I'm reluctant to working on adapting Cython during alphas, because it

happened more than once that incompatible changes in CPython were rolled
back or modified again during alpha, beta and rc phases. That means more
work for me and the Cython project, and its users. Code that Cython users
generate and release on their side with a release version of Cython will
then be broken, and sometimes even more broken than with an older Cython
release.


I saw in your patch that you make changes such that they impact only the
new cpython version. So for old versions the generated code should not be
broken. Surely you don't guarantee that cython code generated for an alpha
version of cpython will work on later versions as well?  Users who generate
code for an alpha version should regenerate it for the next alpha and for
beta, right?


I'd just like to note that we are talking about three different projects 
and dependency levels here (CPython, Cython and a project that uses 
Cython), all three have different release cycles, and not all projects can 
afford to go through a new release with a new Cython version regularly or 
on the "emergency" event of a new CPython release. Some even don't provide 
wheels and require their users to do a source build on their side. Often 
with a fixed Cython version dependency, or even with pre-generated and 
shipped C sources, which makes it harder for the end users to upgrade 
Cython as a work-around.


But at least it should be as easy for the maintainers as updating their 
Cython version and pushing a new release. In most cases. And things are 
also becoming easier these days with improvements in the packaging 
ecosystem. It can just take a bit until everyone has had the chance to 
upgrade along the food chain.




On the second issue:


I don't have the capacity to follow all relevant changes in CPython,
incompatible or not.


We get that, and this is why we're asking to work with you on cython updates
so that this will be easier for all of us. There are a number of cpython
core devs
who would like to help cython maintenance. We realise how important and
thinly resourced cython is, and we want to reduce your maintenance burden.
With better communication we could find ways to do that.


I'm sure we will. Thanks for your help. It is warmly appreciated.



Returning to the issue that started this thread - how do you suggest we
proceed with the exc_info change?


I'm not done sorting out the options yet. Regarding CPython, I think it's 
best to keep the current changes in there. It should be easier for us to 
continue from where we are now than to adapt again to a revert in CPython.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BHIQL4P6F7OPMCAP6U24XEZUPQKI62UT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Greg Ewing schrieb am 01.02.22 um 23:33:

On 2/02/22 8:48 am, Guido van Rossum wrote:
It seems to me that a big part of the problem is that Cython feels 
entitled to use arbitrary CPython internals.


I think the reason for this is that Cython is trying to be two
things at once: (1) an interface between Python and C, (2) a
compiler that turns Python code into fast C code.

To address this there could be an option to choose between
"compatible code" and "fast code", with the former restricting
itself to the stable API.


There is even more than such an option. We use a relatively large set of 
feature flags that allow us to turn the usage of certain implementation 
details of the C-API on and off. We use this to adapt to different Python 
C-API implementations (currently CPython, PyPy, GraalPython and the Limited 
C-API), although with different levels of support and reliability.


Here's the complete list of feature sets for the different targets:

https://github.com/cython/cython/blob/5a76c404c803601b6941525cb8ec8096ddb10356/Cython/Utility/ModuleSetupCode.c#L56-L311

This can also be used to enable and disable certain dependencies on CPython 
implementation details, e.g. PyList, PyLong or PyUnicode, but also type 
specs versus PyTypeObject structs.


Most of these feature flags can be disabled by users. There is no hard 
guarantee that this always works, because it's impossible to test all 
combinations, and then there are bugs as well, but most of the flags are 
independent, which should usually allow to disable them independently.


So, one of the tools that we have in our sleeves when it comes to 
supporting new CPython versions is also to selectively disable the 
dependency on a certain C-API feature that changed, at least until we have 
a way to adapt to the change itself.


In the specific case of the "exc_info" changes, however, that didn't quite 
work, because that change was really not anticipated at that level of 
impact. But there is an implementation for Cython 3.0 alpha now, and we'll 
eventually have a legacy 0.29.x release out that will also adapt in one way 
or another. Just takes a bit more time.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QPAWLCS2FINPLVSDFFQCMVIELXETKQ3W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python

2022-02-01 Thread Stefan Behnel

Christian Heimes schrieb am 01.02.22 um 16:42:

On 01/02/2022 16.08, Victor Stinner wrote:

I would prefer to introduce C API incompatible changes differently:
first fix Cython, and *then* introduce the change.

- (1) Propose a Cython PR and get it merged
- (2) Wait until a new Cython version is released
- (3) If possible, wait until numpy is released with regenerated Cython code
- (4) Introduce the incompatible change in Python

Note: Fedora doesn't need (3) since we always regenerated Cython code in 
numpy.


this is a reasonable request for beta releases, but IMHO it is not feasible 
for alphas. During alphas we want to innovate fast and play around. Your 
proposal would slow down innovation and impose additional burden on core 
developers.


Let's at least try not to run into a catch-22.

I'm reluctant to working on adapting Cython during alphas, because it 
happened more than once that incompatible changes in CPython were rolled 
back or modified again during alpha, beta and rc phases. That means more 
work for me and the Cython project, and its users. Code that Cython users 
generate and release on their side with a release version of Cython will 
then be broken, and sometimes even more broken than with an older Cython 
release.


But Victor is right, OTOH, that the longer we wait with adapting Cython, 
the longer users have to wait with testing their code in upcoming CPython 
versions, and the higher the chance of post-beta and post-rc rollbacks and 
changes in CPython.


I don't have the capacity to follow all relevant changes in CPython, 
incompatible or not. Even a Cython CI breakage of the CPython-dev job 
doesn't always mean that there is something to do on our side and is 
therefore silenced to avoid breakage of our own project workflows, and to 
be looked at irregularly. Additionally, since Cython is a crucial part of 
the Python ecosystem, breakage of Cython by CPython sometimes stalls the 
build pipelines of CI images, which means that new CPython dev versions 
don't reach the CI servers for a while, during which the breakage will go 
even more unnoticed.


I think you should generally appreciate Cython (and the few other C-API 
abstraction tools) as an opportunity to get a large number of extensions 
adapted to CPython's now faster development all at once. The quicker these 
tools adapt, the quicker you can get user feedback on your own changes, and 
the more time you have to validate and refine them during the alpha and 
beta cycles.


You can even see the adaptation as a way to validate your own changes in 
the real world. It's cool to write new code, but difficult to find out 
whether it behaves the way you want for the intended audience. So – be part 
of your own audience.


Stefan

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LJDI74V4IOHPCMQUEGH6VIQWHLM3MADG/
Code of Conduct: http://python.org/psf/codeofconduct/


[lxml] Re: Restricting third party access for lxml github org?

2022-01-25 Thread Stefan Behnel

Hi Martijn!

Martijn Faassen schrieb am 25.01.22 um 11:11:

Hey lxmlers,

I recently found out that older organizations by default grant third party
access to any github OAuth application that a user has enabled. This means
that if any of such applications is compromised, this organization is open
for attack. I therefore would recommend we go amend that here:

https://github.com/organizations/lxml/settings/oauth_application_policy

I don't think it has huge consequences as you can selectively enable those
applications you trust after that, but I figured people using this org
should be aware before it's enabled.


Good call. I enabled that setting. If anything stops working unexpectedly, 
that was me. :)


Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


Re: [Cython] question on submitting a possibly massive bug report

2022-01-25 Thread Stefan Behnel

website.reader via cython-devel schrieb am 25.01.22 um 01:09:

I am not familiar with Cython, but have spent a few weeks looking at compiler warnings 
posted when the mathematical package called "sage v9.4" is compiled, which 
takes several hours to build, since hundreds of code units are invovled in this massive 
build project.

I logged 341 errors during the cythonizing part of the compile run, and found 
110 code units (C packages) which I was able to fix so that the recompile would 
have no warnings. The warnings were legitimate.

There are 4 categories of these warnings.

1. Using an unitialized variable with an unknown value
2. Comparing signed and unsigned variables
3. Discarding a const specifier to a variable upon use elsewhere in the code
4. Coercing a pointer to a variable of the wrong type (or vice versa)

I did speak to one knowledgable person about this, but my question is this

a) do I submit 341 bug reports covering all the warnings?
b) since 110 code units were affected do I file 110 bug reports for each code 
unit?
b) do I submit just one bug report for each of the 4 categories above, thus 
just 4 bug reports?
c) do I just list all the warning messages obtained from the massive build run 
so everyone can get some idea of the problems being faced?

I did look at the C code and the pyx code generating it and definitely cython 
is the origination here of these issues.

Since I am NOT yet familiar with cython from scratch, at the moment I am at a 
loss to write litte tiny programs illustrating the problem.


Cython is a code generator, so there probably are only a few places where a 
larger bunch of issues originate from. You already grouped them by type 
(1-4), and those likely belong to one cause (or a few related causes). Just 
open one issue for each of the four. Then please list a few source code 
examples in each, together with the C code that Cython generated for them, 
and the warning that the C compiler gave you.


If we later find that not all warnings can be resolved this way, we'll see 
what we can do about the rest.


Please make sure to provide the Cython version that you are using. The 
latest release is 3.0.0a10 (and the main development goes there), although 
there is a legacy stable version series 0.29.x that most projects are still 
using and where we will continue to fix bugs for another while. But new 
reports should best target 3.0 in order to avoid chasing zombies.


Thanks,
Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[issue45569] Drop support for 15-bit PyLong digits?

2022-01-12 Thread Stefan Behnel

Stefan Behnel  added the comment:

Cython should be happy with whatever CPython uses (as long as CPython's header 
files agree with CPython's build ;-) ).

I saw the RasPi benchmarks on the ML. That would have been my suggested trial 
platform as well.
https://mail.python.org/archives/list/python-...@python.org/message/5RJGI6THWCDYTTEPXMWXU7CK66RQUTD4/

The results look ok. Maybe the slowdown for pickling is really the increased 
data size of integers. And it's visible that some compute-heavily benchmarks 
like pyaes did get a little slower. I doubt that they represent a real use case 
on such a platform, though. Doing any kind of number crunching on a RasPi 
without NumPy would appear like a rather strange adventure.

That said, if we decide to keep 15-bit digits in the end, I wonder if 
"SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably 
fast 64-bit multiply or not" kind of decision – however that translates into 
code. I'm sure there are 32-bit platforms that would actually benefit from 
30-bit digits today.

If we find a platform that would be fine with 30-bits but lacks a fast 64-bit 
multiply, then we could still try to add a platform specific value size check 
for smaller numbers. Since those are common case, branch prediction might help 
us more often than not.

But then, I wonder how much complexity this is even worth, given that the goal 
is to reduce the complexity. Platform maintainers can still decide to configure 
the digit size externally for the time being, if it makes a difference for 
them. Maybe switching off 15-bits by default is just good enough for the next 
couple of years to come. :)

--

___
Python tracker 
<https://bugs.python.org/issue45569>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: [xml] Resuming maintenance

2022-01-10 Thread Stefan Behnel

Nick Wellnhofer via xml schrieb am 10.01.22 um 15:20:
Thanks to a donation from Google, I'm able to resume maintenance of libxml2 
(and libxslt) for the remainder of 2022.


I'm very happy to read this, Nick. All the best for 2022.

Stefan
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[lxml] Re: question about a bug when lxml runs in a conda environment

2022-01-06 Thread Stefan Behnel

Martin Mueller schrieb am 31.12.21 um 18:06:

I have used lxml extensively in a Pycharm environment that calls on a conda 
environment.  Lately I encountered an odd error. The correct output of a 
marylamb.py script goes like this:

http://www.tei-c.org/ns/1.0";> Mary had a little lamb,
http://www.tei-c.org/ns/1.0";> Its fleece was white as snow, yeah.
http://www.tei-c.org/ns/1.0";> Everywhere the child went,
http://www.tei-c.org/ns/1.0";>  The little lamb was sure to go, 
yeah.
http://www.tei-c.org/ns/1.0";> He followed her to school one day,
http://www.tei-c.org/ns/1.0";> And broke the teacher's rule.
http://www.tei-c.org/ns/1.0";> What a time did they have,
http://www.tei-c.org/ns/1.0";>  That day at school.
http://www.tei-c.org/ns/1.0";> Tisket, tasket,
http://www.tei-c.org/ns/1.0";>  A green and yellow basket.
http://www.tei-c.org/ns/1.0";>  Sent a letter to my baby,
http://www.tei-c.org/ns/1.0";>   On my way I passed it.

In the buggy output the script runs amok and prints the current line plus the 
rest of the text. I print it out at the end of this memo. The Pycharms folks 
were able to identify the conda environment as the likely culprit. If I run the 
script outside it doeesn’t happen. The problem seems to be limited to lxml 
running in a conda environment, because scripts that don’t use lxml are not 
plague by that bug.


It's most likely an issue with the libxml2 version. You probably have 
2.9.12 installed in your condaenv. If you go back to 2.9.10, then it would 
probably work.


  conda install libxml2=2.9.10

You can find the version that lxml uses with

"""
from lxml import etree
print("%-20s: %s" % ('lxml.etree',   etree.LXML_VERSION))
print("%-20s: %s" % ('libxml used',  etree.LIBXML_VERSION))
print("%-20s: %s" % ('libxml compiled',  etree.LIBXML_COMPILED_VERSION))
"""

The "LIBXML_VERSION" is what is currently used.

Stefan
___
lxml - The Python XML Toolkit mailing list -- lxml@python.org
To unsubscribe send an email to lxml-le...@python.org
https://mail.python.org/mailman3/lists/lxml.python.org/
Member address: arch...@mail-archive.com


[issue44394] [security] CVE-2013-0340 "Billion Laughs" fixed in Expat >=2.4.0: Update vendored copy to expat 2.4.1

2022-01-01 Thread Stefan Behnel

Stefan Behnel  added the comment:

I'd like to ask for clarification regarding issue 45321, which adds the missing 
error constants to the `expat` module. I consider those new features – it seems 
inappropriate to add new module constants in the middle of a release series. 
However, in this ticket here, the libexpat version was updated all the way back 
to Py3.6, to solve a security issue.

Should we also backport the error constants then?

--
nosy: +scoder

___
Python tracker 
<https://bugs.python.org/issue44394>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45321] Module xml.parsers.expat.errors misses error code constants of libexpat >=2.0

2021-12-31 Thread Stefan Behnel


Change by Stefan Behnel :


--
components: +XML
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
type:  -> enhancement
versions:  -Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue45321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45321] Module xml.parsers.expat.errors misses error code constants of libexpat >=2.0

2021-12-31 Thread Stefan Behnel


Stefan Behnel  added the comment:


New changeset e18d81569fa0564f3bc7bcfd2fce26ec91ba0a6e by Sebastian Pipping in 
branch 'main':
bpo-45321: Add missing error codes to module `xml.parsers.expat.errors` 
(GH-30188)
https://github.com/python/cpython/commit/e18d81569fa0564f3bc7bcfd2fce26ec91ba0a6e


--

___
Python tracker 
<https://bugs.python.org/issue45321>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45711] Simplify the interpreter's (type, val, tb) exception representation

2021-12-17 Thread Stefan Behnel


Stefan Behnel  added the comment:

FYI, we track the Cython side of this in
https://github.com/cython/cython/issues/4500

--

___
Python tracker 
<https://bugs.python.org/issue45711>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >