[lxml] Re: Broken EXSLT link in docs
Hi Jens, Jens Tröger via lxml - The Python XML Toolkit schrieb am 28.09.24 um 09:45: I think the EXSLT link here: https://lxml.de/xpathxslt.html#regular-expressions-in-xpath or source here: https://github.com/lxml/lxml/blob/9818374770aedc96f8f1e77943f45dea8e7fb4a8/doc/xpathxslt.txt#L319 should change from http://www.exslt.org/ to https://exslt.github.io/ or some other valid URL. Thanks, fixed. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Consider keeping manylinux1 wheels for Python 3.6
James Belchamber schrieb am 15.05.24 um 22:37: Would you be able to do the same thing for aarch64? manylinux1 never supported aarch64: https://github.com/pypa/manylinux?tab=readme-ov-file#manylinux1-centos-5-based---eol Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] nested CDATA - was Re: Building on Windows
Hi, Gertjan Klein schrieb am 02.05.24 um 17:52: Op 25-04-2024 om 16:58 schreef Stefan Behnel: I'm trying to write a conversion program that outputs XML[2]. It must match the output of an existing program. Semantically it already does, but I'd like it to match the way CDATA is handled. To this end, I'd like to allow "wrapped" CDATA. The CDATA class currently disallows this: it checks for the presence of ']]>', and raises if found. The exception probably comes from a time where libxml2 didn't handle this itself. I added a parameter to turn off this check. I expected to need to do the escaping myself, but it seems lxml handles this just fine out of the box. For example, this tester code: from lxml import etree from lxml.etree import CDATA def main(): root = etree.Element("dummy") txt = '' root.text = CDATA(txt, False) Such a flag would need to be a keyword-only argument to make this readable. It's entirely unclear what the "False" refers to, unless you know the call signature by heart. out = etree.tostring(root).decode() print(out) if__name__ == '__main__': main() ...prints this: Looks good to me. According to the XML spec (both 1.0 and 1.1), "CDATA sections cannot nest": https://www.w3.org/TR/REC-xml/#sec-cdata-sect But splitting the CDATA section makes perfect sense. This does not even need an option, we can just remove the check and add a test for it. Do you want to propose a PR? The Python "xml.etree.ElementTree" package can also parse this correctly, but escapes this on output since it doesn't support CDATA sections directly. Thus, it seems best to add the test in "test_etree.py" rather than "test_elementtree.py" since the behaviour of both differ here. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Building on Windows
Gi, Gertjan Klein schrieb am 21.04.24 um 16:12: I'd like to try a tiny change to the CDATA class. In order to try, I have to be able to build lxml. Unfortunately, on Windows. Yeah, supporting Windows is everything but trivial due to the general lack of platform provided build support. Thus, all libraries have to do their own thing, and bringing that together is not easy. I'm happy myself that there is a working build setup at all. If it's a somewhat straightforward change that doesn't need tons of back-and-forth testing and debugging, and you have a github account, you could also use their CI service (Github Actions), either on your own account or in lxml's account via a pull request. I've downloaded Visual Studio 2019 CE. I created a (Python 3.12) virtual environment, where I installed Cython (latest version). I cloned lxml sources from GitHub. I then opened a "Developer command prompt for VS 2019", activated the virtual environment, and typed: (.venv) C:\Temp\lxml\lxml>python setup.py build_ext -i --with-cython --static-deps This downloads the dependencies like libxml2 etc.; this goes without problems. Then compilation starts, and gives errors: [...] Creating library build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.lib and object build\temp.win32-cpython-312\Release\src\lxml\etree.cp312-win_amd64.exp etree.obj : error LNK2001: unresolved external symbol _xmlStrchr etree.obj : error LNK2001: unresolved external symbol _xmlIOParseDTD etree.obj : error LNK2001: unresolved external symbol _xmlMemShow [...] There are in total 503 unresolved externals. I checked the first one, and find that is is present in the downloaded libxml2_a.lib, but without the underscore. The directories of the downloaded libraries are correctly added to the compiler command line. It might help to see the command line. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Consider keeping manylinux1 wheels for Python 3.6
I've uploaded a simple Py3.6 manylinux1 wheel for x86_64. https://files.pythonhosted.org/packages/b8/93/768dabd4032e15dc6e7ca6767c132685545b7b0e12549dfa923fd2bd/lxml-5.2.1-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl Please try it out. Stefan Stefan Behnel schrieb am 03.04.24 um 21:46: Hi, thanks for the report. Miro Hrončok schrieb am 03.04.24 um 15:55: I've noticed that lxml 5.1+ upgraded the manylinux wheels to a newer tag. That came from the migration to cibuildwheel and was only partly intended. The default ensurpip-bundled pip version in Python 3.6 does not support newer manylinuxes, hence it is likely that many CI systems that still test 3.6 now attempt to build lxml from sources. Since 5.2, this also fails with the old pip due to the old bundled pytoml, as indicated in a previous thread on this list. $ python3.6 -m venv venv3.6 $ venv3.6/bin/pip list Package Version -- --- pip 18.1 setuptools 40.6.2 5.0.2 has a manylinux1 wheel: $ venv3.6/bin/pip install lxml==5.0.2 ... lxml-5.0.2-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl 5.1.0 builds from source but uses setup.py and works (with devel deps): $ venv3.6/bin/pip install lxml==5.1.0 ... lxml-5.1.0.tar.gz Running setup.py install for lxml ... 5.2.1 builds from source and will outright blow up when parsing pyproject.toml: $ venv3.6/bin/pip install lxml==5.2.1 ... lxml-5.1.0.tar.gz ... pip._vendor.pytoml.core.TomlError: /tmp/.../lxml/pyproject.toml(40, 1): msg Hmm, right, that's annoying. If support for Python 3.6 is still desired, would it maybe make sense to keep building and uploading manylinux1 wheels to make it easier? I'll see what I can do. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Consider keeping manylinux1 wheels for Python 3.6
Hi, thanks for the report. Miro Hrončok schrieb am 03.04.24 um 15:55: I've noticed that lxml 5.1+ upgraded the manylinux wheels to a newer tag. That came from the migration to cibuildwheel and was only partly intended. The default ensurpip-bundled pip version in Python 3.6 does not support newer manylinuxes, hence it is likely that many CI systems that still test 3.6 now attempt to build lxml from sources. Since 5.2, this also fails with the old pip due to the old bundled pytoml, as indicated in a previous thread on this list. $ python3.6 -m venv venv3.6 $ venv3.6/bin/pip list Package Version -- --- pip 18.1 setuptools 40.6.2 5.0.2 has a manylinux1 wheel: $ venv3.6/bin/pip install lxml==5.0.2 ... lxml-5.0.2-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.whl 5.1.0 builds from source but uses setup.py and works (with devel deps): $ venv3.6/bin/pip install lxml==5.1.0 ... lxml-5.1.0.tar.gz Running setup.py install for lxml ... 5.2.1 builds from source and will outright blow up when parsing pyproject.toml: $ venv3.6/bin/pip install lxml==5.2.1 ... lxml-5.1.0.tar.gz ... pip._vendor.pytoml.core.TomlError: /tmp/.../lxml/pyproject.toml(40, 1): msg Hmm, right, that's annoying. If support for Python 3.6 is still desired, would it maybe make sense to keep building and uploading manylinux1 wheels to make it easier? I'll see what I can do. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: 5.2.0 doesn't build
Hi, thanks for the report. da.ve.k.gu...@...com schrieb am 01.04.24 um 20:10: I'm attempting to develop a project that has been operational for a while. The project makes use of mixbox which references this library. It seems that version 5.2.0 of lxml was released yesterday. Strangely, pytoml is encountering an error related to the pyproject.toml file. Can someone investigate this issue? #21 9.785 Saved /wheels/tox-2.7.0-py2.py3-none-any.whl 12:18:26 #21 9.806 Collecting lxml (from mixbox==1.0.5->-r requirements.txt (line 7)) 12:18:26 #21 10.88 Downloading https://.../lxml-5.2.0.tar.gz (3.7MB) Could you state the platform/architecture that you're running? And which Python version? I wonder why it picks up the source distribution instead of a ready-made binary wheel. lxml takes a while to build and requires external system libraries, so building from source is discouraged for "normal" use. #21 15.22 File "/usr/share/python-wheels/pytoml-0.1.2-py2.py3-none-any.whl/pytoml/parser.py", line 253, in error 12:18:26 #21 15.22 raise TomlError(message, self.pos[0][0], self.pos[0][1], self._filename) 12:18:26 #21 15.22 pytoml.core.TomlError: /tmp/pip-wheel-29w0tw8j/lxml/pyproject.toml(26, 8): expected_equals This seems to use an old version of pytoml, a library which (apparently) has been deprecated in favour of other tools. https://pypi.org/project/pytoml/ I'd try upgrading your build environment (pip, setuptools, wheel, etc.). Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: What replaced xpath.evaluate() ?
lpsm...@uw.edu schrieb am 16.02.24 um 00:38: I'm maintaining older code, which just broke because lxml took out xpath.evaluate(). The only note in the lxml changelog about it says it was 'redundant', meaning (I assume) that there's a better way to do the same thing, but there's no documentation about what that other way might be. Does anyone know what the new code should be? The code in question looks like: xpath = lxml.etree.XPath(target, namespaces=namespaces) root = lxml.etree.Element("root") try: xpath.evaluate(root) You can simply call the XPath object. Thus, it's common to write something like find_config = lxml.etree.XPath("//config[1]") config_element = find_config(root) https://lxml.de/xpathxslt.html#the-xpath-class Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Streaming read/write
Charlie Clark schrieb am 19.01.24 um 15:00: On 18 Jan 2024, at 18:10, Charlie Clark wrote: Apart from the fact that this currently doesn't work, I imagine that both Elements and their children would happily be passed to the write, which could lead to an almighty mess. Getting this to work properly, possibly rewritten for async to avoid the awfully awful (yield) hack could be a nice addition to the documentation. Thinking about this again, I think a pull parser is probably the way to go as I really don't want or need to create elements, it's probably fine if I just make the changes to what's coming through and stream the text straight back into another file. I'll give that a go. If you want to avoid creating element objects all together, maybe even don't need a full (sub-)tree structure to get all relevant information, I suggest you try the low-level SAX interface. https://lxml.de/parsing.html#the-target-parser-interface It's quite efficient and usable for locally constrained XML transformations, e.g. filtering elements or attributes. And you can still parse input chunk by chunk, if you need that: https://lxml.de/parsing.html#the-feed-parser-interface Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Streaming read/write
Hi Charlie, Charlie Clark schrieb am 18.01.24 um 12:13: I was recently wondering about the best way to edit XML documents using both a streaming reader and writer. I'm sure this is possible using iterparse and xmlfile but I seem to remember that iterparse produces the full tree so that parent elements and their children are returned. You might want to look into the more general XMLPullParser, but yes, both that and iterparse() generate a full XML tree in the back. The idea is that you actively delete parts of it when you're done with them, but you gain easy tree navigation for that. If you need to do somewhat complex and non-local tree transformations, the additional tree building and cleanup work is a price you might want to pay. Alternatively, for the parsing side, there's also still SAX (i.e. pass a "target" object into the parser). It matches somewhat well with xmlfile(), at the cost of requiring separate callback methods and thus, probably, some state keeping on your side. But depending on the kind of "editing" that you're doing on your XML documents, it might not be too bad. Basically, lxml can do all the state keeping for you if you let it build a tree (but then you have to clean up after yourself to save memory), or you choose to do all the state keeping yourself and take the bare parse events, and then have full control over the amount of state that you keep. Whatever is better for your use case. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
Re: [Cython] Ready for Cython 3.1 ?
Stefan Behnel schrieb am 05.11.23 um 23:06: I'd like to ease our feature development by using more modern Python features in our code base and by targeting less Python versions in Cython 3.1 compared to the "all things supported" Cython 3.0. I created a 3.0.x maintenance branch and removed the Py<3.7 test jobs from the master branch. That should make the CI response visibly faster. Happy code cleaning :) Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Ready for Cython 3.1 ?
Lisandro Dalcin schrieb am 06.11.23 um 09:05: On Mon, 6 Nov 2023 at 01:19, Stefan Behnel wrote: it looks like Cython 3.0.6 is going to be a "most things fixed" kind of release for the 3.0.x release series. I'm having issues using CYTHON_LIMITED_API with some Python versions (<=3.9). If you are not in a rush to release 3.0.6, I would like to have some time to properly investigate what's going on. I'd rather postpone these things to 3.1. They are not critical for 3.0, and as I wrote, I think it's actually helpful for users to target 3.1 rather than 3.0. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Ready for Cython 3.1 ?
da-woods schrieb am 06.11.23 um 08:48: > I also consider Cython 3.1 a prime target for better Limited API support. Yes - but I wouldn't treat complete support as a blocker (I don't think this is what you meant though). It's experimental in 3.0 and I don't expect it to "fully" work in 3.1. There's a separate question about what we consider the minimum viable Limited API version we want to support. I imagine that'll ultimately be decided by "what we can make work", but I don't think it'll be less that 3.4 (when PyType_GetSlot) was added. It's probably something to decide later. That's another thing that moving the support to 3.1 would solve. If we can target Py3.7/3.8+ instead of older versions, then also the Limited API will be more usable. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Should we start using the internal CPython APIs?
da-woods schrieb am 04.11.23 um 14:45: I'm a bit late in replying to this but here are some unordered thoughts. * I'm fairly relaxed about using `Py_BUILD_CORE` if useful - I think we mostly do have good fallback paths for most things so can adapt quickly when stuff changes. I'm not entirely relaxed about it, but I agree that the fallbacks should usually make it easy to keep things working also after larger changes in CPython. * CYTHON_USE_CPYTHON_CORE_DETAILS sounds reasonable, but it's yet another variation to test. True. * I wonder if fixing up the limited API implementation should be higher priority than creating a third level been "full" and "limited API". I think there's potential for all three. Basically modes "aggressively fast", "highly compatible" and "version independent". The latter is what the Stable ABI together with the Limited API should give you. * I recall we were planning to ditch c89 as a strict requirement after 3.0? Incompatibility with C++ might be more of an issue though. Yes. C++ is not an issue for CPython, so their internal header files are not tested with C++ at all. That's the highest potential for breakage, if we accept to generate C99 from Cython 3.1 onwards. We should make sure that we use "-std=c89" in at least one Cython 3.0 test setup, BTW. * Even so, if there's a good way of turning it off then we could say: "if you want strict c89 support then you can't use CYTHON_USE_CPYTHON_CORE_DETAILS" and people would always have options. That could be part of it, yes. * Waiting and seeing may be a good option for now. I agree. This still seems best for now, especially given the amount of recent changes in the C-API. Let's wait for those to settle down, at least. Thanks everyone for your opinions and comments! Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Ready for Cython 3.1 ?
Hi all, it looks like Cython 3.0.6 is going to be a "most things fixed" kind of release for the 3.0.x release series. Given the work that lies ahead of us for Cython 3.1, I think we're at a point to get started on that, making the future 3.0.x releases stable and "boring". As a reminder, Cython 3.1 will remove support for Python 2.7 and Python 3.[567], i.e. all Python versions that are now EOL. Python 3.8 will continue to receive security fixes for another year. Python 3.7 is EOL but still up for debate since it's probably not hard to support and still maintained in some Linux distributions for another couple of years. But I'm fine with considering it legacy. We'll probably notice if it gets in the way while preparing Cython 3.0, and can leave support in until there's a reason to remove it. https://github.com/cython/cython/issues/2800 I'd like to ease our feature development by using more modern Python features in our code base and by targeting less Python versions in Cython 3.1 compared to the "all things supported" Cython 3.0. I also consider Cython 3.1 a prime target for better Limited API support. Users probably won't care both for that and for outdated Python versions at the same time. Or, they can use Cython 3.0.x for continued legacy support. Since Cython 3.1 is mostly about ripping out old code, we can try to keep the development cycle short, so that new features don't have to wait that long. Certainly not as long as for Cython 3.0… Is everyone and everything ready to start working on Cython 3.1? Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Should we start using the internal CPython APIs?
Thank you for your comments so far. Stefan Behnel schrieb am 29.10.23 um 22:06: I seriously start wondering if we shouldn't just define "Py_BUILD_CORE" (or have our own "CYTHON_USE_CPYTHON_CORE_DETAILS" macro guard that triggers its #define) and include the internal "pycore_*.h" CPython header files from here: https://github.com/python/cpython/tree/main/Include/internal I just remembered that there's a one major technical issue with this. CPython now requires C99 for its own code base (Py3.13 actually uses "-std=c11" on my side). While they care about keeping public header files compatible with C89 and C++, their internal header files may not always have that quality, and won't be tested for it. So, governance is one argument, but technical reasons can also make this appear less appealing overall. I'll let things settle some more and see in what direction Py3.13 will eventually be moving. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Should we start using the internal CPython APIs?
Hi all, given the latest blow against exposing implementation details of CPython in their C-API (see https://github.com/cython/cython/pull/5767 for the endless story), I seriously start wondering if we shouldn't just define "Py_BUILD_CORE" (or have our own "CYTHON_USE_CPYTHON_CORE_DETAILS" macro guard that triggers its #define) and include the internal "pycore_*.h" CPython header files from here: https://github.com/python/cpython/tree/main/Include/internal This would give us greater freedom in accessing all the implementation details, so that we could directly integrate with those. We'd obviously still need one or more fallback implementations for "stable CPython", Limited API, PyPy and friends. There's a risk, clearly, that these internals change even during point releases. Maybe not a big risk, but not impossible either. We'd have to deal with that and so would our users. OTOH, having a single macro switch would make it easy for users to adapt if something breaks on their side, and also easy to benchmark if it makes a difference for their code. We could also leave it off by default and simply allow users with high performance needs to enable it manually. Or start by leaving it off until a new CPython X.Y release has stabilised and its (used-by-us) internals have proven not to change, and then switch it on for that release series. In any case, having a single switch for this feels like it could be easy to handle. What do you think? Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Can we remove the FastGIL implementation?
da-woods schrieb am 19.09.23 um 21:38: I think the detail that was missing is you need to add the `#cython: fast_gil = True` to enable it. [...] So my conclusion is that from 3.11 onwards Python sped up their own GIL handling to about the same as we used to have, and fastgil has turned into a pessimization. I tried the benchmark with the master branch on my side again, this time with correct configuration. :) Turns out that enabling the FastGIL feature makes it much slower for me (on Ubuntu Linux 20.04) in both Py3.8 and 3.10: """ * Python 3.10 (-DCYTHON_FAST_GIL=0) Running the test (already held)... took 1.2482502460479736 Running the test (released)... took 6.444956541061401 Running the test (already held)... took 1.2358744144439697 Running the test (released)... took 6.4064109325408936 * Python 3.10 (-DCYTHON_FAST_GIL=1) Running the test (already held)... took 2.243091583251953 Running the test (released)... took 7.32707667350769 Running the test (already held)... took 2.4065449237823486 Running the test (released)... took 7.50264573097229 """ I also tried it with PGO enabled and got more or less the same result. The Python installations that I tried it with were both PGO builds. It's probably mixed across platforms, different configurations and C compilers. I looked through the "What's new" document for Py3.10 and 3.11 but couldn't find mentions of GIL improvements. Just that some other things have become faster. So – disable the feature in Python 3.11 and later? (Currently it's disabled in 3.12+.) Py3.11+ would suggest that we keep the code in Cython 3.1, since that will support older Python versions that still seem to benefit from it. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Can we remove the FastGIL implementation?
Hi, I've seen reports that Cython's "FastGIL" implementation (which basically keeps the GIL state in a thread-local variable) is no longer faster than CPython's plain GIL implementation in recent Python 3.x versions. Potentially even slower. See the report in https://github.com/cython/cython/issues/5703 It would be helpful to get user feedback on this. If you have GIL-heavy Cython code, especially with nested with-nogil/with-gil sections across functions, and a benchmark that exercises it, could you please run the benchmark with and without the feature enabled and report the results? You can add "-DCYTHON_FAST_GIL=0" to your CFLAGS to disabled it (and "=1" to enable it explicitly). It's enabled by default in CPython 3.6-3.11 (but disabled in Cython 0.29.x on Python 3.11). Thanks, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Cython 3.0.2 released
Hi all, Cython 3.0.2 is released. It fixes two major regressions in 3.0.1, so please upgrade if that failed for you. https://cython.readthedocs.io/en/latest/src/changes.html Have fun, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Cython 3.0 final released
Hi all, after close to five long years, I'm proud to announce the release of Cython 3.0. It's done. It's out. Finally! The full list of improvements compared to the 0.29.x release series is entirely incredible. https://cython.readthedocs.io/en/latest/src/changes.html Cython 3.0 is better than any other Cython release before, in all aspects. It's much more Python, integrates better with C and C++, supports more Python implementations and configurations, provides many great new language features – it's faster, safer and easier to use. It's simply better. New language features include: - Python 3 syntax and semantics by default - Cython type annotations in plain Python code - automatic NumPy ufunc generation - fast @dataclass and @total_ordering extension types - safe exception propagation in C functions by default - Unicode identifiers in Cython code All of this wouldn't have been possible without the help of the many, many people who contributed code and documentation, tested features, found and described bugs, helped debugging problems. Those who started using Cython in new environments, new build systems, new use cases, and helped to get it working there. Who proposed new features or found mismatches and gaps in the existing set of features. Thank you all, you helped making Cython 3.0 an awesome language! Along the way, we added two people to the list of Cython developers. * David Woods has contributed a tremendous list of features and fixes to this release. It would honestly not have been possible without his efforts. * Matúš Valo has put a lot of work into the documentation and the pure Python mode. He found many issues that make Cython now easier and more consistent to use from Python code. Thank you both for your contributions. I'm happy to work together with you. Everyone, have fun using Cython 3.0, and whatever good comes after it. Best, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Cython 3.0 RC 2 released
Hi all, after close to five long years, we're almost there – I've pushed a release candidate for Cython 3.0 with a long list of bug fixes (followed by a second one with one important fix). https://cython.readthedocs.io/en/latest/src/changes.html Please give it some final testing. Unless we find something really serious in the RC2 release, the changes for the final release will be very limited and safe, hopefully none at all. The RC is just in time for this week's US-SciPy, and I'll make sure we have a final release for next week's EuroPython in Praha. Have fun, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Current CI crashes in Py3.12
Hi, just a note that the current CI crashes in Py3.12b1 are due to https://github.com/python/cpython/issues/104614 They fixed it and Py3.12b2 will hopefully support multiple inheritance of extension types again. It's expected next week (June 6th). Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] cython 3 migration update and next releases
Dima Pasechnik schrieb am 21.05.23 um 11:38: On Sun, 21 May 2023, 10:21 Stefane Fermigier, wrote: IFAIK, 15k lines of Cython makes it among one of the largest Cython projects I'm aware of (I did some research a couple of years ago): https://github.com/sfermigier/awesome-cython#some-projects-with-more-that-10-000-lines-of-cython-code > SageMath has 700K Cython lines, yet not mentioned. Certainly worth mentioning, yes. Looking at the numbers, I also noticed that lxml is listed in the 5-10k lines range. It actually has about 18k lines of Cython code (.pyx/.pxi files) and another 1.5k lines in compiled Python (.py) files, according to pygount [1]. I tried sloccount first, but that doesn't seem to have Cython support. Might be worth redoing that count for the other projects as well. Stefan [1] https://pypi.org/project/pygount/ ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] cython 3 migration update and next releases
matus valo schrieb am 16.05.23 um 21:09: I would like to inform you about recent porting of projects to Cython 3. Recently, I participated in migration of 3 bigger projects to Cython 3: Thanks a lot for doing this, Matúš. It helps Cython as much as it helps these projects. When migrating to Cython 3, I was able to find out several issues in the Cython, all of them are merged in master now. Hence, I would like to ask about next steps. It would help greatly to release Cython 3 beta3. This will allow me to pin scipy CI to real pre-release instead of master branch. I'll try to get beta 3 released soon, but need to find a bit of consecutive time to get it out. There are still a couple of PRs that I'd like to look through. Moreover, I would like to ask whether we can do the final Cython 3 release after beta 3. The rationale is that the projects won't start really using Cython 3 until we do the final release. Now, we have 3 big users of Cython migrated, hence I think we have some confidence that Cython 3 is ready. What do you think? It's probably a good time to have a final call for merges. Promoting and voting for PRs is welcome. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Cython 3.0 beta 2 is released
Hi everyone, we received a lot of feedback for our first beta release (thanks you, everyone!) and were able to (hopefully) resolve all blockers that prevented some of you from making good use of it. Let's hear what you think about the second beta. It's up on PyPI. https://cython.readthedocs.io/en/latest/src/changes.html#beta-2-2023-03-26 Have fun, Stefan Stefan Behnel schrieb am 26.02.23 um 11:31: Hi all, Cython 3.0 has left the alpha status – the first beta release is available from PyPI. The changes in this release are huge – and the full list of improvements compared to the 0.29.x release series is entirely incredible. Cython 3.0 is better than any other Cython release before, in all aspects. It's much more Python, integrates better with C++, supports more Python implementations and configurations, provides many great new language features – it's faster, safer and easier to use. It's simply better. https://cython.readthedocs.io/en/latest/src/changes.html#beta-1-2023-02-25 The development of the Cython 3.0 release series started all the way back in 2018, with the first branch commit happening on October 27, 2018. https://github.com/cython/cython/commit/c2de8efb67f80bff59975641aac387d652324e4e List of Milestones along the way, and a long list of contributors: https://github.com/cython/cython/issues/4022#issuecomment-1404305257 Thank you to everyone who contributed. Especially to David Woods, who contributed a tremendous amount of changes, both fixes and new features. Thank you, David! A couple of people have also joined in an effort to make the documentation reflect what this great new Cython has to offer. Thank you all, our users will love you for your help. https://github.com/cython/cython/issues/4187 https://cython.readthedocs.io/en/latest/ Now, go and give it a try. We've taken great care to make the transition from Cython 0.29.x as smooth as possible, which was not easy given the large amount of changes, including some well-motivated breaking changes. We wanted to let all users benefit from this new release. Let us know how it works for you, and tell others about it. :) Have fun, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[lxml] Re: When is a number not a number
Stefan Behnel schrieb am 03.03.23 um 09:00: Stefan Behnel schrieb am 02.03.23 um 08:50: Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de: Probably a bug in _checkNumber(): https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 Ah, yes, it might be the isdigit() check, actually. That could be too broad. Not every digit is a valid part of a number. Thanks for the report and the investigation. I'll try a fix when I get to it. According to the XML Schema 1.1 spec, it's really just [0-9] that we should detect. https://www.w3.org/TR/xmlschema11-2/#decimal I'll remove the ".isdigit()" check all together and only leave the '0-9' comparison in there. Even when we're parsing Unicode strings, we should only care about XML numbers, not everything that Python accepts. https://github.com/lxml/lxml/commit/3d4e60f2835e4d85fd357c182656d3eca534f2ff Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: When is a number not a number
Stefan Behnel schrieb am 02.03.23 um 08:50: Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de: Probably a bug in _checkNumber(): https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 Ah, yes, it might be the isdigit() check, actually. That could be too broad. Not every digit is a valid part of a number. Thanks for the report and the investigation. I'll try a fix when I get to it. According to the XML Schema 1.1 spec, it's really just [0-9] that we should detect. https://www.w3.org/TR/xmlschema11-2/#decimal I'll remove the ".isdigit()" check all together and only leave the '0-9' comparison in there. Even when we're parsing Unicode strings, we should only care about XML numbers, not everything that Python accepts. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: When is a number not a number
Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de: >Probably a bug in _checkNumber(): >https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 Ah, yes, it might be the isdigit() check, actually. That could be too broad. Not every digit is a valid part of a number. Thanks for the report and the investigation. I'll try a fix when I get to it. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[Python-announce] Cython 3.0 beta 1 is released
Hi all, Cython 3.0 has left the alpha status – the first beta release is available from PyPI. https://cython.org/ https://pypi.org/project/Cython/ The changes in this release are huge – and the full list of improvements compared to the 0.29.x release series is entirely incredible. Cython 3.0 is better than any other Cython release before, in all aspects. It's much more Python, integrates better with C++, supports more Python implementations and configurations, provides many great new language features – it's faster, safer and easier to use. It's simply better. https://cython.readthedocs.io/en/latest/src/changes.html#beta-1-2023-02-25 What is Cython? In case you didn't hear about Cython before, it's the most widely used statically optimising Python compiler out there. It translates Python (2/3) code to C, and makes it as easy as Python itself to tune the code all the way down into fast native code. If you have any non-trivial Python application running, chances are you'll find some piece of Cython generated package in it. The development of the Cython 3.0 release series started all the way back in 2018, with the first branch commit happening on October 27, 2018. https://github.com/cython/cython/commit/c2de8efb67f80bff59975641aac387d652324e4e A list of Milestones along the way, and a long list of contributors: https://github.com/cython/cython/issues/4022#issuecomment-1404305257 Thank you to everyone who contributed. A couple of people have also joined in an effort to make the documentation reflect what this great new Cython has to offer. https://cython.readthedocs.io/en/latest/ Now, go and give it a try. We've taken great care to make the transition from Cython 0.29.x as smooth as possible, which was not easy given the large amount of changes, including some well-motivated breaking changes. We wanted to let all users benefit from this new release. Let us know how it works for you, and tell others about it. :) Have fun, Stefan ___ Python-announce-list mailing list -- python-announce-list@python.org To unsubscribe send an email to python-announce-list-le...@python.org https://mail.python.org/mailman3/lists/python-announce-list.python.org/ Member address: arch...@mail-archive.com
[Cython] Cython 3.0 beta 1 is released
Hi all, Cython 3.0 has left the alpha status – the first beta release is available from PyPI. The changes in this release are huge – and the full list of improvements compared to the 0.29.x release series is entirely incredible. Cython 3.0 is better than any other Cython release before, in all aspects. It's much more Python, integrates better with C++, supports more Python implementations and configurations, provides many great new language features – it's faster, safer and easier to use. It's simply better. https://cython.readthedocs.io/en/latest/src/changes.html#beta-1-2023-02-25 The development of the Cython 3.0 release series started all the way back in 2018, with the first branch commit happening on October 27, 2018. https://github.com/cython/cython/commit/c2de8efb67f80bff59975641aac387d652324e4e List of Milestones along the way, and a long list of contributors: https://github.com/cython/cython/issues/4022#issuecomment-1404305257 Thank you to everyone who contributed. Especially to David Woods, who contributed a tremendous amount of changes, both fixes and new features. Thank you, David! A couple of people have also joined in an effort to make the documentation reflect what this great new Cython has to offer. Thank you all, our users will love you for your help. https://github.com/cython/cython/issues/4187 https://cython.readthedocs.io/en/latest/ Now, go and give it a try. We've taken great care to make the transition from Cython 0.29.x as smooth as possible, which was not easy given the large amount of changes, including some well-motivated breaking changes. We wanted to let all users benefit from this new release. Let us know how it works for you, and tell others about it. :) Have fun, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[lxml] Re: Question about inheritance in cssselect.py
Dani Litovsky Alcala schrieb am 10.06.22 um 17:34: lxml v.4.9 cssselect.py:CSSSelector.__init__ calls on `etree.XPath.__init__(self, path, namespaces=namespaces)` to initialize the parent class. Is there a reason why `super()` or even `super(CSSSelector, self)__init__...` is not used? Probably the age of the code. I bring this up as if I attempt to monkey patch (private project) `etree.XPath.__init__` the current code causes an error ``` TypeError: super(type, obj): obj must be an instance or subtype of type ``` while replacing it with the suggested use of `super()` fixes my error. I'll change it to use super(). Thanks for the suggestion. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: %s formatting in documentation (in stead of f-string)
Hi, t.r...@247interfaces.nl schrieb am 19.12.22 um 17:58: Switching to lxml for xml parsing and generating, I was somewhat puzzeled by the usage line's like XHTML_NAMESPACE = "http://www.w3.org/1999/xhtml"; XHTML = "{%s}" % XHTML_NAMESPACE With more modern f-string's this could also be written as XHTML = f"{{{XHTML_NAMESPACE}}}" This may be because I only started on Python on 3.7, and have never worked with any 2.x I quite understand there is no time to rework all the doc's in the low income on this project. Just two questions: - is there (another) good reason not to is f-string formatting? And if not - is there a way to assist on reworking the doc's It's reasonable to update the docs to Py3 style by now, and a bit of that has already been done. The question is whether XHTML = f"{{{XHTML_NAMESPACE}}}" is really more readable than XHTML = "{%s}" % XHTML_NAMESPACE given the amount of curly braces with different meanings that a reader has to go through. To me, personally, the second seems quicker and more obvious to read, whereas it takes me a while to understand what the equivalent f-string does. I think this is a case where we should keep the (IMHO) simpler non-f-string variant. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: zlib error
Hi, Ajayi, Temitope schrieb am 14.12.22 um 17:21: It seems the version of zlib used in lxml is outdated. It currently shows up as zlib 1.2.11 instead of zlib 1.2.13 on scan reports and therefore vulnerable to CVE-2018-25032 and CVE-2022-37434. Can I get some help on if this is correct or I am doing something wrong? What lxml version are you using on which operating system? Are you using pre-built binary wheels or building locally? The binary wheels of lxml 4.9.2 should be using zlib 1.2.13 on Linux/macOS and 1.2.12 on Windows. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[Cython] Cython 3.0 planning - Re: [cython-users] Re: Anything missing for 0.29.33 ?
Hi Matúš, Matúš Valo schrieb am 06.12.22 um 15:58: I have a thought about Cython 3.0. Based on discussion in [1] we should be done with all breaking changes. There is also [2] but PR is already there [3] (I am not sure what is the state of the PR though). Is it possible to make final release (In case we postpone [2] to Cython 3.1 or 3.0.X)? Or, at least, can we move closer to final release and release beta or RC version? I think it would be great to communicate to the community how far we are from final release (not in time but e.g. this is beta/RC release and will be followed by final release). Additional reason for release of Cython 3.0 is that in near future (Python 3.12) two important components used by Cython will be removed: imp module and distutils. In my opinion, Cython 3.0 should be released early to ensure users transition period so we can avoid back-porting this changes to 0.29.X releases. Any thoughts? [1] https://github.com/cython/cython/issues/4022 [2] https://github.com/cython/cython/issues/4936 [3] https://github.com/cython/cython/pull/5016 Let's see that we get [3] merged to close [2], I think then we're ready for a new release, once 0.29.33 is out. As you wrote, we're through with the breaking changes for 3.0 then, so yes, a first beta release might be appropriate. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Anything missing for 0.29.33 ?
Hi, I'll try to push out the next 0.29.x (and hopelfully also 3.0alpha) release before Christmas. If you think I might have forgotten anything that's ready to be included in 0.29.33, please comment in the relevant ticket or PR, or reply to this message on cython-users. Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[lxml] Re: Turn three-line block into single?
Gilles schrieb am 10.08.22 um 15:20: for row in tree.iter("wpt"): lat,lon = row.attrib.values() Note that this assignment depends on the order of the two attributes in the XML document, i.e. in data that you may not control yourself. It will break if the provider of your input documents ever decides to change the order. I'd probably just use lat, lon = row.get('lat'), row.get('lon') Also: > #remove dups > no_dups = [] > for row in tree.iter("wpt"): > lat,lon = row.attrib.values() > if lat not in no_dups: > no_dups.append(lat) > else: > row.getparent().remove(row) You're using a list here instead of a set. It might be that a list is faster for very small amounts of data, but I'd expect a set to win quite quickly. Regardless of my guessing, you shouldn't be using a list here unless benchmarking tells you that it's faster. And if you do, you'd better add a comment for the reasoning. It's just too surprising to see this implemented with a list, so readers will end up wasting their time thinking more into it than there is. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[Cython] Welcome David Woods as a Cython core developer
Hi everyone, with the release of the first 3.0 alpha that supports Python 3.11 (aptly named "alpha 11"), I'm happy to announce that David Woods has been promoted to a Cython core developer. David has shown an extraordinary commitment and dedication over the last years. His first merged commits were already back in 2015, mostly related to the C++ support. But within the last two years, he voluntarily took over more and more responsibility for bugs and issues and developed several major new features for the project. This includes the Walrus operator (PEP 572), cdef dataclasses (modelled after PEP 557), internal "std::move()" usage in C++ mode or support for Unicode identifiers and module names, all of which form a major part of the 3.0 feature set. David has more than deserved a place in the circle of present and prior core devs. David, thank you for your impressive work on Cython, and welcome to the core team! Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Python-Dev] Re: Switching to Discourse
h.vetin...@gmx.com schrieb am 18.07.22 um 18:04: One of the comments in the retro was: Searching the archives is much easier and have found me many old threads that I probably would have problem finding before since I haven’t been subscribed for that long. I'm actually reading python-dev, c.l.py etc. through Gmane, and have done that ever since I joined. Simply because it's a mailing list of which I don't need a local (content) copy, and wouldn't want one. Gmane seems to have a complete archive that's searchable, regardless of "when I subscribed". It's really sad that Discourse lacks an NNTP interface. There's an unmaintained bridge to NNTP servers [1], but not an emulating interface that would serve the available discussions via NNTP messages, so that users can get them into their NNTP/Mail clients to read them in proper discussion threads. I think adding that next to the existing web interface would serve everyone's needs just perfectly. Anyone up for giving that a try? It can't be *that* difficult. ;-) Stefan [1] https://github.com/sman591/discourse-nntp-bridge ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/USPYYNP24UYQQ64YBBTHNOEDNGX46LVM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Switching to Discourse
Petr Viktorin schrieb am 15.07.22 um 13:18: The discuss.python.org experiment has been going on for quite a while, and while the platform is not without its issues, we consider it a success. The Core Development category is busier than python-dev. According to staff, discuss.python.org is much easier to moderate.. If you're following python-dev but not discuss.python.org, you're missing out. That's one of the reasons then why I pretty much lost track of the CPython development since d.p.o was introduced. It's sad, but it was just too much work for me (compared to threaded Newsgroups) to follow the discussions there, definitely more than I wanted to invest. It's not the only reason, though, so please take a decision for the home of CPython discussions that suits the (currently) more active part of the development community. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TA5YNMEJURKMJHTSYTM5Z6G2YQ6UM5TP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Cython] Nested prange loops - (was: [cython-users] Converting to Python objects with nogil (inside prange for loop))
Hi, nested prange loops seem to be a common gotcha for users. I can't say if there is ever a reason to do this, but at least I can't think of any. For me, this sounds like we should turn it into a compile time error – unless someone can think of a use case? Even in that case, I'd still emit a warning since it seems so unlikely to be intended. Please reply to the cython-users list to facilitate user feedback. Stefan Forwarded Message Subject: Re: [cython-users] Converting to Python objects with nogil (inside prange for loop) Date: Fri, 15 Jul 2022 07:43:26 +0100 with nogil, parallel(): for i in prange(N): for j in prange(km.BatchSize): You usually only want one loop in a set of nested loops to be prange. Typically the outer loop, but in this case it might be easier to parallelize the inner loop. ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[lxml] Re: Iterparse raises TypeError on attempt to clean up preceding siblings
Am June 23, 2022 11:20:59 PM UTC schrieb Parfait G : >I see one fix is to also check if `elem.getparent() is not None`. >Thoughts? > >elem.clear() > while elem.getprevious() is not None and elem.getparent() is not None: >del elem.getparent()[0] The parent won't change during the loop, so it's enough to check it once before the loop. Also, there is only one element without parent, that's the root element. Maybe you can skip that altogether in your processing? It should be the first item returned by the iterator that you got through .iter(). Just call next() on it once. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Build problems on Python 3.11
Charlie Clark schrieb am 31.05.22 um 17:54: while I don't see this locally, I'm getting problems on my CI with the Docker Image: ``` Compile failed: command '/usr/bin/gcc' failed with exit code 1 cc -I/usr/include/libxml2 -I/usr/include/libxml2 -c /tmp/xmlXPathInitw7u6s7rr.c -o tmp/xmlXPathInitw7u6s7rr.o cc tmp/xmlXPathInitw7u6s7rr.o -lxml2 -o a.out error: command '/usr/bin/gcc' failed with exit code 1 [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure × Encountered error while trying to install package. ╰─> lxml ``` I'm wondering if there is anything that can be done about this? Presumably inform the maintainer? I've never seen this either, but at least there's an lxml 4.9.0 release now that should work with Py3.11. I didn't upload wheels for 3.11, though. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[Python-ideas] Re: Less is more? Smaller code and data to fit more into the CPU cache?
Barry Scott schrieb am 27.03.22 um 22:23: On 22 Mar 2022, at 15:57, Jonathan Fine wrote: As you may have seen, AMD has recently announced CPUs that have much larger L3 caches. Does anyone know of any work that's been done to research or make critical Python code and data smaller so that more of it fits in the CPU cache? I'm particularly interested in measured benefits. I few years ago (5? 10?) there was a blog about making the python eval loop fit into L1 cache. The author gave up on the work as he claimed it was too hard to contribute any changes to python at the time. I have not kept a link to the blog post sadly. What I recall is that the author found that GCC was producing far more code then was required to implement sections of ceval.c. Fixing that would shrink the ceval code by 50% I recall was the claim. He had a PoC that showed the improvements. Might be worth trying out if "gcc -Os" changes anything for ceval.c. Can also be enabled temporarily with a pragma (and MSVC has a similar option). We use it in Cython for the (run once) module init code to reduce the binary module size, but it might have an impact on cache usage as well. Stefan ___ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/QQVYUUKOKN472N4OLNCAA76HLVFXMKLB/ Code of Conduct: http://python.org/psf/codeofconduct/
The Cython compiler is 20 years old today !
Dear Python community, it's now 20 years since Greg Ewing posted his first announcement of Pyrex, the tool that is now known and used under the name Cython. https://mail.python.org/pipermail/python-list/2002-April/126661.html It was a long way, and I've written up some of it in a blog post: http://blog.behnel.de/posts/cython-is-20/ Today, if you're working on any kind of larger application in Python, you're likely to have some piece of code downloaded into your venv that was built with Cython. Or many of them. I'm proud of what we have achieved. And I'm happy to see and talk to the many, many users out there whom we could help to help their users get their work done. Happy anniversary, Cython! Stefan PS: The list of Cython implemented packages on PyPI is certainly incomplete, so please add the classifier to yours if it's missing. With almost 3000 dependent packages on Github (and almost 100,000 related repos), I'm sure we can crack the number of 1000 Cython built packages on PyPI as a birthday present. (No Spam, please, just honest classifiers.) https://pypi.org/search/?q=&o=-created&c=Programming+Language+%3A%3A+Cython https://github.com/cython/cython/network/dependents?dependent_type=PACKAGE -- https://mail.python.org/mailman/listinfo/python-list
[Python-announce] The Cython compiler is 20 years old today !
Dear Python community, it's now 20 years since Greg Ewing posted his first announcement of Pyrex, the tool that is now known and used under the name Cython. https://mail.python.org/pipermail/python-list/2002-April/126661.html It was a long way, and I've written up some of it in a blog post: http://blog.behnel.de/posts/cython-is-20/ Today, if you're working on any kind of larger application in Python, you're likely to have some piece of code downloaded into your venv that was built with Cython. Or many of them. I'm proud of what we have achieved. And I'm happy to see and talk to the many, many users out there whom we could help to help their users get their work done. Happy anniversary, Cython! Stefan PS: The list of Cython implemented packages on PyPI is certainly incomplete, so please add the classifier to yours if it's missing. With almost 3000 dependent packages on Github (and almost 100,000 related repos), I'm sure we can crack the number of 1000 Cython built packages on PyPI as a birthday present. (No Spam, please, just honest classifiers.) https://pypi.org/search/?q=&o=-created&c=Programming+Language+%3A%3A+Cython https://github.com/cython/cython/network/dependents?dependent_type=PACKAGE ___ Python-announce-list mailing list -- python-announce-list@python.org To unsubscribe send an email to python-announce-list-le...@python.org https://mail.python.org/mailman3/lists/python-announce-list.python.org/ Member address: arch...@mail-archive.com
[Cython] The Cython compiler is 20 years old today !
Dear Cython community, it's now 20 years since Greg Ewing posted his first announcement of Pyrex, the tool that is now known and used under the name Cython. https://mail.python.org/pipermail/python-list/2002-April/126661.html It was a long way, and I've written up some of it in a blog post: http://blog.behnel.de/posts/cython-is-20/ Today, if you're working on any kind of larger application in Python, you're likely to have some piece of code downloaded into your venv that was built with Cython. Or many of them. I'm proud of what we have achieved. And I'm happy to see and talk to the many, many users out there whom we could help to help their users get their work done. Happy anniversary, Cython! Stefan PS: The list of Cython implemented packages on PyPI is certainly incomplete, so please add the classifier to yours if it's missing. With almost 3000 dependent packages on Github (and almost 100,000 related repos), I'm sure we can crack the number of 1000 Cython built packages on PyPI as a birthday present. (No Spam, please, just honest classifiers.) https://pypi.org/search/?q=&o=-created&c=Programming+Language+%3A%3A+Cython https://github.com/cython/cython/network/dependents?dependent_type=PACKAGE ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [xml] libxml2 2.9.23 download
Hi, Jeffrey Walton via xml schrieb am 16.03.22 um 05:45: libxml2 2.9.13 seems to be missing from ftp://xmlsoft.org/libxml2/. As mentioned in the release announcement: https://mail.gnome.org/archives/xml/2022-February/msg9.html the releases have moved to https://download.gnome.org/sources/libxml2/2.9/ Stefan ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[lxml] Re: Is there an ElementTree class lookup hook?
Salut encore, Xavier Morel schrieb am 04.03.22 um 12:58: lxml provides support for custom Element classes (as well as element-ish e.g. Comment or PI) via the `ElementDefaultClassLookup` registry, and the ability to hook it into a parser. But that registry does not seem to have a slot for the root tree of the elements. Is there a hook somewhere to set *that*? I tried looking around the API docs but nothing really jumped out. Do you really need something like that? Can't you just inherit from the ElementTree class? (Assuming that's what you meant.) The reason why you can register your own Element classes is because they can appear all over the place in the API. The ElementTree class is either instantiated by the user or returned from the parse() function. That's mostly it. Ok, maybe XSLT. But still easy enough to wrap yourself. PS: the documentation for `set_default_parser` explains that it sets the default parser *for the current thread* and that "You can create a separate parser for each thread explicitly or use a parser pool.", does it mean that in a "don't call any API which gets an implicit parser and manage your parsers by hand" sense or something else? Parsers are really only used where an explicit "parser" argument is accepted. Everything else just inherits them. If you want to use your own parser, write a wrapper function for parse() that always passes it in, and then use that function instead. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: Compatibility issues between `lxml.etree.set_element_class_lookup`` and `lxml.html`
Salut, Xavier Morel schrieb am 07.03.22 um 13:27: Sorry for the bother, but I've been looking at `lxml.etree.set_element_class_lookup`[0] as a way to add validation and features to lxml usage without having to ban "standard" lxml constructs (and to control usage by dependencies as well). I consider the function fine for what it does, but if it gets in the way, don't use it. It's a global setting, which means that it can break stuff elsewhere, unintentionally and without warning. Just create your own parser instance and configure the class lookup only there. Is there a "proper" way to make these things collaborate? I looked at lxml.html and it looked like it might have to be rebuilt from the HTMLMixin (which already seems icky) but `objectify` is a cython module so there doesn't seem to be a good way to interact with it. Cython modules are mostly just compiled Python mpdules and behave pretty much the same, from a user perspective. If you can read Python, you can probably read Cython code, and if you know how to use Python modules, you can probably also work with Cython compiled modules. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value
Change by Stefan Behnel : -- status: open -> closed ___ Python tracker <https://bugs.python.org/issue46798> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node
Dr. Volker Jaenisch schrieb am 04.03.22 um 00:02: Am 03.03.22 um 23:54 schrieb Stefan Behnel: this reads like something you could implement on top of lxml.objectify, via subclassing and an appropriate element class lookup. This could really be a plain Python package that you could distribute on PyPI to give users an easy choice which interface they prefer. Not everything needs to be part of lxml itself. My prototype is still clued to lxml since I use internal cython functions of lxml that are not exported to python space. But with a little help of the kind lxml people it may be possible to completely seperate it from lxml. The idea is to do pretty much what objectify currently does, using (I guess) the same element lookup, but to use a Python subclass of the ObjectifiedElement class for the tree structure that implements your different attribute lookup scheme in "__getattr__". The general mechanism for selecting element class implementations is described here: https://lxml.de/element_classes.html Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node
Hi Volker, this reads like something you could implement on top of lxml.objectify, via subclassing and an appropriate element class lookup. This could really be a plain Python package that you could distribute on PyPI to give users an easy choice which interface they prefer. Not everything needs to be part of lxml itself. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node
Dr. Volker Jaenisch schrieb am 03.03.22 um 18:19: Therefore I am currently working on enabling LXML to have _ properties in objectify. The changes are not too complicated since the source code quality is good. I am hopeful that after the weekend I will have full functional prototype. As Holger wrote, the issue with prefixes is that they are provided by the input document. There are well-known prefixes for a hand full of namespaces, but that is a pure naming convention and in no way an obligation. While I can see that it might be helpful for debugging purposes to see that there are attributes like "html_image", no-one keeps them from ending up as "s_image" or just "image" (with a default namespace and no prefix), if the creator of the specific document at hand decides so. Aside from debugging, I fail to see a use case for this. And it increases the risk for innocent users to write code that seems to work with most documents (that use "standard" prefixes) but fail for others (which tend to be missing from the test suite). So … I think keeping prefixes generally out of the interface is a good decision. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: python lxml.objectify gives no attribute access to gco:CharacterString node
Dr. Volker Jaenisch schrieb am 01.03.22 um 16:06: To find the desired sibling the code loops over all childern and matches (parentNamespace, propertyName) against them. The correct operation of _findFollowingSibling should IMHO be: Make a lookup on all children (with the python property name only). If one match is found then return this match. If none or more than one match is found then no answer is possible. I see a major drawback with this behaviour, and that is non-local dependencies. If you have this XML: then "root.ch1" would give you the first child. Great, so you use that in your code. Now, someone decides to send you an input document that looks like this: And your code will suddenly fail to find "root.ch1". Depending on what your code does and how it does it, it may fail with an exception, or it may fail silently to find the desired data and just keep working without it. Note that the content of the XML file that your code is designed to process did not change at all. It's just that some entirely unrelated content was added, in a completely different and unrelated namespace. And it was just externally added to the input data, or maybe just some tiny portion it, without telling you or your code about it. Especially in places with optional content, where different namespaces are already a little more common than elsewhere, this is fairly likely to go unnoticed. I find this kind of behaviour dangerous enough to restrict the "magic" in the API to what is easy to understand and predict. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[issue46786] embed, source, track, wbr HTML elements not considered empty
Change by Stefan Behnel : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue46786> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46786] embed, source, track, wbr HTML elements not considered empty
Stefan Behnel added the comment: New changeset 345572a1a0263076081020524016eae867677cac by Jannis Vajen in branch 'main': bpo-46786: Make ElementTree write the HTML tags embed, source, track, wbr as empty tags (GH-31406) https://github.com/python/cpython/commit/345572a1a0263076081020524016eae867677cac -- ___ Python tracker <https://bugs.python.org/issue46786> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46389] 3.11: unused generator comprehensions cause f_lineno==None
Stefan Behnel added the comment: Possibly also related, so I though I'd mention it here (sorry if this is hijacking the ticket, seems difficult to tell). We're also seeing None values in f_lineno in Cython's test suite with 3.11a5: File "", line 1, in run_trace(py_add, 1, 2) ^^^ File "tests/run/line_trace.pyx", line 231, in line_trace.run_trace (line_trace.c:7000) func(*args) File "tests/run/line_trace.pyx", line 60, in line_trace.trace_trampoline (line_trace.c:3460) raise File "tests/run/line_trace.pyx", line 54, in line_trace.trace_trampoline (line_trace.c:3359) result = callback(frame, what, arg) File "tests/run/line_trace.pyx", line 81, in line_trace._create_trace_func._trace_func (line_trace.c:3927) trace.append((map_trace_types(event, event), frame.f_lineno - frame.f_code.co_firstlineno)) TypeError: unsupported operand type(s) for -: 'NoneType' and 'int' https://github.com/cython/cython/blob/7ab11ec473a604792bae454305adece55cd8ab37/tests/run/line_trace.pyx No generator expressions involved, though. (Much of that test was written while trying to get the debugger in PyCharm to work with Cython compiled modules.) There is a chance that Cython is doing something wrong in its own line tracing code, obviously. (I also remember seeing other tracing issues before, where the line reported was actually in the trace function itself rather than the code to be traced. We haven't caught up with the frame-internal changes yet.) -- nosy: +scoder ___ Python tracker <https://bugs.python.org/issue46389> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Re: [xml] Release of libxml2 2.9.13
Nick Wellnhofer schrieb am 23.02.22 um 11:36: I asked on GNOME infra if it is possible to offer .tar.gz downloads, but this would require changes to the upload script. Thanks for asking. Stefan ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[issue46836] [C API] Move PyFrameObject to the internal C API
Stefan Behnel added the comment: I haven't looked fully into this yet, but I *think* that Cython can get rid of most of the direct usages of PyFrameObject by switching to the new InterpreterFrame struct instead. It looks like the important fields have now been moved over to that. That won't improve the situation regarding the usage of CPython internals, but it's probably worth keeping in mind before we start adding new API functions that work on frame objects. -- ___ Python tracker <https://bugs.python.org/issue46836> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value
Stefan Behnel added the comment: > IMHO if the developer doesn't manage the XML itself it is VERY unreasonable > to use the document value and not the developer one. I disagree. If the document says "this is the default if no explicit value if given", then I consider that just as good as providing a value each time. Meaning, the attribute *is* in fact present, just not explicitly spelled out on the element. I would specifically like to avoid adding a new option just to override the way the document distributes its attribute value spelling across DTD and document structure. In particular, the .get() method is the wrong place to deal with this. You can probably configure the parser to ignore the internal DTD subset, if that's what you want. -- ___ Python tracker <https://bugs.python.org/issue46798> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Re: [xml] Release of libxml2 2.9.13
Nick Wellnhofer via xml schrieb am 20.02.22 um 13:53: Version 2.9.13 of libxml2 is available at: https://download.gnome.org/sources/libxml2/2.9/ Thank you for the release, Nick! Note that starting with this release, libxml2 tarballs are published on download.gnome.org instead of ftp.xmlsoft.org. I noticed that they now use xz compression, whereas they were simply gzip compressed before. libxslt also changed the compression. That makes it more difficult to download them automatically, because scripts that want to list the available files now have to search for different file names. Also, Python 2.7 does not have built-in lzma compression support and needs an external module in order to handle it. (Both gz and bz2 have been supported essentially forever, OTOH.) And it seems that xz is not considered safe for long-term storage by everyone: https://www.nongnu.org/lzip/xz_inadequate.html Could you make the archives available in a (second) format that matches all (previous) releases? Apparently, both libxml2 and libxslt were made available with gz and bz2 compression before. Either of them would probably be fine. bz2 seems to compress equally well as xz here. (And compression speed, where bz2 suffers a bit, was never an issue for downloads anyway, just decompression speed, where all three are fine.) Thanks, Stefan ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[lxml] Re: v4.8.0 breaking regression?
Charlie Clark schrieb am 22.02.22 um 17:51: On 22 Feb 2022, at 17:26, Stefan Behnel wrote: If you set STATIC_BUILD=true, and LIBXML_VERSION=2.9.12, lxml will use the git version instead of the release version. I just tried this but got the same result. Presumably, I did something wrong but ENVVARs are not my strength anyway. However, it sounds very much like a know issue that will hopefully disappear once 2.9.13 is released. MacPorts is normally pretty up to date, but I see that this hasn't been updated for nine months but 2.9.13 was only released on the 19th of February. Yes, 2.9.13 was freshly released. That may explain why it works for Bob. A static build would pick up the latest version. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: v4.8.0 breaking regression?
Bob Kline schrieb am 22.02.22 um 17:29: On Tue, Feb 22, 2022 at 11:20 AM Stefan Behnel wrote: ... Help with building more universal macOS wheels would be appreciated. What would that involve? Finding a good way to do it. :) As I wrote, cibuildwheels probably has a way to do it from Github Actions, but changing build systems (or replacing the build configuration) isn't exactly something I'd like to put work into right now. I'm not saying that it would be difficult, just that it needs doing and testing, and probably a couple of iterations until everything runs smoothly again. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: v4.8.0 breaking regression?
Charlie Clark schrieb am 22.02.22 um 09:48: On 21 Feb 2022, at 20:37, Jens Tröger wrote: Yes, when I installed lxml it built locally on my Intel Mac 10.14.6 with Python 3.9.10, and in another email I actually wanted to ask for a pre-compiled whl: Collecting lxml Using cached lxml-4.8.0.tar.gz (3.2 MB) Using legacy 'setup.py install' for lxml, since package 'wheel' is not installed. Installing collected packages: lxml Running setup.py install for lxml ... done Successfully installed lxml-4.8.0 FWIW I can confirm that this happens if lxml is built on the machine but not with the wheel This is locally build lxml Python : sys.version_info(major=3, minor=9, micro=10, releaselevel='final', serial=0) lxml.etree : (4, 8, 0, 0) libxml used : (2, 9, 12) libxml compiled : (2, 9, 12) libxslt used : (1, 1, 34) libxslt compiled : (1, 1, 34) 3 b'\n baz\n \n \n baz\n \n id="b-3">\n baz\n \n\n ' b'\n baz\n \n \n baz\n \n\n ' b'\n baz\n \n\n' And this with the wheel Python : sys.version_info(major=3, minor=9, micro=10, releaselevel='final', serial=0) lxml.etree : (4, 8, 0, 0) libxml used : (2, 9, 12) libxml compiled : (2, 9, 12) libxslt used : (1, 1, 34) libxslt compiled : (1, 1, 34) 3 b'\n baz\n \n ' b'\n baz\n \n ' b'\n baz\n \n' All libraries have the same version so it must be something else. I use MacPorts to keep libraries up to date. Sadly, libxml2 2.9.12 is not libxml2 2.9.12 here. On your machine, you probably have the latest release version installed. The lxml wheels are built with a newer git version that has a fix for this issue. Or a work-around, if you want. If you set STATIC_BUILD=true, and LIBXML_VERSION=2.9.12, lxml will use the git version instead of the release version. It would probably be worth adding a runtime detection for this issue, so that lxml can fail to import if it finds an incompatible libxml2 version. The broken behaviour seems heavy enough to fail hard instead of issuing just a warning (which the build currently does, but you normally won't see that in pip installations). Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[lxml] Re: v4.8.0 breaking regression?
Bob Kline schrieb am 21.02.22 um 17:14: got the expected (correct) output. This is on macOS 12.2.1 (M1). Another interesting data point is that although https://pypi.org/project/lxml/ claims that there are builds of 4.8.0 for Python 3.10, pip on this machine concluded that it needed to build lxml from code. Perhaps an M1 thing? I will see what happens on Linux and Windows. The macOS wheels are not currently compatible with M1, so you end up with a local build instead. Help with building more universal macOS wheels would be appreciated. I guess a switch to cibuildwheels would help, but I doubt that that's done lightly. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[issue46786] embed, source, track, wbr HTML elements not considered empty
Stefan Behnel added the comment: Makes sense. That list hasn't been updated in 10 years. -- versions: -Python 3.10, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue46786> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46798] xml.etree.ElementTree: get() doesn't return default value, always ATTLIST value
Stefan Behnel added the comment: The question here is simply, which is considered more important: the default provided by the document, or the default provided by Python. I don't think it's a clear choice, but the way it is now does not seem unreasonable. Changing it would mean deliberate breakage of existing code that relies on the existing behaviour, and I do not see a reason to do that. -- resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue46798> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24053] Define EXIT_SUCCESS and EXIT_FAILURE constants in sys
Stefan Behnel added the comment: > Any reasons the PR still not merged? There was dissent about whether these constants should be added or not. It doesn't help to merge a PR that is not expected to provide a benefit. -- ___ Python tracker <https://bugs.python.org/issue24053> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[lxml] Re: Undefined symbol error when using lxml from within OBS on Linux machine
Hi, Daniel Beiter schrieb am 10.02.22 um 14:53: For a project I am using OBS (Open Broadcaster Software) that provides Python scripting capabilities to manipulate scenes, objects, etc. ( https://obsproject.com/wiki/Getting-Started-With-OBS-Scripting ). The API is in C and wrapper functions for Python are built by SWIG ( https://obsproject.com/docs/scripting.html ). When loading a Python script from within the OBS software containing nothing else but 'from lxml import etree', it throws an import error because of an undefined symbol. Outside of OBS lxml works as expected and no errors occur. from lxml import etree ImportError: /home/[USER]/.local/lib/python3.8/site-packages/lxml/etree.cpython-38-x86_64-linux-gnu.so: undefined symbol: PyExc_ImportError Can you import other binary packages that you install with pip? E.g. pyyaml or numpy? That symbol is part of Python. It's definitely there. The question is how OBS integrates with Python. Does the application (or the Python library, if it provides one) export the Python symbols? You can list the exported symbols of a library with "nm -D the_library.so". There should be loads of "Py..." symbols in there, including the one above. When Cythonizing src/lxml/etree.pyx warnings occur that the local variable 'args' is referenced before assigned That's unrelated. (And actually a false positive.) Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)
Petr Viktorin schrieb am 10.02.22 um 11:22: So, should there be a mechanism to set source/lineno/position on tracebacks/exceptions, rather than always requiring a frame for it? There's "_PyTraceback_Add()" currently, but it's incomplete in terms of what Cython would need. As it stands, Cython could make use of a function that accepted - string object arguments for filename and function name - (optionally) a 'globals' dict (or a reference to the current module) - (optionally) a 'locals' mapping - (optionally) a code object - a C integer source line - a C integer position, probably start and end lines and columns to add a traceback level to the current exception. I'm not sure about the code object since that's a rather heavy thing, but given that Cython needs to create code objects in order for its functions to be introspectible, that seems like a worthwhile option to have. However, with the recent frame stack refactoring and frame object now being lazily created, according to https://bugs.python.org/issue44032 https://bugs.python.org/issue44590 I guess Cython should rather integrate with the new stack frame infrastructure in general. That shifts the requirements a bit. An API function like the above would then still be helpful for the reduced API compile mode, I guess. But as soon as Cython uses InterpreterFrame structs internally, it would no longer be helpful for the fast mode. InterpreterFrame object are based on byte code instructions again, which brings us back to co_positions. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YSP36JL5SRSPEG4X67G5RMWUWLVXSDC5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)
Andrew Svetlov schrieb am 09.02.22 um 19:40: Stefan, do you really need to emulate call stack with positions? Could the __note__ string with generated Cython part of exception traceback solve your needs (https://www.python.org/dev/peps/pep-0678/) ? Thanks for the link, but I think it would be surprising for users if a traceback displayed some code positions differently than others, when all code lines refer to Python code. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BSDVX7MJFDZ6PFB7FG7Z3R4IO56FZ47T/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)
Guido van Rossum schrieb am 09.02.22 um 19:36: On Wed, Feb 9, 2022 at 9:41 AM Pablo Galindo Salgado wrote: On Wed, 9 Feb 2022 at 17:38, Stefan Behnel wrote: Pablo Galindo Salgado schrieb am 09.02.22 um 17:40: Should there be a getter/setter for co_positions? We consider the representation of co_postions private Yes, and that's the issue. I can only say that currently, I am not confident to expose such an API, at least for co_positions, as the internal implementation is very likely to heavily change and we want to have the possibility of changing it between patch versions if required (to address bugs and other things like that). > > It might require a detailed API design proposal coming from outside > CPython > (e.g. from Cython) to get this to change. I imagine for co_positions in > particular this would have to use a "builder" pattern. > > I am unclear on how this would work though, given that Cython generates C > code, not CPython bytecode. How would the synthesized co_positions be > used? > Would Cython just generate a co_positions fragment at the moment an > exception is raised, pointing at the .pyx file from which the code was > generated? So, what we currently do is to update the line number (which IIRC is really the start line number of the current function) on the current frame when an exception is raised, and the byte code offset to 0. That's a hack but shows the correct code line in the traceback. Probably conflicts with pdb, but there are still other issues with that anyway. I remember looking into the old lnotab mapping at some point and trying to implement that with fake byte code offsets but never got it finished. The idea is pretty simple, though. Instead of byte code offsets, we'd count our syntax tree nodes and just store the code position range of each syntax node at the "byte code offset" of the node's counter number. That's probably fairly easy to do in C code, maybe even with a statically allocated data structure. Then, instead of setting the frame function's line number, we'd set the frame's byte code instruction counter to the number of the failing syntax node, and CPython would retrieve the code position from that offset. That sounds simple enough, probably simpler than any API usage – but depends on implementation details. Especially the idea of storing all this statically in the data segment of the shared library sounds very tempting. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GAJFB6ABFYXF3RFXFDQ3YUZD23FMXPEY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] PEP-657 and co_positions (was: Please update Cython *before* introcuding C API incompatible changes in Python)
Pablo Galindo Salgado schrieb am 09.02.22 um 17:40: Should there be a getter/setter for co_positions? We consider the representation of co_postions private Yes, and that's the issue. so we don't want (for now) to ad getters/setters. If you want to get the position of a instruction, you can use PyCode_Addr2Location What Cython needs is the other direction. How can we provide the current source position range for a given piece of code to an exception? As it stands, the way to do this is to copy the implementation details of CPython into Cython in order to let it expose the specific data structures that CPython uses for its internal representation of code positions. I would prefer using an API instead that allows exposing this mapping directly to CPython's traceback handling, rather than having to emulate byte code positions. While that would probably be quite doable, it's far from a nice interface for something that is not based on byte code. And that's not just a Cython issue. The same applies to Domain Specific Languages or other programming languages that integrate with Python and want to show users code positions for their source code. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VQSWX6MFKIA3RYPSX7O6RTVC422LTJH4/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
Inada Naoki schrieb am 08.02.22 um 06:15: On Tue, Feb 8, 2022 at 1:47 PM Guido van Rossum wrote: Thanks for trying it! I'm curious why it would be slower (perhaps less locality? perhaps the ...Id... APIs have some other trick up their sleeve?) but since it's also messier and less backwards compatible than just leaving _Py_IDENTIFIER alone and just not using it, I'd say let's not spend more time on that alternative and just focus on the two other horses still in the race: immortal objects or what you have now. I think it's because statically allocated strings are not interned. That would explain such a difference. I think deepfreeze should stop using statically allocated strings for interned strings too. … or consider the statically allocated strings the interned string value. Unless another one already exists, but that shouldn't be the case for CPython internal strings. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5NE7EI3TVW4C3ZZI6LO5HNPIZRQNPMHG/ Code of Conduct: http://python.org/psf/codeofconduct/
[issue45948] Unexpected instantiation behavior for xml.etree.ElementTree.XMLParser(target=None)
Stefan Behnel added the comment: This is a backwards incompatible change, but unlikely to have a wide impact. I was thinking for a second if it's making the change in the right direction because it's not unreasonable to pass "None" for saying "I want no target". But it's documented this way and lxml does it the same, so I agree that this should be changed to make "None" behave the same as no argument. -- ___ Python tracker <https://bugs.python.org/issue45948> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
Eric Snow schrieb am 04.02.22 um 17:35: On Fri, Feb 4, 2022 at 8:21 AM Stefan Behnel wrote: Correct. We (intentionally) have our own way to intern strings and do not depend on CPython's identifier framework. You're talking about __Pyx_StringTabEntry (and __Pyx_InitString())? Yes, that's what we generate. The C code parsing is done here: https://github.com/cython/cython/blob/79637b23da77732e753b1e1ab5669b3e29978be3/Cython/Compiler/Code.py#L531-L550 The deduplication is a bit complex on our side because it needs to handle Python source encodings, and also distinguishes between identifiers (that become 'str' in Py2), plain Unicode strings and byte strings. You don't need most of that for plain C code. But it's done here: https://github.com/cython/cython/blob/79637b23da77732e753b1e1ab5669b3e29978be3/Cython/Compiler/Code.py#L1009-L1088 And then there's a whole bunch of code that helps in getting Unicode character code points and arbitrary byte values in very long strings pushed through C compilers, while keeping it mostly readable for interested users. :) https://github.com/cython/cython/blob/master/Cython/Compiler/StringEncoding.py You probably don't need that either, as long as you only deal with ASCII strings. Any way, have fun. Feel free to ask if I can help. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QHJBAKIQUKFPIM6GZ7DYNJF3HDMDQQUH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
Ronald Oussoren via Python-Dev schrieb am 03.02.22 um 14:46: On 2 Feb 2022, at 23:41, Eric Snow wrote: * a little less convenient: adding a global string requires modifying a separate file from the one where you actually want to use the string * strings can get "orphaned" (I'm planning on checking in CI) * some strings may never get used for any given ./python invocation (not that big a difference though) The first two cons can probably be fixed by adding some indirection, with some markers at the place of use and a script that uses those to generate the C definitions. Although my gut feeling is that adding a the CI check you mention is good enough and adding the tooling for generating code isn’t worth the additional complexity. It's what we do in Cython, and it works really well there. It's very straight forward, you just write something like PYUNICODE("some text here") PYIDENT("somename") in your C code and Cython creates a deduplicated global string table from them and replaces the string constants with the corresponding global variables. (We have two different names because an identifier in Py2 is 'str', not 'unicode'.) Now, the thing with CPython is that the C sources where the replacement would take place are VCS controlled. And a script that replaces the identifiers would have to somehow make sure that the new references do not get renamed, which would lead to non-local changes when strings are added. What you could try is to number the identifiers, i.e. use a macro like _Py_STR(123, "some text here") where you manually add a new identifier as _Py_STR("some text here") and the number is filled in automatically by a script that finds all of them, deduplicates, and adds new identifiers at the end, adding 1 to the maximum number that it finds. That makes sure that identifiers that already have an ID number will not be touched, deleted strings disappear automatically, and non-local changes are prevented. Defining the _Py_STR() macro as #define _Py_STR(id, text) (_Py_global_string_table[id]) or #define _Py_STR(id, text) (_Py_global_string_table##id) would also give you a compile error if someone forgets to run the script. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LD3JM2NQ5ZUZDK63RH4IVZPCZ7HC4X3G/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Petr Viktorin schrieb am 03.02.22 um 13:47: On 02. 02. 22 11:50, Stefan Behnel wrote: Maybe we should advertise the two modes more. And make sure that both work. There are certainly issues with the current state of the "limited API" implementation, but that just needs work and testing. I wonder if it can it be renamed? "Limited API" has a specific meaning since PEP 384, and using it for the public API is adding to the general confusion in this area :( I was more referring to it as an *existing* compilation mode of Cython that avoids the usage of CPython implementation details. The fact that the implementation is incomplete just means that we spill over into non-limited API code when no limited API is available for a certain feature. That will usually be public API code, unless that is really not available either. One recent example is the new error locations in tracebacks, where PEP 657 explicitly lists the new "co_positions" field in code objects as an implementation detail of CPython. If we want to implement this in Cython, then there is no other way than to copy these implementation details pretty verbatimly from CPython and to depend on them. https://www.python.org/dev/peps/pep-0657/ In this specific case, we're lucky that this can be considered an entirely optional feature that we can separately disable when users request "public API" mode (let's call it that). Not sure if that's what users want, though. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/A55HYBIFBOTAX5IB4YUYWUHI3IDLRD2F/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Moving away from _Py_IDENTIFIER().
Victor Stinner schrieb am 03.02.22 um 22:46: Oh right, Cython seems to be a false positive. A code search found 3 references to __Pyx_PyObject_LookupSpecial(): PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz: Cython-0.29.26/Cython/Compiler/ExprNodes.py: lookup_func_name = '__Pyx_PyObject_LookupSpecial' PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz: Cython-0.29.26/Cython/Compiler/Nodes.py: code.putln("%s = __Pyx_PyObject_LookupSpecial(%s, %s); %s" % ( PYPI-2022-01-26-TOP-5000/Cython-0.29.26.tar.gz: Cython-0.29.26/Cython/Utility/ObjectHandling.c: static CYTHON_INLINE PyObject* __Pyx_PyObject_LookupSpecial(PyObject* obj, PyObject* attr_name) { Oh, that's not "_PyObject_LookupSpecial()", it doesn't use the _Py_Identifier type: static CYTHON_INLINE PyObject* __Pyx_PyObject_LookupSpecial(PyObject* obj, PyObject* attr_name) { ... } Correct. We (intentionally) have our own way to intern strings and do not depend on CPython's identifier framework. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4ATP4FSVRNI5CLAJDN43QRDH5IHW7BW2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Victor Stinner schrieb am 02.02.22 um 23:23: On Wed, Feb 2, 2022 at 3:54 PM Stefan Behnel wrote: So people using stable Python versions like Python 3.10 would not need Cython, but people testing the "next Python" (Python 3.11) would not have to manually removed generated C code. That sounds like an environment variable might help? Something like CYTHON_FORCE_REGEN=1 would be great :-) https://github.com/cython/cython/commit/b859cf2bd72d525a724149a6e552abecf9cd9d89 Note that this only applies when cythonize() is actually called. Some setup.py scripts may not do that unless requested to. My use case is to use a project on the "next Python" version (the main branch) when the project contains outdated generated C code, whereas I have a more recent Cython version installed. That use case would probably be covered by the Cython version check now, in case that stays in (the decision is pending user feedback). Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/N6R5BE4GVNYRUTOET5QRQ5N2ZCJYZC7X/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Ronald Oussoren via Python-Dev schrieb am 02.02.22 um 16:44: On 2 Feb 2022, at 11:50, Stefan Behnel wrote: Petr Viktorin schrieb am 02.02.22 um 10:22: - "normal" public API, covered by the backwards compatibility policy (users need to recompile for every minor release, and watch for deprecation warnings) That's probably close to what "-DCYTHON_LIMITED_API" does by itself as it stands. I can see that being a nice feature that just deserves a more suitable name. (The name was chosen because it was meant to also internally define "Py_LIMITED_API" at some point. Not sure if it will ever do that.) - internal API (underscore-prefixed names, `internal` headers, things documented as private) AFAIK, only the last one is causing trouble here. Yeah, and that's the current default mode on CPython. Is is possible to automatically pick a different default version when building with a too new CPython version? That way projects can at least be used and tested with pre-releases of CPython, although possibly with less performance. As I already wrote elsewhere, that is making the assumption (or at least optimising for the case) that a new CPython version always breaks Cython. And it has the drawback that we'd get less feedback on the "normal" integration and may thus end up noticing problems only later in the CPython development cycle. I don't think this really solves a problem. In any case, before we start playing with the default settings, I'd rather let users see what *they* can make of the available options. Then we can still come back and see which use cases there are and how to support them better. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2SIGLMW4HNF5BDF2DTFZFXCHNSR4VAGB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Petr Viktorin schrieb am 02.02.22 um 10:22: Moving off the internal (unstable) API would be great, but I don't think Cython needs to move all the way to the limited API. There are three "levels" in the C API: - limited API, with long-term ABI compatibility guarantees That's what "-DCYTHON_LIMITED_API -DPy_LIMITED_API=..." is supposed to do, which currently fails for much if not most code. - "normal" public API, covered by the backwards compatibility policy (users need to recompile for every minor release, and watch for deprecation warnings) That's probably close to what "-DCYTHON_LIMITED_API" does by itself as it stands. I can see that being a nice feature that just deserves a more suitable name. (The name was chosen because it was meant to also internally define "Py_LIMITED_API" at some point. Not sure if it will ever do that.) - internal API (underscore-prefixed names, `internal` headers, things documented as private) AFAIK, only the last one is causing trouble here. Yeah, and that's the current default mode on CPython. Maybe we should advertise the two modes more. And make sure that both work. There are certainly issues with the current state of the "limited API" implementation, but that just needs work and testing. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ESEPW36K3PH4RM7OFVKAOE4QMBI2WYVU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Victor Stinner schrieb am 02.02.22 um 11:35: I wish that there would be a 3rd option: ship C code generated by Cython *but* run Cython if this C code "looks" outdated, for example if building the C code fails with a compiler error. So, one thing I did yesterday was to make sure that .c files get regenerated when a different Cython version is used at build time than what was used to generate them originally. Thinking about this some more now, I'm no longer sure that this is really a good idea, because it can lead to "random" build failures when a package does not pin its Cython version and a newer (or, probably worse, older) one happens to be installed at build time. Not sure how to best deal with this. I'm open to suggestions, although this might be the wrong forum. Let's discuss it in a ticket: https://github.com/cython/cython/issues/4611 Note that what you propose sounds more like a setuptools feature than a Cython feature, though. So people using stable Python versions like Python 3.10 would not need Cython, but people testing the "next Python" (Python 3.11) would not have to manually removed generated C code. That sounds like an environment variable might help? I don't really want to add something like a "last supported CPython version". There is no guarantee that the code breaks between CPython versions, so that would just introduce an artificial support blocker. In Fedora RPM packages of Python projects, we have to force manually running Cython. For example, the numpy package does: "rm PKG-INFO" with the comment: "Force re-cythonization (ifed for PKG-INFO presence in setup.py)". https://src.fedoraproject.org/rpms/numpy/blob/rawhide/f/numpy.spec#_107 In my pythonci project, I use a worse hack, I search for generated C files and remove them manually with this shell command: rm -f -v $(grep -rl '/\* Generated by Cython') PKG-INFO This command searchs for the pattern "/* Generated by Cython". Right. Hacks like these are just awful. There must be a better way. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/V76GA5DRWPEJ7PRBSPRQX335WARZLUHJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[lxml] Re: Missing PDFs from lxml.de/lxml-VERSION.pdf?
Hi, Thomas Schraitle schrieb am 02.02.22 um 08:20: not really remember, but some time ago it was possible to download the latest PDF from lxml.de/lxml-.pdf. This worked quite well, but now I get a 404. Is there a replacement that I can use? If not, would it be possible to build the PDF from the sources and upload it to the assets on a GitHub release? Yeah, that was based on LaTeX PDF generation, broke on my machine at some point and wasn't repaired since. PR welcome, especially if it gets the machinery running as a Github Actions job in the wheel build workflow. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Guido van Rossum schrieb am 02.02.22 um 01:43: It may be hard to imagine if you're working on Cython, which only exists because of performance needs, but there are other things that people want to test with the upcoming CPython release in addition to performance I know. Cython (and originally Pyrex) has come a long way from a tool to get stuff done to a dependency that a large number of packages depend on. Maintainer decisions these days are quite different from those 10 years ago. Let alone 20. Let's just try to keep things working in general, and fix stuff that needs to be broken. On Tue, Feb 1, 2022 at 4:14 PM Stefan Behnel wrote: I'd rather make it more obvious to users what their intentions are. And there is already a way to do that – the Limited API. (and similarly, HPy) Your grammar confuses me. Do you want users to be clearer in expressing their intentions? Erm, sort of. They should be able to choose and express what they prefer, in a simple way. For Cython, support for the Limited API is still work in progress, although many things are in place already. Getting it to work completely would give users a simple way to decide whether they want to opt in for a) speed, lots of wheels and adaptations for each CPython version, or b) less performance, less hassle. But until that work is complete, we're stuck with the unlimited API, right? And by its own statements in a recent post here, HPy is still not ready for all use cases, so it's also still a pipe dream. Yes. HPy is certainly far from ready for anything real, but even for the Limited API, it's still unclear whether it's actually complete enough to cover Cython's needs. Basically, the API that Cython uses must really to be able to implement CPython on top of itself. And at the same time interact not with the reimplementation but with the underlying original, at the C level. The C-API, and especially the Limited API, were never really meant for that. As it looks now, that switch can be done after the code generation, by defining a simple C define in their build script. That also makes both modes easily comparable. I think that is as good as it can get. Do you have specific instructions for package developers here? I could imagine that the scikit-learn maintainer (sorry to pick on you guys :-) might not know where to start with this if until now they've always been able to rely on either numpy wheels or building everything from source with default settings. It's not well documented yet, since the implementation isn't complete, and so, a bunch of things simply won't work. I don't remember if the buffer protocol is part of the Limited API by now, but last I checked it was still missing, so the scikit-learn (or NumPy) people would be fairly unhappy with the current state of affairs. But it's mostly just passing "-DCYTHON_LIMITED_API" to your C compiler. That's the part that will still work but won't do (yet) what you think. Because then, you currently also have to define "-DPy_LIMITED_API=..." and that's when your C compiler will get angry with you. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2UFG7IPKR77HQG36BZAUEUDJJKIGBSLE/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Thomas Caswell schrieb am 01.02.22 um 23:15: I think it would be better to discourage projects from including the output of cython in their sdists. They should either have cython as a build-time requirement or provide built wheels (which are specific a platform and CPython version). The middle ground of not expecting the user to have cython while expecting them to have a working c-complier is a very narrow case and I think asking those users to install cython is worth the forward compatibility for Python versions you get by requiring people installing from source to re-cythonize. I agree. Shipping the generated C sources was a very good choice as long as CPython's C-API was very stable and getting a build time dependency safely installed on user side was very difficult. These days, it's the opposite way. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KTWDJGHPQW7AIKDQQYV4IFHAKQZVXACL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Guido van Rossum schrieb am 02.02.22 um 00:21: On Tue, Feb 1, 2022 at 3:07 David wrote: Greg Ewing wrote: To address this there could be an option to choose between "compatible code" and "fast code", with the former restricting itself to the stable API. To some extent, that exists at the moment - many of the real abuses of the CPython internals can be controlled by setting C defines. For the particular feature that caused this discussion the majority of the uses can be turned off by defining CYTHON_USE_EXC_INFO_STACK=0 and CYTHON_FAST_THREAD_STATE=0. (There's still a few uses relating to coroutines, but those too flags are sufficient to get Cython to build itself and Numpy on Python 3.11a4). Obviously it could still be better. But the desire to support PyPy (and the beginnings of the limited API) mean that Cython does actually have alternate "clean" code-paths for a lot of cases. Hm... So maybe the issue is either with Cython's default settings (perhaps traditionally it defaults to "as fast as possible but relies on internal APIs a lot"?) or with the Cython settings selected by default by projects *using* Cython? I wonder if a solution during CPython's rocky alpha release cycle could be to default (either in Cython or in projects using it) to the "not quite as fast but not relying on a lot of internal APIs" mode, and to switch to Cython's faster mode only once (a) beta is entered and (b) Cython has been fixed to work with that beta? This seems tempting – with the drawback that it would make Cython modules less comparable between final and alpha/beta CPython releases. So users would start reporting ghost performance regressions because it (understandably) feels important to them that the slow-down they witness needs to be resolved before the final release, and they just won't know that this will happen automatically triggered by the version switch. :) Feels a bit like car manufacturers who switch their exhaust cleaners on and off based on the test mode detection. More importantly, though, we'd get less bug reports during the alpha/beta cycle ourselves, because things may look like they work but can still stop working when we switch back to fast mode. I'd rather make it more obvious to users what their intentions are. And there is already a way to do that – the Limited API. (and similarly, HPy) For Cython, support for the Limited API is still work in progress, although many things are in place already. Getting it to work completely would give users a simple way to decide whether they want to opt in for a) speed, lots of wheels and adaptations for each CPython version, or b) less performance, less hassle. As it looks now, that switch can be done after the code generation, by defining a simple C define in their build script. That also makes both modes easily comparable. I think that is as good as it can get. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FXSNX7UCQWNXXC7OWG4LBLILAYXQEOUB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Hi Irit, Irit Katriel via Python-Dev schrieb am 01.02.22 um 23:04: There two separate issues here. One is the timing of committing changes into cython, and the other is the process by which the cython devs learn about cpython development. On the first issue, you wrote: I'm reluctant to working on adapting Cython during alphas, because it happened more than once that incompatible changes in CPython were rolled back or modified again during alpha, beta and rc phases. That means more work for me and the Cython project, and its users. Code that Cython users generate and release on their side with a release version of Cython will then be broken, and sometimes even more broken than with an older Cython release. I saw in your patch that you make changes such that they impact only the new cpython version. So for old versions the generated code should not be broken. Surely you don't guarantee that cython code generated for an alpha version of cpython will work on later versions as well? Users who generate code for an alpha version should regenerate it for the next alpha and for beta, right? I'd just like to note that we are talking about three different projects and dependency levels here (CPython, Cython and a project that uses Cython), all three have different release cycles, and not all projects can afford to go through a new release with a new Cython version regularly or on the "emergency" event of a new CPython release. Some even don't provide wheels and require their users to do a source build on their side. Often with a fixed Cython version dependency, or even with pre-generated and shipped C sources, which makes it harder for the end users to upgrade Cython as a work-around. But at least it should be as easy for the maintainers as updating their Cython version and pushing a new release. In most cases. And things are also becoming easier these days with improvements in the packaging ecosystem. It can just take a bit until everyone has had the chance to upgrade along the food chain. On the second issue: I don't have the capacity to follow all relevant changes in CPython, incompatible or not. We get that, and this is why we're asking to work with you on cython updates so that this will be easier for all of us. There are a number of cpython core devs who would like to help cython maintenance. We realise how important and thinly resourced cython is, and we want to reduce your maintenance burden. With better communication we could find ways to do that. I'm sure we will. Thanks for your help. It is warmly appreciated. Returning to the issue that started this thread - how do you suggest we proceed with the exc_info change? I'm not done sorting out the options yet. Regarding CPython, I think it's best to keep the current changes in there. It should be easier for us to continue from where we are now than to adapt again to a revert in CPython. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BHIQL4P6F7OPMCAP6U24XEZUPQKI62UT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Greg Ewing schrieb am 01.02.22 um 23:33: On 2/02/22 8:48 am, Guido van Rossum wrote: It seems to me that a big part of the problem is that Cython feels entitled to use arbitrary CPython internals. I think the reason for this is that Cython is trying to be two things at once: (1) an interface between Python and C, (2) a compiler that turns Python code into fast C code. To address this there could be an option to choose between "compatible code" and "fast code", with the former restricting itself to the stable API. There is even more than such an option. We use a relatively large set of feature flags that allow us to turn the usage of certain implementation details of the C-API on and off. We use this to adapt to different Python C-API implementations (currently CPython, PyPy, GraalPython and the Limited C-API), although with different levels of support and reliability. Here's the complete list of feature sets for the different targets: https://github.com/cython/cython/blob/5a76c404c803601b6941525cb8ec8096ddb10356/Cython/Utility/ModuleSetupCode.c#L56-L311 This can also be used to enable and disable certain dependencies on CPython implementation details, e.g. PyList, PyLong or PyUnicode, but also type specs versus PyTypeObject structs. Most of these feature flags can be disabled by users. There is no hard guarantee that this always works, because it's impossible to test all combinations, and then there are bugs as well, but most of the flags are independent, which should usually allow to disable them independently. So, one of the tools that we have in our sleeves when it comes to supporting new CPython versions is also to selectively disable the dependency on a certain C-API feature that changed, at least until we have a way to adapt to the change itself. In the specific case of the "exc_info" changes, however, that didn't quite work, because that change was really not anticipated at that level of impact. But there is an implementation for Cython 3.0 alpha now, and we'll eventually have a legacy 0.29.x release out that will also adapt in one way or another. Just takes a bit more time. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QPAWLCS2FINPLVSDFFQCMVIELXETKQ3W/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Please update Cython *before* introcuding C API incompatible changes in Python
Christian Heimes schrieb am 01.02.22 um 16:42: On 01/02/2022 16.08, Victor Stinner wrote: I would prefer to introduce C API incompatible changes differently: first fix Cython, and *then* introduce the change. - (1) Propose a Cython PR and get it merged - (2) Wait until a new Cython version is released - (3) If possible, wait until numpy is released with regenerated Cython code - (4) Introduce the incompatible change in Python Note: Fedora doesn't need (3) since we always regenerated Cython code in numpy. this is a reasonable request for beta releases, but IMHO it is not feasible for alphas. During alphas we want to innovate fast and play around. Your proposal would slow down innovation and impose additional burden on core developers. Let's at least try not to run into a catch-22. I'm reluctant to working on adapting Cython during alphas, because it happened more than once that incompatible changes in CPython were rolled back or modified again during alpha, beta and rc phases. That means more work for me and the Cython project, and its users. Code that Cython users generate and release on their side with a release version of Cython will then be broken, and sometimes even more broken than with an older Cython release. But Victor is right, OTOH, that the longer we wait with adapting Cython, the longer users have to wait with testing their code in upcoming CPython versions, and the higher the chance of post-beta and post-rc rollbacks and changes in CPython. I don't have the capacity to follow all relevant changes in CPython, incompatible or not. Even a Cython CI breakage of the CPython-dev job doesn't always mean that there is something to do on our side and is therefore silenced to avoid breakage of our own project workflows, and to be looked at irregularly. Additionally, since Cython is a crucial part of the Python ecosystem, breakage of Cython by CPython sometimes stalls the build pipelines of CI images, which means that new CPython dev versions don't reach the CI servers for a while, during which the breakage will go even more unnoticed. I think you should generally appreciate Cython (and the few other C-API abstraction tools) as an opportunity to get a large number of extensions adapted to CPython's now faster development all at once. The quicker these tools adapt, the quicker you can get user feedback on your own changes, and the more time you have to validate and refine them during the alpha and beta cycles. You can even see the adaptation as a way to validate your own changes in the real world. It's cool to write new code, but difficult to find out whether it behaves the way you want for the intended audience. So – be part of your own audience. Stefan ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LJDI74V4IOHPCMQUEGH6VIQWHLM3MADG/ Code of Conduct: http://python.org/psf/codeofconduct/
[lxml] Re: Restricting third party access for lxml github org?
Hi Martijn! Martijn Faassen schrieb am 25.01.22 um 11:11: Hey lxmlers, I recently found out that older organizations by default grant third party access to any github OAuth application that a user has enabled. This means that if any of such applications is compromised, this organization is open for attack. I therefore would recommend we go amend that here: https://github.com/organizations/lxml/settings/oauth_application_policy I don't think it has huge consequences as you can selectively enable those applications you trust after that, but I figured people using this org should be aware before it's enabled. Good call. I enabled that setting. If anything stops working unexpectedly, that was me. :) Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
Re: [Cython] question on submitting a possibly massive bug report
website.reader via cython-devel schrieb am 25.01.22 um 01:09: I am not familiar with Cython, but have spent a few weeks looking at compiler warnings posted when the mathematical package called "sage v9.4" is compiled, which takes several hours to build, since hundreds of code units are invovled in this massive build project. I logged 341 errors during the cythonizing part of the compile run, and found 110 code units (C packages) which I was able to fix so that the recompile would have no warnings. The warnings were legitimate. There are 4 categories of these warnings. 1. Using an unitialized variable with an unknown value 2. Comparing signed and unsigned variables 3. Discarding a const specifier to a variable upon use elsewhere in the code 4. Coercing a pointer to a variable of the wrong type (or vice versa) I did speak to one knowledgable person about this, but my question is this a) do I submit 341 bug reports covering all the warnings? b) since 110 code units were affected do I file 110 bug reports for each code unit? b) do I submit just one bug report for each of the 4 categories above, thus just 4 bug reports? c) do I just list all the warning messages obtained from the massive build run so everyone can get some idea of the problems being faced? I did look at the C code and the pyx code generating it and definitely cython is the origination here of these issues. Since I am NOT yet familiar with cython from scratch, at the moment I am at a loss to write litte tiny programs illustrating the problem. Cython is a code generator, so there probably are only a few places where a larger bunch of issues originate from. You already grouped them by type (1-4), and those likely belong to one cause (or a few related causes). Just open one issue for each of the four. Then please list a few source code examples in each, together with the C code that Cython generated for them, and the warning that the C compiler gave you. If we later find that not all warnings can be resolved this way, we'll see what we can do about the rest. Please make sure to provide the Cython version that you are using. The latest release is 3.0.0a10 (and the main development goes there), although there is a legacy stable version series 0.29.x that most projects are still using and where we will continue to fix bugs for another while. But new reports should best target 3.0 in order to avoid chasing zombies. Thanks, Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[issue45569] Drop support for 15-bit PyLong digits?
Stefan Behnel added the comment: Cython should be happy with whatever CPython uses (as long as CPython's header files agree with CPython's build ;-) ). I saw the RasPi benchmarks on the ML. That would have been my suggested trial platform as well. https://mail.python.org/archives/list/python-...@python.org/message/5RJGI6THWCDYTTEPXMWXU7CK66RQUTD4/ The results look ok. Maybe the slowdown for pickling is really the increased data size of integers. And it's visible that some compute-heavily benchmarks like pyaes did get a little slower. I doubt that they represent a real use case on such a platform, though. Doing any kind of number crunching on a RasPi without NumPy would appear like a rather strange adventure. That said, if we decide to keep 15-bit digits in the end, I wonder if "SIZEOF_VOID_P" is the right decision point. It seems more of a "has reasonably fast 64-bit multiply or not" kind of decision – however that translates into code. I'm sure there are 32-bit platforms that would actually benefit from 30-bit digits today. If we find a platform that would be fine with 30-bits but lacks a fast 64-bit multiply, then we could still try to add a platform specific value size check for smaller numbers. Since those are common case, branch prediction might help us more often than not. But then, I wonder how much complexity this is even worth, given that the goal is to reduce the complexity. Platform maintainers can still decide to configure the digit size externally for the time being, if it makes a difference for them. Maybe switching off 15-bits by default is just good enough for the next couple of years to come. :) -- ___ Python tracker <https://bugs.python.org/issue45569> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
Re: [xml] Resuming maintenance
Nick Wellnhofer via xml schrieb am 10.01.22 um 15:20: Thanks to a donation from Google, I'm able to resume maintenance of libxml2 (and libxslt) for the remainder of 2022. I'm very happy to read this, Nick. All the best for 2022. Stefan ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[lxml] Re: question about a bug when lxml runs in a conda environment
Martin Mueller schrieb am 31.12.21 um 18:06: I have used lxml extensively in a Pycharm environment that calls on a conda environment. Lately I encountered an odd error. The correct output of a marylamb.py script goes like this: http://www.tei-c.org/ns/1.0";> Mary had a little lamb, http://www.tei-c.org/ns/1.0";> Its fleece was white as snow, yeah. http://www.tei-c.org/ns/1.0";> Everywhere the child went, http://www.tei-c.org/ns/1.0";> The little lamb was sure to go, yeah. http://www.tei-c.org/ns/1.0";> He followed her to school one day, http://www.tei-c.org/ns/1.0";> And broke the teacher's rule. http://www.tei-c.org/ns/1.0";> What a time did they have, http://www.tei-c.org/ns/1.0";> That day at school. http://www.tei-c.org/ns/1.0";> Tisket, tasket, http://www.tei-c.org/ns/1.0";> A green and yellow basket. http://www.tei-c.org/ns/1.0";> Sent a letter to my baby, http://www.tei-c.org/ns/1.0";> On my way I passed it. In the buggy output the script runs amok and prints the current line plus the rest of the text. I print it out at the end of this memo. The Pycharms folks were able to identify the conda environment as the likely culprit. If I run the script outside it doeesn’t happen. The problem seems to be limited to lxml running in a conda environment, because scripts that don’t use lxml are not plague by that bug. It's most likely an issue with the libxml2 version. You probably have 2.9.12 installed in your condaenv. If you go back to 2.9.10, then it would probably work. conda install libxml2=2.9.10 You can find the version that lxml uses with """ from lxml import etree print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION)) print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION)) print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION)) """ The "LIBXML_VERSION" is what is currently used. Stefan ___ lxml - The Python XML Toolkit mailing list -- lxml@python.org To unsubscribe send an email to lxml-le...@python.org https://mail.python.org/mailman3/lists/lxml.python.org/ Member address: arch...@mail-archive.com
[issue44394] [security] CVE-2013-0340 "Billion Laughs" fixed in Expat >=2.4.0: Update vendored copy to expat 2.4.1
Stefan Behnel added the comment: I'd like to ask for clarification regarding issue 45321, which adds the missing error constants to the `expat` module. I consider those new features – it seems inappropriate to add new module constants in the middle of a release series. However, in this ticket here, the libexpat version was updated all the way back to Py3.6, to solve a security issue. Should we also backport the error constants then? -- nosy: +scoder ___ Python tracker <https://bugs.python.org/issue44394> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45321] Module xml.parsers.expat.errors misses error code constants of libexpat >=2.0
Change by Stefan Behnel : -- components: +XML resolution: -> fixed stage: patch review -> resolved status: open -> closed type: -> enhancement versions: -Python 3.10, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue45321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45321] Module xml.parsers.expat.errors misses error code constants of libexpat >=2.0
Stefan Behnel added the comment: New changeset e18d81569fa0564f3bc7bcfd2fce26ec91ba0a6e by Sebastian Pipping in branch 'main': bpo-45321: Add missing error codes to module `xml.parsers.expat.errors` (GH-30188) https://github.com/python/cpython/commit/e18d81569fa0564f3bc7bcfd2fce26ec91ba0a6e -- ___ Python tracker <https://bugs.python.org/issue45321> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45711] Simplify the interpreter's (type, val, tb) exception representation
Stefan Behnel added the comment: FYI, we track the Cython side of this in https://github.com/cython/cython/issues/4500 -- ___ Python tracker <https://bugs.python.org/issue45711> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com