[lxml] Re: When is a number not a number

2023-03-03 Thread Stefan Behnel
Stefan Behnel schrieb am 03.03.23 um 09:00: Stefan Behnel schrieb am 02.03.23 um 08:50: Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de: Probably a bug in _checkNumber(): https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 Ah,

[lxml] Re: When is a number not a number

2023-03-03 Thread Stefan Behnel
Stefan Behnel schrieb am 02.03.23 um 08:50: Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de: Probably a bug in _checkNumber(): https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 Ah, yes, it might be the isdigit() check,

[lxml] Re: When is a number not a number

2023-03-02 Thread Holger.Joukl
> > ValueError: invalid literal for int() with base 10: '²²' > > > > Probably a bug in _checkNumber(): > > https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae > > 5fc4/src/lxml/objectify.pyx#L974 > > str.isdigit() accepts many Unicode characters classified as digits that

[lxml] Re: When is a number not a number

2023-03-02 Thread Marius Gedminas
On Wed, Mar 01, 2023 at 03:15:22PM +, holger.jo...@lbbw.de wrote: > ValueError: invalid literal for int() with base 10: '²²' > > Probably a bug in _checkNumber(): > https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 str.isdigit()

[lxml] Re: When is a number not a number

2023-03-01 Thread Stefan Behnel
Am March 1, 2023 3:15:22 PM UTC schrieb holger.jo...@lbbw.de: >Probably a bug in _checkNumber(): >https://github.com/lxml/lxml/blob/d01872ccdf7e1e5e825b6c6292b43e7d27ae5fc4/src/lxml/objectify.pyx#L974 Ah, yes, it might be the isdigit() check, actually. That could be too broad. Not every digit is

[lxml] Re: When is a number not a number

2023-03-01 Thread Holger.Joukl
> Hi, > Does anyone think this needs to be posted to the bug tracker? > > lxml seems to identify superscripts as an integer but then throws an > exception. > > Thanks > > Alex > > > from lxml import objectify > xml = """ > > ²² > > """ > doc = objectify.fromstring(xml) >

[lxml] Re: When is a number not a number

2023-03-01 Thread Alex Hippel
Sorry, I forgot to include the version numbers: Python : sys.version_info(major=3, minor=11, micro=1, releaselevel='final', serial=0) lxml.etree : (4, 9, 2, 0) libxml used : (2, 9, 14) libxml compiled : (2, 9, 14) libxslt used: (1, 1, 35) libxslt