On Wed, Oct 23, 2019 at 6:04 PM David Mertz <me...@gnosis.cx> wrote:

> Is this the same code points identified by `str.isspace`?
>

I haven't checked -- so I will:

and the answer is no:

$ python weird_spaces.py
x x x x᠎x x x x x x x x x x x xx x x xx
['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
41
18
[False, True, False, True, False, True, False, False, False, True, False,
True, False, True, False, True, False, True, False, True, False, True,
False, True, False, True, False, True, False, True, False, False, False,
True, False, True, False, True, False, False, False]

There are only three that didn't split, but many more than three that
failed .isspace.

Thanks for doing that. I would have soon otherwise. Still, "most of them"
> isn't actually a precise answer for an uncertain string. :-)
>

nope.

But it could be defined somewhere, and presumably is, though maybe not
consistently.

-CHB

On Wed, Oct 23, 2019, 8:57 PM Christopher Barker <python...@gmail.com>
wrote:

> On Wed, Oct 23, 2019 at 5:53 PM Andrew Barnert via Python-ideas <
> python-ideas@python.org> wrote:
>
>> > To be fair, I also don't know which of those split on str.split() with
>> no arguments to the method either.
>>
>
> I couldn't resist -- the answer is most of them:
>
> #!/usr/bin/env python
> weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
>                 "x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
>                 "x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")
> print(weird_spaces)
> splitted = weird_spaces.split()
> print(splitted)
>
> print(len(weird_spaces))
> print(len(splitted))
>
> $ python weird_spaces.py
> x x x x᠎x x x x x x x x x x x xx x x xx
> ['x', 'x', 'x', 'x\u180ex', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x', 'x',
> 'x', 'x\u200bx', 'x', 'x', 'x\ufeffx']
> 41
> 18
>
> -CHB
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
>   - Teaching
>   - Scientific Software Development
>   - Desktop GUI and Web Development
>   - wxPython, numpy, scipy, Cython
>


-- 
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
#!/usr/bin/env python

weird_spaces = ("x\u0020x\u00A0x\u1680x\u180Ex\u2000x\u2001x\u2002"
                "x\u2003x\u2004x\u2005x\u2006x\u2007x\u2008x\u2009"
                "x\u200Ax\u200Bx\u202Fx\u205Fx\u3000x\uFEFFx")

print(weird_spaces)
splitted = weird_spaces.split()
print(splitted)

print(len(weird_spaces))
print(len(splitted))


isspace = [c.isspace() for c in weird_spaces]

print(isspace)


_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ICM2RIS7EA3RXCRVRYTSDALFUQUEDM35/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to