Neil Hodgson wrote:
> Glenn Linderman:
> 
>> and perhaps other things (and
>> are there new Unicode control characters that could be used for line
>> endings?),
> 
>    Unicode includes Line Separator U+2028 and Paragraph Separator
> U+2029 but they are rarely supported and very rarely used. They are a
> pain to work with since they are 3 byte sequences in UTF-8. Visual
> Studio does support them.
> 
>    Python does not currently support these line separators such as in
> this example which only reads 2 lines rather than 3:
> 
> with open("x.txt", "wb") as f:
>       f.write("a\nb\u2029c\n".encode('utf-8'))
> with open("x.txt", "r") as f:
>       n = 1
>       for l in f.readlines():
>               print(n, repr(l))
>               n += 1

Please file a bug report for this. f.readlines() (or rather
the io layer) should be using Py_UNICODE_ISLINEBREAK(ch)
for detecting line break characters.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 06 2009)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to