Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset c34772013c53 by Ezio Melotti in branch '3.2':
#12266: Fix str.capitalize() to correctly uppercase/lowercase titlecased and
cased non-letter characters.
http://hg.python.org/cpython/rev/c34772013c53
New changeset
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 1ea72da11724 by Ezio Melotti in branch 'default':
#12266: merge with 3.2.
http://hg.python.org/cpython/rev/1ea72da11724
--
___
Python tracker rep...@bugs.python.org
Ezio Melotti ezio.melo...@gmail.com added the comment:
Fixed, thanks for the report!
--
resolution: duplicate - fixed
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12266
Tom Christiansen tchr...@perl.com added the comment:
Ezio Melotti rep...@bugs.python.org wrote on Mon, 15 Aug 2011 04:56:55 -:
Another thing I noticed is that (at least on wide builds) surrogate pairs are
not joined on the fly:
p
'\ud800\udc00'
len(p)
2
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset d3816fa1bcdf by Ezio Melotti in branch '2.7':
#12266: move the tests in test_unicode.
http://hg.python.org/cpython/rev/d3816fa1bcdf
--
___
Python tracker
Ezio Melotti ezio.melo...@gmail.com added the comment:
Fixed in http://hg.python.org/devguide/rev/c9dd231b0940
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
STINNER Victor victor.stin...@haypocalc.com added the comment:
See also #12737.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12746
___
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
See also #12746.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12737
___
___
Changes by Tom Christiansen tchr...@perl.com:
--
nosy: +tchrist
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12746
___
___
Python-bugs-list
Marc-Andre Lemburg m...@egenix.com added the comment:
Keep in mind that we should be able to access and use lone surrogates too,
therefore:
s = '\ud800' # should be valid
len(s) # should this raise an error? (or return 0.5 ;)?
s[0] # error here too?
list(s) # here too?
p = s +
New submission from STINNER Victor victor.stin...@haypocalc.com:
A lot of code is duplicated in unicodeobject.c to manipulate (encode/decode)
surrogates. Each function has from one to three different implementations. The
new decode_ucs4() function adds a new implementation. Attached patch
STINNER Victor victor.stin...@haypocalc.com added the comment:
We may use the following unlikely macro for IS_SURROGATE, IS_HIGH_SURROGATE and
IS_LOW_SURROGATE:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
I suppose that we should use
Ezio Melotti ezio.melo...@gmail.com added the comment:
So the issue here is that while using combing chars, str.title() fails to
titlecase the string properly.
The algorithm implemented by str.title() [0] is quite simple: it loops through
the code units, and uppercases all the chars that
Ezio Melotti ezio.melo...@gmail.com added the comment:
This has been proposed already in #10542 (the issue also has patches).
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12751
___
Ezio Melotti ezio.melo...@gmail.com added the comment:
If the regex module works fine here, I think it's better to leave the re module
alone and include the regex module in 3.3.
--
___
Python tracker rep...@bugs.python.org
Ezio Melotti ezio.melo...@gmail.com added the comment:
This indeed should be fixed by replacing 're' with 'regex'. So I would
suggest to focus your tests on 'regex' and report them there so that possible
bugs gets fixed and tested before we include the module in the stdlib.
--
Ezio Melotti ezio.melo...@gmail.com added the comment:
As I said on #12734 and #12731, if the 'regex' module address this issue, we
should just wait until we include it in the stdlib.
--
___
Python tracker rep...@bugs.python.org
Ezio Melotti ezio.melo...@gmail.com added the comment:
This is actually a duplicated of #9200.
@Terry
Besides which, all I see (on Windowsj) in Firefox is things like
ð¼ð¯ð‘…ð¨ð‘‰ð¯ð».
Encoding problem. Firefox thinks this is some iso-8859-*. You can fix this
selecting
Ezio Melotti ezio.melo...@gmail.com added the comment:
I closed #12730 as a duplicate of this and updated the title of this issue.
--
title: str.isprintable() is always False for large code points - Make str
methods work with non-BMP chars on narrow builds
Ezio Melotti ezio.melo...@gmail.com added the comment:
See also #12751.
--
nosy: +tchrist
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10542
___
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
nosy: +tchrist
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9200
___
___
Python-bugs-list
New submission from Julian Taylor jtaylor.deb...@googlemail.com:
using unicode strings for locale.normalize gives following traceback with
python2.7:
~$ python2.7 -c 'import locale; locale.normalize(uen_US)'
Traceback (most recent call last):
File string, line 1, in module
File
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
nosy: +ezio.melotti
stage: - test needed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12752
___
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 16edc5cf4a79 by Ezio Melotti in branch '3.2':
#12204: document that str.upper().isupper() might be False and add a note about
cased characters.
http://hg.python.org/cpython/rev/16edc5cf4a79
New changeset
Ezio Melotti ezio.melo...@gmail.com added the comment:
Fixed, thanks for the report!
--
resolution: - fixed
stage: commit review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12204
Matthew Barnett pyt...@mrabarnett.plus.com added the comment:
For what it's worth, I've had idea about string storage, roughly based on how
*nix stores data on disk.
If a string is small, point to a block of codepoints.
If a string is medium-sized, point to a block of pointers to codepoint
Julian Taylor jtaylor.deb...@googlemail.com added the comment:
this is a regression introduced by fixing http://bugs.python.org/issue1813
This breaks some user code,. e.g. wx.Locale.GetCanonicalName returns unicode.
Example bugs:
Marc-Andre Lemburg m...@egenix.com added the comment:
Julian Taylor wrote:
New submission from Julian Taylor jtaylor.deb...@googlemail.com:
using unicode strings for locale.normalize gives following traceback with
python2.7:
~$ python2.7 -c 'import locale; locale.normalize(uen_US)'
Raymond Hettinger raymond.hettin...@gmail.com added the comment:
Are you sure this should have been backported? Are there any apps that may be
working now but won't be after the next point release?
--
nosy: +rhettinger
___
Python tracker
Antoine Pitrou pit...@free.fr added the comment:
HIGH_SURROGATE and LOW_SURROGATE require that their ordinal argument
has been preproceed to fit in [0; 0x]. I added this requirement in
the comment of these macros.
The macros should preprocess the argument themselves. It will make the
Ezio Melotti ezio.melo...@gmail.com added the comment:
This is only a doc patch, maybe you are confusing this issue with #12266?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12204
___
Raymond Hettinger raymond.hettin...@gmail.com added the comment:
Right. I was looking at the other patches that went in in the last 24 hours.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12204
Ezio Melotti ezio.melo...@gmail.com added the comment:
It's unlikely that #12266 might break apps. The behavior changed only for
fairly unusual characters, and the old behavior was clearly wrong.
FWIW the str.capitalize() implementation of PyPy doesn't have the bug, and
after the fix both
R. David Murray rdmur...@bitdance.com added the comment:
In what way does 'replace' not satisfy your need to set the tzinfo?
As for utcnow, we can't change what it returns for backward compatibility
reasons, but you can get a non-naive utc datatime by doing
datetime.now(timezone.utc). (I
Daniel O'Connor dar...@dons.net.au added the comment:
On 15/08/2011, at 23:39, R. David Murray wrote:
R. David Murray rdmur...@bitdance.com added the comment:
In what way does 'replace' not satisfy your need to set the tzinfo?
Ahh that would work, although it is pretty clumsy since you have
R. David Murray rdmur...@bitdance.com added the comment:
Ah. Well, pre-3.2 datetime itself did not generate *any* non-naive datetimes.
Nor do you need to specify everything for replace. dt.replace(tzinfo=tz)
should work just fine.
--
resolution: - invalid
stage: -
Ezio Melotti ezio.melo...@gmail.com added the comment:
Here are some benchmarks:
Commands:
# half of the bytes are invalid
./python -m timeit -s 'b = bytes(range(256)); b_dec = b.decode' 'b_dec(utf-8,
surrogateescape)'
./python -m timeit -s 'b = bytes(range(256)); b_dec = b.decode'
Martin v. Löwis mar...@v.loewis.de added the comment:
A PEP 393 draft implementation is available at
https://bitbucket.org/t0rsten/pep-393/ (branch pep-393); if this gets into 3.3,
this issue will be outdated: there won't be narrow builds of Python anymore
(nor will there be wide builds).
Ezio Melotti ezio.melo...@gmail.com added the comment:
That's a really good news.
Some Unicode issues can still be fixed on 2.7 and 3.2 though.
FWIW I was planning to look at this and #9200 in the following days and see if
I can fix them.
--
___
Terry J. Reedy tjre...@udel.edu added the comment:
My Firefox is already set at utf-8. More likely a font limitation. I will look
again after installing one of the fonts Tom suggested.
The pair of boxes on IDLE are for the surrogate pairs. Perhaps tk does not even
try to display a single
New submission from Tom Christiansen tchr...@perl.com:
Unicode character names share a common namespace with formal aliases and with
named sequences, but Python recognizes only the original name. That means not
everything in the namespace is accessible from Python. (If this is construed
to
Ezio Melotti ezio.melo...@gmail.com added the comment:
My Firefox is already set at utf-8.
Every page can specify the encoding it uses (in HTTP headers, meta tag and/or
xml prologue). If none of these are specified, afaik Firefox tries to detect
the encoding, and sometimes fails. What
Tom Christiansen tchr...@perl.com added the comment:
Terry J. Reedy tjre...@udel.edu added the comment:
My Firefox is already set at utf-8. More likely a font limitation. I
will look again after installing one of the fonts Tom suggested.
Symbola is best for exotic glyphs, especially astral
Changes by Ezio Melotti ezio.melo...@gmail.com:
--
components: +Unicode
nosy: +ezio.melotti
stage: - test needed
versions: -Python 2.7, Python 3.1, Python 3.2
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12753
Terry J. Reedy tjre...@udel.edu added the comment:
You are right, FF switched on me without notice. Bad FF.
Thank you! What I now see makes much more sense.
[ мЯхШщЯл, мЯхШщЯл, ДЯхШщЯл, ДЇНЀСЇГ ],
and I now know to check on other pages (although Tom's Unicode talk slides
still have boxes even
Tom Christiansen tchr...@perl.com added the comment:
Terry J. Reedy tjre...@udel.edu added the comment:
You are right, FF switched on me without notice. Bad FF. Thank you! What
I now see makes much more sense.
[ мЯхШщЯл, мЯхШщЯл, ДЯхШщЯл, ДЇНЀСЇГ ],
and I now know to check on other
Tom Christiansen tchr...@perl.com added the comment:
Sorry I didn't include a test case. Hope this makes up for it. If not, please
tell me how to write better test cases. :(
Yeah ok, so I'm a bit persnickety or even unorthodox about my vertical
alignment, but it really helps to make what is
Tom Christiansen tchr...@perl.com added the comment:
Oh whoops, that was the long ticket. Shall I reupload to the right number?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12734
___
Terry J. Reedy tjre...@udel.edu added the comment:
Adding Symbola filled in the symbols and emoticons lines.
The gothic chars are still missing even with Alfios.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12730
Tom Christiansen tchr...@perl.com added the comment:
Terry J. Reedy tjre...@udel.edu added the comment:
Adding Symbola filled in the symbols and emoticons lines.
The gothic chars are still missing even with Alfios.
That's too bad, as the Gothic paternoster is kinda cute. :)
Hm, I wonder where
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12746
___
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue9200
___
Tom Christiansen tchr...@perl.com added the comment:
Here’s the right test file for the right ticket.
--
Added file: http://bugs.python.org/file22903/nametests.py
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12753
Changes by Tom Christiansen tchr...@perl.com:
Removed file: http://bugs.python.org/file22902/nametests.py
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12734
___
Barry A. Warsaw ba...@python.org added the comment:
A cheap way of fixing this would be to test for str-ness of localename and if
it's a unicode, just localname.encode('ascii')
Or is that completely insane?
--
nosy: +barry
___
Python tracker
Barry A. Warsaw ba...@python.org added the comment:
For example:
diff -r fb49394f75ed Lib/locale.py
--- a/Lib/locale.py Mon Aug 15 14:24:15 2011 +0300
+++ b/Lib/locale.py Mon Aug 15 16:47:23 2011 -0400
@@ -355,6 +355,8 @@
# Normalize the locale name and extract the
Changes by Barry A. Warsaw ba...@python.org:
--
keywords: +patch
Added file: http://bugs.python.org/file22904/issue12752.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12752
___
Changes by STINNER Victor victor.stin...@haypocalc.com:
--
nosy: +belopolsky
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12751
___
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
This has been proposed already in #10542 (the issue also has patches).
The two issues are different: this issue is only a refactoring, whereas #10542
adds a new feature (function/macro: Py_UNICODE_NEXT).
--
Antoine Pitrou pit...@free.fr added the comment:
The proposed resolution looks ok. Another possibility is simply to use .lower()
if the string is an unicode string, since that will bypass the C locale.
--
nosy: +pitrou
stage: test needed - patch review
Daniel O'Connor dar...@dons.net.au added the comment:
On 16/08/2011, at 1:06, R. David Murray wrote:
R. David Murray rdmur...@bitdance.com added the comment:
Ah. Well, pre-3.2 datetime itself did not generate *any* non-naive datetimes.
Nor do you need to specify everything for replace.
New submission from Raymond Hettinger raymond.hettin...@gmail.com:
While keeping the MT generator as the default, add new alternative random
number generators as drop-in replacements. Since MT was first introduced, PRNG
technology has continued to advance.
I'm opening this feature request to
Changes by Barry A. Warsaw ba...@python.org:
--
assignee: - barry
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12752
___
___
Python-bugs-list
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset 0d64fe6c737f by Barry Warsaw in branch '2.7':
The simplest possible fix for the regression in bug 12752 by encoding unicodes
http://hg.python.org/cpython/rev/0d64fe6c737f
--
nosy: +python-dev
Changes by Barry A. Warsaw ba...@python.org:
--
resolution: - fixed
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12752
___
Ned Deily n...@acm.org added the comment:
Interesting, I didn't know the Dvorak - Qwerty ⌘ input method existed. In
just some causal experimentation with it, it seems pretty clear that the input
method is not being consistently followed by Tk and there seem to be
differences between Tk 8.4
Sturla Molden stu...@molden.no added the comment:
George Marsaglia's latest random number generator KISS4691 is worth
considering, though I am not sure the performance is that different from
MT19937.
Here is a link to Marsaglia's post on comp.lang.c. Marasglia passed away
shortly after
Sturla Molden stu...@molden.no added the comment:
I'm posting the code for comparison of KISS4691 and MT19937. I do realize
KISS4691 might not be sufficiently different from MT19937 in characteristics
for Raymond Hettinger to consider it. But at least here it is for reference
should it be of
Sturla Molden stu...@molden.no added the comment:
Another (bug fix) post by Marsaglia on KISS4691:
http://www.phwinfo.com/forum/comp-lang-c/460292-ensuring-long-period-kiss4691-rng.html
--
___
Python tracker rep...@bugs.python.org
Changes by Sturla Molden stu...@molden.no:
Removed file: http://bugs.python.org/file22905/prngtest.zip
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12754
___
Changes by Sturla Molden stu...@molden.no:
Added file: http://bugs.python.org/file22906/prngtest.zip
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12754
___
Meador Inge mead...@gmail.com added the comment:
On Sun, Aug 14, 2011 at 1:03 PM, Stefan Krah rep...@bugs.python.org wrote:
Stefan Krah stefan-use...@bytereef.org added the comment:
I like random tests in the stdlib, otherwise the same thing gets tested
over and over again. `make
Eli Bendersky eli...@gmail.com added the comment:
Terry, I'm not 100% sure about what you mean by Python wrapper objects ...
visible from Python, but I think I'll disagree.
There's a big difference between C functions in general and type methods
this document speaks of. Let's leave list aside
Sturla Molden stu...@molden.no added the comment:
Further suggestions to improve the random module:
** Object-oriented PRNG: Let it be an object which stores the random state
internally, so we can create independent PRNG objects. I.e. not just one global
generator.
** Generator for
Raymond Hettinger raymond.hettin...@gmail.com added the comment:
Please focus your thoughts.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12754
___
Terry J. Reedy tjre...@udel.edu added the comment:
the type object determines which (C) functions get called when, for instance,
an attribute gets looked up on an object or it is multiplied by another object.
These C functions are called “type methods”
These C functions are any of the C
Eli Bendersky eli...@gmail.com added the comment:
[].append is a Python-level method object that wraps a C function.
What makes you think that? There's no Python implementation of .append that I
know of. Neither is there a Python implementation of the Noddy.name method that
is discussed in
77 matches
Mail list logo