On Apr 2, 11:22 pm, jmfauth <wxjmfa...@gmail.com> wrote: > On 2 avr, 18:57, rusi <rustompm...@gmail.com> wrote: > > > > > > > > > > > On Apr 2, 8:17 pm, Ethan Furman <et...@stoneleaf.us> wrote: > > > > Simmons (too many Steves!), I know you're new so don't have all the > > > history with jmf that many > > > of us do, but consider that the original post was about numbers, had > > > nothing to do with > > > characters or unicode *in any way*, and yet jmf still felt the need to > > > bring unicode up. > > > Just for reference, here is the starting para of Chris' original mail > > that started this thread. > > > > The Python 3 merge of int and long has effectively penalized > > > small-number arithmetic by removing an optimization. As we've seen > > > from PEP 393 strings (jmf aside), there can be huge benefits from > > > having a single type with multiple representations internally. Is > > > there value in making the int type have a machine-word optimization in > > > the same way? > > > ie it mentions numbers, strings, PEP 393 *AND jmf.* So while it is > > true that jmf has been butting in with trollish behavior into > > completely unrelated threads with his unicode rants, that cannot be > > said for this thread. > > ----- > > That's because you did not understand the analogy, int/long <-> FSR. > > One another illustration, > > >>> def AddOne(i): > > ... if 0 < i <= 100: > ... return i + 10 + 10 + 10 - 10 - 10 - 10 + 1 > ... elif 100 < i <= 1000: > ... return i + 100 + 100 + 100 + 100 - 100 - 100 - 100 - 100 > + 1 > ... else: > ... return i + 1 > ... > > Do it work? yes. > Is is "correct"? this can be discussed. > > Now replace i by a char, a representent of each "subset" > of the FSR, select a method where this FST behave badly > and take a look of what happen. > > >>> timeit.repeat("'a' * 1000 + 'z'") > > [0.6532032148133153, 0.6407248807756699, 0.6407264561239894]>>> > timeit.repeat("'a' * 1000 + '9'") > > [0.6429508479509245, 0.6242782443215589, 0.6240490311410927] > > > > >>> timeit.repeat("'a' * 1000 + '€'") > > [1.095694927496563, 1.0696347279235603, 1.0687741939041082]>>> > timeit.repeat("'a' * 1000 + 'ẞ'") > > [1.0796421281222877, 1.0348612767961853, 1.035325216876231]>>> > timeit.repeat("'a' * 1000 + '\u2345'") > > [1.0855414137412112, 1.0694677410017164, 1.0688096392412945] > > > > >>> timeit.repeat("'œ' * 1000 + '\U00010001'") > > [1.237314015362017, 1.2226262553064657, 1.21994619397816]>>> > timeit.repeat("'œ' * 1000 + '\U00010002'") > > [1.245773635836997, 1.2303978424029651, 1.2258257877430765] > > Where does it come from? Simple, the FSR breaks the > simple rules used in all coding schemes (unicode or not). > 1) a unique set of chars > 2) the "same" algorithm for all chars.
Can you give me a source for this requirement? Numbers are after all numbers. SO we should use the same code/ algorithms/machine-instructions for floating-point and integers? > > And again that's why utf-8 is working very smoothly. How wonderful. Heres a suggestion. Code up the UTF-8 and any of the python string reps in C and profile them. Please come back and tell us if UTF-8 outperforms any of the python representations for strings on any operation (except straight copy). > > The "corporates" which understood this very well and > wanted to incorporate, let say, the used characters > of the French language had only the choice to > create new coding schemes (eg mac-roman, cp1252). > > In unicode, the "latin-1" range is real plague. > > After years of experience, I'm still fascinated to see > the corporates has solved this issue easily and the "free > software" is still relying on latin-1. > I never succeed to find an explanation. > > Even, the TeX folks, when they shifted to the Cork > encoding in 199?, were aware of this and consequently > provides special package(s). > > No offense, this is in my mind why "corporate software" > will always be "corporate software" and "hobbyist software" > will always stay at the level of "hobbyist software". > > A French windows user, understanding nothing in the > coding of characters, assuming he is aware of its > existence (!), has certainly no problem. > > Fascinating how it is possible to use Python to teach, > to illustrate, to explain the coding of the characters. No? > > jmf You troll with eclat and elan! -- http://mail.python.org/mailman/listinfo/python-list