Re: The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)

BartC Sat, 12 Mar 2016 05:22:43 -0800

On 12/03/2016 12:13, Marko Rauhamaa wrote:

BartC <b...@freeuk.com>:

If you're looking at fast processing of language source code (in a
thread partly about efficiency), then you cannot ignore the fact that
the vast majority of characters being processed are going to have
ASCII codes.


I don't know why you would optimize for inputting program source code.
Text in general has left ASCII behind a long time ago. Just go to
Wikipedia and click on any of the other languages.

Why, look at the *English* page on Hillary Clinton:

    Hillary Diane Rodham Clinton /ˈhɪləri daɪˈæn ˈrɒdəm ˈklɪntən/ (born
    October 26, 1947) is an American politician.
    <URL: https://en.wikipedia.org/wiki/Hillary_Clinton>

You couldn't get past the first sentence in ASCII.

I saved that page locally as a .htm file in UTF-8 encoding. I ran amodified version of my benchmark, and it appeared that 99.7% of thebytes had ASCII codes. The other 0.3% presumably were multi-bytesequences, so that the actual proportion of Unicode characters would beeven less.

I then saved the Arabic version of the page, which visually, whenrendered, consists of 99% Arabic script. But the .htm file was still 80%ASCII!


So what were you saying about ASCII being practically obsolete ... ?

--
Bartc
--
https://mail.python.org/mailman/listinfo/python-list

Re: The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?)

Reply via email to