Re: IDLE "Codepage" Switching?

2023-01-17 Thread Thomas Passin

On 1/17/2023 8:46 PM, rbowman wrote:

On Tue, 17 Jan 2023 12:47:29 +, Stephen Tucker wrote:


2. Does the IDLE in Python 3.x behave the same way?


fwiw

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license()" for more information.
str = ""
for c in range(157, 169):
 str += chr(c) + ""

 
print(str)

žŸ ¡¢£¤¥¦§¨
str = ""
for c in range(140, 169):
 str += chr(c) + " "

 
print(str)

Œ  Ž   ‘ ’ “ ” • – — ˜ ™ š › œ  ž Ÿ   ¡ ¢ £ ¤ ¥
¦ § ¨


I don't know how this will appear since Pan is showing the icon for a
character not in its set.  However, even with more undefined characters
the printable one do not change. I get the same output running Python3
from the terminal so it's not an IDLE thing.


I'm not sure what explanation is being asked for here.  Let's take 
Python3, so we can be sure that the strings are in unicode.  The font 
being used by the console isn't mentioned, but there's no reason it 
should have glyphs for any random unicode character.  In my case, I see 
the same missing and printable characters as in the previous post 
(above).  The font is Source Code Pro Medium.


Changing the console's code page won't magically provide the missing glyphs.

I wrote these characters to a file using utf-8 encoding and opened it in 
an editor that recognized the content as utf-8 (EditPlus).  It displayed 
the same characters but had fewer leading spaces (i.e., missing glyphs), 
and did not show any default "missing-character" glyphs.  The editor is 
using the Cousine font.


The second factor that could be in play is what the default character 
encoding is, which is set by Windows and could be different in different 
places (locales).  I don't recall just now how Python3 handles this. 
Since Python2 strings are not unicode unless specified, and Python2 
probably handles the locale/default encoding differently from Python3, 
it would not be a surprise if the two give different results.


If you print such a Python2 string, you will get glyphs for (non-ascii) 
ord(chr) > 127 that come from the Windows code page table, which will be 
different from what Python3 will display.


Python3 uses Windows Unicode API functions, and isn't subject to the 
same limitations as Python2 was - Python2 had to go though the Windows 
code page apparatus and didn't use the Unicode API.  See PEP 528 - 
https://peps.python.org/pep-0528/)


IDLE sets up its own window itself, and probably uses a different font 
from the default Windows console, so there could be some differences 
there too, especially as to whether missing glyphs show a visible symbol 
or not.


Code Page 65001 was often claimed to be for utf-8.  It's not really 
correct in general, but it's OK for many utf-8 characters.  But in 
Python2, the codecs module does not know about code page 65001 - unless 
you apply a simple patch - so if you try to set the console to cp65001, 
you cannot get anything printed.  You get an exception raised instead.


Yes, it's all confusing, and especially with Python2.


--
https://mail.python.org/mailman/listinfo/python-list


[Python-announce] python-build-standalone 3.11 distributions

2023-01-17 Thread Gregory Szorc
python-build-standalone
(https://github.com/indygreg/python-build-standalone) is a project
that produces standalone, highly portable builds of CPython that are
designed to run on as many machines as possible with no additional
dependencies. I created the project for PyOxidizer
(https://github.com/indygreg/PyOxidizer) but it is now being used for
various other tools that want to easily "install" a working Python
interpreter, such as Bazel's rules_python
(https://github.com/bazelbuild/rules_python) and various applications
embedding Python.

Read more at https://gregoryszorc.com/docs/python-build-standalone/main/

I'm pleased to announce the latest 20230116 release
(https://github.com/indygreg/python-build-standalone/releases/tag/20230116)
of the project. This release is the first providing Python 3.11
distributions, joining existing support for Python 3.8, 3.9, and 3.10.

Distributions are available for Windows, Linux, and macOS covering
multiple machine architectures and levels of optimization.
___
Python-announce-list mailing list -- python-announce-list@python.org
To unsubscribe send an email to python-announce-list-le...@python.org
https://mail.python.org/mailman3/lists/python-announce-list.python.org/
Member address: arch...@mail-archive.com


Re: IDLE "Codepage" Switching?

2023-01-17 Thread rbowman
On Tue, 17 Jan 2023 12:47:29 +, Stephen Tucker wrote:

> 2. Does the IDLE in Python 3.x behave the same way?

fwiw

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license()" for more information.
str = ""
for c in range(157, 169):
str += chr(c) + ""


print(str)
žŸ ¡¢£¤¥¦§¨
str = ""
for c in range(140, 169):
str += chr(c) + " "


print(str)
Œ  Ž   ‘ ’ “ ” • – — ˜ ™ š › œ  ž Ÿ   ¡ ¢ £ ¤ ¥ 
¦ § ¨ 


I don't know how this will appear since Pan is showing the icon for a 
character not in its set.  However, even with more undefined characters 
the printable one do not change. I get the same output running Python3 
from the terminal so it's not an IDLE thing.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Fast lookup of bulky "table"

2023-01-17 Thread Dino


Thanks a lot, Edmondo. Or better... Grazie mille.

On 1/17/2023 5:42 AM, Edmondo Giovannozzi wrote:


Sorry,
I was just creating an array of 400x10 elements that I fill with random 
numbers:

   a = np.random.randn(400,100_000)

Then I pick one element randomly, it is just a stupid sort on a row and then I 
take an element in another row, but it doesn't matter, I'm just taking a random 
element. I may have used other ways to get that but was the first that came to 
my mind.

  ia = np.argsort(a[0,:])
  a_elem = a[56, ia[0]]

The I'm finding that element in the all the matrix a (of course I know where it 
is, but I want to test the speed of a linear search done on the C level):

%timeit isel = a == a_elem

Actually isel is a logic array that is True where a[i,j] == a_elem and False 
where a[i,j] != a_elem. It may find more then one element but, of course, in 
our case it will find only the element that we have selected at the beginning. 
So it will give the speed of a linear search plus the time needed to allocate 
the logic array. The search is on the all matrix of 40 million of elements not 
just on one of its row of 100k element.

On the single row (that I should say I have chosen to be contiguous) is much 
faster.

%timeit isel = a[56,:] == a_elem
26 µs ± 588 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

the matrix is a double precision numbers that is 8 byte, I haven't tested it on 
string of characters.

This wanted to be an estimate of the speed that one can get going to the C 
level.
You loose of course the possibility to have a relational database, you need to 
have everything in memory, etc...

A package that implements tables based on numpy is pandas: 
https://pandas.pydata.org/

I hope that it can be useful.




--
https://mail.python.org/mailman/listinfo/python-list


Re: Fast lookup of bulky "table"

2023-01-17 Thread Edmondo Giovannozzi
Il giorno martedì 17 gennaio 2023 alle 00:18:04 UTC+1 Dino ha scritto:
> On 1/16/2023 1:18 PM, Edmondo Giovannozzi wrote: 
> > 
> > As a comparison with numpy. Given the following lines: 
> > 
> > import numpy as np 
> > a = np.random.randn(400,100_000) 
> > ia = np.argsort(a[0,:]) 
> > a_elem = a[56, ia[0]] 
> > 
> > I have just taken an element randomly in a numeric table of 400x10 
> > elements 
> > To find it with numpy: 
> > 
> > %timeit isel = a == a_elem 
> > 35.5 ms ± 2.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) 
> > 
> > And 
> > %timeit a[isel] 
> > 9.18 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 
> > 
> > As data are not ordered it is searching it one by one but at C level. 
> > Of course it depends on a lot of thing...
> thank you for this. It's probably my lack of experience with Numpy, 
> but... can you explain what is going on here in more detail? 
> 
> Thank you 
> 
> Dino

Sorry, 
I was just creating an array of 400x10 elements that I fill with random 
numbers:

  a = np.random.randn(400,100_000) 

Then I pick one element randomly, it is just a stupid sort on a row and then I 
take an element in another row, but it doesn't matter, I'm just taking a random 
element. I may have used other ways to get that but was the first that came to 
my mind.

 ia = np.argsort(a[0,:]) 
 a_elem = a[56, ia[0]] 

The I'm finding that element in the all the matrix a (of course I know where it 
is, but I want to test the speed of a linear search done on the C level):

%timeit isel = a == a_elem 

Actually isel is a logic array that is True where a[i,j] == a_elem and False 
where a[i,j] != a_elem. It may find more then one element but, of course, in 
our case it will find only the element that we have selected at the beginning. 
So it will give the speed of a linear search plus the time needed to allocate 
the logic array. The search is on the all matrix of 40 million of elements not 
just on one of its row of 100k element. 

On the single row (that I should say I have chosen to be contiguous) is much 
faster.

%timeit isel = a[56,:] == a_elem
26 µs ± 588 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)

the matrix is a double precision numbers that is 8 byte, I haven't tested it on 
string of characters.

This wanted to be an estimate of the speed that one can get going to the C 
level. 
You loose of course the possibility to have a relational database, you need to 
have everything in memory, etc...

A package that implements tables based on numpy is pandas: 
https://pandas.pydata.org/

I hope that it can be useful.


-- 
https://mail.python.org/mailman/listinfo/python-list


IDLE "Codepage" Switching?

2023-01-17 Thread Stephen Tucker
I have four questions.

1. Can anybody explain the behaviour in IDLE (Python version 2.7.10)
reported below? (It seems that the way it renders a given sequence of bytes
depends on the sequence.)

2. Does the IDLE in Python 3.x behave the same way?

3. If it does, is this as it should behave?

4. If it is, then why is it as it should behave?
==
>>> mylongstr = ""
>>> for thisCP in range (157, 169):
mylongstr += chr (thisCP) + " "


>>> print mylongstr
ン ゙ ゚ ᅠ ᄀ ᄁ ᆪ ᄂ ᆬ ᆭ ᄃ ᄄ
>>> mylongstr = ""
>>> for thisCP in range (158, 169):
mylongstr += chr (thisCP) + " "


>>> print mylongstr
ž Ÿ   ¡ ¢ £ ¤ ¥ ¦ § ¨
>>> mylongstr = ""
>>> for thisCP in range (157, 169):
mylongstr += chr (thisCP) + " "


>>> print mylongstr
ン ゙ ゚ ᅠ ᄀ ᄁ ᆪ ᄂ ᆬ ᆭ ᄃ ᄄ
==

Stephen Tucker.
-- 
https://mail.python.org/mailman/listinfo/python-list