[issue47000] Make encoding="locale" uses locale encoding even in UTF-8 mode is enabled.

2022-03-30 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Please see https://bugs.python.org/issue47000#msg415769 for what Victor
suggested.

In particular, the locale module uses the "no underscore" convention.
Not sure whether it's good to start using snake case now, but I'm also
not against it.

I would like to reiterate my concern with the "locale" encoding, though.

As mentioned earlier, I believe it adds too much magic. It would be better
to leave this in the hands of the applications and not try to guess
the correct encoding.

It's better to expose easy to use APIs to access the various different
settings and point users to those rather than try to do a best effort
guess... explicit is better than implicit.

After all, Mojibake potentially corrupts important data, without the
alerting the user and that's not really what we should be after (e.g.
UTF-8 is valid Latin-1 in most cases and this is a real problem we often
run into in Germany with our Umlauts).

--

___
Python tracker 
<https://bugs.python.org/issue47000>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47072] Database corruption with the shelve module

2022-03-27 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 27.03.2022 09:56, Hubert Tournier wrote:
> 
> The storage format used under Windows is completely different from the one 
> used under Unix (or *BSD).

The shelve module uses the dbm module underneath and this will pick
its storage mechanism based on what's available on the platform:

https://docs.python.org/3/library/dbm.html
https://github.com/python/cpython/blob/3.10/Lib/dbm/__init__.py

It's likely that you'll get the dbm.dumb interface on Windows.
On Linux, you typically have one of gdbm or the Berkley DB installed.

dbm.whichdb() will tell you which type of dbm implementation your
files are likely using.

More on the differences of DBM style libs:
http://www.ccl.net/cca/software/UNIX/apache/apacheRH7.0/local-copies/dbm.html

Aside: You are probably better off using SQLite with a pickle
layer to store arbitrary objects. This is much more mature than
the dbm modules.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue47072>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32642] add support for path-like objects in sys.path

2022-03-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.03.2022 08:59, Nick Coghlan wrote:
> 
> The import system is already complex, so I think the hesitation about 
> allowing arbitrary Path-like objects is warranted. (For example: are 
> importlib's caching semantics really valid for *arbitrary* path-like objects? 
> An object can be path-like without being immutable)
> 
> Coercion on input (as Noam suggests) would have a much lower risk of unwanted 
> side effects, as any dynamic behaviour would be resolved at insertion time.

This is not only about the import system. Lots of Python code out there
manipulates sys.path or reads sys.path for various reasons and does not
expect Path objects as list members, since only strings and bytes
are allowed:

https://docs.python.org/3/library/sys.html#sys.path

Conversion to strings sounds like a good way to get the best out of
both worlds.

I'm just curious on how this would work. You'd like have to create a
list subclass for use with sys.path which applies the conversion
whenever a non-string member gets added. Or perhaps add helper methods
to Path objects to safely add their value to sys.path.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue32642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Here's a wheel which only includes the portable code (I disabled
all the special cases as you suggested).

Archive:  dist/blake3_experimental_c-0.0.1-cp310-cp310-linux_x86_64.whl
  Length  DateTimeName
-  -- -   
   297680  2022-03-23 19:26   blake3.cpython-310-x86_64-linux-gnu.so
 3183  2022-03-23 19:26   blake3_experimental_c-0.0.1.dist-info/METADATA
  105  2022-03-23 19:26   blake3_experimental_c-0.0.1.dist-info/WHEEL
7  2022-03-23 19:26   
blake3_experimental_c-0.0.1.dist-info/top_level.txt
  451  2022-03-23 19:26   blake3_experimental_c-0.0.1.dist-info/RECORD
- ---
   301426 5 files

I didn't run any benchmarks, but it's clear that the SIMD code was
used in my initial build and this adds some 50kB to the .so file.
This is on a older Linux x64 box with Intel i7-4770k CPU.

Could be that the Rust version adds several such SIMD variants and
then branches based on the platform running the code.

In any case, the C extension is indeed very easy to build and
install with a standard compiler setup.

--

___
Python tracker 
<https://bugs.python.org/issue39298>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

With "lean" I meant: doesn't use much code and is easy to compile
and install.

I built a wheel from Jack's experimental package and it comes out to
just under 100kB on Linux x64, compared to around the 1.1MB the
Rust wheel needs:

Archive:  blake3_experimental_c-0.0.1-cp310-cp310-linux_x86_64.whl
  Length  DateTimeName
-  -- -   
   348528  2022-03-23 18:38   blake3.cpython-310-x86_64-linux-gnu.so
 3183  2022-03-23 18:38   blake3_experimental_c-0.0.1.dist-info/METADATA
  105  2022-03-23 18:38   blake3_experimental_c-0.0.1.dist-info/WHEEL
7  2022-03-23 18:38   
blake3_experimental_c-0.0.1.dist-info/top_level.txt
  451  2022-03-23 18:38   blake3_experimental_c-0.0.1.dist-info/RECORD
- ---
   352274 5 files

Archive:  blake3-0.3.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl
  Length  DateTimeName
-  -- -   
 3800  2022-01-13 01:26   blake3-0.3.1.dist-info/METADATA
  133  2022-01-13 01:26   blake3-0.3.1.dist-info/WHEEL
   48  2022-01-13 01:26   blake3/__init__.py
  4195392  2022-01-13 01:26   blake3/blake3.cpython-310-x86_64-linux-gnu.so
  382  2022-01-13 01:26   blake3-0.3.1.dist-info/RECORD
- ---
  4199755 5 files

I don't know why there is such a significant difference in size. Perhaps
the Rust version includes multiple variants for different CPU
optimizations ?!

--

___
Python tracker 
<https://bugs.python.org/issue39298>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 23.03.2022 17:53, Larry Hastings wrote:
> 
> Ok, I give up.

Sorry to spoil the fun, but there's no need to throw
everything in the bin ;-)

A lean and fast blake3 C package would still be a great thing
to have on PyPI, e.g. provide support for platforms, which
Jack's blake3 Rust package doesn't cover, e.g.

Raspis:
https://www.piwheels.org/project/blake3/

Android (e.g. via termux):
https://wiki.termux.com/wiki/Main_Page
https://wiki.termux.com/wiki/Python

etc.

--

___
Python tracker 
<https://bugs.python.org/issue39298>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39298] add BLAKE3 to hashlib

2022-03-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 23.03.2022 02:12, Gregory P. Smith wrote:
> 
> I view the NIST standard hashes as important enough to attempt to guarantee 
> as present (all the SHAs and MD5) as built-in. Others should really 
> demonstrate practical application popularity to gain included battery status 
> rather than just using PyPI.

+1 on this. I also think the topic deserves a wider discussion.

IMO, Python's stdlib should only provide a basic set of hash algorithms
and not try to add every single new algorithm out there.

PyPI is a much better way to add support for new hash algorithms,
can move much faster than the stdlib, provide specialized builds for
added performance and also add exotic features, which are not always
needed.

Here's the list of Python 3.10 algos on a typical Linux system:

>>> hashlib.algorithms_available
{'sha512_256', 'mdc2', 'md5-sha1', 'md4', 'ripemd160', 'shake_128', 'sha3_384',
'blake2s', 'sha3_512', 'sha3_256', 'sha256', 'sha1', 'sm3', 'sha512_224',
'whirlpool', 'sha384', 'shake_256', 'sha224', 'sha512', 'sha3_224', 'md5',
'blake2b'}

This already is more than enough. Since we're using OpenSSL in Python
anyway, exposing some of the often used algos from OpenSSL is fine,
since it doesn't add much extra bloat. The above list already goes
way beyond this, IMO.

The longer the list gets, the more confusion it causes among users,
since Python's stdlib doesn't provide any guidance on
basic questions such as "Which hash algo should I use for my
application".

Most applications today will only need these basic hash algos:

{'ripemd160', 'sha3_512', 'sha3_256', 'sha256', 'sha1', 'sha512', 'md5'}

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue39298>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47000] Make encoding="locale" uses locale encoding even in UTF-8 mode is enabled.

2022-03-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

FWIW: I don't think the "locale" encoding is a good idea. Instead of
trying to fix this to make it more usable, I'd suggest to deprecate
and remove it again.

When it comes to encodings, explicit is better than implicit.

If an application wants to work with some user defined locale settings,
it's better for the application to decide where to pick the locale
settings from, e.g. the OS, the UI, an application config setting,
etc.

There are too many ways this can be done and trying to build
magic to determine the "right" one is bound to fail in one way or
another.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue47000>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46659] Deprecate locale.getdefaultlocale() function

2022-02-24 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Thanks, Victor.

--

___
Python tracker 
<https://bugs.python.org/issue46659>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46662] Lib/sqlite3/dbapi2.py: convert_timestamp function failed to correctly parse timestamp

2022-02-09 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 08.02.2022 11:54, Erlend E. Aasland wrote:
> 
> The sqlite3 timestamp converter is buggy, as already noted in the docs[^1]. 
> Adding timezone support is out of the question[^2][^3][^4][^5], but fixing it 
> to be able to discard any attached timezone info _may_ be ok; at first sight, 
> I don't see how this could break existing applications (like, for example 
> adding time zone support could do). I need to think it through.

I think it's better to deprecate these converters and let users implement
their own.

--

___
Python tracker 
<https://bugs.python.org/issue46662>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46659] Deprecate locale.getdefaultlocale() function

2022-02-06 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

> For these reasons, I propose to deprecate locale.getdefaultlocale(): 
> setlocale(), getpreferredencoding() and getlocale() should be used instead.

Please see the discussion on https://bugs.python.org/issue43552: 
locale.getpreferredencoding() needs to be deprecated as well. Instead we should 
have a single locale.getencoding() as outlined there... perhaps in a separate 
ticket ?! Thanks.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue46659>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45382] platform() is not able to detect windows 11

2022-01-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.01.2022 01:29, Eryk Sun wrote:
> 
> Eryk Sun  added the comment:
> 
>> Bit wmic seems nice solution.
>> Is still working for windows lower than 11?
> 
> wmic.exe is still included in Windows 10 and 11, but it's officially 
> deprecated [1], which means it's no longer being actively developed, and it 
> might be removed in a future update. PowerShell is the preferred way to use 
> WMI.

All of these require shelling out to the OS, so why not stick to
`ver` as we've done in the past. This has existed for ages and
will most likely not disappear anytime soon.

Is there a good reason to prefer wmic or PowerShell (which are
less likely to be available or reliable) ?

> ---
> [1] 
> https://docs.microsoft.com/en-us/windows/deployment/planning/windows-10-deprecated-features
-- 
Marc-Andre Lemburg
eGenix.com

--

___
Python tracker 
<https://bugs.python.org/issue45382>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46249] [sqlite3] move set lastrowid out of the query loop and enable it for executemany()

2022-01-12 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 11.01.2022 21:30, Erlend E. Aasland wrote:
> 
>> I'd suggest to deprecate the cursor.lastrowid attribute and
>> instead point people to the much more useful [...]
> 
> Yes, I think mentioning the RETURNING ROWID trick in the sqlite3 docs is a 
> very nice improvement. Mentioning the last_insert_rowid SQL function is 
> probably also worth consideration.
> 
> I'm reluctant to deprecate cursor.lastrowid, though. ATM, I'm leaning towards 
> just keeping the current behaviour.

Fair enough :-)

Perhaps just documenting that the value is not necessarily what
people may expect, when coming from other databases due to the
different semantics with SQLite, is enough.
 --
Marc-Andre Lemburg
eGenix.com

--

___
Python tracker 
<https://bugs.python.org/issue46249>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46249] [sqlite3] move set lastrowid out of the query loop and enable it for executemany()

2022-01-11 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

On 11.01.2022 20:46, Erlend E. Aasland wrote:
> 
> If we are to revert to this behaviour, we'll have to start examining the SQL 
> we are given (search for INSERT and REPLACE keywords, determine if they are 
> valid (i.e. not a comment, not part of a column or table name, etc.), which 
> will lead to a noticeable performance hit for every new statement (not for 
> statements reused via the LRU cache though). I'm not sure this is a good 
> idea. However I will give it a good thought.
>
> My first thought now, is that it would be better for the sqlite3 module to 
> align lastrowid with the behaviour of the C API sqlite3_last_insert_rowid() 
> (also available as an SQL function: last_insert_rowid). OTOH, the SQLite API 
> is tied to the _connection_ object, so it may not make sense to align it with 
> lastrowid which is a _cursor_ attribute.

I've had a look at the API description and find it less than useful,
to be honest:

https://sqlite.org/c3ref/last_insert_rowid.html

You don't know on which cursor the last row was inserted, it's
possible that this was or is done by a trigger and the last row
is not updated in case the INSERT does not succeed for some reason,
leaving it unchanged - without the user getting a notification of
this failure, since the .execute() call itself will succeed for
e.g. "INSERT INTO table SELECT ...;".

It also seems that the function really only works for INSERTs and
not for UPDATEs.

> Perhaps the Right Thing To Do™ is to be conservative and just leave it as it 
> is. I still want to apply the optimisation, though. It does not alter the 
> behaviour in any kind of way, and it speeds up executemany().

I'd suggest to deprecate the cursor.lastrowid attribute and
instead point people to the much more useful

"INSERT INTO t (name) VALUES ('two'), ('three') RETURNING ROWID;"

https://sqlite.org/lang_insert.html
https://sqlite.org/forum/forumpost/058ac49cc3

(good to know that SQLite has adopted this PostgreSQL variant as
well)

RETURNING is also available for UPDATES:

https://sqlite.org/lang_update.html

If people really want to use the sqlite3_last_insert_rowid()
functionality, they can use the SQL function of the same name:

https://www.sqlite.org/lang_corefunc.html#last_insert_rowid

which then has known semantics and doesn't conflict with the DB-API
specs.

But this is your call :-)

--

___
Python tracker 
<https://bugs.python.org/issue46249>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46249] [sqlite3] move set lastrowid out of the query loop and enable it for executemany()

2022-01-11 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

On 08.01.2022 21:56, Erlend E. Aasland wrote:
>  
> Marc-André: since Python 3.6, the sqlite3.Cursor.lastrowid attribute does no 
> longer comply with the recommendations of PEP 249:
> 
> Previously, lastrowid was set to None for operations other than INSERT or 
> REPLACE. This changed with ab994ed8b97e1b0dac151ec827c857f5e7277565 (in 
> Python 3.6), so that lastrowid is _unchanged_ for operations other than 
> INSERT or REPLACE, and it is set to 0 after the first valid SQL (that is not 
> INSERT/REPLACE) is executed on the cursor.
> 
> Now, PEP 249 only _recommends_ that lastrowid is set to None for operations 
> that do not modify a row, so it's probably not a big deal. No-one has ever 
> mentioned this change in behaviour; there have been no bug reports.
> 
> FTR, here is the relevant quote from PEP 249:
> 
> If the operation does not set a rowid or if the database does not support
> rowids, this attribute should be set to None.
> 
> (I interpret "should" as understood by RFC 2119.)

Well, it may be a little stronger than the SHOULD in the RFC, but then
again the whole DB-API is about conventions and if they don't make sense
for a database backend, it is possible to deviate from the spec, esp. for
optional extensions such as .lastrowid.

> So, my follow-up question becomes:
> I see no point in reverting to pre Python 3.6 behaviour. I would rather 
> change the default value to be 0 (to get rid of the dirty flag in GH-30380), 
> and to make the behaviour more consistent with how the actual SQLite API 
> behaves.
> 
> 
> Do you have an opinion about such a change (in behaviour)?

Is 0 a valid row ID in SQLite ? If not, then I guess this would be
an alternative to None as suggested by the DB-API.

If it is a valid row ID, I'd suggest to go back to resetting to None,
since otherwise code might get confused: if an UPDATE does not get
applied (e.g. a condition is false), code could then still take
.lastrowid as referring to the UPDATE and not a previous
operation, since code will now know whether the condition was met
or not.
 --
Marc-Andre Lemburg
eGenix.com

--
title: [sqlite3] lastrowid improvements -> [sqlite3] move set lastrowid out of 
the query loop and enable it for executemany()

___
Python tracker 
<https://bugs.python.org/issue46249>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46338] libc_ver() runtime error when sys.executable is empty

2022-01-11 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 10.01.2022 23:01, Allie Hammond wrote:
> 
> libc_ver() in platform.py (called from platform()) causes a runtime error if 
> sys.executable returns null. In my case, FreeRADIUS offers a module 
> rlm_python3 which allows you to run python code from the C based FreeRADIUS 
> server - since this module doesn't use a python binary to execute 
> sys.executable returns null trigering this error.

Interesting. I guess rlm_python3 embeds Python. Is sys.executable an
empty string or None ?

--

___
Python tracker 
<https://bugs.python.org/issue46338>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12756] datetime.datetime.utcnow should return a UTC timestamp

2022-01-10 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Hi Tony,

from practical experience, it is a whole lot better to not deal with
timezones in data processing code at all, but instead only use
naive UTC datetime values everywhere, expect when you have to
prepare reports or output which has a requirement to show datetime
value in local time or some specific timezone.

You convert all datetime values into UTC upon input, possibly
store the timezone somewhere, if that is relevant for later reporting,
and then forget about timezones.

Your code will run faster, become a lot easier to understand
and you avoid many pitfalls that TZs have, esp. when TZs are
silently dropped interfacing to e.g. numeric code, databases or
other external code.

There's a reason why cloud code (and a lot of other code, such
as data science code) has standardized on UTC :-)

Cheers,
-- 
Marc-Andre Lemburg
eGenix.com

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue12756>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46249] [sqlite3] move set lastrowid out of the query loop

2022-01-04 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 04.01.2022 21:02, Erlend E. Aasland wrote:
> 
> Erlend E. Aasland  added the comment:
> 
>> If possible, it's usually better to have the .executemany() create a
>> cursor with an output array providing the row ids, e.g. using "INSERT ...
>> RETURNING ..." (PostgreSQL). That way you can access all row ids and
>> can also provide the needed detail in case the INSERTs happen out of
>> order to map them to the input data.
> 
> Hm, maybe so. But it will add a lot of overhead and complexity to 
> executemany(), and there haven't been requests for this feature for sqlite3. 
> AFAIK, there hasn't been request for lastrowid for executemany() at all. 
> OTOH, my proposal of modifying lastrowid to always show the rowid of the 
> actual last inserted row is a very cheap operation, _and_ it simplifies the 
> code (=> increased maintainability), so I think I'll go for that.

Sorry, I wasn't suggesting this for SQLite; it's just a better
and more flexible option than using cursor.lastrowid where
available. Sadly, the PG extension is not standards conform SQL.

>> For cases where you don't need sequence IDs, it's often better to
>> not rely on auto-increment columns for IDs, but instead use random
>> pre-generated IDs. Saves roundtrips to the database and works nicely
>> with cluster databases as well.
> 
> Yes, but in those cases you keep track of the row id yourself, so you 
> probably won't need lastrowid ;)

Indeed, and that's the point :-)

Those auto-increment ID fields are
not really such a good idea to begin with. Either you know that you
will need to manipulate the rows after inserting them (in which case
you can set an ID) or you don't care about the individual rows and
only want to aggregate or search in them based on other fields.

--

___
Python tracker 
<https://bugs.python.org/issue46249>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46249] [sqlite3] move set lastrowid out of the query loop

2022-01-04 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

On 04.01.2022 11:02, Erlend E. Aasland wrote:
> 
> Erlend E. Aasland  added the comment:
> 
> Thank you for your input Marc-André.
> 
> For SQLite, it's pretty simple: we use an API called 
> sqlite3_last_insert_rowid() which takes the database connection as it's 
> argument, not a statement pointer. This function returns "the rowid of the 
> most recent successful INSERT into a rowid table or virtual table on database 
> connection" (quote from SQLite docs). IMO, it would make sense to also use 
> this post executemany().

Sounds like a plan.

If possible, it's usually better to have the .executemany() create a
cursor with an output array providing the row ids, e.g. using "INSERT ...
RETURNING ..." (PostgreSQL). That way you can access all row ids and
can also provide the needed detail in case the INSERTs happen out of
order to map them to the input data.

For cases where you don't need sequence IDs, it's often better to
not rely on auto-increment columns for IDs, but instead use random
pre-generated IDs. Saves roundtrips to the database and works nicely
with cluster databases as well.

--

___
Python tracker 
<https://bugs.python.org/issue46249>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46249] [sqlite3] move set lastrowid out of the query loop

2022-01-04 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 04.01.2022 00:49, Erlend E. Aasland wrote:
> 
> Erlend E. Aasland  added the comment:
> 
> I see that PEP 249 does not define the semantics of lastrowid for 
> executemany(). What's the precedence here, MAL? IMO, it would be nice to be 
> able to fetch the last row id after you've done a 1000 inserts using 
> executemany().

The DB-API deliberately leaves this undefined, since there are many ways you
could implement this, e.g.

- return the last row id for the last entry in the array passed to 
.executemany()
- return the last row id of the last actually modified/inserted row after
running .executemany()
- return an array of row ids, one for each row modified/inserted
- return a row id of one of the modified/inserted rows, without defining which
- always return None for .executemany()

Note that in some cases, the order of actions taken by the database is not
predefined (e.g. some databases run such inserts in chunks across
a cluster), so even the "last" semantics are not clear.

> So, another option would be to keep "set-lastrowid" in the query loop, and 
> just remove the condition; we set it every time (but of course only build a 
> PyObject of it when the loop is done).

Since the DB-API leaves this undefined, you are free to add your own
meaning, which makes most sense for SQLite, to always return None or
not implement it at all.

DB-API extensions such as Cursor.lastrowid are optional and don't need
to be implemented if they don't make sense for a particular use case:

https://www.python.org/dev/peps/pep-0249/#optional-db-api-extensions

--

___
Python tracker 
<https://bugs.python.org/issue46249>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41921] REDoS in parseentities

2021-12-06 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Interesting that the tool still exists. It uses mxTextTools, but in a 
non-packaged version, so it's been broken for two decades now :-)

I think it's safe to remove it from Tools\scripts.

--

___
Python tracker 
<https://bugs.python.org/issue41921>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45927] timeit accepts -c/--clock and -t/--time without any functionality

2021-11-29 Thread Marc-Andre Lemburg


New submission from Marc-Andre Lemburg :

>From the code:

opts, args = getopt.getopt(args, "n:u:s:r:tcpvh",
   ["number=", "setup=", "repeat=",
"time", "clock", "process",
"verbose", "unit=", "help"])

but the options -c and -t are not used in the subsequent code.

This can lead to situations where if you mistype e.g. -s as -c, you get 
completely wrong results (see https://bugs.python.org/issue45902 for an 
example).

--
components: Library (Lib)
messages: 407276
nosy: lemburg
priority: normal
severity: normal
status: open
title: timeit accepts -c/--clock and -t/--time without any functionality
versions: Python 3.10, Python 3.11

___
Python tracker 
<https://bugs.python.org/issue45927>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45902] Bytes and bytesarrays can be sorted with a much faster count sort.

2021-11-29 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.11.2021 10:56, Ruben Vorderman wrote:
> 
> $ python -m timeit -c "from bytes_sort import bytes_sort" "bytes_sort(b'')"
> 50 loops, best of 5: 495 nsec per loop

Shouldn't this read:

$ python -m timeit -s "from bytes_sort import bytes_sort" "bytes_sort(b'')"

(-s instead of -c) ?

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45902>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45921] codecs module doesn't support iso-8859-6-i, iso-8859-6-e, iso-8859-8-i or iso-8859-8-i

2021-11-29 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Even though these are IANA recognized encodings, we need to apply he same logic 
as we do for all new encodings, which essentially boils down to: Are these 
encoding in wider spread use today ?

Reading through the RFC 1556, it seems that the added -i or -e are just 
indications for applications on how to interpret BIDI information: either 
implicit by looking at the order of characters in the stream or explicit via 
control characters embedded in the stream. They are not new encodings, with new 
mappings.

If that's a correct interpretation, we could add those as aliases for the 
non-annotated encodings.

After more than 20 years with Unicode support in Python and the world moving 
towards UTF-8, I have become fairly reluctant towards adding more encoding 
support to Python.

If people are still using unsupported encodings, it's probably better to point 
them to other dedicated tools for converting text to UTF-8, e.g. iconv, than 
extending the pretty extensive support we already have in Python.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45921>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45476] [C API] Disallow using PyFloat_AS_DOUBLE() as l-value

2021-11-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 15.11.2021 10:54, STINNER Victor wrote:
> 
> I don't understand what you are trying to prove about compilers not inlining 
> when you explicitly ask them... not to inline.

I'm not trying to prove anything, Victor.

I'm only stating the fact that by switching from macros to inline
functions we are giving away control to the compilers and should not
be surprised that Python now suddenly runs a lot slower on systems
which either have inlining optimizations switched off or where the
compiler (wrongly) assumes that creating more assembler would result
in slower code.

I've heard all your arguments against macros, but don't believe the
blanket approach to convert to inline functions is warranted in all
cases, in particular not for code which is private to the interpreter
and where we know that we need the code inlined to not result in
unexpected performance regressions.

I also don't believe that we should assume that Python C extension
authors will unintentionally misuse Python API macros. If they do,
it's their business to sort out any issues, not ours. If we document
that macros may not be used as l-values, that's clear enough. We don't
need to use compiler restrictions to impose such limitations.

IMO, conversion to inline functions should only happen, when

a) the core language implementation has a direct benefit, and

b) it is very unlikely that compilers will not inline the code
   with -O2 settings, e.g. perhaps using a threshold of LOCs
   or by testing with the website Oleg mentioned.

Overall, I think the PEP 670 should get some more attentions from the
SC to have a guideline to use as basis for deciding whether or not
to use the static inline function approach. That way we could avoid
these discussions :-)

BTW: Thanks for the details about -O0 vs. -Og.

--

___
Python tracker 
<https://bugs.python.org/issue45476>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45476] [C API] Disallow using PyFloat_AS_DOUBLE() as l-value

2021-11-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 15.11.2021 08:54, Oleg Iarygin wrote:
> 
> Oleg Iarygin  added the comment:
> 
> Marc-Andre:
>> Inlining is something that is completely under the control of the
> used compilers. Compilers are free to not inline function marked for
> inlining [...]
> 
> I checked the following C snippet on gcc.godbolt.org using GCC 4.1.2 and 
> Clang 3.0.0 with /-O0/-O1/-Os, and both compilers inline a function 
> marked as static inline:
> 
> static inline int foo(int a)
> {
> return a * 2;
> }
> 
> int bar(int a)
> {
> return foo(a) < 0;
> }
> 
> So even with -O0, GCC from 2007 and Clang from 2011 perform inlining. Though, 
> old versions of CLang leave a dangling original copy of foo for some reason. 
> I hope a linker removes it later.

That's a great website :-) Thanks for sharing.

However, even with x86-64 gcc 11.2, I get assembler which does not inline
foo() without compiler options or with -O0: https://gcc.godbolt.org/z/oh6qnffh7

Only with -O1, the site reports inlining foo().

> As for other compilers, I believe that if somebody specifies -O0, that person 
> has a sound reason to do so (like per-line debugging, building precise flame 
> graphs, or other specific scenario where execution performance does not 
> matter), so inlining interferes here anyway.

Sure, but my point was a different one: even with higher optimization
levels, the compiler can decide whether or not to inline. We expect
the compiler to inline, but cannot be sure.

With macros the compiler has no choice and we are in control and even
when using -O0, you will still want e.g. Py_INCREF() and Py_DECREF()
inlined.

--

___
Python tracker 
<https://bugs.python.org/issue45476>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45698] Error on importing getopt

2021-11-03 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Could you please provide more details regarding the OS, whether you compiled
Python yourself and the env settings you used ?

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45698>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45653] Freeze the encodings module.

2021-10-30 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

On 30.10.2021 17:54, Filipe Laíns wrote:
> 
> I just tested partially freezing the package, and it seems to working fine :)
FWIW: I think it's best not bother and simply freeze the whole thing.

It's mostly char mappings which compress well and there's a benefit
in sharing these using mmap (which the OS does for you with static
C data).

--

___
Python tracker 
<https://bugs.python.org/issue45653>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-28 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 27.10.2021 22:58, Brett Cannon wrote:
> 
> Brett Cannon  added the comment:
> 
>> Could Brett or you please add those notes back ? There's no other place
> where such details are documented.
> 
> It really depends on what "details" you're referring to.

I had already listed some of those details.

> Most of what I removed were things like "Module by ", or saying 
> _json.c is for "json accelerator" which is obvious to me. Anything that 
> seemed pertinent to compilation I left in.
> 
> So if there's something you specifically want to add back in that you think 
> is important then please feel free to as I'm done editing the file for my 
> purposes at the moment.

Will do.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45653] Freeze the encodings module.

2021-10-28 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

encodings is a package. I think you first have to check whether mixing
frozen and non-frozen submodules are even supported. I've never tried
having only part of a package frozen.

Freezing the whole package certainly works.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45653>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-28 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Thanks, Pablo :-)

--

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-28 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Hi Pablo,

I think you missed one instance:

   print_escape(stdout, tok->cur, tok->inp - tok->cur);

Cheers

--

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42738] subprocess: don't close all file descriptors by default (close_fds=False)

2021-10-27 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Gregory P. Smith wrote:
> A higher level "best practices for launching child processes module" with 
> APIs reflecting explicit intents (performance vs security vs simplicity) 
> rather than requiring users to understand subprocess platform specific 
> details may be a good idea at this point (on PyPI I assume).

Interesting that you say that. subprocess was originally written with exactly 
this idea in mind - to create a module which deals with all the platform 
details, so that the user doesn't have to know about them: 
https://www.python.org/dev/peps/pep-0324/#motivation

On the topic itself: I believe Python should come with safe and usable 
defaults, but provide ways to enable alternative approaches which have more 
performance if the user so decides (by e.g. passing in a parameter). In such 
cases, the user then becomes responsible for understanding the implications.

Since there already is a parameter, expert users can already gain more 
performance by using it and then explicitly only closing FDs the users knows 
can/should be closed in the new process.

Perhaps there's a middle ground: have subprocess only loop over the first 64k 
FDs per default and not all possible ones.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue42738>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.10.2021 20:47, Marc-Andre Lemburg wrote:
>> Brett removed a lot of stuff in 01cf4fb9c1aa567484c2ffb1b11f9b3fe9949b05 to 
>> make the file more readable. I removed unnecessary -D, -I, and -L to make 
>> the file even more readable. You can pass custom flags to ./configure.
> 
> Could Brett or you please add those notes back ? There's no other place
> where such details are documented. We've lost important information and
> I would like to get this back into Setup and ideally add more
> information to make it easier for users or admins to customize
> their build.

I could also edit the file and add those back, after you're done
with the refactoring.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.10.2021 19:42, Christian Heimes wrote:
> 
> Brett removed a lot of stuff in 01cf4fb9c1aa567484c2ffb1b11f9b3fe9949b05 to 
> make the file more readable. I removed unnecessary -D, -I, and -L to make the 
> file even more readable. You can pass custom flags to ./configure.

Could Brett or you please add those notes back ? There's no other place
where such details are documented. We've lost important information and
I would like to get this back into Setup and ideally add more
information to make it easier for users or admins to customize
their build.

Regarding the -I and -L flags, my question was whether these are
now added elsewhere, since the code would not compile without
e.g. -I$(srcdir)/Include/internal.

I've gone through the Makefile and found that these are already
added via PY_CFLAGS_NODIST, so they are indeed not needed in Setup.
That's great :-)

Please be careful when moving e.g. -I or -L options which point to
non-Python directories. If such options get moved into e.g.
PY_CFLAGS_NODIST, there's no way to override them via options
given in Setup.

Having the -D defined in the relevant code is a lot better. Thanks
for that :-)

> Setup should not be edited by hand. Customizations go to Setup.local.

Editing Setup.local instead of Setup is not documented anywhere ?!
Both files are read, so I guess it's not all that relevant which
of the two are edited.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.10.2021 16:17, Christian Heimes wrote:
> 
> Christian Heimes  added the comment:
> 
> Brett, we can use AM_CONDITIONAL() to conditionally enable/disable a feature 
> and AC_CONFIG_FILES() to create a Modules/Setup from a template:
> 
> Example:
> 
> The conditional
> 
> AM_CONDITIONAL([HAVE_SCPROXY], [test "$ac_sys_system" = "Darwin"])
> 
> sets HAVE_SCPROXY_FALSE and HAVE_NIS_SCPROXY based on the check.

I think it would be more helpful (and generate a lot fewer such
macros), if we'd just test for platforms and then use those in the
Setup.in template.

Some other things:

I saw that you stripped off lots of -I and -L options from the Setup
lines. Are those now unconditionally taken from somewhere else,
without possibility to override them ?

You also removed the structure of the listings, which makes things
like dependencies between e.g. _multibytecodec and the CJK codecs
unclear, the notes about audio, the comments on building the _md5
and sha* modules, etc.

Since Setup is a (potentially) hand edited file, it's good to leave
as much useful information in that file as possible, to not
accidentally create setups which don't work and to have references
which may help in finding the right compile options.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 26.10.2021 16:02, Eric Snow wrote:
> 
> FYI, I figured out the problem on my end.  I wasn't using an installed 
> python.  Once I did it worked fine.

Oh, you mean you tried using it directly from the source tree ?
I don't think I ever tried that direct route.

When building PyRun, I first install to a temporary directory and
then use this to run the freeze.py tool, generate the frozen .c
files and run make to have the executable built.

I've pretty much finished the port to 3.10.

I'll try the main version in the next couple of days. There's currently
a lot of work going on for the makesetup / Setup files
(https://bugs.python.org/issue45548). I'm waiting for that to stabilize.

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45613] [sqlite3] set threadsafety attribute based on default SQLite threaded mode

2021-10-26 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

+1 on setting the attributes dynamically. Other DB-API modules use a
similar approach.

--

___
Python tracker 
<https://bugs.python.org/issue45613>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-24 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 23.10.2021 21:30, Christian Heimes wrote:
> 
> The trick would move the math function back into the core. Mark moved the 
> math functions out of the core on purpose, see bpo-7518.

I don't follow you. With the _math.o target in Makefile.pre.in,
_math.c was always compiled into the main Python interpreter,
even with math and cmath built as shared libs.

And yes, it does export a symbol, but _Py_log1p is not going to
conflict with anything else out there :-)

The trick is essentially not changing the 3.10 status quo. It
only makes sure that _math.o is compiled in the same way as all
other Setup managed modules and moves the target from Makefile.pre.in
to the makesetup section of the final Makefile.

And it removes the warning of having multiple _math.o targets
ending up in the Makefile, which isn't problematic, since make
will always use the last definition (from the makesetup section),
but doesn't look nice either.

OTOH, _math.h and .c are really small, so perhaps it's better
to merge both into a single _math.h file and include that directly
into the modules. I believe that's what Brett's patch does, right ?

Today, only the _Py_log1p() code is actually used and then only
to address a very special case in a platform independent way
(log1p(-0.0) == -0.0), so the whole code boils down to some 10
lines of C being incorporated.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 23.10.2021 20:04, Christian Heimes wrote:
> 
> PR GH-29179 or GH-29181 address the issue with _math.o

I think those patches are both taking things a bit too far.

This is a build problem, not a code problem. It's perfectly
good style to link a single file to multiple other object files
instead of copying the code into those object files.

The catch is that the makesetup logic is not smart enough to only
include the necessary Makefile line once.

The entry in Makefile.pre.in should not be needed, since the logic
for building math and cmath modules already exists in setup.py.

BTW: There's a simple trick to avoid the makesetup issue: simply
add the _math.c entry to some other module which is always linked
statically, e.g. _stat, and remove it from both math and cmath
entries in Setup, as well as the Makefile.pre.in.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I'm using a very simple conditional logic in Setup, which is based
on sed, mostly to add platform specific variables:

In Setup:

# @if macosx: TIME_DEFS=
# @if not macosx: TIME_DEFS=-lrt
time -DPy_BUILD_CORE_BUILTIN -I$(srcdir)/Include/internal timemodule.c
$(TIME_DEFS) # -lm # time operations and variables

In Makefile:

# Install the custom Modules/Setup file
if test "$(MACOSX_PLATFORM)"; then \
sed -e 's/# @if macosx: *//' \
$(PYRUNDIR)/$(MODULESSETUP) > 
$(PYTHONDIR)/Modules/Setup; \
elif test "$(FREEBSD_PLATFORM)"; then \
sed -e 's/# @if freebsd: *//' \
$(PYRUNDIR)/$(MODULESSETUP) > 
$(PYTHONDIR)/Modules/Setup; \
else \
sed -e 's/# @if not macosx: *//' \
-e 's/# @if not freebsd: *//' \
$(PYRUNDIR)/$(MODULESSETUP) > 
$(PYTHONDIR)/Modules/Setup; \
fi;

Setup used to be templated as well in the past, but that logic was removed
at some point. Perhaps it's time to reintroduce it.

--

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45548] Update Modules/Setup

2021-10-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

FYI: I've been working with a fixed Setup file in PyRun for a long while. There 
are indeed a number of modules missing from Setup, since the whole logic was 
left behind a bit after things moved to setup.py.

The issue with _math.o is actually in the main Makefile.pre.in. The version 
listed there does not match the Makefile lines added via Setup. For PyRun, I 
had to comment out the one in Makefile.pre.in and then only add one instance to 
the math module and not the cmath one. This avoids a (harmless) warning during 
the build.

I'm not sure what the _math.o entry exists in Makefile.pre.in. It's only needed 
by those two modules, AFAIK.

Here's the list of modules I had to add in the past (taken from the 3.10 port):

"""
### Built-in extensions for which there are no entries in Setup.dist/Setup:

# _decimal needs more complex setup, punting on this for now
#DECIMAL_DEFS=-DCONFIG_64=1 -DASM=1
#_decimal \
#   _decimal/_decimal.c \
#   _decimal/libmpdec/basearith.c \
#   _decimal/libmpdec/constants.c \
#   _decimal/libmpdec/context.c \
#   _decimal/libmpdec/convolute.c \
#   _decimal/libmpdec/crt.c \
#   _decimal/libmpdec/difradix2.c \
#   _decimal/libmpdec/fnt.c \
#   _decimal/libmpdec/fourstep.c \
#   _decimal/libmpdec/io.c \
#   _decimal/libmpdec/memory.c \
#   _decimal/libmpdec/mpdecimal.c \
#   _decimal/libmpdec/numbertheory.c \
#   _decimal/libmpdec/sixstep.c \
#   _decimal/libmpdec/transpose.c \
#   $(DECIMAL_DEFS) \
#   -I$(srcdir)/Modules/_decimal \
#   -I$(srcdir)/Modules/_decimal/libmpdec \
#   -I$(prefix)/include -L$(exec_prefix)/lib

# _opcode
_opcode _opcode.c

# _ctypes needs to build libffi first - punting on this

# _lsprof
_lsprof _lsprof.c rotatingtree.c

# _sqlite3
SQLITE_DEFS=-DMODULE_NAME='"sqlite3"' -DSQLITE_OMIT_LOAD_EXTENSION
# @if freebsd: SQLITE_LIBS=-I/usr/local/include -L/usr/local/lib
# @if not freebsd: SQLITE_LIBS=
_sqlite3 \
_sqlite/module.c \
_sqlite/cache.c \
_sqlite/connection.c \
_sqlite/cursor.c \
_sqlite/microprotocols.c \
_sqlite/prepare_protocol.c \
_sqlite/row.c \
_sqlite/statement.c \
_sqlite/util.c \
$(SQLITE_DEFS) -I$(srcdir)/Modules/_sqlite \
$(SQLITE_LIBS) \
-I$(prefix)/include -L$(exec_prefix)/lib \
-lsqlite3

# bz2
_bz2 _bz2module.c -lbz2

# lzma
#
# Note: Adding this can cause serious issues, since the needed lib isn't
# universally installed everywhere. See #1793 and #1794.
#
#_lzma _lzmamodule.c -llzma

# multiprocessing
_multiprocessing \
_multiprocessing/semaphore.c \
_multiprocessing/multiprocessing.c \
-I$(srcdir)/Modules/_multiprocessing

# Optional add-on for multiprocessing to use shared memory
#POSIXSHMEM_LIBS=rt
POSIXSHMEM_LIBS=
_posixshmem \
_multiprocessing/posixshmem.c \
-I$(srcdir)/Modules/_multiprocessing \
$(POSIXSHMEM_LIBS)

# queue
_queue _queuemodule.c
"""

Not all modules are included, since I did not need all missing ones.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45548>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-23 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Hmm, perhaps I should reopen the ticket, even though I now found the cause.

After all, it is possible that lineno is None and inspect.getframeinfo() cannot 
handle it :-)

And it may be worthwhile investigating why recreation of a code object using:

return types.CodeType(co.co_argcount,
  co.co_posonlyargcount,
  co.co_kwonlyargcount,
  co.co_nlocals, co.co_stacksize,
  co.co_flags, co.co_code, co.co_consts,
  co.co_names, co.co_varnames,
  co.co_filename, co.co_name,
  co.co_firstlineno, co.co_lnotab,
  co.co_freevars, co.co_cellvars)

does not necessarily create a valid copy of a code object co.

--
resolution: not a bug -> 
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Turns out this was a bug in the freeze.py script I was using. I had added a bug 
work-around for the modulefinder module and even though it should work as 
advertised, it seems to be missing some code object attributes when recreating 
the objects which fixed file paths.

The stdlib version uses the .replace() method which was added in Python 3.8 and 
that appears to work better.

Is it possible that code objects now have some extra attributes in 3.10 which 
aren't exposed ? E.g. things placed into ce_extras ?

--

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I've looked at how the importlib freeze logic works, compared to 
Tools/freeze/freeze.py.

The only difference I can spot is that importlib uses C to build the C array 
and does a compile followed by a marshal.dumps, whereas freeze.py loads all 
modules into memory and then runs marshal on module.__code__.

Could this cause issues with the line number calculations in 3.10 ?

--

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I see this in modules frozen by the Tools/freeze/ tool.

The line numbers shown for such frozen modules in the frameinfo stack are a bit 
off as well, compared normal Python. Could this be an indication that there's 
something not working quite right, which then leads to 
_PyCode_CheckLineNumber() returning -1 ?

The Tools/freeze/ doesn't do anything special, BTW. All it does is load the 
module and then store the marshal'ed code objects in C arrays. The information 
read from those C arrays should be the same as what Python reads from PYC files.

--

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Update: I've been trying hard to find a short version which triggers the issue, 
but so far it seems to only trigger when using exec() from a frozen Python 
module.

There don't appear to be many ways frame->f_lineno can end up being -1 (which 
then gets translated into None in Python). _PyCode_CheckLineNumber() is one 
source I found.

Any hints where to look for the cause of this weird effect ?

--

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

In the case of setuptools, this would be the file setup.py, but I think the 
specific file is not relevant. I can try to come up with a shorter example.

--

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I had left a comment on Github about using stderr instead of stdout, to make 
the output more consistent (other parser error messages go to stderr).

Note sure whether that's something you still want to change before closing the 
issue.

--
stage: patch review -> 

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

To add some more context:

This came up while porting eGenix PyRun to Python 3.10. While installing 
setuptools 58.2.0 via "pyrun setup.py install", an exception was raised in 
getframeinfo().

PyRun uses exec() to run Python code:

def pyrun_exec_code_file(filename, globals_dict, locals_dict=None):
with open(filename, 'r', encoding='utf-8') as file:
source = file.read()
code = compile(source, filename, 'exec', optimize=pyrun_optimized)
exec(code, globals_dict, locals_dict)

Using pdb, I then found that the top frame does not have f_lineno set in Python 
3.10.

--

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Yes, I know that (at the moment) it's only documented to work in the parser, 
but since Py_DebugFlag is a general purpose flag, this use could easily be 
extended to other parts of the interpreter as well, e.g. for parsing the 
command line or instrumenting the interpreter to collect debug stats.

In any case, thanks for the quick fix, Pablo.

--

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-21 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I have an initial version of PyRun for Python 3.10 running as well.
This created a few more headaches in order to make it work with
setuptools and some glitches which appear to be bugs in 3.10
(https://bugs.python.org/issue45563 and https://bugs.python.org/issue45562).
Nothing major, though.

I'll have to check my version of the freeze tool against the one
in Python 3.9 and 3.10 to see whether there's anything in the
core versions which could cause the tool not to work.

BTW: (My) freeze.py uses this startup code as main():

int
main(int argc, char **argv)
{
extern int Py_FrozenMain(int, char **);

/* Disabled, since we want to default to non-optimized mode: */
/* Py_OptimizeFlag++; */
Py_NoSiteFlag++;/* Don't import site.py */

PyImport_FrozenModules = _PyImport_FrozenModules;
return Py_FrozenMain(argc, argv);
}

I still have to dig through the changes you have made, but this
suggests that it replaces PyImport_FrozenModules completely
with its own version, so the default freeze that you are
implementing gets overridden.

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-21 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
components: +Interpreter Core, Library (Lib), Parser
nosy: +lys.nikolaou, pablogsal

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-21 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
components: +Parser
nosy: +lys.nikolaou, pablogsal

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-21 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

What's even worse is that those debug lines get written to stdout, not stderr.

--

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45563] inspect.getframeinfo() doesn't handle frames without lineno

2021-10-21 Thread Marc-Andre Lemburg


New submission from Marc-Andre Lemburg :

In Python 3.10, it seems that top-level frames generated by running exec() have 
their f_lineno attribute set to None.

inspect.getframeinfo() tries to build context lines and fails on this line in 
such a case:

start = lineno - 1 - context//2

because lineno is None.

It's not clear whether this is a bug in inspect or the way such frames get 
their f_lineno attribute initialized.

The same code works just fine in Python 3.9.

--
messages: 404674
nosy: lemburg
priority: normal
severity: normal
status: open
title: inspect.getframeinfo() doesn't handle frames without lineno
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue45563>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45562] python -d creates lots of tokenizer messages

2021-10-21 Thread Marc-Andre Lemburg


New submission from Marc-Andre Lemburg :

python3.9 -d:

Python 3.9.7 (default, Oct 21 2021, 20:51:19)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Loaded pyinteractive.py.

>>>

python3.10 -d:

Python 3.10.0 (default, Oct 21 2021, 23:13:32) [GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
line[1] = "\"\"\" pyinteractive.py\n"  tok->done = 10
line[2] = "\n"  tok->done = 10
line[3] = "This file is executed on interactive startup by Python\n"  
tok->done = 10
...

(not that in both cases a PYTHONINTERACTIVE script is loaded)

Is that intended ?

--
messages: 404673
nosy: lemburg
priority: normal
severity: normal
status: open
title: python -d creates lots of tokenizer messages
versions: Python 3.10

___
Python tracker 
<https://bugs.python.org/issue45562>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-20 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 20.10.2021 16:25, Eric Snow wrote:
> 
> Eric Snow  added the comment:
> 
> On Wed, Oct 20, 2021 at 6:01 AM Marc-Andre Lemburg
>  wrote:
>> I have PyRun mostly working with Python 3.9.
>> ...
>> No changes were necessary to Tools/freeze/.
> 
> Great!  Thanks for getting to that so quickly.  Are you going to take
> a look at 3.10 after you're happy with 3.9?

Yes, 3.10 is next, once I have 3.9 ironed out. And then I'll give
3.11 a try.

>> BTW: Why is test_embed even used for the PGO target ?
> 
> Perhaps I've missed something, but I'm not clear on why PGO would be a
> problem for test_embed.  Are you talking about a specific test in
> test_embed?

Sorry, I wasn't clear. PGO is not a problem for test_embed. I just
wonder why the test_embed tests are run for creating the PGO profile
files. test_embed is far from being a regular work load for Python
applications.

Well, I guess using the test suite for PGO is questionable anyway.
It's just that we don't have anything else handy to create those
profiles at Python build time :-)

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-20 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 19.10.2021 18:47, Marc-Andre Lemburg wrote:
> 
>> On Sat, Oct 16, 2021 at 5:01 AM Marc-Andre Lemburg
>>  wrote:
>>> I can try to port PyRun to 3.9 and 3.10 to see whether I run into any 
>>> issues.
>>> Would that help ?
>>
>> Yeah, that would totally help.
> 
> Ok, I'll start looking into this and post updates here.

I have PyRun mostly working with Python 3.9. Still need to add a few
new C modules, but the basics work.

No changes were necessary to Tools/freeze/. The PGO build complains
about test_embed not working - no surprise there. I'll patch the suite
to ignore the test.

BTW: Why is test_embed even used for the PGO target ?

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-19 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 16.10.2021 21:20, Eric Snow wrote:
> 
> On Sat, Oct 16, 2021 at 5:01 AM Marc-Andre Lemburg
>  wrote:
>> I can try to port PyRun to 3.9 and 3.10 to see whether I run into any issues.
>> Would that help ?
> 
> Yeah, that would totally help.

Ok, I'll start looking into this and post updates here.

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19459] Python does not support the GEORGIAN-PS charset

2021-10-19 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 19.10.2021 10:44, Serhiy Storchaka wrote:
> 
> Possible solutions (they can be combined):
> 
> 1. Add support for the GEORGIAN-PS charset and all other encodings used in 
> libc (issue22679). The problem is that it is difficult to get the official 
> information about these encodings.

As with all encodings we add: there has to be a real need to support
them natively in Python (as opposed to installing codecs via PyPI)
and we need a definite source for the encoding, e.g. a standards
document from an official body.

IMO, we should not really add more encodings to the stdlib, but instead
point people to e.g. the iconv package:

https://pypi.org/project/python-iconv/

Perhaps we ought to make it easier for such packages to provide
additional codecs even during the startup phase, e.g. via a special
env var which points Python to a list of codec packages to load
prior to initializing the I/O encoding... not sure whether this is
possible, though.

> 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale 
> encoding. But typos can slip unnoticed.

I think this would be a more general solution to such cases, provided
the startup logic issues a visible warning about the fallback.

--

___
Python tracker 
<https://bugs.python.org/issue19459>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-16 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 16.10.2021 01:31, Eric Snow wrote:
> 
> @MAL, who's maintaining Tools/freeze?  I'm not aware of who's using it (other 
> than you, of course).  It looks like PyRun isn't compatible with anything 
> newer than 3.5, so it seems like that isn't verifying that Tools/freeze still 
> works.  Neither does it have tests that run in the test suite (nor on 
> buildbots).
> 
> So could Tools/freeze have been broken for a while?  I ask because I haven't 
> been able to get it work work on the master branch (or on 3.10).

I don't know who maintains it, but it's been working fine up until Python 3.8,
which is the last version I ported PyRun to.

There have also been a couple of patches going into the freeze tool, so this is
still on the radar of at least some people other than me. It's also one of the
oldest tools we have in Python and dates back to the early days of Python. Guido
wrote the initial version.

I can try to port PyRun to 3.9 and 3.10 to see whether I run into any issues.
Would that help ?

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45490] [meta][C API] Avoid C macro pitfalls and usage of static inline functions

2021-10-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Meta comment :-) ... wouldn't it be better to enable the Github wiki feature for
such collections ?

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45490>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 14.10.2021 22:56, Eric Snow wrote:
> 
> @MAL, what's the best way to make sure Tools/freeze is still working?  I 
> don't see any tests for it in the test suite.  I tried running the test in 
> Tools/freeze/test, but I can't get that to work on main (or on the 3.10 
> branch).

You'd have to create a frozen binary using the standard way freeze
works. I have never run those tests, so don't know whether they work,
but, of course, made sure that the freeze works as basis for PyRun
and patched it slightly to add features we needed.

One of these days, I need to refactor PyRun into a standalone project
and put it on Github (it's currently integrated into our internal
single repo setup). Then it'll be easier to see the changes I made.
For now, I can only reference the tar file:

https://www.egenix.com/products/python/PyRun/#Download
https://www.egenix.com/products/python/PyRun/#Installation

I can send you an updated version for Python 3.8, if there's
interest.

Essentially, you need to create a Python module which runs your
application, then point freeze.py to it and then compile the
generated .c files using the generated Makefile.

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45476] [C API] Convert "AS" functions, like PyFloat_AS_DOUBLE(), to static inline functions

2021-10-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 15.10.2021 11:43, STINNER Victor wrote:
> Again, I'm not aware of any performance issue caused by short static inline 
> functions like Py_TYPE() or the proposed PyFloat_AS_DOUBLE(). If there is a 
> problem, it should be addressed, since Python uses more and more static 
> inline functions.
> 
> static inline functions is a common feature of C language. I'm not sure where 
> your doubts of bad performance come from.

Inlining is something that is completely under the control of the
used compilers. Compilers are free to not inline function marked for
inlining, which can result in significant slowdowns on platforms
which are e.g. restricted in RAM and thus emphasize on small code size,
or where the CPUs have small caches or not enough registers (think
micro-controllers).

The reason why we have those macros is because we want the developers to be
able to make a conscious decision "please inline this code unconditionally
and regardless of platform or compiler". The developer will know better
what to do than the compiler.

If the developer wants to pass control over to the compiler s/he can use
the corresponding C function, which is usually available (and then, in many
cases, also provides error handling).

> Using static inline functions has other advantages. It helps debugging and 
> profiling, since the function name can be retrieved by debuggers and 
> profilers when analysing the machine code. It also avoids macro pitfalls 
> (like abusing a macro to use it as an l-value ;-)).

Perhaps, but then I never had to profile macro use in the past. Instead,
what I typically found was that using macros results in faster code when
used in inner loops, so profiling usually guided me to use macros instead
of functions.

That said, the macros you have inlined so far were all really trivial,
so a compiler will most likely always inline them (the number of machine
code instructions for the call would be more than needed for
the actual operation).

Perhaps we ought to have a threshold for making such decisions, e.g.
number of machine code instructions generated for the macro or so, to
not get into discussions every time :-)

A blanket "static inline" is always better than a macro is not good
enough as an argument, though.

Esp. in PGO driven optimizations the compiler could opt for using
the function call rather than inlining if it finds that the code
in question is not used much and it needs to save space to have
loops fit into CPU caches.

--

___
Python tracker 
<https://bugs.python.org/issue45476>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45476] [C API] Convert "AS" functions, like PyFloat_AS_DOUBLE(), to static inline functions

2021-10-15 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I am with Raymond on this one.

If "protecting against wrong use" is the only reason to go down the slippery 
path of starting to rely on compiler optimizations for performance critical 
operations, the argument is not good enough.

If people do use macros in l-value mode, it's their problem when their code 
breaks, not ours. Please don't forget that we are operating under the 
consenting adults principle: we expect users of the CPython API to use it as 
documented and expect them to take care of the fallout, if things break when 
they don't.

We don't need to police developers into doing so.

--

___
Python tracker 
<https://bugs.python.org/issue45476>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45382] platform() is not able to detect windows 11

2021-10-08 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 08.10.2021 02:15, Eryk Sun wrote:
> 
>> use the build number as reference instead of the major.minor
> 
> It could check the (major, minor, build) tuple, which allows reporting 10.1+ 
> as "post11" and minimizes hard coding of build numbers. For example, given 
> win32_ver() iterates by (major, minor, build) thresholds:

Great idea.

Could you prepare a PR for this ?

--

___
Python tracker 
<https://bugs.python.org/issue45382>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 07.10.2021 16:40, Eric Snow wrote:
> 
> On Thu, Oct 7, 2021 at 1:17 AM Marc-Andre Lemburg
>  wrote:
>> I'm not sure I follow, but in any case, please make sure that
>> the freeze tool in Tools/ continues to work with the new mechanism.
>>
>> The freeze tool would also need to know which modules are already
>> frozen via the new script, so that modules don't get included twice.
> 
> Will do.

Great, thanks, Eric.
 --
Marc-Andre Lemburg
eGenix.com

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29410] Moving to SipHash-1-3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 07.10.2021 15:29, Christian Heimes wrote:
> 
> Christian Heimes  added the comment:
> 
> JP got back to me
> 
> On 07/10/2021 14.34, Jean-Philippe Aumasson wrote:
>> xxHash is much faster indeed, but collisions seem trivial to find, which 
>> might allow hash-flood DoS again (see for example 
>> https://github.com/Cyan4973/xxHash/issues/180 
>> <https://github.com/Cyan4973/xxHash/issues/180>). It's however unclear 
>> whether exploitable multicollisions can also be trivially found.
>>
>> If collisions don't matter and if the ~10x speed-up makes a difference, 
>> then probably a good option, but guess you'll need to keep SipHash (or 
>> some other safe hash) when DoS resistance is needed?
> 
> This information disqualifies xxHash for our use case.

The quoted issue was for an early version of XXH3.

Please see
https://github.com/Cyan4973/xxHash/wiki/Collision-ratio-comparison
as reference for collision analysis of the current xxHash versions.

The numbers are close to the expected case, meaning that
collisions are not more frequent than statistically to be
expected given the hash and the test sample size.

Looking at this older comparison, it would also make sense to
revisit the hybrid approach, e.g. use FNV for strings
up to 16 bytes, XXH128 for longer strings:

https://cglab.ca/~abeinges/blah/hash-rs/

Given that dictionaries often use relatively short string keys,
this should have a significant effect on applications where the
dictionaries don't just use interned strings.

It would also have an effect on Python startup time, since all
those interned strings need to have their hash calculated
during startup.

--

___
Python tracker 
<https://bugs.python.org/issue29410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29410] Moving to SipHash-1-3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 07.10.2021 12:48, Christian Heimes wrote:
> 
>> I don't quite follow. Why is it fine that you discuss DoS, but it's not
> fine when others discuss DoS ?
> 
> But this BPO is not about discussing mitigations against DoS attacks in 
> general. It's about adding SipHash1-3- and following the example of Rust and 
> Ruby.
> 
> If you like to discuss DoS attacks on hashing of numeric types or other 
> mitigations, then please do this in a dedicated ticket. I like to keep this 
> BPO focused on a single topic.

The point that both Victor and I wanted to make is that we have
different views on the relevance of DoS attack mitigations
on selecting the default hash algorithm to use with Python strings
(and other objects which use pyhash.c).

The motivation for moving to siphash 1-3 is performance and we can
potentially get even better performance by looking at today's hash
algorithms and revisiting the decision to go with siphash.

This broadens the discussion, yes, but that can easily be addressed
by changing the title to e.g. "Revisiting the default hash algorithm
for strings".

Since siphash is a crypto hash function, whereas xxhash (and other
faster hash algorithms) are non-crypto hash functions, the topic of
hash collisions which can be used for DoS becomes relevant, so I
don't see why such discussions are off-topic.

With non-crypto hash algorithms available which exhibit good
collision stats and taking into account that DoS can be mitigated
using other ways (which is essential anyway, since Python doesn't
protect again hash based DoS in all cases), we get to a better Python.

More details on xxhash collision stats:
https://github.com/Cyan4973/xxHash/wiki/Collision-ratio-comparison#collision-study

--

___
Python tracker 
<https://bugs.python.org/issue29410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29410] Moving to SipHash-1-3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

BTW: We already use (a slight variant of) xxHash for tuples: 
https://bugs.python.org/issue34751

The issues is an interesting read, in particular on how xxHash was eventually 
chosen, with a whole set of other hash algorithms in between.

--

___
Python tracker 
<https://bugs.python.org/issue29410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29410] Moving to SipHash-1-3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 07.10.2021 12:16, Christian Heimes wrote:
> 
>> That's certainly true, but at the same time, just focusing on string
> hashes only doesn't really help either, e.g. it is very easy to
> create a DoS with numeric keys or other objects which use trivial
> hashing algorithms.
> 
> Marc-Andre, Victor, your postings are off-topic. Please move your discussion 
> to a new BPO.

I don't quite follow. Why is it fine that you discuss DoS, but it's not
fine when others discuss DoS ?

You're statement comes across as an attempt to shut down a possibly
fruitful discussion. Not sure if you intended it that way.

We could open a new issue to discuss faster alternatives to siphash
and then close this one, or simply rename this issue, if it bothers you
that we're not strictly focusing on siphash 1-3 :-)

--

___
Python tracker 
<https://bugs.python.org/issue29410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29410] Moving to SipHash-1-3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 07.10.2021 11:49, Inada Naoki wrote:
> Hash DoS is not only for HTTP headers. Everywhere creating dict from 
> untrusted source can be attack vector.
> For example, many API servers receive JSON as HTTP request body. Limiting 
> HTTP header don't protect it.

That's certainly true, but at the same time, just focusing on string
hashes only doesn't really help either, e.g. it is very easy to
create a DoS with numeric keys or other objects which use trivial
hashing algorithms.

I wouldn't focus too much on this at the Python core level.
Server implementations have other ways to protect themselves against
DoS, e.g. by monitoring process memory, CPU load or runtime, applying
limits on incoming data.

IMO, it's much better to use application and use case specific methods
for this, than trying to fix basic data types in Python to address
the issue and making all Python application suffer as a result.

--

___
Python tracker 
<https://bugs.python.org/issue29410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue29410] Moving to SipHash-1-3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Since the days this was discussed, a lot of new and faster hash algorithms have 
been developed. It may be worthwhile looking at those instead.

E.g. xxHash is a lot more performant than siphash: 
https://github.com/Cyan4973/xxHash (the link also has a comparison of hash 
algorithms)

--

___
Python tracker 
<https://bugs.python.org/issue29410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45382] platform() is not able to detect windows 11

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

It's probably time to extend the marketing version detection mechanism to use
the build number as reference instead of the major.minor system version numbers.

Here's a good reference for this:

https://en.wikipedia.org/wiki/List_of_Microsoft_Windows_versions

MS resources:

https://docs.microsoft.com/en-us/windows/win32/sysinfo/operating-system-version
https://docs.microsoft.com/en-us/windows/release-health/release-information

--

___
Python tracker 
<https://bugs.python.org/issue45382>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26651] Deprecate register_adapter() and register_converter() in sqlite3

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

FWIW: I'm -1 on removing the possibility to register conversion or adapter 
hooks in sqlite3. Such mechanisms have become a standard with Python database 
modules and are widely used to adapt them to applications or middleware using 
the modules.

The database module defaults don't always work well everywhere and there needs 
to be an efficient way to modify this behavior. Fixing all input to .execute() 
et al. and all output from .fetch*() is not efficient.

I'd suggest to close this as rejected. The deprecation won't do anyone good.

Related to the few such implementations in dbapi2.py of sqlite2, which I 
believe triggered this issue:

Those are not necessarily ideal, since they don't handle all possible cases.

The adapters (conversion from Python data type to SQLite data type) are always 
used, but the converters (conversion from SQLite data type to Python type) are 
not, since those rely on SQLite providing type information, which it only does 
if you pass in detect_types to the .connect() method. Many people don't notice 
the missing time offset support in the converters due to this (sqlite3 returns 
strings instead).

Modifying those preconfigured adapters / converters would need to be a separate 
issue, though. E.g. deprecating the date/timestamp converters would be a way 
forward, since applications will know better what to do in their particular use 
case. Such a deprecation would have to take a longer while, though, for the 
reasons stated by Raymond.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue26651>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45395] Frozen stdlib modules are discarded if custom frozen modules added.

2021-10-07 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

I'm not sure I follow, but in any case, please make sure that
the freeze tool in Tools/ continues to work with the new mechanism.

The freeze tool would also need to know which modules are already
frozen via the new script, so that modules don't get included twice.

--

___
Python tracker 
<https://bugs.python.org/issue45395>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45382] platform() is not able to detect windows 11

2021-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 05.10.2021 22:30, Steve Dower wrote:
> The version number for "Windows 11" still starts with 10.0. Just like how 
> Windows 5.x and 6.x were around for a very long time each ;)
> 
> There are tables in platform module that map the specific version to the 
> release name. These probably need to be updated to return "11" for versions 
> 10.0.22000 and greater.

Hmm, but the "ver" output seems to have more information than those
APIs.

Note: The tables for mapping to releases for Windows only take the
major.minor versions as key. Unfortunately, those did not change. It's
actually the build version which provides the indicator, it seems.

Any idea, whether a patch will fix this on Windows soonish ?

--

___
Python tracker 
<https://bugs.python.org/issue45382>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45382] platform() is not able to detect windows 11

2021-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

win32_ver() should be using the internal Windows APIs to figure out the 
version. I do wonder why those don't return the same version as the "ver" 
command line tool.

Adding our Windows experts to the noisy list.

--
nosy: +lemburg, paul.moore, steve.dower, tim.golden, zach.ware

___
Python tracker 
<https://bugs.python.org/issue45382>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45372] Unwarranted "certificate has expired" when urlopen-ing R3 sites

2021-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 05.10.2021 12:48, Aivar Annamaa wrote:
> 
> I can list the root certs with certmgr, but I'm not sure which piece to 
> investigate further. 

Check the certs in the LE chain as listed on the page you quoted
and compare them to the working installation.

> Even if there is problem with installed certs, it's interesting, why doesn't 
> it bother the browsers and requests? Maybe this is opportunity to make 
> something better in urllib?

Browsers and requests use their own list of trusted CAs.

--

___
Python tracker 
<https://bugs.python.org/issue45372>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45372] Unwarranted "certificate has expired" when urlopen-ing R3 sites

2021-10-05 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Are you sure that all updates on the failing machine have been correctly
installed ? It's possible that the list of CA root certs is not up to date
on the machine.

You can use certmgr.msc to check the list of installed CA root certs.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45372>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36819] Crash during encoding using UTF-16/32 and custom error handler

2021-09-29 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 29.09.2021 10:41, Serhiy Storchaka wrote:
> 
> Restricting the returned position to be strictly larger than start would 
> solve the problem with infinite loop and OOM. But this is a different issue.

Yes, this would make sense, since having the codec process
the same error location over and over again will not resolve the
error, so it's clearly a bug in the error handler.

--

___
Python tracker 
<https://bugs.python.org/issue36819>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36819] Crash during encoding using UTF-16/32 and custom error handler

2021-09-29 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
nosy: +doerwalter

___
Python tracker 
<https://bugs.python.org/issue36819>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36819] Crash during encoding using UTF-16/32 and custom error handler

2021-09-29 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
components: +Unicode
nosy: +ezio.melotti

___
Python tracker 
<https://bugs.python.org/issue36819>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36819] Crash during encoding using UTF-16/32 and custom error handler

2021-09-29 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Looking at the specs in PEP 293 (https://www.python.org/dev/peps/pep-0293/), it 
is certainly possible for the error handler to return a newpos outside the 
range start - end, meaning in most cases: a value >= end.

There's a good reason for this: the codec may not be able to correctly 
determine the end of the sequence and so the end value presented by the codec 
is not necessarily a valid start to continue encoding/decoding. The error 
handler can e.g. choose to skip more input characters by trying to find the 
next valid sequence.

In the example script, the handler returns start, so the value is within the 
range. A limit would not solve the problem.

It seems that the reallocation logic of the codecs is the main problem here.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue36819>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45020] Freeze all modules imported during startup.

2021-09-25 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 25.09.2021 18:20, STINNER Victor wrote:
> 
> STINNER Victor  added the comment:
> 
> Marc-Andre: I suppose that you're talking about LANDMARK in Modules/getpath.c 
> and PC/getpathp.c.

Now that you mention it: yes, that as well :-) But os.py is used in the
Python stdlib code as well, just search for "os.__file__" to see a few
such uses.

If you search for ".__file__" you'll find that there are quite a few
cases in the test suite expecting that attribute on other stdlib modules
as well. The attribute may officially be optional, but in reality a
lot of code expects to find it.

--

___
Python tracker 
<https://bugs.python.org/issue45020>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45020] Freeze all modules imported during startup.

2021-09-25 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Eric, I noticed that you are freezing os.py. Please be aware that
this module is often being used as indicator for where the stdlib
was installed (the stdlib itself does this in site.py to read the
LICENSE and the test suite also uses os.__file__ in a couple of
places).

It may be worth changing the stdlib to pick a different module
as landmark for this purpose.

Also: Unless you have added the .__file__ attribute to frozen
modules, much of this landmark checking code will fail... which is
the reason I added the attribute to frozen modules in PyRun.

--

___
Python tracker 
<https://bugs.python.org/issue45020>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40116] Regression in memory use of shared key dictionaries for "compact dicts"

2021-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 22.09.2021 21:02, Raymond Hettinger wrote:
>> The language specification says that the dicts maintain insertion 
>> order, but the wording implies that this only to explicit 
>> dictionaries, not instance attribute or other namespace dicts.
> 
> That is a quite liberal reading of the spec.  I would object to making 
> instance and namespace dicts behave differently.  That would be a behavior 
> regression and we would forever have to wrestle with the difference.

I agree. Keeping the insertion order is essential for many common
use cases, including those where a class or instance dict is used,
e.g. namespaces used for data records, data caches, field
definitions in data records, etc. (and yes, those often can be
dynamically extended as well :-)).

I think for the case you mention, a documentation patch would be
better and more helpful for the programmers. Point them to slots
and the sharing problem should go away in most cases :-)

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue40116>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45020] Freeze all modules imported during startup.

2021-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 22.09.2021 20:47, Brett Cannon wrote:
> What about if there isn't a pre-computed location for __file__? I could 
> imagine a self-contained CPython build where there is no concept of a file 
> location on disk for anything using this.

This does work and is enough to make most code out there happy.

I use e.g. "/os.py" in PyRun. There is no os.py file to load,
but tracebacks and inspection tools work just fine with this.

--

___
Python tracker 
<https://bugs.python.org/issue45020>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45261] Unreliable (?) results from timeit (cache issue?)

2021-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On the topic average vs. minimum, it's interesting to see this pop up
every now and then. When I originally wrote pybench late in 1997, I used
average, since it gave good results on my PC at the time.

Later on, before pybench was added to Tools/ in Python 2.5 in 2006, people
started complaining about sometimes getting weird results (e.g. negative
times due to the calibration runs not being stable enough). A lot of noise
was made from the minimum fans, so I switched to minimum, which then made
the results more stable, but I left in the average figures as well.

Then some years later, people complained about pybench not being
good enough for comparing to e.g. PyPy, since those other implementations
optimized away some of the micro-benchmarks which were used in pybench.
It was then eventually removed, to not confuse people not willing
to try to understand what the benchmark suite was meant for, nor
understand the issues around running such benchmarks on more modern
CPUs.

CPUs have advanced a lot since the days pybench was written and so
reliable timings are not easy to get unless you invest in dedicated
hardware, custom OS and CPU settings and lots of time to calibrate
everything. See Victor's research for more details.

What we have here is essentially the same issue. timeit() is mostly
being used for micro-benchmarks, but those need to be run in dedicated
environments. timeit() is good for quick checks, but not really up
to the task of providing reliable timing results.

One of these days, we should ask the PSF or one of its sponsors to
provide funding and devtime to set up such a reliable testing
environment. One which runs not only high end machines, but also
average and lower end machines, and using different OSes,
so that we can detect performance regressions early and easily
on different platforms.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45261>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45213] Frozen modules are looked up using a linear search.

2021-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Perhaps a frozen dict could be used instead of the linear search.

This could then also be made available as sys.frozen_modules for inspection by 
applications and tools such as debuggers or introspection tools trying to find 
source code (and potentially failing at this).

Not urgent, though.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45213>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45255] sqlite3.connect() should check if the sqlite file exists and throw a FileNotFoundError if it doesn't

2021-09-22 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Such a change would be backwards incompatible and no longer in line with PEP 
249.

I also don't understand what you regard as confusing about the message "unable 
to open database file". The message could be extended to include the path, but 
apart from that, it's as clear as it can get :-)

--
nosy: +lemburg
status: pending -> open

___
Python tracker 
<https://bugs.python.org/issue45255>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45116] Performance regression 3.10b1 and later on Windows: Py_DECREF() not inlined in PGO build

2021-09-17 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

FWIW: Back in the days of Python 1.5.2, the ceval loop was too big for CPU 
caches as well and one of the things I experimented with at the time was 
rearranging the opcodes based on how often they were used and splitting the 
whole switch statement we had back then in two parts. This results in a 10-20% 
speedup.

CPU caches have since gotten much larger, but the size of the loop still is 
something to keep in mind and optimize for, as more and more logic gets added 
to the inner loop of Python.

IMO, we should definitely keep forced inlines / macros where they are used 
inside hot loops, perhaps even in all of the CPython code, since the conversion 
to inline functions is mostly for hiding internals from extensions, not to hide 
them from CPython itself.

@neonene: Could you provide more details about the CPU you're using to run the 
tests ?

BTW: Perhaps the PSF could get a few sponsors to add more hosts to 
speed.python.org, to provide a better overview. It looks as if the system is 
only compiling on Ubuntu 14.04 and running on an 11 year old system 
(https://speed.python.org/about/). If that's the case, the system uses a server 
CPU with 12MB cache 
(https://www.intel.com/content/www/us/en/products/sku/47916/intel-xeon-processor-x5680-12m-cache-3-33-ghz-6-40-gts-intel-qpi/specifications.html).

--

___
Python tracker 
<https://bugs.python.org/issue45116>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45232] ascii codec is used by default when LANG is not set

2021-09-17 Thread Marc-Andre Lemburg


Change by Marc-Andre Lemburg :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue45232>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45232] ascii codec is used by default when LANG is not set

2021-09-17 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

On 17.09.2021 15:45, Olivier Delhomme wrote:
> 
> Olivier Delhomme  added the comment:
> 
> Hi Marc-Andre,
> 
> Please note that setting PYTHONUTF8 with "export PYTHONUTF8=1":
> 
> * Is external to the program and user dependent
> * It does not seems to work on my use case:
> 
>   $ unset LANG
>   $ export PYTHONUTF8=1
>   $ python3 
>   Python 3.6.4 (default, Jan 11 2018, 16:45:55) 
>   [GCC 4.8.5] on linux
>   Type "help", "copyright", "credits" or "license" for more information.
>   >>> machaine='help me if you can'
>  File "", line 0
> 
>^
>SyntaxError: 'ascii' codec can't decode byte 0xc3 in position 10: ordinal 
> not in range(128)

UTF-8 mode is only supported in Python 3.7 and later:

   https://docs.python.org/3/whatsnew/3.7.html#whatsnew37-pep540
-- 
Marc-Andre Lemburg
eGenix.com

--

___
Python tracker 
<https://bugs.python.org/issue45232>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45232] ascii codec is used by default when LANG is not set

2021-09-17 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Yes, this is intended. ASCII is used as fallback in case Python
cannot determine the I/O encoding to use during startup. This is
also the reason why later changes to the environment have no
affect on this - the determination of the encoding has already
been applied.

You can force UTF-8 by enabling the UTF-8 mode:

export PYTHONUTF8=1

This will then have Python use UTF-8 regardless of the LANG
env var setting.

--
nosy: +lemburg

___
Python tracker 
<https://bugs.python.org/issue45232>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45120] Windows cp encodings "UNDEFINED" entries update

2021-09-17 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

Just to be clear: The Python code page encodings are (mostly) taken from the 
unicode.org set of mappings (ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/). 
This is our standards body for such mappings, where possible. In some cases, 
the Unicode consortium does not provide such mappings and we resort to other 
standards (ISO, commonly used mapping files in OSes, Wikipedia, etc).

Changes to the existing mapping codecs should only be done in case corrections 
are applied to the mappings under those names by the standard bodies.

If you want to add variants such as the best fit ones from MS, we'd have to add 
them under a different name, e.g. bestfit1252 (see 
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/).

Otherwise, interop with other systems would no longer.

>From Eryk's description it sounds like we should always add 
>WC_NO_BEST_FIT_CHARS as an option to MultiByteToWideChar() in order to make 
>sure it doesn't use best fit variants unless explicitly requested.

--

___
Python tracker 
<https://bugs.python.org/issue45120>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45020] Freeze all modules imported during startup.

2021-09-04 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

FWIW, I've not found the importer for frozen modules to be lacking
features. When using frozen modules, you don't expect to see source
code, so the whole part about finding source code is not really
relevant for that use case.

The only lacking part I found regarding frozen modules is support
for these in pkgutil.py. But that's easy to add:

--- /home/lemburg/egenix/projects/PyRun/Python-3.8.0/Lib/pkgutil.py
2019-10-14 1>
+++ ./Lib/pkgutil.py2019-11-17 11:36:38.404752218 +0100
@@ -315,20 +315,27 @@
 return self.etc[2]==imp.PKG_DIRECTORY

 def get_code(self, fullname=None):
+# eGenix PyRun needs pkgutil to also work for frozen modules,
+# since pkgutil is used by the runpy module, which is needed
+# to implement the -m command line switch.
+if self.code is not None:
+return self.code
 fullname = self._fix_name(fullname)
-if self.code is None:
-mod_type = self.etc[2]
-if mod_type==imp.PY_SOURCE:
-source = self.get_source(fullname)
-self.code = compile(source, self.filename, 'exec')
-elif mod_type==imp.PY_COMPILED:
-self._reopen()
-try:
-self.code = read_code(self.file)
-finally:
-self.file.close()
-elif mod_type==imp.PKG_DIRECTORY:
-self.code = self._get_delegate().get_code()
+mod_type = self.etc[2]
+if mod_type == imp.PY_FROZEN:
+self.code = imp.get_frozen_object(fullname)
+return self.code
+elif mod_type==imp.PY_SOURCE:
+source = self.get_source(fullname)
+self.code = compile(source, self.filename, 'exec')
+elif mod_type==imp.PY_COMPILED:
+self._reopen()
+try:
+self.code = read_code(self.file)
+finally:
+self.file.close()
+elif mod_type==imp.PKG_DIRECTORY:
+self.code = self._get_delegate().get_code()
 return self.code

 def get_source(self, fullname=None):

--

___
Python tracker 
<https://bugs.python.org/issue45020>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45020] Freeze all modules imported during startup.

2021-08-31 Thread Marc-Andre Lemburg


Marc-Andre Lemburg  added the comment:

On 31.08.2021 20:14, Brett Cannon wrote:
> 
> Brett Cannon  added the comment:
> 
>> set __file__ (and __path__) on frozen modules?
> 
> See https://bugs.python.org/issue21736

The patch on that ticket is straight from PyRun, where the
__file__ location is set in a way which signals that the file
does not exist, but instead is baked into the executable:

>>> import os
>>> os.__file__
'/os.py'

Not doing this breaks too many tests in the test suite for no
good reason, which is why I mentioned "practicality beats
purity" in the ticket.

--

___
Python tracker 
<https://bugs.python.org/issue45020>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   6   7   8   9   10   >