[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-10 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset d914596a671c4b0f13641359cf43aa0d6fc05070 by Raymond Hettinger 
(Christian Heimes) in branch 'master':
bpo-36559: random module: optimize sha512 import (GH-12742)
https://github.com/python/cpython/commit/d914596a671c4b0f13641359cf43aa0d6fc05070


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-10 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-10 Thread Inada Naoki


Inada Naoki  added the comment:

* Is hashlib large, slow to import library? -- Yes.
* random module use hashlib only for specific (rare) use case? -- Yes.
* Does user of random module needs hashlib in most case? -- No.  For example, 
tmpfile user may not need hashlib.

I'm +1 on PR 12728.

--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-09 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> You could also use the internal _sha512 module. 
> It's always present, small, lean and provides a SHA512
> implementation with sufficient performance.

I suppose we could do this but it borders on telling folks that we're worried 
about using our own public APIs, that importing hashlib is bad for them.  It 
shouldn't be that way, hashlib is a collection of hash functions -- it is clear 
why this import isn't small and fast.  It suggests that something is wrong with 
the implementation.

The focus on client code in random seems like the wrong focus.  That is just 
typical of what other clients would do.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-09 Thread STINNER Victor


STINNER Victor  added the comment:

Note: *Technically*, you can disable the compilation of the _sha512 module 
using "*disabled*" in Modules/Setup, but I'm not sure if it's a common use 
case. At least, it makes sense to me when we are sure that OpenSSL and _hashlib 
are available ;-) I didn't want to mention that since I'm not sure that it's 
really relevant in this discussion. (I was aware of the regression, and 
hopefully it's now fixed!)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-09 Thread Christian Heimes


Christian Heimes  added the comment:

Thanks for pointing to the other issue. This is clearly a regression and should 
be fixed ASAP. I have ACKed your PR and pushed it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-09 Thread Xavier de Gaye


Xavier de Gaye  added the comment:

> You could also use the internal _sha512 module. It's always present

This is not true at the moment, the _sha512 module is not present when openssl 
is missing. This is a bug in setup.py that prevents building the _sha512 module 
when openssl is missing. See issue 36544.

It is possible that this issue (since it started because of hashlib failures) 
and also issue 36414 are caused by this problem.

--
nosy: +xdegaye

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-09 Thread Christian Heimes


Change by Christian Heimes :


--
pull_requests: +12665

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-09 Thread Christian Heimes


Christian Heimes  added the comment:

You could also use the internal _sha512 module. It's always present, small, 
lean and provides a SHA512 implementation with sufficient performance.

--
nosy: +christian.heimes

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

One other thought. This code has been present for over a decade.  There is no 
evidence that anyone has ever wanted random to defer one of its imports. This 
seems like an invented problem.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

In general, deferred imports are code smell that should avoided unless really 
necessary. They create an on-going maintenance burden (there's a reason most 
modules don't do this and put their imports at the top).

FWIW, a broken hashlib is a localized bug, not an optimization problem. It 
doesn't affect any user with a build that passes the test suite.

Running "python -v" shows that "random" is not part of the normal startup, so 
deferring the import saves zero for normal startup. It only affects modules 
that specifically import random.

IIRC, Mercurial uses hashing extensively, so deferring the import doesn't help 
them at all.

This is minor change, so I suppose we could let it go through; however, it 
seems somewhat arbitrary and the reasons offered seem dubious. For the most 
part, it isn't a good practice.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread STINNER Victor


STINNER Victor  added the comment:

Raymond:
> Why do we care about this particular import?  It doesn't see slow in any way.

I ran a benchmark:

$ ./python -m perf command -o ref.json -v -- ./python -c 'import random'
$ # apply patch
$ ./python -m perf command -o patch.json -v -- ./python -c 'import random'
$ ./python -m perf compare_to ref.json patch.json 
Mean +- std dev: [ref] 13.8 ms +- 0.2 ms -> [patch] 11.8 ms +- 0.2 ms: 1.18x 
faster (-15%)

My PR 12728 makes Python startup 2 ms faster which I consider as quite 
significant on 13.8 ms.

I know that some users are fighting to get a faster Python startup. Mercurial 
is just one example ;-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread STINNER Victor


STINNER Victor  added the comment:

Raymond:
> In general, we don't do deferred imports unless there is a compelling reason 
> (i.e. it is very slow or it is sometimes unavailable).

While I was working on adding OpenSSL 1.1.1 to Python 3.4, my _hashopenssl 
module was broken. In that case, "import random" fails with ImportError because 
of hashlib failures. I was surprised, since random doesn't need hashlib at 
startup.

I would like to be able to use "import random" even if hashlib is broken. For 
me, random is a key component, whereas I see hashlib more as optional. (Even if 
in practice, it should always be available).


Raymond:
> Otherwise, it is a false optimization.

Brett:
> Could you explain a bit more, Victor, about why you want to avoid importing 
> hashlib and OpenSSL so much?

Well, OpenSSL is not a random tiny library. For example, loading it increases 
Python RSS of around 2.7 MiB.

Example with script x.py:
---
import os
os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
import random
os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
---

Output without the change on Fedora 29:

VmRSS:  7396 kB
VmRSS: 11796 kB  # +4.4 MiB

With the change:

VmRSS:  7272 kB
VmRSS:  8988 kB  # +1.7 MiB


Another example is that OpenSSL loads the libz.so dynamic library by dependency.

I would prefer to minimize Python footprint if possible.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread Brett Cannon


Brett Cannon  added the comment:

Could you explain a bit more, Victor, about why you want to avoid importing 
hashlib and OpenSSL so much?

--
nosy: +brett.cannon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

Why do we care about this particular import?  It doesn't see slow in any way.  
In general, we don't do deferred imports unless there is a compelling reason 
(i.e. it is very slow or it is sometimes unavailable).  Otherwise, it is a 
false optimization.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread STINNER Victor


STINNER Victor  added the comment:

In the past, some developers complained when an import has been removed in a 
stdlib module. I vaguely recall code using "import os" to get the "errno" 
module from "os.errno". That's "What's New in Python 3.7" contains:

"Several undocumented internal imports were removed. One example is that 
os.errno is no longer available; use import errno directly instead. Note that 
such undocumented internal imports may be removed any time without notice, even 
in micro version releases."

For this reason, I don't think that stable versions (2.7 and 3.7) should be 
modified.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread STINNER Victor


Change by STINNER Victor :


--
keywords: +patch
pull_requests: +12651
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36559] "import random" should import hashlib on demand (nor load OpenSSL)

2019-04-08 Thread STINNER Victor


New submission from STINNER Victor :

Currently, when the random module is imported, the hashlib module is always 
imported which loads the OpenSSL library, whereas hashlib is only needed when a 
Random() instance is created with a string seed.

For example, "rnd = random.Random()" and "rnd = random.Random(12345)" don't 
need hashlib.

Example on Linux:

$ python3
Python 3.7.2 (default, Mar 21 2019, 10:09:12) 
>>> import os, sys

>>> 'hashlib' in sys.modules
False
>>> res=os.system(f"grep ssl /proc/{os.getpid()}/maps")

>>> import random
>>> 'hashlib' in sys.modules
True
>>> res=os.system(f"grep ssl /proc/{os.getpid()}/maps")
7f463ec38000-7f463ec55000 r--p  00:2a 5791335
/usr/lib64/libssl.so.1.1.1b
7f463ec55000-7f463eca5000 r-xp 0001d000 00:2a 5791335
/usr/lib64/libssl.so.1.1.1b
...


Attached PR only imports hashlib on demand.

Note: I noticed this issue while working on adding OpenSSL 1.1.1 support to 
Python 3.4 :-)

--
components: Library (Lib)
messages: 339637
nosy: vstinner
priority: normal
severity: normal
status: open
title: "import random" should import hashlib on demand (nor load OpenSSL)
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com