[Python-Dev] Re: Question about code in test_email/test_message.py

2022-07-07 Thread Stefan Ring
> In that file there is a variable named message_params that is initialized 
> with about 15 different sets of test data, in class TestEmailMessageBase, but 
> is never referenced again. I even grepped for the variable name in all test 
> py files to confirm that it isn't somehow imported somewhere.
>
> Is there something about the test framework that uses that automatically? I'm 
> hesitant to add any testing code for my issue if I'm missing something so 
> fundamental.

Have a look at the "parameterize" decorator (in __init__.py in the
same directory). It looks for things that .endswith('_params').
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PJOT32I5SMOHEGMTIOI5RNM6ZT2IDBR7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Can't sync cpython main to my fork

2021-05-06 Thread Stefan Ring
On Thu, May 6, 2021 at 2:46 PM Skip Montanaro  wrote:
>
> I looked at the fast-forward stuff in 'git push --help' but couldn't
> decipher what it told me, or more importantly, how it related to my
> problem. It's not clear to me how python/cpython:main can be behind
> smontanaro/cpython:main.

Just open it in a browser, and it says: "This branch is 186 commits
ahead, 120 commits behind python:main"
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OBC4G2XIRS56CVEW37IKRBVXLUDL27PW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: pth file encoding

2021-03-17 Thread Stefan Ring
On Wed, Mar 17, 2021 at 6:37 PM Steve Dower  wrote:
>
> On 3/17/2021 8:00 AM, Michał Górny wrote:
> > How about writing paths as bytestrings in the long term?  I think this
> > should eliminate the necessity of knowing the correct encoding for
> > the filesystem.
>
> That's what we're trying to do, the problem is that they start as
> strings, and so we need to convert them to a bytestring.
>
> That conversion is the encoding ;)
>
> And yeah, for reading, I'd use a UTF-8 reader that falls back to locale
> on failure (and restarts reading the file). But for writing, we need the
> tools that create these files (including Notepad!) to use the encoding
> we want.

A somewhat radical idea carrying this to the extreme would be to use
UTF-16 (LE) on Windows. After all, this _is_ the native file system
encoding, and Notepad will happily read and write it.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WRAW4UI3X3WYMQ3FMIERDKTVD6WKD5S2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 0.9.1

2021-02-18 Thread Stefan Ring
On Thu, Feb 18, 2021 at 10:10 AM Larry Hastings  wrote:
> Call me crazy, but... shouldn't they be checked in?  I thought we literally 
> had every revision going back to day zero.  It should be duck soup to 
> recreate the original sources--all you need is the correct revision number.

It seems to be mostly there, but the directory structure is completely
different. And the demo directory, which constitutes a significant
part of the original "distribution", is absent.

> CVS to SVN to HG to GIT, oh my,

Yeah, back in CVS times it was customary to move files around in the
repository. I’m guilty of this myself. ;)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WT6VDEQANVFLDNXHGJVFJ232MQS6EJQ2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 0.9.1

2021-02-17 Thread Stefan Ring
On Wed, Feb 17, 2021 at 7:33 AM Steven D'Aprano  wrote:
>
> On Tue, Feb 16, 2021 at 05:49:49PM -0600, Skip Montanaro wrote:
>
> > If someone knows how to get the original Usenet messages from what Google
> > published, let me know.
>
> I don't have those, but I do have a copy of Python 0.9.1 with unmangled
> scripts.
>
> $ ls -lh Python-0.9.1.tar.gz
> -rwxr-xr-x 1 steve steve 379K Nov  5  2009 Python-0.9.1.tar.gz
>
> I don't remember where I got it from, but it compiled on CentOS release
> 5.11, I'm not sure if it will compile on anything newer.

I guess you got it from here: https://www.python.org/download/releases/early/

Compared to the original, this one has a lot of whitespace changes.
Mostly tabs -> spaces.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WOGPEM2JABEG6ZSCD63YGCMTRBVR4BWZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 0.9.1

2021-02-16 Thread Stefan Ring
It was not that bad, though:
https://github.com/smontanaro/python-0.9.1/compare/main...Ringdingcoder:original
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DYL6AVU6IN4FNGONQO3MSGTFIUENWJWF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 0.9.1

2021-02-16 Thread Stefan Ring
> When I see diffs like this (your git vs. the unshar result) I tend to
> trust unshar more:

Sorry, it was not you. I meant the github repo from this e-mail thread.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CZFFUBW52HBIGMNLJZME46EQLMKZVUZA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Python 0.9.1

2021-02-16 Thread Stefan Ring
On Wed, Feb 17, 2021 at 7:33 AM Steven D'Aprano  wrote:
>
> On Tue, Feb 16, 2021 at 05:49:49PM -0600, Skip Montanaro wrote:
>
> > If someone knows how to get the original Usenet messages from what Google
> > published, let me know.
>
> I don't have those, but I do have a copy of Python 0.9.1 with unmangled
> scripts.
>
> $ ls -lh Python-0.9.1.tar.gz
> -rwxr-xr-x 1 steve steve 379K Nov  5  2009 Python-0.9.1.tar.gz
>
> I don't remember where I got it from, but it compiled on CentOS release
> 5.11, I'm not sure if it will compile on anything newer.
>
> Skip, if you would like me to email it to you privately, let me know.
> (Likewise for anyone else.)

The original ones are here:
http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/volume91/Feb/
Look at http://ftp.fi.netbsd.org/pub/misc/archive/alt.sources/index.gz
for the associating subjects with file names. As far as I can tell,
they extract flawlessly using unshar.

When I see diffs like this (your git vs. the unshar result) I tend to
trust unshar more:

--- a/README
+++ b/README
@@ -41,7 +41,7 @@ I am the author of Python:
1098 SJ  Amsterdam
The Netherlands

-   E-mail: gu...@cwi.nl
+   E-mail: gu...@cwi.nl

--- a/doc/mod.tex
+++ b/doc/mod.tex
@@ -17,7 +17,7 @@
 \itembreak
 }

-   itle{\bf
+\title{\bf
Python Library Reference \\
(DRAFT)
 }
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/M22TFWZRACVXGLNHDHLWJ5FHUZBAYDEL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Speeding up CPython

2020-10-21 Thread Stefan Ring
On Wed, Oct 21, 2020 at 3:51 AM Gregory P. Smith  wrote:
>
> meta: i've written too many words and edited so often i can't see my own 
> typos and misedits anymore.  i'll stop now. :)

Haha! Very interesting background, thank you for writing down all of this!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CFC4XSS5AR2MBEDSQTNTL4GKH63RKVRZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: REPL output bug

2020-06-15 Thread Stefan Ring
> Now run the same code inside the REPL:
>
> Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32
> bit (Intel)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import sys, time
>  >>> for i in range(1,11):
> ... sys.stdout.write('\r%d' % i)
> ... time.sleep(1)
> ...
> 12
> 22
> 32
> 42
> 52
> 62
> 72
> 82
> 92
> 103
>  >>>
>
> It appears that the requested characters are output, *followed by* the
> number of characters output
> (which is the value returned by sys.stdout.write) and a newline.
> Surely this is not the intended behaviour.
> sys.stderr behaves the same as sys.stdout.

Why not? I suppose it's intended this way. A behavior change like this
does not happen by accident.

>>> for i in range(3):
...  (lambda: 2)()
...
2
2
2
>>>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YP4YEXASEWEBS5DEPJKX76QVLPUSNHUX/
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-Dev] SEC: Spectre variant 2: GCC: -mindirect-branch=thunk -mindirect-branch-register

2018-09-20 Thread Stefan Ring
On Tue, Sep 18, 2018 at 8:38 AM INADA Naoki  wrote:

> I think this topic should split to two topics: (1) Guard Python
> process from Spectre/Meltdown
> attack from other process, (2) Prohibit Python code attack other
> processes by using
> Spectre/Meltdown.

(3) Guard Python from performance degradation by overly aggressive
Spectre "mitigation".
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why is Python for Windows compiled with MSVC?

2018-02-01 Thread Stefan Ring
> As much as Steve is unlikely to do the work to initiate and
> maintain support of these other tools—whether due to his employer's
> interests or his own—I too was unlikely to do work like this thread is
> asking. In fact, the chances I would have done it were zero because I was
> sitting on my couch upgrading our Visual Studio versions because it let me
> do better stuff at my day job, though I was always open to review patches
> that supported alternatives without major disruption. However, they never
> came. I suspect the same could be said of Martin and anyone else working in
> this area prior to that, because nothing has really changed.

It would be cool though if Microsoft started providing a
cross-compiler running on Linux. This could even be the only compiler
shipped with Visual Studio, now that Windows can run Linux userland.
Cross-compilers from Microsoft would not be totally unheard of. IIRC,
the last DOS versions (Visual C++ 1.5x) were Win32 binaries building
for DOS 16 bit. Technically speaking, using a 32 bit compiler for
building for 64 bit Windows or the other way around would probably
count as cross-compilation anyway.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Handle errors in cleanup code

2017-06-13 Thread Stefan Ring
On Tue, Jun 13, 2017 at 2:26 AM, Nathaniel Smith <n...@pobox.com> wrote:
> On Mon, Jun 12, 2017 at 6:29 AM, Stefan Ring <stefan...@gmail.com> wrote:
>>
>> > Yury in the comment for PR 2108 [1] suggested more complicated code:
>> >
>> > do_something()
>> > try:
>> > do_something_other()
>> > except BaseException as ex:
>> > try:
>> > undo_something()
>> > finally:
>> > raise ex
>>
>> And this is still bad, because it loses the back trace. The way we do it is:
>>
>> do_something()
>> try:
>> do_something_other()
>> except BaseException as ex:
>> tb = sys.exc_info()[2]
>> try:
>> undo_something()
>> finally:
>> raise ex, None, tb
>
> Are you testing on python 2? On Python 3 just plain 'raise ex' seems
> to give a sensible traceback for me...

Yes, on Python 2.

Interesting to know that this has changed in Python 3. I'll check this
out immediately.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Handle errors in cleanup code

2017-06-12 Thread Stefan Ring
> Yury in the comment for PR 2108 [1] suggested more complicated code:
>
> do_something()
> try:
> do_something_other()
> except BaseException as ex:
> try:
> undo_something()
> finally:
> raise ex

And this is still bad, because it loses the back trace. The way we do it is:

do_something()
try:
do_something_other()
except BaseException as ex:
tb = sys.exc_info()[2]
try:
undo_something()
finally:
raise ex, None, tb
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] User + sys time bigger than real time, in case of no real parallelism

2017-02-05 Thread Stefan Ring
> That is usually what I can expect in case of tasks executed in parallel on
> different CPUs. But my example should not be the case, due to the GIL. What
> am I missing? Thank you very much, and sorry again for the OT :(

With such finely intermingled thread activity, there might be a fair
bit of spinlock spinning involved. Additionally, I suspect that the
kernel does not track CPU time at microsecond precision and will tend
to round the used times up.

Obviously, this is not a reasonable way to use threads. The example is
only effective at producing lots of overhead.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Have problem when building python3.5.1 rpm with default SPEC file

2017-01-22 Thread Stefan Ring
> now that the SPEC file of fedora is open source, how about redhat's, how 
> could I get it?

Fedora's spec files lives here:
http://pkgs.fedoraproject.org/cgit/rpms/python3.git
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Convert from unsigned long long to PyLong

2016-07-22 Thread Stefan Ring
So to sum this up, you claim that PyLong_FromUnsignedLongLong can
somehow produce a number larger than the value range of a 64 bit
number (0x10180). I have a hard time believing this.

Most likely you are looking in the wrong place, mysql_affected_rows
returns 2^64-1, and some Python code later adds 0x181 to that number.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] under what circumstances can python still exhibit "high water mark" memory usage?

2015-10-14 Thread Stefan Ring
On Wed, Oct 14, 2015 at 3:11 PM, Chris Withers  wrote:
> I'm having trouble with some python processes that are using 3GB+ of memory
> but when I inspect them with either heapy or meliae, injected via pyrasite,
> those tools only report total memory usage to be 119Mb.
>
> This feels like the old "python high water mark" problem, but I thought that
> was fixed in 2.6/3.0?
> Under what circumstances can a Python process still exhibit high memory
> usage that tools like heapy don't know about?

Which Python version are you experiencing this with? I know that in
Python 2.7, having many floats (and I think also ints) active at once
creates a high water situation. Python 2.7 is what I have experience
with -- with heap sizes around 40 GB sometimes.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How do I ensure that my code is being executed?

2015-01-20 Thread Stefan Ring
On Tue, Jan 20, 2015 at 3:35 PM, Neil Girdhar mistersh...@gmail.com wrote:
 I get error:

 TypeError: init_builtin() takes exactly 1 argument (0 given)

 The only source file that can generate that error is
 Modules/_ctypes/_ctypes.c, but when I make changes to that file such as:

 PyErr_Format(PyExc_TypeError,
  call takes exactly %d arguments XYZABC (%zd given),
  inargs_index, actual_args);

 I do not see any difference after make clean and a full rebuild.  How is
 this possible?  I need to debug the arguments passed.

The message says argument, the source code says arguments (I
suppose that you only added the XYZABC), so this cannot be source of
this exception.

grep for given in ceval.c
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 22619 at bugs.python.org

2015-01-06 Thread Stefan Ring
On Tue, Jan 6, 2015 at 8:52 AM, Dmitry Kazakov jsb...@gmail.com wrote:
 Greetings.

 I'm sorry if I'm too insistent, but it's not truly rewarding to
 constantly improve a patch that no one appears to need. Again, I
 understand people are busy working and/or reviewing critical patches,
 but 2 months of inactivity is not right. Yes, I posted a message
 yesterday, but no one seemed to be bothered. In any case, I'll respect
 your decision about this patch and will never ask for a review of this
 patch again.

The later patches seem to miss the Mercurial header that would allow
the integrated review functionality on bugs.python.org to kick in (I
presume) and thus make it much easier to review.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Stefan Ring
On Fri, Jan 10, 2014 at 4:35 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On 10 January 2014 13:32, Lennart Regebro rege...@gmail.com wrote:
 No, because your environment have a default language. And Python has a
 default encoding. You only get problems when some file doesn't use the
 default encoding.

 The reason Python 3 currently tries to rely on the POSIX locale
 encoding is that during the Python 3 development process it was
 pointed out that ShiftJIS, ISO-2022 and various CJK codec are in
 widespread use in Asia, since Asian users needed solutions to the
 problem of representing kana, ideographs and other non-Latin
 characters long before the Unicode Consortium existed.

 This creates a problem for Python 3, as assuming utf-8 means we have a
 high risk of corrupting user's data at least in Asian locales, as well
 as anywhere else where non-UTF-8 encodings are common (especially when
 encodings that aren't ASCII compatible are involved).

From my experience, the concept of a default locale is deeply flawed.
What if I log into a (Linux) machine using an old latin-1 putty from
the Windows XP era, have most file names and contents in UTF-8
encoding, except for one directory where people from eastern Europe
upload files via FTP in whatever encoding they choose. What should the
default encoding be now?

That's why I make it a principle to always unset all LC_* and LANG
variables, except when working locally, which happens rather rarely.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python3 complexity

2014-01-09 Thread Stefan Ring
 just became harder to use for that purpose.

The entire discussion reminds me very much of the situation with file
names in OS X. Whenever I want to look at an old zip file or tarball
which happens to have been lying around on my hard drive for a decade
or more, I can't because OS X insist that file names be encoded in
UTF-8 and just throw errors if that requirement is not met. And
certainly I cannot be required to re-encode all files to the
then-favored encoding continually – although favors don’t change often
and I’m willing to bet that UTF-8 is here to stay, but it has already
happened twice in my active computer life (DOS - latin-1 - UTF-8).

Going back to the old tarballs, OS X is completely useless for
handling them as a result of their encoding decision, and I have to
move to a Linux machine which just does not care about encodings.

PS I was very relieved to find out that os.listdir() – jut to pick one
file name-related function – will still return bytes if requested, as
it is not at all uncommon (at least for me) to have conflicting file
name encodings in different parts of a filesystem.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Stefan Ring
 Yup, in fact, if I hadn't come up with the __read[gf]sword() trick,
 my only other option would have been TLS (or the GetCurrentThreadId
 /pthread_self() approach in the presentation).  TLS is fantastic,
 and it's definitely an intrinsic part of the solution (the Y part
 of if we're a parallel thread, do Y), but it definitely more
 costly than a simple FS/GS register read.

I think you should be able to just take the address of a static
__thread variable to achieve the same thing in a more portable way.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reworking the GIL

2009-11-23 Thread Stefan Ring
Hello,

I built something very similar for my company last year, and it’s been running
flawlessly in production at a few customer sites since, with avg. CPU usage ~50%
around the clock. I even posted about it on the Python mailing list [1] where
there was almost no resonance at that time. I never posted code, though --
nobody seemed to be too interested.

I am well aware that your current work is a lot more far-reaching than what I’ve
done, which is basically just a FIFO scheduler. I even added scheduling
priorities later which don’t work too great because the amount of time used for
a tick can vary by several orders of magnitude, as you know.

Thought you might be interested.

Regards
Stefan

[1] http://mail.python.org/pipermail/python-dev/2008-March/077814.html
[2] http://www.bestinclass.dk/index.php/2009/10/python-vs-clojure-evolving/
[3] www.dabeaz.com/python/GIL.pdf

PS On a slightly different note, I came across some Python bashing [2] yesterday
and somehow from there to David Beazley’s presentation about the GIL [3]. While
I don’t mind the bashing, the observations about the GIL seem quite unfair to me
because David’s measurements have been made on Mac OS X with its horribly slow
pthreads functions. I was not able to measure any slowdown on Linux.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reworking the GIL

2009-11-23 Thread Stefan Ring
 I built something very similar for my company last year, and it’s been running
 flawlessly in production at a few customer sites since, with avg. CPU usage 
 ~50%
 around the clock. I even posted about it on the Python mailing list [1] where
 there was almost no resonance at that time. I never posted code, though --
 nobody seemed to be too interested.

I've never bothered to make this tidy and nice, especially the
function naming (PySpecial_*) leaves some things to be desired. It's
not too bad, though; it just doesn't have commit-ready quality. I
don't worry about this anymore, so I just post what I have. Maybe
someone can make use of it.
--- Python-2.5.2/Include/pythread.h.scheduling	2006-06-13 17:04:24.0 +0200
+++ Python-2.5.2/Include/pythread.h	2008-10-16 14:46:07.0 +0200
@@ -40,6 +40,20 @@
 PyAPI_FUNC(void *) PyThread_get_key_value(int);
 PyAPI_FUNC(void) PyThread_delete_key_value(int key);
 
+#ifndef _POSIX_THREADS
+#error Requires POSIX threads
+#endif
+
+PyAPI_FUNC(void *) PyThread_mutex_alloc(void);
+PyAPI_FUNC(void) PyThread_mutex_free(void *);
+PyAPI_FUNC(void) PyThread_mutex_lock(void *);
+PyAPI_FUNC(void) PyThread_mutex_unlock(void *);
+
+PyAPI_FUNC(void *) PyThread_cond_alloc(void);
+PyAPI_FUNC(void) PyThread_cond_free(void *);
+PyAPI_FUNC(void) PyThread_cond_wait(void *, void *);
+PyAPI_FUNC(void) PyThread_cond_signal(void *);
+
 #ifdef __cplusplus
 }
 #endif
--- Python-2.5.2/Python/thread.c.scheduling	2006-07-21 09:59:47.0 +0200
+++ Python-2.5.2/Python/thread.c	2008-10-16 14:46:07.0 +0200
@@ -155,6 +155,56 @@
 #endif
 */
 
+void *PyThread_mutex_alloc(void)
+{
+	pthread_mutex_t *m = malloc(sizeof(pthread_mutex_t));
+	if (pthread_mutex_init(m, NULL))
+		Py_FatalError(PyThread_mutex_alloc: pthread_mutex_init failed);
+	return m;
+}
+
+void PyThread_mutex_free(void *m)
+{
+	if (pthread_mutex_destroy(m))
+		Py_FatalError(PyThread_mutex_free: pthread_mutex_destroy failed);
+	free(m);
+}
+
+void PyThread_mutex_lock(void *m)
+{
+	pthread_mutex_lock(m);
+}
+
+void PyThread_mutex_unlock(void *m)
+{
+	pthread_mutex_unlock(m);
+}
+
+void *PyThread_cond_alloc(void)
+{
+	pthread_cond_t *c = malloc(sizeof(pthread_cond_t));
+	if (pthread_cond_init(c, NULL))
+		Py_FatalError(PyThread_cond_alloc: pthread_cond_init failed);
+	return c;
+}
+
+void PyThread_cond_free(void *c)
+{
+	if (pthread_cond_destroy(c))
+		Py_FatalError(PyThread_cond_free: pthread_cond_destroy failed);
+	free(c);
+}
+
+void PyThread_cond_wait(void *c, void *m)
+{
+	pthread_cond_wait(c, m);
+}
+
+void PyThread_cond_signal(void *c)
+{
+	pthread_cond_signal(c);
+}
+
 /* return the current thread stack size */
 size_t
 PyThread_get_stacksize(void)
--- Python-2.5.2/Python/ceval.c.scheduling	2008-01-23 21:09:39.0 +0100
+++ Python-2.5.2/Python/ceval.c	2008-10-16 14:47:07.0 +0200
@@ -210,7 +210,31 @@
 #endif
 #include pythread.h
 
-static PyThread_type_lock interpreter_lock = 0; /* This is the GIL */
+typedef void *PySpecial_cond_type;
+
+struct special_linkstruct {
+	PySpecial_cond_type wait;
+	struct special_linkstruct *queue_next, *free_next;
+	int in_use;
+};
+
+typedef void *PySpecial_lock_type;
+
+typedef struct {
+	PySpecial_lock_type the_lock;
+	struct special_linkstruct *wait_queue, *wait_last, *free_queue;
+} PySpecialSemaphore;
+
+void
+PySpecial_init(PySpecialSemaphore *s)
+{
+	s-the_lock = PyThread_mutex_alloc();
+	s-wait_queue = NULL;
+	s-wait_last = NULL;
+	s-free_queue = NULL;
+}
+
+static PySpecialSemaphore *interpreter_lock = NULL; /* This is the GIL */
 static long main_thread = 0;
 
 int
@@ -219,26 +243,100 @@
 	return interpreter_lock != 0;
 }
 
+static PySpecialSemaphore *allocate_special(void)
+{
+	PySpecialSemaphore *s = malloc(sizeof(PySpecialSemaphore));
+	PySpecial_init(s);
+	return s;
+}
+
+static struct special_linkstruct *allocate_special_linkstruct(void)
+{
+	struct special_linkstruct *ls = malloc(sizeof(struct special_linkstruct));
+	ls-wait = PyThread_cond_alloc();
+	ls-queue_next = NULL;
+	ls-free_next = NULL;
+	ls-in_use = 0;
+	return ls;
+}
+
+static void PySpecial_Lock(PySpecialSemaphore *s)
+{
+	struct special_linkstruct *ls;
+
+	PyThread_mutex_lock(s-the_lock);
+
+	if (!s-free_queue)
+		s-free_queue = allocate_special_linkstruct();
+
+	ls = s-free_queue;
+	s-free_queue = ls-free_next;
+
+	if (!s-wait_queue)
+	{
+		ls-in_use = 1;
+		s-wait_queue = ls;
+		s-wait_last = ls;
+		PyThread_mutex_unlock(s-the_lock);
+		return;
+	}
+
+	assert(s-wait_queue != ls);
+	assert(s-wait_last != ls);
+	assert(s-wait_last-queue_next == NULL);
+	assert(!ls-in_use);
+	s-wait_last-queue_next = ls;
+	s-wait_last = ls;
+	ls-in_use = 1;
+
+	while (s-wait_queue != ls)
+		PyThread_cond_wait(ls-wait, s-the_lock);
+
+	PyThread_mutex_unlock(s-the_lock);
+}
+
+static void PySpecial_Unlock(PySpecialSemaphore *s)
+{
+	struct special_linkstruct *ls;
+
+	PyThread_mutex_lock(s-the_lock);
+	ls = s-wait_queue;
+	assert(ls-in_use);
+
+	s-wait_queue = ls-queue_next;
+	if (s-wait_queue)
+	{
+		

[Python-Dev] Improved thread switching

2008-03-19 Thread Stefan Ring

The company I work for has over the last couple of years created an
application server for use in most of our customer projects. It embeds Python
and most project code is written in Python by now. It is quite resource-hungry
(several GB of RAM, MySQL databases of 50-100GB). And of course it is
multi-threaded and, at least originally, we hoped to make it utilize multiple
processor cores. Which, as we all know, doesn't sit very well with Python. Our
application runs heavy background calculations most of the time (in Python)
and has to service multiple (few) GUI clients at the same time, also using
Python. The problem was that a single background thread would increase the
response time of the client threads by a factor of 10 or (usually) more.

This led me to add a dirty hack to the Python core to make it switch threads
more frequently. While this hack greatly improved response time for the GUI
clients, it also slowed down the background threads quite a bit. top would
often show significantly less CPU usage -- 80% instead of the more usual 100%.

The problem with thread switching in Python is that the global semaphore used
for the GIL is regularly released and immediately reacquired. Unfortunately,
most of the time this leads to the very same thread winning the race on the
semaphore again and thus more wait time for the other threads. This is where
my dirty patch intervened and just did a nanosleep() for a short amount of
time (I used 1000 nsecs).

I have then created a better scheduling scheme and written a small test
program that nicely mimics what Python does for some statistics. I call the
scheduling algorithm the round-robin semaphore because threads can now run in
a more or less round-robin fashion. Actually, it's just a semaphore with FIFO
semantics.

The implementation problem with the round-robin semaphore is the __thread
variable I had to use because I did not want to change the signature of the
Enter() and Leave() methods. For CPython, I have replaced this thread-local
allocation with an additional field in the PyThreadState. Because of that, the
patch for CPython I have already created is a bit more involved than the
simple nanosleep() hack. Consequently, it's not very polished yet and not at
all as portable as the rest of the Python core.

I now show you the results from the test program which compares all three
scheduling mechanisms -- standard python, my dirty hack and the new
round-robin semaphore. I also show you the test program containing the three
implementations nicely encapsulated.

The program was run on a quad-core Xeon 1.86 GHz on Fedora 5 x86_64. The first
three lines from the output (including the name of the algorithm) should be
self-explanatory. The fourth and the fifth show a distribution of wait times
for the individual threads. The ideal distribution would be everything on the
number of threads (2 in this case) and zero everywhere else. As you can see,
the round-robin semaphore is pretty close to that. Also, because of the high
thread switching frequency, we could lower Python's checkinterval -- the jury
is still out on the actual value, likely something between 1000 and 1.

I can post my Python patch if there is enough interest.

Thanks for your attention.


Synch: Python lock
iteration count: 24443
thread switches: 10
 1 2 3 4 5 6 7 8 910   -10   -50  -100  
 -1k more
 24433 0 0 0 0 0 0 0 0 0 0 1 1  
   6 0

Synch: Dirty lock
iteration count: 25390
thread switches: 991
 1 2 3 4 5 6 7 8 910   -10   -50  -100  
 -1k more
 2439910 0 0 0 0 1 0 1 0   975 1 1  
   0 0

Synch: round-robin semaphore
iteration count: 23023
thread switches: 22987
 1 2 3 4 5 6 7 8 910   -10   -50  -100  
 -1k more
36 22984 0 0 0 0 0 0 0 0 1 0 0  
   0 0
// compile with g++ -g -O0 -pthread -Wall p.cpp

#include pthread.h
#include semaphore.h

#include stdio.h
#include stdlib.h

#include string.h
#include errno.h
#include assert.h

//
// posix stuff

class TMutex {
pthread_mutex_t mutex;

static pthread_mutex_t initializer_normal;
static pthread_mutex_t initializer_recursive;
TMutex(const TMutex );
TMutex operator=(const TMutex );
public:
TMutex(bool recursive = true);
~TMutex() { pthread_mutex_destroy(mutex); }
void Lock() { pthread_mutex_lock(mutex); }
bool TryLock() { return pthread_mutex_trylock(mutex) == 0;}
void Unlock() { pthread_mutex_unlock(mutex); }

friend class TCondVar;
};

class TCondVar {
pthread_cond_t cond;

static pthread_cond_t initializer;
TCondVar(const TCondVar );
TCondVar operator=(const TCondVar );
public:
TCondVar();
~TCondVar() { pthread_cond_destroy(cond); }
void Wait(TMutex *mutex) { pthread_cond_wait(cond, 

Re: [Python-Dev] Improved thread switching

2008-03-19 Thread Stefan Ring
Adam Olsen rhamph at gmail.com writes:

 Can you try with a call to sched_yield(), rather than nanosleep()?  It
 should have the same benefit but without as much performance hit.
 
 If it works, but is still too much hit, try tuning the checkinterval
 to see if you can find an acceptable throughput/responsiveness
 balance.
 

I tried that, and it had no effect whatsoever. I suppose it would make an effect
on a single CPU or an otherwise heavily loaded SMP system but that's not the
secnario we care about.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improved thread switching

2008-03-19 Thread Stefan Ring
Adam Olsen rhamph at gmail.com writes:

 
 On Wed, Mar 19, 2008 at 10:09 AM, Stefan Ring s.r at visotech.at wrote:
  Adam Olsen rhamph at gmail.com writes:
 
Can you try with a call to sched_yield(), rather than nanosleep()?  It
should have the same benefit but without as much performance hit.
   
If it works, but is still too much hit, try tuning the checkinterval
to see if you can find an acceptable throughput/responsiveness
balance.
   
 
   I tried that, and it had no effect whatsoever. I suppose it would make an
effect
   on a single CPU or an otherwise heavily loaded SMP system but that's not 
  the
   secnario we care about.
 
 So you've got a lightly loaded SMP system?  Multiple threads all
 blocked on the GIL, multiple CPUs to run them, but only one CPU is
 active?  I that case I can imagine how sched_yield() might finish
 before the other CPUs wake up a thread.
 
 A FIFO scheduler would be the right thing here, but it's only a short
 term solution.  Care for a long term solution? ;)
 
 http://code.google.com/p/python-safethread/
 


I've already seen that but it would not help us in our current
situation. The performance penalty really is too heavy. Our system is
slow enough already ;). And it would be very difficult bordering on
impossible to parallelize Plus, I can imagine that all extension modules
(and our own code) would have to be adapted.

The FIFO scheduler is perfect for us because the load is typically quite
low. It's mostly at those times when someone runs a lengthy calculation
that all other users suffer greatly increased response times.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improved thread switching

2008-03-19 Thread Stefan Ring
Adam Olsen rhamph at gmail.com writes:

 So you want responsiveness when idle but throughput when busy?

Exactly ;)

 Are those calculations primarily python code, or does a C library do
 the grunt work?  If it's a C library you shouldn't be affected by
 safethread's increased overhead.
 

It's Python code all the way. Frankly, it's a huge mess, but it would be very
very hard to come up with a scalable solution that would allow to optimize
certain hotspots and redo them in C or C++. There isn't even anything left to
optimize in particular because all those low hanging fruit have already been
taken care of. So it's just ~30kloc Python code over which the total time spent
is quite uniformly distributed :(.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com