Re: [Python-Dev] Variant of removing GIL.

2005-09-17 Thread Luis P Caamano
On 9/17/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 
 Message: 9
 Date: Fri, 16 Sep 2005 21:07:23 -0500
 From: [EMAIL PROTECTED]
 Subject: Re: [Python-Dev] Variant of removing GIL.
 Message-ID: [EMAIL PROTECTED]
 
 
 Martin However, this is really hard to do correctly - if it were
 Martin simple, it would have been done long ago.
 
 I don't believe difficulty is the only (or primary) barrier.  I think
 *someone* would have tackled it since Greg Stein did back in 1.4(?) or his
 free-threading changes would have been incorporated into the core had they
 yielded speedups on multiprocessors and not hurt performance on
 uniprocessors.
 
 Skip
 

It did yield speedups on multiprocessors.  The uniprocessor part
could've been solved just like most kernels do, one binary for
UP and another for MP.  That's what IBM, RedHat, Solaris, and
almost all other modern kernels that support SMP machines
do.

In theory, if we had those changes in the CPython interpreter, we
could've been running at 1.6 times the speed on dual processor
machines today (according to Greg's benchmark data) and at the
same speed on UP machines running the UP compiled CPython
interpreter, which would not have had all the locking calls not
needed on a UP machine that would hurt its performance.

By now, we probably could've improved on the scalability of
MP performance when running on machines with more
than three processors.

Mind you though, I'm not trying to oversimplify the issue.
I was not using python yet at that time (I started around
1.5/1.6) and I didn't see all the info involved in the decision
making process, so I'm sure there were other issues that
contributed to the decision of not keeping Greg's free
threading changes.

My point is that not yielding speedups on multiprocessors
and hurting performance on uniprocessors is not a good
or valid reason to drop free-threading.

-- 
Luis P Caamano
Atlanta, GA USA
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Variant of removing GIL.

2005-09-17 Thread Phillip J. Eby
At 12:32 PM 9/17/2005 -0400, Luis P Caamano wrote:
My point is that not yielding speedups on multiprocessors
and hurting performance on uniprocessors is not a good
or valid reason to drop free-threading.

It is if you have only volunteers to maintain the code base, and the 
changes significantly increase maintenance complexity.  Also, a significant 
number of third-party C extensions would need to be modified for 
compatibility, as has already been pointed out.

Note also that Jython and IronPython exist, and run on VMs that address 
these issues, and that the PyPy project can generate code for many kinds of 
backends.  There's nothting stopping anybody from creating a 
multiprocessor-friendly backend for PyPy, for example.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Variant of removing GIL.

2005-09-17 Thread Martin v. Löwis
Luis P Caamano wrote:
 Mind you though, I'm not trying to oversimplify the issue.
 I was not using python yet at that time (I started around
 1.5/1.6) and I didn't see all the info involved in the decision
 making process, so I'm sure there were other issues that
 contributed to the decision of not keeping Greg's free
 threading changes.

For historical correctness, I believe there never was a
decision to not keep Greg's free threading changes.
I believe Greg never actually contributed them (at least
not in a publically-visible manner). This, in turn appears
to be the result of the problem that nobody (including
Greg) was able to tell whether the patches are actually
correct (for extension modules, it appears there was
agreement that the patches are *not* correct).

(more correctly, it appears that some of the code made
 it to Python 1.5)

Instead, the issue mainly died because nobody provided
working code (along with a strategy on what to do with
the existing extension modules).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Variant of removing GIL.

2005-09-17 Thread Luis P Caamano
On 9/17/05, Phillip J. Eby [EMAIL PROTECTED] wrote:
 At 12:32 PM 9/17/2005 -0400, Luis P Caamano wrote:
 My point is that not yielding speedups on multiprocessors
 and hurting performance on uniprocessors is not a good
 or valid reason to drop free-threading.
 
 It is 

No, it's not because it's not true.  

 if you have only volunteers to maintain the code base, and the
 changes significantly increase maintenance complexity.  

This is one very valid reason.

 Also, a significant
 number of third-party C extensions would need to be modified for
 compatibility, as has already been pointed out.

And another.

 
 Note also that Jython and IronPython exist, and run on VMs that address
 these issues, and that the PyPy project can generate code for many kinds of
 backends.  There's nothting stopping anybody from creating a
 multiprocessor-friendly backend for PyPy, for example.

Yes, eventually when we have CPython running on more SMP
or equivalent machines than UP machines, the majority will
have an itch and it will get scratched.

-- 
Luis P Caamano
Atlanta, GA USA
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Variant of removing GIL.

2005-09-17 Thread Luis P Caamano
On 9/17/05, Martin v. Löwis [EMAIL PROTECTED] wrote:

 
 Instead, the issue mainly died because nobody provided
 working code (along with a strategy on what to do with
 the existing extension modules).
 

When I first started writing python code I had just come out
of about 6 years of kernel development (1994-2000).  These 6 years
span the times when we were making changes to UP kernels
to work correctly and efficiently on SMP machines, implementing
kernel threads, and thread-safing libc and other changes needed
to move from the now defunct user space DCE-threads to posix threads
including kernel threads and even MxN support.  So, I was very
familiar with thread programming.

I architected, designed, and developed a bunch of distributed
servers and services using python and Pyro and it all was done
using nice and cool thread programming practices and techniques.
It was nice to see Python supported threads.

Close to a year into this when we had a lot of functionality implemented
I started running scalability tests and I was shocked and appalled
to find out that our architecture did not scale well on SMP machines
because we depended on threads.  And that's when I discovered
the GIL and twisted and other process based python techniques
used to get around the GIL.  It was too late for us to redo things and
we've been in a holding pattern since then waiting to see what will hit
us first, the need to scale a lot, which means we'd have to drop some
time to implement some changes to support process based scalability,
or no GIL.  I have my money on process based scalability. :(  

I'm sure that has happened to a lot of people because nobody finds
about the GIL in the beginning of their python development
experience.  If I started writing a new complex app from scratch,
now I know exactly what to do.

One big problem is that a lot of people defend the GIL with the
premise that the GIL is OK because we release the GIL in
C extensions and during IO.  That is true, the GIL is not as
bad as it would've been if the CPython interpreter extensions
didn't do that but it's also true that it hurts performance on
SMP machines in more than one way.   When I found out about
this it dawn on me that the performance could've been worse
although it was still way below what we expected.

I don't remember exactly how but what I remember is that when
I presented the problem to Guido he just told me to put up or
shut up.  At first I was insulted but then I realized that it was
a fair response given the complexity of the problem, previous
history I didn't know about, the fairly good compromise of
releasing the GIL in extensions, and the lack of a large
percentage of python developers asking for better performance
on SMP machines.

I couldn't put up so ...  I think I've gone beyond my quota :-)

PS  

The GIL is probably my only peeve about the CPython
interpreter.  So don't get me wrong, I love the language and I'm
always grateful for all the hard work you guys put in developing
such great language and implementation.

-- 
Luis P Caamano
Atlanta, GA USA
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Variant of removing GIL.

2005-09-16 Thread Martin v. Löwis
Sokolov Yura wrote:
 I think I know how to remove GIL Obviously I am an idiot.

Not an idiot, just lazy :-) Please try to implement your ideas,
and I predict that you will find:
1. it is a lot of work to implement
2. it requires changes to all C files, in particular to extension
   modules outside the Python source tree proper.
3. performing the conversion, even in a semi-mechanical way, will
   introduce many new bugs, in the form of race conditions because
   of missing locks.

Optionally, you may also find that the performance of the
interpreter will decrease.

I haven't really tried to completely understand your proposal, but
you are right, in principle, that a global lock can be replaced with
more fine-grained locks. However, this is really hard to do
correctly - if it were simple, it would have been done long ago.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Variant of removing GIL.

2005-09-16 Thread skip

Martin However, this is really hard to do correctly - if it were
Martin simple, it would have been done long ago.

I don't believe difficulty is the only (or primary) barrier.  I think
*someone* would have tackled it since Greg Stein did back in 1.4(?) or his
free-threading changes would have been incorporated into the core had they
yielded speedups on multiprocessors and not hurt performance on
uniprocessors.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Variant of removing GIL.

2005-09-15 Thread Sokolov Yura
Excuse my English.

I think I know how to remove GIL Obviously I am an idiot.

First about Py_INCREF and Py_DECREF.

We should not remove GIL at all. We should change it.

It must be one writer-many reader in a following semantic:

Lock has a read-counter and a write-counter. Initially both are 0.

When reader tries to acquire lock for read it sleeps until 
write-counter is 0.
When he reader acquires lock, he increase read-counter.
When reader releases lock, he decreases read-counter.
One reader will not block other, since he not increases write-counter.
Reader will sleep, if there is any waiting writers, since they are 
increase write-counter.

When writer tries to acquire lock for write, he increase 
write-counter and
sleeps until read-counter happens 0. For writers lock for write - 
simple lock.
when writer release lock, he decrease write-counter.
When there is no waiting writers, readers arise.

Excuse me for telling obviouse things. I am really reinvent wheel in my 
head,
since I was a bad studient.

I think this kind of lock is native for linux (i saw it in a kernel 
source, but do not know
is waiting writer locks new readers or not?).

Now, every thread keep an queue of objects to decref. It can be 
implemented as array, cause
it will be freed at once.

Initially, every object acquires GIL for read.
Py_INCREF works as usually,
Py_DECREF places a ref into a queue.
When queue has became full or 100 instruction left ( :-) , it usefull),
thread releases GIL for read and acquires for write,
when he acquire it, he decrefs all objects stored in a queue and clear 
queue.
After all he acquires GIL for read.


But what could we do with changing objects (dicts,lists and another)?

There should be a secondary one-writer-many-reader public-write GIL 
-  PWGIL.
SGIL ought to be more complicated, since it should work in RLOCK 
semantic for write lock.
Lets call this lock ROWMR(reentreed one writer - many reader)

So semantic for ROWMR can be:

When a thread acquires ROWMR lock, it acquires it at a read level.
Lets name it write-level=0.
While threads write-level=0 it is a reader.
Thread can increase write-level.
When he turns write-level from 0 to 1, he becomes writer.
while write-level0, thread is writer.
Thread can decrease write-level.
When write-level turns from 1 to 0, thread becomes reader.

With PWGIL :
We can mark every _mutable_ object with a creator thread number.
If mark match current thread number, object is private for the thread.
If mark is 0 (or another imposible thread number) object is public.
If mark !=0 and !=current thread number, object is alien.
When we access _mutable_ object, we check is it private?
If it is, we can do anything without locking.
If it is not and we access for read, we check is it public.
   If yes (read of public), then we can read it without locking.
   If no, we increase write-level,
if object is alien, make it public,
if we need to change object, change it,
decrease write-level.
Of couse, when we append object to public collection, we chould make 
it public,
 write-level is already  increased so we do not make many separate 
locks, but
when we then will access thouse object for read, we will not lock for 
make it public.

I don't know, how nested scopes are implemented, but i think it should 
be considered as a mutable object.

So there is a small overhead for a single threaded application
( only for comparing 2 numbers)
 and in a big part of multithreaded, since we are locking only writting on
_mutable_ _public_ objects. Most part of public objects is not 
accessed to write
often: they are numbers, classes and mostly-read collections.
And one can optimize a program by accumulating results in a private 
collection
and then flush it to public one.
Also, there may be a statement for explicit increasing write-level 
around big update
of public object and decreasing after it.

PWGIL also must be released and reacquired with every 100 instructions 
left,but only if write-level=0,
 it conforms to current GIL semantic.
I think, it must be not released with flushing decref queue, since it 
can happen while we are in C code.
And there must be strong think about blocking IO.

Mostly awful situation (at my point of view):
object O is private for a thread A.
thread B accesses O and try to mark it public, so it locks in attempt 
of increasing write-level
thread A starts to change O (it is in write-level 0), and in a C code 
it releases PWGIL
  (around blocking IO, for example).
thread B becomes writer, changes object to public, becomes reader 
and starts to read O,
returning thread A continue to change O , remaining in a write-level=0.

But, I think, well written C code should not attemt to make blocking IO 
inside of changing non-local objects
 (and it does not attempt at the moment, as I guess. Am I mistaken?). 
Or/and, when it returns and continues
to change O, it must check, is it private or it isn't?

I think, big part of checks 

Re: [Python-Dev] Variant of removing GIL.

2005-09-15 Thread Sokolov Yura
Corrections:

Now, every thread keeps one n queue of objects to incref and second 
queue to decref.
Both can be implemented as array, cause they will be freed at once.

Initially, every thread acquires GIL for read.
Py_INCREF places a ref into a incref queue of a thread,
Py_DECREF places a ref into a decref queue of a thread.
When queue has became full or 100 instruction left ( :-) , it usefull),
thread releases GIL for read and acquires for write,
when first process acquire it he:
  walk throw all incref queues in all threads, incref all thouse 
refs, and clear queues.
  then walk throw all decref queues in all threads, decref all 
thouse refs, and clear queues
After all he acquires GIL for read. Other processes could stupidly 
repeat it founding clear queues.
Since there only one thread works as a garbage collector, we will not 
loose any incref and decref.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com