Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-12 Thread PJ Eby
On Wed, Jan 11, 2012 at 7:01 PM, Mike Meyer  wrote:

> On Wed, 4 Jan 2012 00:07:27 -0500
> PJ Eby  wrote:
>  > On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer  wrote:
> > > For
> > > instance, combining STM with explicit locking would allow explicit
> > > locking when IO was required,
> > I don't think this idea makes any sense, since STM's don't really
> > "lock", and to control I/O in an STM system you just STM-ize the
> > queues. (Generally speaking.)
>
> I thought about that. I couldn't convince myself that STM by itself
> sufficient. If you need to make irreversible changes to the state of
> an object, you can't use STM, so what do you use? Can every such
> situation be handled by creating "safe" values then using an STM to
> update them?
>

If you need to do something irreversible, you just need to use an
STM-controlled queue, with something that reads from it to do the
irreversible things.  The catch is that your queue design has to support
guaranteed-successful item removal, since if the dequeue transaction fails,
it's too late.  Alternately, the queue reader can commit removal first,
then perform the irreversible operation...  but leave open a short window
for failure.  It depends on the precise semantics you're looking for.

In either case, though, the STM is pretty much sufficient, given a good
enough queue data structure.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-11 Thread Matt Joiner
On Thu, Jan 12, 2012 at 11:01 AM, Mike Meyer  wrote:
> On Wed, 4 Jan 2012 00:07:27 -0500
> PJ Eby  wrote:
>
>> On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer  wrote:
>> > A suite is marked
>> > as a `transaction`, and then when an unlocked object is modified,
>> > instead of indicating an error, a locked copy of it is created to be
>> > used through the rest of the transaction. If any of the originals
>> > are modified during the execution of the suite, the suite is rerun
>> > from the beginning. If it completes, the locked copies are copied
>> > back to the originals in an atomic manner.
>> I'm not sure if "locked" is really the right word here.  A private
>> copy isn't "locked" because it's not shared.
>
> Do you have a suggestion for a better word? Maybe the "safe" state
> used elsewhere?
>
>> > For
>> > instance, combining STM with explicit locking would allow explicit
>> > locking when IO was required,
>> I don't think this idea makes any sense, since STM's don't really
>> "lock", and to control I/O in an STM system you just STM-ize the
>> queues. (Generally speaking.)
>
> I thought about that. I couldn't convince myself that STM by itself
> sufficient. If you need to make irreversible changes to the state of
> an object, you can't use STM, so what do you use? Can every such
> situation be handled by creating "safe" values then using an STM to
> update them?
>
>        ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com

IMHO STM by itself isn't sufficient. Either immutability, or careful
use of references protected by STM amounting to the same are the only
reasonable ways to do it. Both also perform much better than the
alternatives.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-11 Thread Mike Meyer
On Wed, 4 Jan 2012 00:07:27 -0500
PJ Eby  wrote:

> On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer  wrote:
> > A suite is marked
> > as a `transaction`, and then when an unlocked object is modified,
> > instead of indicating an error, a locked copy of it is created to be
> > used through the rest of the transaction. If any of the originals
> > are modified during the execution of the suite, the suite is rerun
> > from the beginning. If it completes, the locked copies are copied
> > back to the originals in an atomic manner.
> I'm not sure if "locked" is really the right word here.  A private
> copy isn't "locked" because it's not shared.

Do you have a suggestion for a better word? Maybe the "safe" state
used elsewhere?

> > For
> > instance, combining STM with explicit locking would allow explicit
> > locking when IO was required,
> I don't think this idea makes any sense, since STM's don't really
> "lock", and to control I/O in an STM system you just STM-ize the
> queues. (Generally speaking.)

I thought about that. I couldn't convince myself that STM by itself
sufficient. If you need to make irreversible changes to the state of
an object, you can't use STM, so what do you use? Can every such
situation be handled by creating "safe" values then using an STM to
update them?

   http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Proposed PEP on concurrent programming support

2012-01-04 Thread Jim Jewett
(I've added back python-ideas, because I think that is still the
appropriate forum.)

> A new
> suite type - the ``transaction`` will be added to the language. The
> suite will have the semantics discussed above: modifying an object in
> the suite will trigger creation of a thread-local shallow copy to be
> used in the Transaction. Further modifications of the original will
> cause all existing copies to be discarded and the transaction to be
> restarted. ...

How will you know that an object has been modified?

The only ways I can think of are

(1)  Timestamp every object -- or at least every mutable object -- and
hope that everybody agrees on which modifications should count.

(2)  Make two copies of every object you're using in the suite; at the
end, compare one of them to both the original and the one you were
operating on.  With this solution, you can decide for youself what
counts as a modification, but it still isn't straightforward; I would
consider changing a value to be changing a dict, even though
nothing in the item (header) itself changed.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed PEP on concurrent programming support

2012-01-03 Thread PJ Eby
On Tue, Jan 3, 2012 at 7:40 PM, Mike Meyer  wrote:

> STM is a relatively new technology being experimented with in newer
> languages, and in a number of 3rd party libraries (both Peak [#Peak]_
> and Kamaelia [#Kamaelia]_ provide STM facilities).


I don't know about Kamaelia, but PEAK's STM (part of the Trellis
event-driven library) is *not* an inter-thread concurrency solution: it's
actually used to sort out the order of events in a co-operative
multitasking scenario.  So, it should not be considered evidence for the
practicality of doing inter-thread co-ordination that way in pure Python.


A suite is marked
> as a `transaction`, and then when an unlocked object is modified,
> instead of indicating an error, a locked copy of it is created to be
> used through the rest of the transaction. If any of the originals are
> modified during the execution of the suite, the suite is rerun from
> the beginning. If it completes, the locked copies are copied back to
> the originals in an atomic manner.
>

I'm not sure if "locked" is really the right word here.  A private copy
isn't "locked" because it's not shared.


The disadvantage is that any code in a transaction must be safe to run
> multiple times.  This forbids any kind of I/O.
>

More precisely, code in a transaction must be *reversible*, so it doesn't
forbid any I/O that can be undone.  If you can seek backward in an input
file, for example, or delete queued output data, then it can still be done.
 Even I/O like re-drawing a screen can be made STM safe by making the
redraw occur after a transaction that reads and empties a buffer written by
other transactions.


For
> instance, combining STM with explicit locking would allow explicit
> locking when IO was required,


I don't think this idea makes any sense, since STM's don't really "lock",
and to control I/O in an STM system you just STM-ize the queues.
 (Generally speaking.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Proposed PEP on concurrent programming support

2012-01-03 Thread Mike Meyer
PEP: XXX
Title: Interpreter support for concurrent programming
Version: $Revision$
Last-Modified: $Date$
Author: Mike Meyer 
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 11-Nov-2011
Post-History: 


Abstract


The purpose of this PEP is to explore strategies for making concurrent
programming in Python easier by allowing the interpreter to detect and
notify the user about possible bugs in concurrent access. The reason
for doing so is that "Errors should never pass silently".

Such bugs are caused by allowing objects to be accessed simultaneously
from another thread of execution while they are being modified.
Currently, python systems provide no support for such bugs, falling
back on the underlying platform facilities and some tools built on top
of those.  While these tools allow prevention of such modification if
the programmer is aware of the need for them, there are no facilities
to detect that such a need might exist and warn the programmer of it.

The goal is not to prevent such bugs, as that depends on the
programmer getting the logic of the interactions correct, which the
interpreter can't judge.  Nor is the goal to warn the programmer about
any such modifications - the goal is to catch standard idioms making
unsafe modifications.  If the programmer starts tinkering with
Python's internals, it's assumed they are aware of these issues.


Rationale
=

Concurrency bugs are among the hardest bugs to locate and fix.  They
result in corrupt data being generated or used in a computation.  Like
most such bugs, the corruption may not become evident until much later
and far away in the program.  Minor changes in the code can cause the
bugs to fail to manifest.  They may even fail to manifest from run to
run, depending on external factors beyond the control of the
programmer.

Therefore any help in locating and dealing with such bugs is valuable.
If the interpreter is to provide such help, it must be aware of when
things are safe to modify and when they are not. This means it will
almost certainly cause incompatible changes in Python, and may impose
costs so high for non-concurrent operations as to make it untenable.
As such, the final options discussed are destined for Python version 4
or later, and may never be implemented in any mainstream
implementation of Python.

Terminology
===

The word "thread" is used throughout to mean "concurrent thread of
execution".  Nominally, this means a platform thread.  However, it is
intended to include any threading mechanism that allows the
interpreter to change threads between or in the middle of a statement
without the programmer specifically allowing this to happen.

Similarly, the word "interpreter" means any system that processes and
executes Python language files.  While this normally means cPython,
the changes discussed here should be amenable to other
implementations.


Concept
===

Locking object
--

The idea is that the interpreter should indicate an error anytime an
unlocked object is mutated.  For mutable types, this would mean
changing the value of the type. For Python class instances, this would
mean changing the binding of an attribute.  Mutating an object bound
to such an attribute isn't a change in the object the attribute
belongs to, and so wouldn't indicate an error unless the object bound
to the attribute was unlocked.

Locking by name
---

It's also been suggested that locking "names" would be useful.  That
is, to prevent a specific attribute of an object from being rebound,
or a key/index entry in a mapping object. This provides a finer
grained locking than just locking the object, as you could lock a
specific attribute or set of attributes of an object, without locking
all of them.

Unfortunately, this isn't sufficient: a set may need to be locked to
prevent deletions for some period, or a dictionary to prevent adding a
key, or a list to prevent changing a slice, etc.

So some other locking mechanism is required.  If that needs to specify
objects, some way of distinguishing between locking a name and locking
the object bound to the name needs to be invented, or there needs to
be two different locking mechanisms.  It's not clear that the finer
grained locking is worth adding yet another language mechanism.


Alternatives



Explicit locking


These alternatives requires that the programmer explicitly name
anything that is going to be changed to lock it before changing it.
This lets the interpreter gets involved, but makes a number of errors
possible based on the order that locks are applied.

Platform locks
''

The current tool set uses platform locks via a C extension.  The
problem with these is that the interpreter has no knowledge of them,
and so can't do anything about detecting the mutation of unlocked
objects.


A ``locking`` keyword
'

Adding a statement to tell the interpreter to lock objects for the
attached su