Re: Threadpool item mailboxes design problem

2013-04-15 Thread Charles Hixson

On 04/14/2013 07:32 PM, Chris Rebert wrote:


On Apr 14, 2013 4:27 PM, Charles Hixson charleshi...@earthlink.net 
mailto:charleshi...@earthlink.net wrote:


 What is the best approach to implementing actors that accept and 
post messages (and have no other external contacts).


You might look at how some of the existing Python actor libraries are 
implemented (perhaps one of these might even save you from reinventing 
the wheel):


http://www.pykka.org/en/latest/
http://www.kamaelia.org/Docs/Axon/Axon.html
https://pypi.python.org/pypi/pulsar

Kinda old:
http://candygram.sourceforge.net/contents.html
http://osl.cs.uiuc.edu/parley/

Candygram looks interesting.  I'd forgotten about it.  The others look 
either a bit limited (in different ways), or overly general, with the 
costs that that brings.  I'll need to study Candygram a bit more.  
However, even Candygram seems to have a RAM centric model that I'd need 
to work around.  (Well, the mailbox synchronization must clearly be RAM 
centric, but the storage shouldn't be.)


 So far what I've come up with is something like:
 actors = {}
 mailboxs = {}

 Stuff actors with actor instances, mailboxes with 
multiprocessing.queue instances.   (Actors and mailboxes will have 
identical keys, which are id#, but it's got to be a dict rather than a 
list, because too many are rolled out to disk.)  And I'm planning of 
having the actors running simultaneously and continually in a 
threadpool that just loops through the actors that are assigned to 
each thread of the pool.

snip
 It would, however, be better if the mailbox could be specific to the 
threadpool instance, so less space would be wasted.  Or if the queues 
could dynamically resize.  Or if there was a threadsafe dict.  Or... 
 But I don't know that any of these are feasible.  (I mean, yes, I 
could write all the mail to a database, but is that a better answer, 
or even a good one?)


My recollection is that the built-in collection types are threadsafe 
at least to the limited extent that the operations exposed by their 
APIs (e.g. dict.setdefault) are atomic.

Perhaps someone will be able to chime in with more details.

If list operations were threadsafe, why would multiprocessing.queue have 
been created?  I don't recall any claim that they were.  Still, I've 
found an assertion on StackOverflow that they are...at least for simple 
assignment and reading.  And the same would appear to be true of dicts 
from that post.  This *does* require that either the index be constant, 
and the stored value be constant, or that the code section be locked 
during the access.  Fortunately I'm intending to have id#s be 
unchangable, and the messages to be tuples, and thus constant.


OTOH, to use this approach I'll need to find some way to guarantee that 
removing messages and posting messages don't occur at the same time.  So 
that still means I'll need to lock each access.  The answer that seems 
best is for each thread to have a mailbox that cells within the thread 
post and read messages from.  This will automatically deal with internal 
to thread synchronization.  Then I'll need a mailman thread that...


This seems a promising approach, that avoids the problem of fixed length 
queues, but I'll still need to do a lot of synchronization.  Still, it's 
a lot less, and each thread would be locked for shorter amounts of time.


--
Charles Hixson

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threadpool item mailboxes design problem

2013-04-15 Thread Charles Hixson

On 04/15/2013 10:14 AM, Charles Hixson wrote:

On 04/14/2013 07:32 PM, Chris Rebert wrote:


On Apr 14, 2013 4:27 PM, Charles Hixson charleshi...@earthlink.net 
mailto:charleshi...@earthlink.net wrote:


 What is the best approach to implementing actors that accept and 
post messages (and have no other external contacts).


You might look at how some of the existing Python actor libraries are 
implemented (perhaps one of these might even save you from 
reinventing the wheel):


http://www.pykka.org/en/latest/
http://www.kamaelia.org/Docs/Axon/Axon.html
https://pypi.python.org/pypi/pulsar

Kinda old:
http://candygram.sourceforge.net/contents.html
http://osl.cs.uiuc.edu/parley/

Candygram looks interesting.  I'd forgotten about it.  The others look 
either a bit limited (in different ways), or overly general, with the 
costs that that brings.  I'll need to study Candygram a bit more.  
However, even Candygram seems to have a RAM centric model that I'd 
need to work around.  (Well, the mailbox synchronization must clearly 
be RAM centric, but the storage shouldn't be.)


 So far what I've come up with is something like:
 actors = {}
 mailboxs = {}

 Stuff actors with actor instances, mailboxes with 
multiprocessing.queue instances.   (Actors and mailboxes will have 
identical keys, which are id#, but it's got to be a dict rather than 
a list, because too many are rolled out to disk.)  And I'm planning 
of having the actors running simultaneously and continually in a 
threadpool that just loops through the actors that are assigned to 
each thread of the pool.

snip
 It would, however, be better if the mailbox could be specific to 
the threadpool instance, so less space would be wasted.  Or if the 
queues could dynamically resize.  Or if there was a threadsafe dict. 
 Or...  But I don't know that any of these are feasible.  (I mean, 
yes, I could write all the mail to a database, but is that a better 
answer, or even a good one?)


My recollection is that the built-in collection types are threadsafe 
at least to the limited extent that the operations exposed by their 
APIs (e.g. dict.setdefault) are atomic.

Perhaps someone will be able to chime in with more details.

If list operations were threadsafe, why would multiprocessing.queue 
have been created?  I don't recall any claim that they were.  Still, 
I've found an assertion on StackOverflow that they are...at least for 
simple assignment and reading.  And the same would appear to be true 
of dicts from that post.  This *does* require that either the index be 
constant, and the stored value be constant, or that the code section 
be locked during the access.  Fortunately I'm intending to have id#s 
be unchangable, and the messages to be tuples, and thus constant.


OTOH, to use this approach I'll need to find some way to guarantee 
that removing messages and posting messages don't occur at the same 
time.  So that still means I'll need to lock each access.  The answer 
that seems best is for each thread to have a mailbox that cells within 
the thread post and read messages from.  This will automatically deal 
with internal to thread synchronization.  Then I'll need a mailman 
thread that...


This seems a promising approach, that avoids the problem of fixed 
length queues, but I'll still need to do a lot of synchronization.  
Still, it's a lot less, and each thread would be locked for shorter 
amounts of time.

--
Charles Hixson

Currently it looks as if Pyro is the best option.  It appears that 
Python threads are very contentious, so I'll need to run in processes 
rather than in threads, but I'll still need to transfer messages back 
and forth.  I'll probably use UnixSockets rather than IP, but this could 
change...and using Pyro would make changing it easy.  Still, pickle is 
used in code transmission, and that makes IP a questionable choice.


--
Charles Hixson

-- 
http://mail.python.org/mailman/listinfo/python-list


Threadpool item mailboxes design problem

2013-04-14 Thread Charles Hixson
What is the best approach to implementing actors that accept and post 
messages (and have no other external contacts).


So far what I've come up with is something like:
actors = {}
mailboxs = {}

Stuff actors with actor instances, mailboxes with multiprocessing.queue 
instances.   (Actors and mailboxes will have identical keys, which are 
id#, but it's got to be a dict rather than a list, because too many are 
rolled out to disk.)  And I'm planning of having the actors running 
simultaneously and continually in a threadpool that just loops through 
the actors that are assigned to each thread of the pool.


This lets any actor post messages to the mailbox of any other actor that 
it has the id of, and lets him read his own mail without 
multi-processing clashes.  But I'm quite uncertain that this is the best 
way, because, if nothing else, it means that each mailbox needs to be 
allocated large enough to handle the maximum amount of mail it could 
possibly receive.  (I suppose I could implement some sort of wait 
awhile and try again method.)  It would, however, be better if the 
mailbox could be specific to the threadpool instance, so less space 
would be wasted.  Or if the queues could dynamically resize.  Or if 
there was a threadsafe dict.  Or...  But I don't know that any of these 
are feasible.  (I mean, yes, I could write all the mail to a database, 
but is that a better answer, or even a good one?)


--
Charles Hixson

--
http://mail.python.org/mailman/listinfo/python-list


Re: Threadpool item mailboxes design problem

2013-04-14 Thread Chris Rebert
On Apr 14, 2013 4:27 PM, Charles Hixson charleshi...@earthlink.net
wrote:

 What is the best approach to implementing actors that accept and post
messages (and have no other external contacts).

You might look at how some of the existing Python actor libraries are
implemented (perhaps one of these might even save you from reinventing the
wheel):

http://www.pykka.org/en/latest/
http://www.kamaelia.org/Docs/Axon/Axon.html
https://pypi.python.org/pypi/pulsar

Kinda old:
http://candygram.sourceforge.net/contents.html
http://osl.cs.uiuc.edu/parley/

 So far what I've come up with is something like:
 actors = {}
 mailboxs = {}

 Stuff actors with actor instances, mailboxes with multiprocessing.queue
instances.   (Actors and mailboxes will have identical keys, which are id#,
but it's got to be a dict rather than a list, because too many are rolled
out to disk.)  And I'm planning of having the actors running simultaneously
and continually in a threadpool that just loops through the actors that are
assigned to each thread of the pool.
snip
 It would, however, be better if the mailbox could be specific to the
threadpool instance, so less space would be wasted.  Or if the queues could
dynamically resize.  Or if there was a threadsafe dict.  Or...  But I don't
know that any of these are feasible.  (I mean, yes, I could write all the
mail to a database, but is that a better answer, or even a good one?)

My recollection is that the built-in collection types are threadsafe at
least to the limited extent that the operations exposed by their APIs (e.g.
dict.setdefault) are atomic.
Perhaps someone will be able to chime in with more details.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Threadpool item mailboxes design problem

2013-04-14 Thread 88888 Dihedral
Charles Hixson於 2013年4月15日星期一UTC+8上午7時12分11秒寫道:
 What is the best approach to implementing actors that accept and post 
 
 messages (and have no other external contacts).
 
 
 
 So far what I've come up with is something like:
 
 actors = {}
 
 mailboxs = {}
 
 
 
 Stuff actors with actor instances, mailboxes with multiprocessing.queue 
 
 instances.   (Actors and mailboxes will have identical keys, which are 
 
 id#, but it's got to be a dict rather than a list, because too many are 
 
 rolled out to disk.)  And I'm planning of having the actors running 
 
 simultaneously and continually in a threadpool that just loops through 
 
 the actors that are assigned to each thread of the pool.
 
 
 
 This lets any actor post messages to the mailbox of any other actor that 
 
 it has the id of, and lets him read his own mail without 
 
 multi-processing clashes.  But I'm quite uncertain that this is the best 
 
 way, because, if nothing else, it means that each mailbox needs to be 
 
 allocated large enough to handle the maximum amount of mail it could 
 
 possibly receive.  (I suppose I could implement some sort of wait 
 
 awhile and try again method.)  It would, however, be better if the 
 
 mailbox could be specific to the threadpool instance, so less space 
 
 would be wasted.  Or if the queues could dynamically resize.  Or if 
 
 there was a threadsafe dict.  Or...  But I don't know that any of these 
 
 are feasible.  (I mean, yes, I could write all the mail to a database, 
 
 but is that a better answer, or even a good one?)
 
 
 
 -- 
 
 Charles Hixson

Actors can receive and response to messages to take actions
accordingly in time in one or more cores.

The timer is required and the message read/write operations 
are required.

Do you want the actors to gain new methods to evolve 
in the long run?

-- 
http://mail.python.org/mailman/listinfo/python-list