Re: multiprocessing and dictionaries

2009-07-13 Thread Bjorn Meyer

On Monday 13 July 2009 13:12:18 Piet van Oostrum wrote:

> > Bjorn Meyer  (BM) wrote:
> >
> >BM> Here is what I have been using as a test.
> >BM> This pretty much mimics what I am trying to do.
> >BM> I put both threading and multiprocessing in the example which shows
> >BM> the output that I am looking for.
> >
> >BM> #!/usr/bin/env python
> >
> >BM> import threading
> >BM> from multiprocessing import Manager, Process
> >
> >BM> name = ('test1','test2','test3')
> >BM> data1 = ('dat1','dat2','dat3')
> >BM> data2 = ('datA','datB','datC')
>
> [snip]
>
> >BM> def multiprocess_test(name,data1,data2, mydict):
> >BM>   for nam in name:
> >BM> for num in range(0,3):
> >BM>   mydict.setdefault(nam, []).append(data1[num])
> >BM>   mydict.setdefault(nam, []).append(data2[num])
> >BM>   print 'Multiprocess test dic:',mydict
>
> I guess what's happening is this:
>
> d.setdefault(nam, []) returns a list, initially an empty list ([]). This
> list gets appended to. However, this list is a local list in the
> multi-process_test Process, therefore the result is not reflected in the
> original list inside the manager. Therefore all your updates get lost.
> You will have to do operations directly on the dictionary itself, not on
> any intermediary objects. Of course with the threading the situation is
> different as all operations are local.
>
> This works:
>
> def multiprocess_test(name,data1,data2, mydict):
>   print name, data1, data2
>   for nam in name:
> for num in range(0,3):
>   mydict.setdefault(nam, [])
>   mydict[nam] += [data1[num]]
>   mydict[nam] += [data2[num]]
>   print 'Multiprocess test dic:',mydict
>
> If you have more than one process operating on the dictionary
> simultaneously you have to beware of race conditions!!
> --
> Piet van Oostrum 
> URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
> Private email: p...@vanoostrum.org

Excellent. That works perfectly.

Thank you for your response Piet.

Bjorn


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing and dictionaries

2009-07-13 Thread Piet van Oostrum
> Bjorn Meyer  (BM) wrote:

>BM> Here is what I have been using as a test.
>BM> This pretty much mimics what I am trying to do.
>BM> I put both threading and multiprocessing in the example which shows
>BM> the output that I am looking for.

>BM> #!/usr/bin/env python

>BM> import threading
>BM> from multiprocessing import Manager, Process

>BM> name = ('test1','test2','test3')
>BM> data1 = ('dat1','dat2','dat3')
>BM> data2 = ('datA','datB','datC')

[snip]

>BM> def multiprocess_test(name,data1,data2, mydict):
>BM>   for nam in name:
>BM> for num in range(0,3):
>BM>   mydict.setdefault(nam, []).append(data1[num])
>BM>   mydict.setdefault(nam, []).append(data2[num])
>BM>   print 'Multiprocess test dic:',mydict

I guess what's happening is this:

d.setdefault(nam, []) returns a list, initially an empty list ([]). This
list gets appended to. However, this list is a local list in the
multi-process_test Process, therefore the result is not reflected in the
original list inside the manager. Therefore all your updates get lost.
You will have to do operations directly on the dictionary itself, not on
any intermediary objects. Of course with the threading the situation is
different as all operations are local.

This works:

def multiprocess_test(name,data1,data2, mydict):
  print name, data1, data2
  for nam in name:
for num in range(0,3):
  mydict.setdefault(nam, [])
  mydict[nam] += [data1[num]]
  mydict[nam] += [data2[num]]
  print 'Multiprocess test dic:',mydict

If you have more than one process operating on the dictionary
simultaneously you have to beware of race conditions!!
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing and dictionaries

2009-07-13 Thread Bjorn Meyer

On Monday 13 July 2009 01:56:08 Piet van Oostrum wrote:

> > Bjorn Meyer  (BM) wrote:
> >
> >BM> I am trying to convert a piece of code that I am using the thread
> > module with BM> to the multiprocessing module.
> >
> >BM> The way that I have it set up is a chunk of code reads a text file and
> > assigns BM> a dictionary key multiple values from the text file. I am
> > using locks to write BM> the values to the dictionary.
> >BM> The way that the values are written is as follows:
> >BM>  mydict.setdefault(key, []).append(value)
> >
> >BM> The problem that I have run into is that using multiprocessing, the
> > key gets BM> set, but the values don't get appended.
> >BM> I've even tried the Manager().dict() option, but it doesn't seem to
> > work.
> >
> >BM> Is this not supported at this time or am I missing something?
>
> I think you should give more information. Try to make a *minimal* program
> that shows the problem and include it in your posting or supply a
> download link.
> --
> Piet van Oostrum 
> URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
> Private email: p...@vanoostrum.org

Here is what I have been using as a test.
This pretty much mimics what I am trying to do.
I put both threading and multiprocessing in the example which shows the output 
that I am looking for.

#!/usr/bin/env python

import threading
from multiprocessing import Manager, Process

name = ('test1','test2','test3')
data1 = ('dat1','dat2','dat3')
data2 = ('datA','datB','datC')

def thread_test(name,data1,data2, d):
  for nam in name:
for num in range(0,3):
  d.setdefault(nam, []).append(data1[num])
  d.setdefault(nam, []).append(data2[num])
  print 'Thread test dict:',d

def multiprocess_test(name,data1,data2, mydict):
  for nam in name:
for num in range(0,3):
  mydict.setdefault(nam, []).append(data1[num])
  mydict.setdefault(nam, []).append(data2[num])
  print 'Multiprocess test dic:',mydict

if __name__ == '__main__':
  mgr = Manager()
  md = mgr.dict()
  d = {}

  m = Process(target=multiprocess_test, args=(name,data1,data2,md))
  m.start()
  t = threading.Thread(target=thread_test, args=(name,data1,data2,d))
  t.start()
  
  m.join()
  t.join()
  
  print 'Thread test:',d
  print 'Multiprocess test:',md


Thanks
Bjorn

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing and dictionaries

2009-07-13 Thread Piet van Oostrum
> Bjorn Meyer  (BM) wrote:

>BM> I am trying to convert a piece of code that I am using the thread module 
>with 
>BM> to the multiprocessing module.

>BM> The way that I have it set up is a chunk of code reads a text file and 
>assigns 
>BM> a dictionary key multiple values from the text file. I am using locks to 
>write 
>BM> the values to the dictionary.
>BM> The way that the values are written is as follows:
>BM>mydict.setdefault(key, []).append(value)

>BM> The problem that I have run into is that using multiprocessing, the key 
>gets 
>BM> set, but the values don't get appended.
>BM> I've even tried the Manager().dict() option, but it doesn't seem to work.

>BM> Is this not supported at this time or am I missing something?

I think you should give more information. Try to make a *minimal* program
that shows the problem and include it in your posting or supply a
download link.
-- 
Piet van Oostrum 
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing and dictionaries

2009-07-12 Thread Chris Rebert
On Sun, Jul 12, 2009 at 10:16 AM, Bjorn Meyer wrote:
> I am trying to convert a piece of code that I am using the thread module with
> to the multiprocessing module.
>
> The way that I have it set up is a chunk of code reads a text file and assigns
> a dictionary key multiple values from the text file. I am using locks to write
> the values to the dictionary.
> The way that the values are written is as follows:
>        mydict.setdefault(key, []).append(value)
>
> The problem that I have run into is that using multiprocessing, the key gets
> set, but the values don't get appended.

Don't have much concurrency experience, but have you tried using a
defaultdict instead?
It's possible its implementation might solve the problem.
http://docs.python.org/dev/library/collections.html#collections.defaultdict

Cheers,
Chris
-- 
http://blog.rebertia.com
-- 
http://mail.python.org/mailman/listinfo/python-list