Re: multiprocessing and dictionaries
On Monday 13 July 2009 13:12:18 Piet van Oostrum wrote: > > Bjorn Meyer (BM) wrote: > > > >BM> Here is what I have been using as a test. > >BM> This pretty much mimics what I am trying to do. > >BM> I put both threading and multiprocessing in the example which shows > >BM> the output that I am looking for. > > > >BM> #!/usr/bin/env python > > > >BM> import threading > >BM> from multiprocessing import Manager, Process > > > >BM> name = ('test1','test2','test3') > >BM> data1 = ('dat1','dat2','dat3') > >BM> data2 = ('datA','datB','datC') > > [snip] > > >BM> def multiprocess_test(name,data1,data2, mydict): > >BM> for nam in name: > >BM> for num in range(0,3): > >BM> mydict.setdefault(nam, []).append(data1[num]) > >BM> mydict.setdefault(nam, []).append(data2[num]) > >BM> print 'Multiprocess test dic:',mydict > > I guess what's happening is this: > > d.setdefault(nam, []) returns a list, initially an empty list ([]). This > list gets appended to. However, this list is a local list in the > multi-process_test Process, therefore the result is not reflected in the > original list inside the manager. Therefore all your updates get lost. > You will have to do operations directly on the dictionary itself, not on > any intermediary objects. Of course with the threading the situation is > different as all operations are local. > > This works: > > def multiprocess_test(name,data1,data2, mydict): > print name, data1, data2 > for nam in name: > for num in range(0,3): > mydict.setdefault(nam, []) > mydict[nam] += [data1[num]] > mydict[nam] += [data2[num]] > print 'Multiprocess test dic:',mydict > > If you have more than one process operating on the dictionary > simultaneously you have to beware of race conditions!! > -- > Piet van Oostrum > URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] > Private email: p...@vanoostrum.org Excellent. That works perfectly. Thank you for your response Piet. Bjorn -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing and dictionaries
> Bjorn Meyer (BM) wrote: >BM> Here is what I have been using as a test. >BM> This pretty much mimics what I am trying to do. >BM> I put both threading and multiprocessing in the example which shows >BM> the output that I am looking for. >BM> #!/usr/bin/env python >BM> import threading >BM> from multiprocessing import Manager, Process >BM> name = ('test1','test2','test3') >BM> data1 = ('dat1','dat2','dat3') >BM> data2 = ('datA','datB','datC') [snip] >BM> def multiprocess_test(name,data1,data2, mydict): >BM> for nam in name: >BM> for num in range(0,3): >BM> mydict.setdefault(nam, []).append(data1[num]) >BM> mydict.setdefault(nam, []).append(data2[num]) >BM> print 'Multiprocess test dic:',mydict I guess what's happening is this: d.setdefault(nam, []) returns a list, initially an empty list ([]). This list gets appended to. However, this list is a local list in the multi-process_test Process, therefore the result is not reflected in the original list inside the manager. Therefore all your updates get lost. You will have to do operations directly on the dictionary itself, not on any intermediary objects. Of course with the threading the situation is different as all operations are local. This works: def multiprocess_test(name,data1,data2, mydict): print name, data1, data2 for nam in name: for num in range(0,3): mydict.setdefault(nam, []) mydict[nam] += [data1[num]] mydict[nam] += [data2[num]] print 'Multiprocess test dic:',mydict If you have more than one process operating on the dictionary simultaneously you have to beware of race conditions!! -- Piet van Oostrum URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing and dictionaries
On Monday 13 July 2009 01:56:08 Piet van Oostrum wrote: > > Bjorn Meyer (BM) wrote: > > > >BM> I am trying to convert a piece of code that I am using the thread > > module with BM> to the multiprocessing module. > > > >BM> The way that I have it set up is a chunk of code reads a text file and > > assigns BM> a dictionary key multiple values from the text file. I am > > using locks to write BM> the values to the dictionary. > >BM> The way that the values are written is as follows: > >BM> mydict.setdefault(key, []).append(value) > > > >BM> The problem that I have run into is that using multiprocessing, the > > key gets BM> set, but the values don't get appended. > >BM> I've even tried the Manager().dict() option, but it doesn't seem to > > work. > > > >BM> Is this not supported at this time or am I missing something? > > I think you should give more information. Try to make a *minimal* program > that shows the problem and include it in your posting or supply a > download link. > -- > Piet van Oostrum > URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] > Private email: p...@vanoostrum.org Here is what I have been using as a test. This pretty much mimics what I am trying to do. I put both threading and multiprocessing in the example which shows the output that I am looking for. #!/usr/bin/env python import threading from multiprocessing import Manager, Process name = ('test1','test2','test3') data1 = ('dat1','dat2','dat3') data2 = ('datA','datB','datC') def thread_test(name,data1,data2, d): for nam in name: for num in range(0,3): d.setdefault(nam, []).append(data1[num]) d.setdefault(nam, []).append(data2[num]) print 'Thread test dict:',d def multiprocess_test(name,data1,data2, mydict): for nam in name: for num in range(0,3): mydict.setdefault(nam, []).append(data1[num]) mydict.setdefault(nam, []).append(data2[num]) print 'Multiprocess test dic:',mydict if __name__ == '__main__': mgr = Manager() md = mgr.dict() d = {} m = Process(target=multiprocess_test, args=(name,data1,data2,md)) m.start() t = threading.Thread(target=thread_test, args=(name,data1,data2,d)) t.start() m.join() t.join() print 'Thread test:',d print 'Multiprocess test:',md Thanks Bjorn -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing and dictionaries
> Bjorn Meyer (BM) wrote: >BM> I am trying to convert a piece of code that I am using the thread module >with >BM> to the multiprocessing module. >BM> The way that I have it set up is a chunk of code reads a text file and >assigns >BM> a dictionary key multiple values from the text file. I am using locks to >write >BM> the values to the dictionary. >BM> The way that the values are written is as follows: >BM>mydict.setdefault(key, []).append(value) >BM> The problem that I have run into is that using multiprocessing, the key >gets >BM> set, but the values don't get appended. >BM> I've even tried the Manager().dict() option, but it doesn't seem to work. >BM> Is this not supported at this time or am I missing something? I think you should give more information. Try to make a *minimal* program that shows the problem and include it in your posting or supply a download link. -- Piet van Oostrum URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4] Private email: p...@vanoostrum.org -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing and dictionaries
On Sun, Jul 12, 2009 at 10:16 AM, Bjorn Meyer wrote: > I am trying to convert a piece of code that I am using the thread module with > to the multiprocessing module. > > The way that I have it set up is a chunk of code reads a text file and assigns > a dictionary key multiple values from the text file. I am using locks to write > the values to the dictionary. > The way that the values are written is as follows: > mydict.setdefault(key, []).append(value) > > The problem that I have run into is that using multiprocessing, the key gets > set, but the values don't get appended. Don't have much concurrency experience, but have you tried using a defaultdict instead? It's possible its implementation might solve the problem. http://docs.python.org/dev/library/collections.html#collections.defaultdict Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list