Re: Multiprocessing takes higher execution time
Sibtey Mehdi sibt...@infotechsw.com wrote: I use multiprocessing to compare more then one set of files. For comparison each set of files (i.e. Old file1 Vs New file1) I create a process, Process(target=compare, args=(oldFile, newFile)).start() It takes 61 seconds execution time. When I do the same comparison without implementing multiprocessing, it takes 52 seconds execution time. The oldProjects and newProjects will contains zip files i.e(oldxyz1.zip,oldxyz2.zip, newxyz2.zip,newxyz2.zip) it will unzip both the zip files and compare all the files between old and new (mdb files or txt files) and gives the result. I do this comparision for n number set of zip files and i assigne each set of zip files comparision to a process. I had a brief look at the code and your use of multiprocessing looks fine. How many projects are you processing at once? And how many MB of zip files is it? As reading zip files does lots of disk IO I would guess it is disk limited rather than anything else, which explains why doing many at once is actually slower (the disk has to do more seeks). -- Nick Craig-Wood n...@craig-wood.com -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiprocessing takes higher execution time
On Thu, Jan 8, 2009 at 7:31 PM, Nick Craig-Wood n...@craig-wood.com wrote: (...) How many projects are you processing at once? And how many MB of zip files is it? As reading zip files does lots of disk IO I would guess it is disk limited rather than anything else, which explains why doing many at once is actually slower (the disk has to do more seeks). If this is the case, this problem is not well suited to multi processing but rather distributed processing :) --JamesMills -- http://mail.python.org/mailman/listinfo/python-list
RE: Multiprocessing takes higher execution time
Thanks Nick. It processes 10-15 projects(i.e. 10-15 processes are started) at once. One Zip file size is 2-3 MB. When I used dual core system it reduced the execution time from 61 seconds to 55 seconds. My dual core system Configuration is, Pentium(R) D CPU 3.00GHz, 2.99GHz 1 GB RAM Regards, Gopal -Original Message- From: Nick Craig-Wood [mailto:n...@craig-wood.com] Sent: Thursday, January 08, 2009 3:01 PM To: python-list@python.org Subject: Re: Multiprocessing takes higher execution time Sibtey Mehdi sibt...@infotechsw.com wrote: I use multiprocessing to compare more then one set of files. For comparison each set of files (i.e. Old file1 Vs New file1) I create a process, Process(target=compare, args=(oldFile, newFile)).start() It takes 61 seconds execution time. When I do the same comparison without implementing multiprocessing, it takes 52 seconds execution time. The oldProjects and newProjects will contains zip files i.e(oldxyz1.zip,oldxyz2.zip, newxyz2.zip,newxyz2.zip) it will unzip both the zip files and compare all the files between old and new (mdb files or txt files) and gives the result. I do this comparision for n number set of zip files and i assigne each set of zip files comparision to a process. I had a brief look at the code and your use of multiprocessing looks fine. How many projects are you processing at once? And how many MB of zip files is it? As reading zip files does lots of disk IO I would guess it is disk limited rather than anything else, which explains why doing many at once is actually slower (the disk has to do more seeks). -- Nick Craig-Wood n...@craig-wood.com -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiprocessing takes higher execution time
Sibtey Mehdi wrote: Hi, I use multiprocessing to compare more then one set of files. For comparison each set of files (i.e. Old file1 Vs New file1) I create a process, Process(target=compare, args=(oldFile, newFile)).start() It takes 61 seconds execution time. When I do the same comparison without implementing multiprocessing, it takes 52 seconds execution time. The parallel processing time should be lesser. I am not able to get advantage of multiprocessing here. Any suggestions can be very helpful. My first suggestion would be: show us some code. We aren't psychic, you know. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiprocessing takes higher execution time
On 2009-01-07, Steve Holden st...@holdenweb.com wrote: I use multiprocessing to compare more then one set of files. For comparison each set of files (i.e. Old file1 Vs New file1) I create a process, Process(target=compare, args=(oldFile, newFile)).start() It takes 61 seconds execution time. When I do the same comparison without implementing multiprocessing, it takes 52 seconds execution time. My first suggestion would be: show us some code. We aren't psychic, you know. I am! He's only got one processor, and he's just been bit by Amdahl's law when P1 and S1. There you have a perfectly psychic answer: an educated guess camoflaged in plausible-sounding but mostly-bullshit buzzwords. A better psychic would have avoided making that one falsifiable statement (he's only got one processor). -- Grant Edwards grante Yow! Hello. Just walk at along and try NOT to think visi.comabout your INTESTINES being almost FORTY YARDS LONG!! -- http://mail.python.org/mailman/listinfo/python-list
Re: Multiprocessing takes higher execution time
Grant Edwards inva...@invalid wrote: On 2009-01-07, Steve Holden st...@holdenweb.com wrote: I use multiprocessing to compare more then one set of files. For comparison each set of files (i.e. Old file1 Vs New file1) I create a process, Process(target=compare, args=(oldFile, newFile)).start() It takes 61 seconds execution time. When I do the same comparison without implementing multiprocessing, it takes 52 seconds execution time. My first suggestion would be: show us some code. We aren't psychic, you know. I am! He's only got one processor, and he's just been bit by Amdahl's law when P1 and S1. There you have a perfectly psychic answer: an educated guess camoflaged in plausible-sounding but mostly-bullshit buzzwords. A better psychic would have avoided making that one falsifiable statement (he's only got one processor). ;-) My guess would be that the job is IO bound rather than CPU bound, but that is covered by Amdahl's Law too where P is approx 0, N irrelevant... Being IO bound explains why it takes longer with multiprocessing - it causes more disk seeks to run an IO bound algorithm in parallel than running it sequentially. -- Nick Craig-Wood n...@craig-wood.com -- http://www.craig-wood.com/nick -- http://mail.python.org/mailman/listinfo/python-list
RE: Multiprocessing takes higher execution time
Hello, Please see the code I have send in attachment. Any suggestions will highly appreciate. Thanks and Regards, Gopal -Original Message- From: Grant Edwards [mailto:inva...@invalid] Sent: Wednesday, January 07, 2009 8:58 PM To: python-list@python.org Subject: Re: Multiprocessing takes higher execution time On 2009-01-07, Steve Holden st...@holdenweb.com wrote: I use multiprocessing to compare more then one set of files. For comparison each set of files (i.e. Old file1 Vs New file1) I create a process, Process(target=compare, args=(oldFile, newFile)).start() It takes 61 seconds execution time. When I do the same comparison without implementing multiprocessing, it takes 52 seconds execution time. My first suggestion would be: show us some code. We aren't psychic, you know. I am! He's only got one processor, and he's just been bit by Amdahl's law when P1 and S1. There you have a perfectly psychic answer: an educated guess camoflaged in plausible-sounding but mostly-bullshit buzzwords. A better psychic would have avoided making that one falsifiable statement (he's only got one processor). -- Grant Edwards grante Yow! Hello. Just walk at along and try NOT to think visi.comabout your INTESTINES being almost FORTY YARDS LONG!! The oldProjects and newProjects will contains zip files i.e(oldxyz1.zip,oldxyz2.zip, newxyz2.zip,newxyz2.zip) it will unzip both the zip files and compare all the files between old and new (mdb files or txt files) and gives the result. I do this comparision for n number set of zip files and i assigne each set of zip files comparision to a process. class CompareProjects(dict): Compares the set of projects(zip files) def __init__(self, oldProjects, newProjects, ignoreOidFields, tempDir): self.oldProjects = oldProjects self.newProjects = newProjects def _compare(self, tempDir, ignoreOidFields): Compares each project projects = set(self.oldProjects.keys()).union(set(self.newProjects.keys())) progress.totalProjects = len(projects) progress.progress = 0 que = Queue() for count,project in enumerate(projects): oldProject = self.oldProjects.get(project) if project in self.oldProjects else None newProject = self.newProjects.get(project) if project in self.newProjects else None prj = '_'.join((os.path.basename(oldProject)[:-4], os.path.basename(newProject)[:-4])) cmpProj = CompareProject(oldProject, newProject, ignoreOidFields, tempDir) p = Process(target=cmpProj._compare, args=(os.path.join(tempDir, prj), ignoreOidFields, False, project, que)) p.start() print 'pid',p.pid while progress.totalProjects != len(self): if not que.empty(): proj, cmpCitect = que.get_nowait()#get() self.__setitem__(proj, cmpCitect) progress.progress += 1 else: time.sleep(0.001) class CompareProject(object): compares two projects def __init__(self, oldProject, newProject, ignoreOidFields, tempDir, unitCompare = False): self.oldProject = oldProject self.newProject = newProject def _compare(self, tempDir, ignoreOidFields, unitCompare, project=None, que=None): Compares the extracted .mdb files and txt files oldProjectDir = os.path.join(tempDir,'oldProject') newProjectDir = os.path.join(tempDir,'newProject') # get .mdb files and txt files from the project oldmdbFiles,oldTxtFiles = self.getFiles(self.oldProject, oldProjectDir) newmdbFiles,newTxtFiles = self.getFiles(self.newProject, newProjectDir) #start comparing mdb files and txt files self.comparedTables = ComparedTables(oldDbfFiles, newDbfFiles, ignoreOidFields, tempDir) self.comparedTxtFiles = ComparedTextFiles(oldTxtFiles, newTxtFiles, tempDir) if que and project: que.put_nowait((project, self)) class ComparedTables(dict): This class compare two mdb files and it each tables and gives the results class ComparedTextFiles(dict): This class compare two txt files line by line and give the diff results ..-- http://mail.python.org/mailman/listinfo/python-list