Re: [Tutor] using multiprocessing efficiently to process large data file

2012-09-02 Thread eryksun
On Sun, Sep 2, 2012 at 2:41 AM, Alan Gauld wrote: > >> if __name__ == '__main__': # <-- required for Windows > > Why? > What difference does that make in Windows? It's a hack to get around the fact that Win32 doesn't fork(). Windows calls CreateProcess(), which loads a fresh interpreter. mul

Re: [Tutor] using multiprocessing efficiently to process large data file

2012-09-01 Thread Alan Gauld
On 02/09/12 06:48, eryksun wrote: from multiprocessing import Pool, cpu_count from itertools import izip_longest, imap FILE_IN = '...' FILE_OUT = '...' NLINES = 100 # estimate this for a good chunk_size BATCH_SIZE = 8 def func(batch): """ test

Re: [Tutor] using multiprocessing efficiently to process large data file

2012-09-01 Thread eryksun
On Sat, Sep 1, 2012 at 9:14 AM, Wayne Werner wrote: > > with open('inputfile') as f: > for line1, line2, line3, line4 in zip(f,f,f,f): > # do your processing here Use itertools.izip_longest (zip_longest in 3.x) for this. Items in the final batch are set to fillvalue (defaults to None)

Re: [Tutor] using multiprocessing efficiently to process large data file

2012-09-01 Thread Wayne Werner
On Thu, 30 Aug 2012, Abhishek Pratap wrote: Hi Guys I have a with few million lines. I want to process each block of 8 lines and from my estimate my job is not IO bound. In other words it takes a lot more time to do the computation than it would take for simply reading the file. I am wondering

Re: [Tutor] using multiprocessing efficiently to process large data file

2012-08-31 Thread Prasad, Ramit
Please always respond to the list. And avoid top posting. > -Original Message- > From: Abhishek Pratap [mailto:abhishek@gmail.com] > Sent: Thursday, August 30, 2012 5:47 PM > To: Prasad, Ramit > Subject: Re: [Tutor] using multiprocessing efficiently to process la

Re: [Tutor] using multiprocessing efficiently to process large data file

2012-08-30 Thread Alan Gauld
On 30/08/12 23:19, Abhishek Pratap wrote: I am wondering how can I go about reading data from this at a faster pace and then farm out the jobs to worker function using multiprocessing module. I can think of two ways. 1. split the split and read it in parallel(dint work well for me ) primarily

Re: [Tutor] using multiprocessing efficiently to process large data file

2012-08-30 Thread Prasad, Ramit
> I have a with few million lines. I want to process each block of 8 > lines and from my estimate my job is not IO bound. In other words it > takes a lot more time to do the computation than it would take for > simply reading the file. > > I am wondering how can I go about reading data from this a

[Tutor] using multiprocessing efficiently to process large data file

2012-08-30 Thread Abhishek Pratap
Hi Guys I have a with few million lines. I want to process each block of 8 lines and from my estimate my job is not IO bound. In other words it takes a lot more time to do the computation than it would take for simply reading the file. I am wondering how can I go about reading data from this at a