Re: [melbourne-pug] Joblib question

2018-03-09 Thread Alejandro Dubrovsky
delayed is a decorator, so it takes a function or a method. You are passing it a generator instead. def make_links(self): Parallel(n_jobs=-2)(delayed(scrape_db)(self, create_useful_link(self, Link, db), db) for db in databases ) should work, but it will only parallelise over the scrape_db ca

Re: [melbourne-pug] Joblib question

2018-03-09 Thread paul sorenson
Mike, Are there unique features of joblib that you need to use? Scraping web pages is often a good candidate for asyncio based models. cheers On 03/08/2018 11:41 PM, Mike Dewhirst wrote: > https://media.readthedocs.org/pdf/joblib/latest/joblib.pdf > > I'm trying to make the following code run

Re: [melbourne-pug] Joblib question

2018-03-09 Thread Mike Dewhirst
On 10/03/2018 12:33 PM, paul sorenson wrote: Mike, Are there unique features of joblib that you need to use? I was seduced by "Parallel". On reading the docs a little more diligently it seems well suited to parallel computation with heavy compute-bound stuff like scientific number crunchin

Re: [melbourne-pug] Joblib question

2018-03-09 Thread Mike Dewhirst
On 9/03/2018 7:30 PM, Alejandro Dubrovsky wrote: delayed is a decorator, so it takes a function or a method. You are passing it a generator instead. def make_links(self): Parallel(n_jobs=-2)(delayed(scrape_db)(self, create_useful_link(self, Link, db), db) for db in databases ) should wor

Re: [melbourne-pug] Joblib question

2018-03-09 Thread Mike Dewhirst
I've run the process a couple of times and there doesn't seem to be an appreciable difference. Both methods take enough time to boil the kettle. I know that isn't proper testing. It might be difficult to test timing accurately when we are waiting on websites all over the world to respond. Might