On Thu, Apr 04, 2013 at 04:21:54PM -0400, David Larochelle wrote:
> My hope is to split the engine process into two pieces that ran in
> parallel: one to query the database and another to send downloads to
> fetchers. This way it won't matter how long the db query takes as long as
> we can get URLs from the database faster than we can download them.

If this is, indeed the bottle neck, then I would think that splitting them
into two communicating processes would solve the problem.

Using files, as I mentioned in my previous email, is probably considered
"old school" but is probably the simplest communication method.
Unfortunately, it won't work once your system scales to multiple machines.
As described, it sounds like the current system runs on a single machine,
so this may not be an immediate problem. If you are planning to scale across
machines, then I'd second the recommendation to use a message queue instead.

-Gyepi

_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to