Re: Strategy in designing a scaleable robot with POE::Component::Client::HTTP?

2009-11-20 Thread Rocco Caputo
Find out how many sites your system and network will let you crawl at  
once.  Limit the number of parallel jobs to that.  Read about tuning  
POE::Component::Client::HTTP, either in the documentation or in this  
mailing list's archives.


Stay under your system's limits.  Consider how performance plummets  
when a machine overcommits its memory and begins swapping.  Don't let  
that happen to you.


Use fork() with POE to take advantage of both cores.

Are you looking for a design consultant?

--
Rocco Caputo - rcap...@pobox.com


On Nov 20, 2009, at 23:15, Ryan Chan wrote:


Hello,


Assume I only have a dual core server, with limited 1GB memory , I
want to build a web robot to crawl 1000 pre-defined web sites.

Anyone can provide a basic strategy for my tasks?

Should I  create 1000 sessions at the same time, to archive the max
network throughput?





Thanks.




Strategy in designing a scaleable robot with POE::Component::Client::HTTP?

2009-11-20 Thread Ryan Chan
Hello,


Assume I only have a dual core server, with limited 1GB memory , I
want to build a web robot to crawl 1000 pre-defined web sites.

Anyone can provide a basic strategy for my tasks?

Should I  create 1000 sessions at the same time, to archive the max
network throughput?





Thanks.