I use 500 threads in a 4 cpu virtual machine. it's load average is about 7 and context switch(using vmstat) is larger than 4,000 so I want to give async client a try. anyone can help me? I don't know how to use async client. it only return a future. I am not familiar with it. what I want is a class. it get urls from a queue and fetch its content and then send to another queue.
On Sat, Jul 19, 2014 at 7:32 PM, Oleg Kalnichevski <ol...@apache.org> wrote: > On Sat, 2014-07-19 at 10:43 +0800, Li Li wrote: >> I think thread context switch will use many cpu resources. > > You would be amazed at how efficient at context switching modern JRE > have become. The use of NIO starts paying off at approximately a > thousand of concurrent connections or more. > >> if I can >> use async method(maybe it use java nio, it's epoll in linux; if it use >> nio.2, it's aio), it will be more performant. >> I have hundreds(even thousands) threads running. some website is slow >> and which take half a minute to get a single webpage. >> I am now using 500 threads and it only use 200KB/s bandwidth. If I add >> more threads, it will use more memory(stack) and cpu >> > > There is no guarantee you would get a better channel saturation with > NIO. By all of means do try out HttpAsyncClient in place of blocking > HttpClient and let us know the results. > > Oleg > > >> On Fri, Jul 18, 2014 at 10:02 PM, Oleg Kalnichevski <ol...@apache.org> wrote: >> > On Fri, 2014-07-18 at 18:16 +0800, Li Li wrote: >> >> hi all, >> >> I used to use HttpComponents Client to crawl webpages. I need to >> >> improve it by using async client. What I want to is something like: >> >> Queue<URL> needCrawlQueue; >> >> Queue<String[]> htmlQueue; >> >> >> >> HttpAsyncClient client; >> >> int maxConcurrent=500; >> >> >> >> //if finished a url, then get notified and call back this code >> >> if(client.currentCrawlingCount<maxConcurrent){ >> >> URL url=needCrawlQueue.take(); >> >> //request this url >> >> } >> >> >> >> //if finished a url, then get notifed and call back this code >> >> //String url;String html is call back arguments >> >> htmlQueue.put(new String[]{url, html}; >> >> >> >> I mean I have a asnyc client class which take two queues. >> >> if current unfinished urls less than maxConcurrent, then it task a >> >> url from a queue and request this url. if a url succeed(or failed), >> >> add the result to another queue. >> >> >> > >> > Why do you think the use of an async client would necessarily be an >> > improvement? What is it exactly you want to improve? Generally a decent >> > blocking client with a moderate number of threads is likely to be faster >> > than an async one. >> > >> > Oleg >> > >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org >> > For additional commands, e-mail: httpclient-users-h...@hc.apache.org >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org >> For additional commands, e-mail: httpclient-users-h...@hc.apache.org >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org > For additional commands, e-mail: httpclient-users-h...@hc.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org For additional commands, e-mail: httpclient-users-h...@hc.apache.org