I am using httpclient 4.3 to crawl webpages.
I start 200 threads and PoolingHttpClientConnectionManager with
totalMax 1000 and perHostMax 5
I give java 2GB memory and one thread throws an exception(others still
running, this thread is dead)
Exception in thread "Thread-156" java.lang.OutOfMemoryError: Java heap space
at org.apache.http.util.ByteArrayBuffer.<init>(ByteArrayBuffer.java:56)
at org.apache.http.util.EntityUtils.toByteArray(EntityUtils.java:133)
at
com.founder.httpclientfetcher.HttpClientFetcher$3.handleResponse(HttpClientFetcher.java:221)
at
com.founder.httpclientfetcher.HttpClientFetcher$3.handleResponse(HttpClientFetcher.java:211)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:218)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:160)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:136)
at
com.founder.httpclientfetcher.HttpClientFetcher.httpGet(HttpClientFetcher.java:233)
at com.founder.vcfetcher.CrawlWorker.getContent(CrawlWorker.java:198)
at com.founder.vcfetcher.CrawlWorker.doWork(CrawlWorker.java:134)
at com.founder.vcfetcher.CrawlWorker.run(CrawlWorker.java:231)
does it mean my code has some memory leak probelm?
my codes:
public String httpGet(String url) throws Exception {
if (!isValid)
throw new RuntimeException("not valid now, you should init first");
HttpGet httpget = new HttpGet(url);
// Create a custom response handler
ResponseHandler<String> responseHandler = new ResponseHandler<String>() {
public String handleResponse(final HttpResponse response)
throws ClientProtocolException, IOException {
int status = response.getStatusLine().getStatusCode();
if (status >= 200 && status < 300) {
HttpEntity entity = response.getEntity();
if (entity == null)
return null;
byte[] bytes = EntityUtils.toByteArray(entity);
String charSet = CharsetDetector.getCharset(bytes);
return new String(bytes, charSet);
} else {
throw new ClientProtocolException(
"Unexpected response status: " + status);
}
}
};
String responseBody = client.execute(httpget, responseHandler);
return responseBody;
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]