Re: OOM problem

2014-02-10 Thread Ken Krugler
If you're crawling web pages, you need to have a limit to the amount of data any page returns. Otherwise you'll eventually run into a site that returns an unbounded amount of data, which will kill your JVM. See SimpleHttpFetcher in Bixo for an example of one way to do this type of limiting (th

Re: OOM problem

2014-02-10 Thread Li Li
jmap result: Debugger attached successfully. Server compiler detected. JVM version is 22.1-b02 using thread-local object allocation. Parallel GC with 4 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 2147483648 (2048.0MB) NewSize =

OOM problem

2014-02-10 Thread Li Li
I am using httpclient 4.3 to crawl webpages. I start 200 threads and PoolingHttpClientConnectionManager with totalMax 1000 and perHostMax 5 I give java 2GB memory and one thread throws an exception(others still running, this thread is dead) Exception in thread "Thread-156" java.lang.OutOfMemoryErr

Using basic auth produces warnings about NTLM and NEGOTIATE errors.

2014-02-10 Thread Brett Ryan
If a server supports NTLM and Kerberos authentication, but when setting up the client I only provide basic credentials I get a log for each of the NTLM and NEGOTIATE authentication schemes. Taking the example from : https://hc.apache.org/httpcomponents-client-4.3.x/httpclient/examples/org/apach

writeTo called twice on custom InputStreamEntity class

2014-02-10 Thread Sachin Shetty
Hi, I have a a custom Entity stream class as below public class GcsCustomInputStreamEntity extends InputStreamEntity { InputStream sourceStream - null; Public GcsCustomInputStreamEntity(InputStream sourceStream) { this.sourceStream = sourceStream }