With regards to John's patch for unlimited content length; it doesn't seem
to work with chunked http streams or zip compressed streams. I have created
this patch so it does. One thing: I can't make patches against cvs due to
firewall constraints so I have had to do this manually with diff and the
nightly-build - I hope this isn't too much of a problem.
------
---
src/plugin/protocol-http/src/java/net/nutch/protocol/http/HttpResponse.java.
orig 2004-07-01 08:24:52.000000000 +0100
+++
src/plugin/protocol-http/src/java/net/nutch/protocol/http/HttpResponse.java
2004-07-01 11:30:15.140625000 +0100
@@ -150,7 +150,9 @@
Http.LOG.fine("uncompressing....");
byte[] compressed = content;
- content = GZIPUtils.unzipBestEffort(compressed, Http.MAX_CONTENT);
+ int sizeLimit = Http.MAX_CONTENT < 0 ? Integer.MAX_VALUE :
Http.MAX_CONTENT;
+
+ content = GZIPUtils.unzipBestEffort(compressed, sizeLimit);
if (content == null)
throw new HttpException("unzipBestEffort returned null");
@@ -237,7 +239,7 @@
break;
}
- if ( (contentBytesRead + chunkLen) > Http.MAX_CONTENT )
+ if ( Http.MAX_CONTENT >= 0 && (contentBytesRead + chunkLen) >
Http.MAX_CONTENT )
chunkLen= Http.MAX_CONTENT - contentBytesRead;
// read one chunk
Our name has changed. Please update your address book to the following format:
"[EMAIL PROTECTED]".
This message contains information that may be privileged or confidential and is the
property of the Capgemini Group. It is intended only for the person to whom it is
addressed. If you are not the intended recipient, you are not authorized to read,
print, retain, copy, disseminate, distribute, or use this message or any part
thereof. If you receive this message in error, please notify the sender immediately
and delete all copies of this message.
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers