With regards to John's patch for unlimited content length; it doesn't seem
to work with chunked http streams or zip compressed streams. I have created
this patch so it does. One thing: I can't make patches against cvs due to
firewall constraints so I have had to do this manually with diff and the
nightly-build - I hope this isn't too much of a problem. 

------

---
src/plugin/protocol-http/src/java/net/nutch/protocol/http/HttpResponse.java.
orig    2004-07-01 08:24:52.000000000 +0100
+++
src/plugin/protocol-http/src/java/net/nutch/protocol/http/HttpResponse.java
2004-07-01 11:30:15.140625000 +0100
@@ -150,7 +150,9 @@
         Http.LOG.fine("uncompressing....");
         byte[] compressed = content;
 
-        content = GZIPUtils.unzipBestEffort(compressed, Http.MAX_CONTENT);
+                 int sizeLimit = Http.MAX_CONTENT < 0 ? Integer.MAX_VALUE :
Http.MAX_CONTENT;
+
+        content = GZIPUtils.unzipBestEffort(compressed, sizeLimit);
 
         if (content == null)
           throw new HttpException("unzipBestEffort returned null");
@@ -237,7 +239,7 @@
         break;
       }
 
-      if ( (contentBytesRead + chunkLen) > Http.MAX_CONTENT )
+      if ( Http.MAX_CONTENT >= 0 &&  (contentBytesRead + chunkLen) >
Http.MAX_CONTENT )
         chunkLen= Http.MAX_CONTENT - contentBytesRead;
 
       // read one chunk


Our name has changed.  Please update your address book to the following format: 
"[EMAIL PROTECTED]".

This message contains information that may be privileged or confidential and is the 
property of the Capgemini Group. It is intended only for the person to whom it is 
addressed. If you are not the intended recipient,  you are not authorized to read, 
print, retain, copy, disseminate,  distribute, or use this message or any part 
thereof. If you receive this  message in error, please notify the sender immediately 
and delete all  copies of this message.



-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to