some of Deflate encoded pages not fetched -----------------------------------------
Key: NUTCH-1270 URL: https://issues.apache.org/jira/browse/NUTCH-1270 Project: Nutch Issue Type: Bug Components: fetcher Affects Versions: 1.4 Environment: software Reporter: behnam nikbakht it is a problem with some of web pages that fetched but their content can not retrived after this change, this error fixed we change lib-http/src/java/org/apache/nutch/protocol/http/api/HttpBase.java public byte[] processDeflateEncoded(byte[] compressed, URL url) throws IOException { if (LOGGER.isTraceEnabled()) { LOGGER.trace("inflating...."); } byte[] content = DeflateUtils.inflateBestEffort(compressed, getMaxContent()); + if(content==null) + content = DeflateUtils.inflateBestEffort(compressed, 200000); if (content == null) throw new IOException("inflateBestEffort returned null"); if (LOGGER.isTraceEnabled()) { LOGGER.trace("fetched " + compressed.length + " bytes of compressed content (expanded to " + content.length + " bytes) from " + url); } return content; } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira