Tim Allison created NUTCH-3001: ---------------------------------- Summary: protocol-selenium requires Content-Type header Key: NUTCH-3001 URL: https://issues.apache.org/jira/browse/NUTCH-3001 Project: Nutch Issue Type: Bug Reporter: Tim Allison
It looks like the selenium protocol requires that there be content-type. The logic seems to be: If the content type is html or xhtml, use selenium, otherwise just grab the bytes. If the content-type is null, nothing is pulled. My guess is that the logic should be : if the content type is not null and equals html or xhtml use selenium, otherwise grab the bytes. Right? {noformat} String contentType = getHeader(Response.CONTENT_TYPE); // handle with Selenium only if content type in HTML or XHTML if (contentType != null) { -- This message was sent by Atlassian Jira (v8.20.10#820010)