[ https://issues.apache.org/jira/browse/NUTCH-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152140#comment-15152140 ]
Markus Jelsma commented on NUTCH-2191: -------------------------------------- Hi - it works indeed. But new problems appear, as usual! 1. SSL does not work due to {code} 2016-02-18 11:53:21,130 ERROR htmlunit.Http - Failed to get protocol output java.lang.IllegalArgumentException: Cannot locate declared field org.apache.http.impl.client.HttpClientBuilder.sslContext at org.apache.commons.lang3.reflect.FieldUtils.readDeclaredField(FieldUtils.java:382) at com.gargoylesoftware.htmlunit.HttpWebConnection.createConnectionManager(HttpWebConnection.java:944) at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:161) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1321) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1238) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:346) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:415) at org.apache.nutch.protocol.htmlunit.HttpResponse.<init>(HttpResponse.java:103) {code} 2. I don't know how yet but since it uses Selenium, every time i try a file a browser opens! This is crazy, i didn't know this was even possible. Markus > Add protocol-htmlunit > --------------------- > > Key: NUTCH-2191 > URL: https://issues.apache.org/jira/browse/NUTCH-2191 > Project: Nutch > Issue Type: New Feature > Components: protocol > Affects Versions: 1.11 > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Fix For: 1.12 > > Attachments: NUTCH-2191.patch, NUTCH-2191.patch > > > HtmlUnit is, opposed to other Javascript enabled headless browsers, a > portable library and should therefore be better suited for very large scale > crawls. This issue is an attempt to implement protocol-htmlunit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)