[ 
https://issues.apache.org/jira/browse/NUTCH-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215537#comment-15215537
 ] 

ASF GitHub Bot commented on NUTCH-2191:
---------------------------------------

Github user karanjeets commented on a diff in the pull request:

    https://github.com/apache/nutch/pull/100#discussion_r57677200
  
    --- Diff: conf/nutch-default.xml ---
    @@ -1874,6 +1874,72 @@ visit 
https://wiki.apache.org/nutch/SimilarityScoringFilter-->
       </description>
     </property>
     
    +
    +<!-- lib-htmlunit plugin properties; applies to protocol-htmlunit -->
    +
    +<property>
    +  <name>htmlunit.page.load.delay</name>
    --- End diff --
    
    I can share them but that would mean changing the name of properties 
defined for selenium which is in short affecting the selenium code as well. Are 
we okay with that?
    
    Currently, all properties in nutch-default.xml has "selenium" prefix and 
hence the change is required.


> Add protocol-htmlunit
> ---------------------
>
>                 Key: NUTCH-2191
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2191
>             Project: Nutch
>          Issue Type: New Feature
>          Components: protocol
>    Affects Versions: 1.11
>            Reporter: Markus Jelsma
>            Assignee: Chris A. Mattmann
>             Fix For: 1.12
>
>         Attachments: NUTCH-2191.patch, NUTCH-2191.patch, NUTCH-2191.patch, 
> NUTCH-2191.patch
>
>
> HtmlUnit is, opposed to other Javascript enabled headless browsers, a 
> portable library and should therefore be better suited for very large scale 
> crawls. This issue is an attempt to implement protocol-htmlunit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to