[ 
https://issues.apache.org/jira/browse/NUTCH-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678993#comment-17678993
 ] 

ASF GitHub Bot commented on NUTCH-2980:
---------------------------------------

KamilMroczek opened a new pull request, #753:
URL: https://github.com/apache/nutch/pull/753

   - Disabled phantomJS driver as it was causing problems casting 
TakeScreenshot to HtmlUnitWebDriver and the project has been archived since 2018
   - Improved README setup instructions for IntelliJ
   
   The following libraries were added as part of the selenium-java and htmlunit 
upgrades. They are all Apache 2.0, MIT or EDL.
   
   async-http-client
   async-http-client-netty-utils
   auto-common
   auto-service
   auto-service-annotations
   checker-qual
   dec
   failsafe
   failureaccess
   htmlunit-xpath
   jakarta.activation
   jcommander
   jtoml
   listenablefuture
   netty-buffer
   netty-codec
   netty-codec-http
   netty-codec-socks
   netty-common
   netty-handler
   netty-handler-proxy
   netty-reactive-streams
   netty-resolver
   netty-transport
   netty-transport-classes-epoll
   netty-transport-classes-kqueue
   netty-transport-native-epoll
   netty-transport-native-kqueue
   netty-transport-native-unix-common
   opentelemetry-api
   opentelemetry-api-logs
   opentelemetry-context
   opentelemetry-exporter-common
   opentelemetry-exporter-logging
   opentelemetry-sdk
   opentelemetry-sdk-common
   opentelemetry-sdk-extension-autoconfigure
   opentelemetry-sdk-extension-autoconfigure-spi
   opentelemetry-sdk-logs
   opentelemetry-sdk-metrics
   opentelemetry-sdk-trace
   opentelemetry-semconv
   reactive-streams
   salvation2
   selenium-chromium-driver
   selenium-devtools-v106
   selenium-devtools-v107
   selenium-devtools-v108
   selenium-devtools-v85
   selenium-http
   selenium-json
   selenium-manager




> Upgrade Selenium Java to 4.7.2
> ------------------------------
>
>                 Key: NUTCH-2980
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2980
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin, protocol
>    Affects Versions: 1.19
>            Reporter: Kamil Mroczek
>            Priority: Major
>             Fix For: 1.20
>
>
> Selenium version is quite old and had some issues processing a website. Once 
> I switched to the latest version I was able to scrape that websites. Good to 
> keep it up to date since we were already 1 major release behind.
> Upgrading Selenium Java from 3.141.59 to 4.7.2 and Selenium HTMLUnit from 
> 2.35.1 to 4.7.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to