Hi Hanbing Luo,
thanks for reporting this issue.
Difficult to tell whether this is a bug or not.
Unfortunately, I'm not able to reproduce it using 1.20 or the recent Nutch
master and Java 11 on Linux. The mentioned page was successfully fetched.
But if you are able to reproduce the issue repeatedly, it is one. Definitely.
Generally speaking, protocol-okhttp is the more advanced one,
supporting HTTP/2 and pooling connections.
Although for HTTP/1.1, protocol-http should do its job.
Best,
Sebastian
On 1/17/25 10:29, dolphin wrote:
Hello everyone
I used Nutch (v1.20) to crawl the website:
https://weworkremotely.com/
With the default protocol-http plugin, I encountered an
SSLHandshakeException. However, this issue does not occur when using
the protocol-okhttp plugin.
The exception appears to be related to the TLS version. The web server does not
support TLSv1 but does support TLSv1.2 and TLSv1.3. I also tried setting
-Dhttps.protocols=TLSv1.3, but it didn't resolve the issue.
I'm unsure if this is a bug in the protocol-http plugin. Please let me know if
further details are needed.
Thanks
Hanbing Luo
How to reproduce:
run org.apache.nutch.protocol.http.Http main method with above URL.
java -version
java version "11.0.23" 2024-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)
Detailed error stack:
2025-01-10 15:12:32,399 ERROR o.a.n.p.h.Http [main] Failed to get protocol
output
org.apache.nutch.protocol.http.api.HttpException: SSL connect
to https://weworkremotely.com failed with: Remote host terminated the
handshake
at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:156)
~[classes/:?]
at org.apache.nutch.protocol.http.Http.getResponse(Http.java:65) ~[classes/:?]
at
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:354)
[classes/:?]
at org.apache.nutch.protocol.http.api.HttpBase.main(HttpBase.java:697)
[classes/:?]
at org.apache.nutch.protocol.http.Http.main(Http.java:59) [classes/:?]
Caused by: javax.net.ssl.SSLHandshakeException: Remote host terminated the
handshake
at java.base/sun.security.ssl.SSLSocketImpl.handleEOF(SSLSocketImpl.java:1715)
~[?:?]
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1514)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
~[?:?]
at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136)
~[classes/:?]
... 4 more
Caused by: java.io.EOFException: SSL peer shut down incorrectly
at
java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:489)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:478)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:160)
~[?:?]
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?]
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1506)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:455)
~[?:?]
at
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:426)
~[?:?]
at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:136)
~[classes/:?]
... 4 more
Status: exception(16), lastModified=0:
org.apache.nutch.protocol.http.api.HttpException: SSL connect
to https://weworkremotely.com failed with: Remote host terminated the
handshake