Hi Hanbing Luo,

thanks for reporting this issue.

Difficult to tell whether this is a bug or not.

Unfortunately, I'm not able to reproduce it using 1.20 or the recent Nutch master and Java 11 on Linux. The mentioned page was successfully fetched.

But if you are able to reproduce the issue repeatedly, it is one. Definitely.

Generally speaking, protocol-okhttp is the more advanced one,
supporting HTTP/2 and pooling connections.

Although for HTTP/1.1, protocol-http should do its job.

Best,
Sebastian


On 1/17/25 10:29, dolphin wrote:
Hello everyone


I used Nutch (v1.20) to crawl the website:
https://weworkremotely.com/
With the default protocol-http plugin, I encountered an 
SSLHandshakeException. However, this issue does not occur when using 
the protocol-okhttp plugin.

The exception appears to be related to the TLS version. The web server does not 
support TLSv1 but does support TLSv1.2 and TLSv1.3. I also tried setting 
-Dhttps.protocols=TLSv1.3, but it didn't resolve the issue.

I'm unsure if this is a bug in the protocol-http plugin. Please let me know if 
further details are needed.


Thanks
Hanbing Luo


How to reproduce:
run org.apache.nutch.protocol.http.Http main method with above URL.


java -version
java version "11.0.23" 2024-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)


Detailed error stack:
2025-01-10 15:12:32,399 ERROR o.a.n.p.h.Http [main] Failed to get protocol 
output
org.apache.nutch.protocol.http.api.HttpException: SSL connect 
to https://weworkremotely.com failed with: Remote host terminated the 
handshake
at org.apache.nutch.protocol.http.HttpResponse.<init&gt;(HttpResponse.java:156) 
~[classes/:?]
at org.apache.nutch.protocol.http.Http.getResponse(Http.java:65) ~[classes/:?]
at 
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:354)
 [classes/:?]
at org.apache.nutch.protocol.http.api.HttpBase.main(HttpBase.java:697) 
[classes/:?]
at org.apache.nutch.protocol.http.Http.main(Http.java:59) [classes/:?]
Caused by: javax.net.ssl.SSLHandshakeException:&nbsp;Remote host terminated the 
handshake
at java.base/sun.security.ssl.SSLSocketImpl.handleEOF(SSLSocketImpl.java:1715) 
~[?:?]
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1514) 
~[?:?]
at 
java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
 ~[?:?]
at 
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:455) 
~[?:?]
at 
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:426) 
~[?:?]
at org.apache.nutch.protocol.http.HttpResponse.<init&gt;(HttpResponse.java:136) 
~[classes/:?]
... 4 more
Caused by: java.io.EOFException:&nbsp;SSL peer shut down incorrectly
at 
java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:489)
 ~[?:?]
at 
java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:478)
 ~[?:?]
at 
java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInputRecord.java:160)
 ~[?:?]
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:111) ~[?:?]
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1506) 
~[?:?]
at 
java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
 ~[?:?]
at 
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:455) 
~[?:?]
at 
java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:426) 
~[?:?]
at org.apache.nutch.protocol.http.HttpResponse.<init&gt;(HttpResponse.java:136) 
~[classes/:?]
... 4 more
Status: exception(16), lastModified=0: 
org.apache.nutch.protocol.http.api.HttpException: SSL connect 
to&nbsp;https://weworkremotely.com&nbsp;failed with: Remote host terminated the 
handshake

Reply via email to