[
https://issues.apache.org/jira/browse/SOLR-12990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867061#comment-16867061
]
Hoss Man commented on SOLR-12990:
---------------------------------
After [~caomanhdat]'s commits in SOLR-12988 yesterday, which re-enabled SSL
randomization testing under java 11, we've started to see these symptoms pop up
again in jenkins jobs..
[http://fucit.org/solr-jenkins-reports/job-data/thetaphi/Lucene-Solr-8.x-Linux/726/]
[https://jenkins.thetaphi.de/view/Lucene-Solr/job/Lucene-Solr-8.x-Linux/726/]
{noformat}
-print-java-info:
[java-info] java version "11.0.2"
[java-info] OpenJDK Runtime Environment (11.0.2+9, Oracle Corporation)
[java-info] OpenJDK 64-Bit Server VM (11.0.2+9, Oracle Corporation)
[java-info] Test args: [-XX:+UseCompressedOops -XX:+UseConcMarkSweepGC]
...
[junit4] 2> 2027070 INFO
(SUITE-IndexSizeEstimatorTest-seed#[8F30F6AA795F0A16]-worker) [ ]
o.a.s.SolrTestCaseJ4 Randomized ssl (true) and clientAuth (false) via:
@org.apache.solr.util.RandomizeSSL(reason="", ssl=0.0/0.0, value=0.0/0.0,
clientAuth=0.0/0.0)
...
[junit4] 2> 2028080 ERROR
(OverseerThreadFactory-11245-thread-1-processing-n:127.0.0.1:36765_solr)
[n:127.0.0.1:36765_solr ] o.a.s.c.a.c.OverseerCollectionMessageHandler
Error from shard: https
://127.0.0.1:36765/solr
[junit4] 2> => org.apache.solr.client.solrj.SolrServerException:
IOException occurred when talking to server at: https://127.0.0.1:36765/solr
[junit4] 2> at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:670)
[junit4] 2> org.apache.solr.client.solrj.SolrServerException: IOException
occurred when talking to server at: https://127.0.0.1:36765/solr
[junit4] 2> at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:670)
~[java/:?]
[junit4] 2> at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
~[java/:?]
[junit4] 2> at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:245)
~[java/:?]
[junit4] 2> at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1274) ~[java/:?]
[junit4] 2> at
org.apache.solr.handler.component.HttpShardHandlerFactory$1.request(HttpShardHandlerFactory.java:176)
~[java/:?]
[junit4] 2> at
org.apache.solr.handler.component.HttpShardHandler.lambda$submit$0(HttpShardHandler.java:199)
~[java/:?]
[junit4] 2> at
java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
[junit4] 2> at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?]
[junit4] 2> at
java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
[junit4] 2> at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181)
~[metrics-core-4.0.5.jar:4.0.5]
[junit4] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
~[java/:?]
[junit4] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
[junit4] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
[junit4] 2> at java.lang.Thread.run(Thread.java:834) [?:?]
[junit4] 2> Caused by: javax.net.ssl.SSLException: Received fatal alert:
internal_error
[junit4] 2> at
sun.security.ssl.Alert.createSSLException(Alert.java:129) ~[?:?]
[junit4] 2> at
sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?]
[junit4] 2> at
sun.security.ssl.TransportContext.fatal(TransportContext.java:308) ~[?:?]
[junit4] 2> at
sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:279) ~[?:?]
[junit4] 2> at
sun.security.ssl.TransportContext.dispatch(TransportContext.java:181) ~[?:?]
[junit4] 2> at
sun.security.ssl.SSLTransport.decode(SSLTransport.java:164) ~[?:?]
[junit4] 2> at
sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1152) ~[?:?]
[junit4] 2> at
sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1063)
~[?:?]
[junit4] 2> at
sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:402) ~[?:?]
[junit4] 2> at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:355)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:373)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:394)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
~[httpclient-4.5.6.jar:4.5.6]
[junit4] 2> at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:555)
~[java/:?]
[junit4] 2> ... 13 more
{noformat}
----
Some new research has lead to finding new reports of similar bugs w/similar
symptoms (notably that it's a race condition that isn't garunteed to happen on
every request) in other systems that also seems to be traced to a TLSv1.3 bug
in java11...
[https://github.com/eclipse/jetty.project/issues/2939]
[https://bugs.openjdk.java.net/browse/JDK-8213202]
...what's perplexing is that if this is in fact JDK-8213202 (fixed in java
11.0.3) then according to the anecdotal reports, disabling TLSv1.3 and forcing
1.2 should work around the issue – but Dat's commits yesterday already force
TLSv1.2 ... so is this yet another TLSv1.3 bug in the JDK (and if so has it
also been fixed in 11.0.3 ?)
?
> High test failure rate on Java11/12 when (randomized) ssl=true
> clientAuth=false
> -------------------------------------------------------------------------------
>
> Key: SOLR-12990
> URL: https://issues.apache.org/jira/browse/SOLR-12990
> Project: Solr
> Issue Type: Bug
> Reporter: Hoss Man
> Priority: Major
> Labels: Java11, Java12
> Attachments: DistributedDebugComponentTest.ssl.debug.log.txt,
> enable.ssl.debug.patch
>
>
> Ever since the policeman's Jenkins instance started running tests on Java11,
> we've seen an abnormally high number of test failures that seem to be related
> to randomzed ssl.
> I've been investigating these logs, and trying to reproduce and have found
> the following observations:
> * In all the policeman jenkins logs i looked at, these SSL related failures
> only occur when the RandomizeSSL annotation picks {{ssl=true
> clientAuth=false}}
> ** NOTE: this doesn't mean that every test using {{ssl=true
> clientAuth=false}} failed -- since our build system only prints test output
> when tests fail, it's possible/probably (based on how often the value should
> be picked) that many tests randomly use {{ssl=true clientAuth=false}} and pass
> * the failures usually showed an exception that was {{Caused by:
> javax.net.ssl.SSLException: Received fatal alert: internal_error}} in the
> logs.
> * when i attempted to re-produce some of these failing seeds on my own
> machine using Java11, i could not _reliably_ reproduce these failures w/the
> same seeds
> ** beasting could _occasionally_ reproduce the failures, at roughly 1/10 runs
> ** suggesting that system load/timing contributed to these SSL related
> failures
> * picking one particularly trivial test (DistributedDebugComponentTest)
> ** with {{javax.net.debug=all}} enabled, i was able to see more details...
> *** notably: {{Fatal (INTERNAL_ERROR): Session has no PSK}}
> ** when I patched the test to force {{ssl=true clientAuth=true}} I was unable
> to trigger any failures with the same seed.
> * on the jira/http2 branch I was unable to reproduce these failures at all,
> w/o any patching
> ** similar to SOLR-12988, this may be because of bug fixes in the upgraded
> jetty.
> ----
> Filing this issue largely for tracking purpose, although we may also want to
> use it for discussions/considerations of other backports/fixes to 7x
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]