Re: Problems when Upgrading from Solr 7.7.1 to 8.5.0

2020-05-15 Thread Houston Putman
Hello Ludger,

I don't have answers to all of your questions, but for #2 (Incorrect Load
Balancing) it is a bug that will be fixed in 8.6. You can find more info at
SOLR-14471 .

- Houston

On Mon, May 11, 2020 at 8:16 AM Ludger Steens 
wrote:

> Hi all,
>
> we recently upgraded our SolrCloud cluster from version 7.7.1 to version
> 8.5.0 and ran into multiple problems.
> In the end we had to revert the upgrade and went back to Solr 7.7.1.
>
> In our company we are using Solr since Version 4 and so far, upgrading
> Solr to a newer version was possible without any problems.
> We are curious if others are experiencing the same kind of problems and if
> these are some known issues. Or maybe we did something wrong and missed
> something when upgrading?
>
>
> 1. Network issues when indexing documents
> ===
>
> Our collection contains roughly 150 million documents.  When we re-created
> the collection and re-indexed all documents, we regularly experienced
> network problems that causes our loader application to fail.
> The Solr log always contains an IOException Exception:
>
> ERROR
> (updateExecutor-5-thread-1338-processing-x:PSMG_CI_2020_04_15_10_07_04_sha
> rd6_replica_n22 r:core_node25 null n:solr2:8983_solr
> c:PSMG_CI_2020_04_15_10_07_04 s:shard6) [c:PSMG_CI_2020_04_15_10_07_04
> s:shard6 r:core_node25 x:PSMG_CI_2020_04_15_10_07_04_shard6_replica_n22]
> o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
> SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode:
> http://solr1:8983/solr/PSMG_CI_2020_04_15_10_07_04_shard6_replica_n20/ to
> http://solr1:8983/solr/PSMG_CI_2020_04_15_10_07_04_shard6_replica_n20/ =>
> java.io.IOException: java.io.IOException: cancel_stream_error
>  at
> org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten
> tProvider.java:197)
>  java.io.IOException: java.io.IOException: cancel_stream_error
>  at
> org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten
> tProvider.java:197) ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120]
>  at
> org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputSt
> ream.flush(OutputStreamContentProvider.java:151)
> ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120]
>  at
> org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputSt
> ream.write(OutputStreamContentProvider.java:145)
> ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120]
>  at
> org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:2
> 16) ~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42
> - romseygeek - 2020-03-1309:38:26]
>  at
> org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.
> java:209) ~[solr-solrj-8.5.0.jar:8.5.0
> 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 - romseygeek - 202003-13
> 09:38:26]
>  at
> org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:172)
> ~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 -
> romseygeek - 2020-03-13 09:3826]
>
> After the Exception the collection usually was in a degraded state for
> some time and shards try to recover and sync with the leader.
>
> In the Solr changelog we saw that one major change from 7.x to 8.x was
> that Solr now uses HTTP/2 instead of HTTP/1.1. So we tried to disable
> HTTP/2 by setting the system property solr.http1=true.
> That did make the indexing process a LOT more stable but we still saw a
> IOExceptions from time to time. Disabling HTTP/2 did not completely fix
> the problem.
>
> ERROR
> (updateExecutor-5-thread-9310-processing-x:PSMG_BOM_2020_04_28_05_00_11_sh
> ard7_replica_n24 r:core_node27 null n:solr3:8983_solr
> c:PSMG_BOM_2020_04_28_05_00_11 s:shard7) [c:PSMG_BOM_2020_04_28_05_00_11
> s:shard7 r:core_node27 x:PSMG_BOM_2020_04_28_05_00_11_shard7_replica_n24]
> o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
> SolrCmdDistributor$Req: cmd=add{,id=5141653a-e33a-4b60-856d-7aa2ce73dee7};
> node=ForwardNode:
> http://solr2:8983/solr/PSMG_BOM_2020_04_28_05_00_11_shard6_replica_n22/ to
> http://solr2:8983/solr/PSMG_BOM_2020_04_28_05_00_11_shard6_replica_n22/ =>
> java.io.IOException: java.io.EOFException:
> HttpConnectionOverHTTP@9dc7ad1::SocketChannelEndPoint@2d20213b{solr2/10.0.
> 0.216:8983<->/10.0.0.193:38728,ISHUT,fill=-,flush=-,to=5/60}{io=0/0,ki
> o=0,kro=1}->HttpConnectionOverHTTP@9dc7ad1(l:/10.0.0.193:38728 <->
> r:solr2/10.0.0.216:8983,closed=false)=>HttpChannelOverHTTP@47a242c3(exchan
> ge=HttpExchange@6ffd260f req=PENDING/null@null
> res=PENDING/null@null)[send=HttpSenderOverHTTP@17e056f9(req=CONTENT,snd=ID
> LE,failure=null)[HttpGenerator@3b6594c7{s=COMMITTED}],recv=HttpReceiverOve
> rHTTP@6e847d32(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0 of -1}]]
> at
> org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten
> 

Problems when Upgrading from Solr 7.7.1 to 8.5.0

2020-05-11 Thread Ludger Steens
Hi all,

we recently upgraded our SolrCloud cluster from version 7.7.1 to version
8.5.0 and ran into multiple problems.
In the end we had to revert the upgrade and went back to Solr 7.7.1.

In our company we are using Solr since Version 4 and so far, upgrading
Solr to a newer version was possible without any problems.
We are curious if others are experiencing the same kind of problems and if
these are some known issues. Or maybe we did something wrong and missed
something when upgrading?


1. Network issues when indexing documents
===

Our collection contains roughly 150 million documents.  When we re-created
the collection and re-indexed all documents, we regularly experienced
network problems that causes our loader application to fail.
The Solr log always contains an IOException Exception:

ERROR
(updateExecutor-5-thread-1338-processing-x:PSMG_CI_2020_04_15_10_07_04_sha
rd6_replica_n22 r:core_node25 null n:solr2:8983_solr
c:PSMG_CI_2020_04_15_10_07_04 s:shard6) [c:PSMG_CI_2020_04_15_10_07_04
s:shard6 r:core_node25 x:PSMG_CI_2020_04_15_10_07_04_shard6_replica_n22]
o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
SolrCmdDistributor$Req: cmd=add{,id=(null)}; node=StdNode:
http://solr1:8983/solr/PSMG_CI_2020_04_15_10_07_04_shard6_replica_n20/ to
http://solr1:8983/solr/PSMG_CI_2020_04_15_10_07_04_shard6_replica_n20/ =>
java.io.IOException: java.io.IOException: cancel_stream_error
 at
org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten
tProvider.java:197)
 java.io.IOException: java.io.IOException: cancel_stream_error
 at
org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten
tProvider.java:197) ~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120]
 at
org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputSt
ream.flush(OutputStreamContentProvider.java:151)
~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120]
 at
org.eclipse.jetty.client.util.OutputStreamContentProvider$DeferredOutputSt
ream.write(OutputStreamContentProvider.java:145)
~[jetty-client-9.4.24.v20191120.jar:9.4.24.v20191120]
 at
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:2
16) ~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42
- romseygeek - 2020-03-1309:38:26]
 at
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.
java:209) ~[solr-solrj-8.5.0.jar:8.5.0
7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 - romseygeek - 202003-13
09:38:26]
 at
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:172)
~[solr-solrj-8.5.0.jar:8.5.0 7ac489bf7b97b61749b19fa2ee0dc46e74b8dc42 -
romseygeek - 2020-03-13 09:3826]

After the Exception the collection usually was in a degraded state for
some time and shards try to recover and sync with the leader.

In the Solr changelog we saw that one major change from 7.x to 8.x was
that Solr now uses HTTP/2 instead of HTTP/1.1. So we tried to disable
HTTP/2 by setting the system property solr.http1=true.
That did make the indexing process a LOT more stable but we still saw a
IOExceptions from time to time. Disabling HTTP/2 did not completely fix
the problem.

ERROR
(updateExecutor-5-thread-9310-processing-x:PSMG_BOM_2020_04_28_05_00_11_sh
ard7_replica_n24 r:core_node27 null n:solr3:8983_solr
c:PSMG_BOM_2020_04_28_05_00_11 s:shard7) [c:PSMG_BOM_2020_04_28_05_00_11
s:shard7 r:core_node27 x:PSMG_BOM_2020_04_28_05_00_11_shard7_replica_n24]
o.a.s.u.ErrorReportingConcurrentUpdateSolrClient Error when calling
SolrCmdDistributor$Req: cmd=add{,id=5141653a-e33a-4b60-856d-7aa2ce73dee7};
node=ForwardNode:
http://solr2:8983/solr/PSMG_BOM_2020_04_28_05_00_11_shard6_replica_n22/ to
http://solr2:8983/solr/PSMG_BOM_2020_04_28_05_00_11_shard6_replica_n22/ =>
java.io.IOException: java.io.EOFException:
HttpConnectionOverHTTP@9dc7ad1::SocketChannelEndPoint@2d20213b{solr2/10.0.
0.216:8983<->/10.0.0.193:38728,ISHUT,fill=-,flush=-,to=5/60}{io=0/0,ki
o=0,kro=1}->HttpConnectionOverHTTP@9dc7ad1(l:/10.0.0.193:38728 <->
r:solr2/10.0.0.216:8983,closed=false)=>HttpChannelOverHTTP@47a242c3(exchan
ge=HttpExchange@6ffd260f req=PENDING/null@null
res=PENDING/null@null)[send=HttpSenderOverHTTP@17e056f9(req=CONTENT,snd=ID
LE,failure=null)[HttpGenerator@3b6594c7{s=COMMITTED}],recv=HttpReceiverOve
rHTTP@6e847d32(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0 of -1}]]
at
org.eclipse.jetty.client.util.DeferredContentProvider.flush(DeferredConten
tProvider.java:197)
java.io.IOException: java.io.EOFException:
HttpConnectionOverHTTP@9dc7ad1::SocketChannelEndPoint@2d20213b{solr2/10.0.
0.216:8983<->/10.0.0.193:38728,ISHUT,fill=-,flush=-,to=5/60}{io=0/0,ki
o=0,kro=1}->HttpConnectionOverHTTP@9dc7ad1(l:/10.0.0.193:38728 <->
r:solr2/10.0.0.216:8983,closed=false)=>HttpChannelOverHTTP@47a242c3(exchan
ge=HttpExchange@6ffd260f req=PENDING/null@null
res=PENDING/null@null)[send=HttpSenderOverHTTP@17e056f9(req=CONTENT,snd=ID