Re: [dspace-tech] DSpace 7.5 Solr Statistics migration from 5.10 with sharding by year
You might need to double check the tomcat connector settings. # You need to replace the Catalina Connector Elements on lines 69,70 and 71 with the connector elements you see below. edit /etc/tomcat9/server.xml Hope it helps. Mo. On Sunday, June 11, 2023 at 11:35:12 PM UTC+3 Nicholas Woodward wrote: > Has this approach of importing all previous year's statistics into the > "statistics" Solr core worked for others who have a lot of stats? For the > last few days I've been trying to import all of the exported statistics > files below after renaming the beginning of each CSV file to > "statistics--...", but no matter how high I set the > `http.socket.timeout` parameter in Solr I get the > SocketTimeoutException error below when importing the last ZIP file > (statistics.zip). > > I'm working with the most recent code on the main branch of the DSpace > repository. I've increased the Java memory given to Solr to 2GB and added > the same amount to the `bin/dspace` command, but that didn't seem to help, > and in some cases made things worse. At the time that I get the socket > timeout error and the import-statistics process stops running the > "statistics" core usually has anywhere from 20-30 million docs in the > index. > > Error message: > > Problem encountered while trying to import index statistics. > org.apache.solr.client.solrj.SolrServerException: Timeout occurred while > waiting response from server at: http://localhost:8983/solr/statistics > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:692) > > > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) > > at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) > at org.dspace.util.SolrImportExport.importIndex(SolrImportExport.java:465) > > > at org.dspace.util.SolrImportExport.main(SolrImportExport.java:148) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:568) > at > org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:277) > at > org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:133) > at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:98) > > Caused by: java.net.SocketTimeoutException: Read timed out > at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283) > at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309) > at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350) > at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803) > at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966) > at > org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) > at > org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) > at > org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) > at > org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:571) > ... 12 more > > > Statistics files: > > 52MB Jun 7 08:54 statistics-2014.zip > > 130MB Jun 7 08:58 statistics-2015.zip > > 222MB Jun 7 09:05 statistics-2017.zip > > 46MB Jun 7 09:06 statistics-2018.zip > > 300MB Jun 7 09:15 statistics-2019.zip > > 2
Re: [dspace-tech] DSpace 7.5 Solr Statistics migration from 5.10 with sharding by year
Has this approach of importing all previous year's statistics into the "statistics" Solr core worked for others who have a lot of stats? For the last few days I've been trying to import all of the exported statistics files below after renaming the beginning of each CSV file to "statistics--...", but no matter how high I set the `http.socket.timeout` parameter in Solr I get the SocketTimeoutException error below when importing the last ZIP file (statistics.zip). I'm working with the most recent code on the main branch of the DSpace repository. I've increased the Java memory given to Solr to 2GB and added the same amount to the `bin/dspace` command, but that didn't seem to help, and in some cases made things worse. At the time that I get the socket timeout error and the import-statistics process stops running the "statistics" core usually has anywhere from 20-30 million docs in the index. Error message: Problem encountered while trying to import index statistics. org.apache.solr.client.solrj.SolrServerException: Timeout occurred while waiting response from server at: http://localhost:8983/solr/statistics at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:692) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266) at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) at org.dspace.util.SolrImportExport.importIndex(SolrImportExport.java:465) at org.dspace.util.SolrImportExport.main(SolrImportExport.java:148) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:277) at org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:133) at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:98) Caused by: java.net.SocketTimeoutException: Read timed out at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283) at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309) at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350) at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803) at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966) at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:571) ... 12 more Statistics files: 52MB Jun 7 08:54 statistics-2014.zip 130MB Jun 7 08:58 statistics-2015.zip 222MB Jun 7 09:05 statistics-2017.zip 46MB Jun 7 09:06 statistics-2018.zip 300MB Jun 7 09:15 statistics-2019.zip 273MB Jun 7 09:22 statistics-2020.zip 415MB Jun 7 09:36 statistics-2021.zip 30MB Jun 7 09:37 statistics-2022.zip 687MB Jun 7 10:02 statistics.zip Thanks, Nick On Thursday, April 13, 2023 at 2:13:00 PM UTC-5 Tim Donohue wrote: > Thanks James & Tomas for sharing your hints/tips here! It's obvious we > didn't document this very well in the DSpace 7 Upgrade process. Just now, > I've done my best to summarize your advice & add more hints in Step 10(a) > of the Upgrade process to help others along. I eve