Re: [dspace-tech] DSpace 7.5 Solr Statistics migration from 5.10 with sharding by year

2023-06-11 Thread Mohammad S. AlMutairi
You might need to double check the tomcat connector settings.

# You need to replace the Catalina Connector Elements on lines 69,70 and 71 
with the connector elements you see below.
 edit /etc/tomcat9/server.xml



Hope it helps.

Mo.

On Sunday, June 11, 2023 at 11:35:12 PM UTC+3 Nicholas Woodward wrote:

> Has this approach of importing all previous year's statistics into the 
> "statistics" Solr core worked for others who have a lot of stats? For the 
> last few days I've been trying to import all of the exported statistics 
> files below after renaming the beginning of each CSV file to 
> "statistics--...", but no matter how high I set the 
> `http.socket.timeout` parameter in Solr I get the 
> SocketTimeoutException error below when importing the last ZIP file 
> (statistics.zip). 
>
> I'm working with the most recent code on the main branch of the DSpace 
> repository. I've increased the Java memory given to Solr to 2GB and added 
> the same amount to the `bin/dspace` command, but that didn't seem to help, 
> and in some cases made things worse. At the time that I get the socket 
> timeout error and the import-statistics process stops running the 
> "statistics" core usually has anywhere from 20-30 million docs in the 
> index. 
>
> Error message: 
>
> Problem encountered while trying to import index statistics.
> org.apache.solr.client.solrj.SolrServerException: Timeout occurred while 
> waiting response from server at: http://localhost:8983/solr/statistics
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:692)
>
>
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
>
> at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
> at org.dspace.util.SolrImportExport.importIndex(SolrImportExport.java:465)
>
>
> at org.dspace.util.SolrImportExport.main(SolrImportExport.java:148)
> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at 
> org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:277)
> at 
> org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:133)
> at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:98)
>
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
> at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
> at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
> at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
> at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
> at 
> org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
> at 
> org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
> at 
> org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
> at 
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
> at 
> org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
> at 
> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
> at 
> org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
> at 
> org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
> at 
> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
> at 
> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
> at 
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
> at 
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
> at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
> at 
> org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
> at 
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> at 
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:571)
> ... 12 more
>
>
> Statistics files:
>
> 52MB Jun  7 08:54 statistics-2014.zip
>
> 130MB Jun  7 08:58 statistics-2015.zip
>
> 222MB Jun  7 09:05 statistics-2017.zip
>
> 46MB Jun  7 09:06 statistics-2018.zip
>
> 300MB Jun  7 09:15 statistics-2019.zip
>
> 2

Re: [dspace-tech] DSpace 7.5 Solr Statistics migration from 5.10 with sharding by year

2023-06-11 Thread Nicholas Woodward
Has this approach of importing all previous year's statistics into the 
"statistics" Solr core worked for others who have a lot of stats? For the 
last few days I've been trying to import all of the exported statistics 
files below after renaming the beginning of each CSV file to 
"statistics--...", but no matter how high I set the 
`http.socket.timeout` parameter in Solr I get the 
SocketTimeoutException error below when importing the last ZIP file 
(statistics.zip). 

I'm working with the most recent code on the main branch of the DSpace 
repository. I've increased the Java memory given to Solr to 2GB and added 
the same amount to the `bin/dspace` command, but that didn't seem to help, 
and in some cases made things worse. At the time that I get the socket 
timeout error and the import-statistics process stops running the 
"statistics" core usually has anywhere from 20-30 million docs in the 
index. 

Error message: 

Problem encountered while trying to import index statistics.
org.apache.solr.client.solrj.SolrServerException: Timeout occurred while 
waiting response from server at: http://localhost:8983/solr/statistics
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:692)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290)
at org.dspace.util.SolrImportExport.importIndex(SolrImportExport.java:465)
at org.dspace.util.SolrImportExport.main(SolrImportExport.java:148)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at 
org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:277)
at 
org.dspace.app.launcher.ScriptLauncher.handleScript(ScriptLauncher.java:133)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:98)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:571)
... 12 more


Statistics files:

52MB Jun  7 08:54 statistics-2014.zip

130MB Jun  7 08:58 statistics-2015.zip

222MB Jun  7 09:05 statistics-2017.zip

46MB Jun  7 09:06 statistics-2018.zip

300MB Jun  7 09:15 statistics-2019.zip

273MB Jun  7 09:22 statistics-2020.zip

415MB Jun  7 09:36 statistics-2021.zip

30MB Jun  7 09:37 statistics-2022.zip

687MB Jun  7 10:02 statistics.zip

Thanks,
Nick


On Thursday, April 13, 2023 at 2:13:00 PM UTC-5 Tim Donohue wrote:

> Thanks James & Tomas for sharing your hints/tips here!  It's obvious we 
> didn't document this very well in the DSpace 7 Upgrade process.  Just now, 
> I've done my best to summarize your advice & add more hints in Step 10(a) 
> of the Upgrade process to help others along. I eve