(realize I never did summarize my setup, so here it is...)

***Our environment ***
* Dspace version 1.8.2
* Tomcat 5.5
* Redhat release 5.8
* Two separate linux hosts running dspace, both behind a 
director/ipvs/heartbeat/load-balancing bastion host.  all queries 
load-balanced between the two dspace servers

*** Our goals ***
* to collect usage statistics on a single solr server, rather than having 
each host's solr server collect its own statistics
* to maintain our no-single-point-of-failure environment (if one dspace 
server goes down, the other server continues to serve content)


I am happy to report that I got this to work -- sharing a solr server 
between two dspace servers.  I must have set up something wrong with my 
tunneling before, but now I can indeed configure the solr client on dspace 
host1 to send its solr statistics to the solr server on dspace host2, just 
by changing the url in modules/solr-statistics.cfg:server and adding the 
RemoteAddrValve to Tomcat server.xml on dspace host2 to permit http access 
from addresses other than 127.0.0.1.  Yeay!


Now on to the other issue with having two, redundant dspace servers and one 
active solr server -- how best to keep dspace available when one host goes 
down?  The current dspace behavior is that it stops serving pages if it 
cannot contact the solr server to report usage data.  Thus, if my dspace 
host2 (the one hosting the live solr server) goes down, dspace host1 will 
stop serving pages as well.  Redundancy is lost.

I can think of two reasonably elegant ways to resolve this:

*** SOLUTION #1: change the dspace code to add a timeout for the solr 
client's attempt to send data to the solr server.
I've found one possible place to tweak for this, in 
dspace-stats/src/main/java/org/dspace/statistics/SolrLogger.java:

*** 86,91 ****
--- 86,95 ----
              try
              {
                  server = new 
CommonsHttpSolrServer(ConfigurationManager.getProperty("solr-statistics", 
"server"));
+
+                 // add a connection timout to prevent solr client from 
hanging if solr server offline
+                 server.setConnectionTimeout(500);
+
                  SolrQuery solrQuery = new SolrQuery()
                          .setQuery("type:2 AND id:1");
                  server.query(solrQuery);

I've tested this change and it seems to work well -- if I make dspace host2 
(hosting the "live" solr server) unavailable, dspace host1 will continue 
serving documents just fine (albeit with a 500 millisecond delay for each 
document). An error and a java exception trail get logged to the dspace log 
for each failure to contact the solr server. If I again make dspace host2 
available, dspace host1 resumes sending usage data to the solr server on 
dspace host2.  Nice!

So is this the best place to tweak dspace to accomplish this?  Any reason to 
think this tweak is going to bomb?  I think I'd try to add more code to 
swallow the exception thrown from this so as not to junk up dspace log, but 
other than that?


*** SOLUTION #2: run a standalone solr server.
Rather than have either of the dspace servers host the "live" solr server, 
we would run a separate generic apache solr server.  This host would own the 
solr server ip address and solr statistics index disk resource.  If our 
monitoring app detects that the solr server has stopped responding, it would 
kill that service and transfer the ip address and disk resource to a standby 
solr server.

But can dspace use a generic (or minimally customized) standalone apache 
solr server for its statistics?  Or is the solr server inside dspace too 
customized for this to be feasible?  I suppose we could install full-blown 
dspace servers just to use their solr servers for this, but that starts to 
seem ugly.

Thanks in advance for any advice.

--Mike Reynolds
  University of Washington Libraries ITS

Compulsory reading: DSpace Mailing List Response Etiquette
http://www.lib.washington.edu/its/mailing+list+response+etiquette.html 


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to