(realize I never did summarize my setup, so here it is...) ***Our environment *** * Dspace version 1.8.2 * Tomcat 5.5 * Redhat release 5.8 * Two separate linux hosts running dspace, both behind a director/ipvs/heartbeat/load-balancing bastion host. all queries load-balanced between the two dspace servers
*** Our goals *** * to collect usage statistics on a single solr server, rather than having each host's solr server collect its own statistics * to maintain our no-single-point-of-failure environment (if one dspace server goes down, the other server continues to serve content) I am happy to report that I got this to work -- sharing a solr server between two dspace servers. I must have set up something wrong with my tunneling before, but now I can indeed configure the solr client on dspace host1 to send its solr statistics to the solr server on dspace host2, just by changing the url in modules/solr-statistics.cfg:server and adding the RemoteAddrValve to Tomcat server.xml on dspace host2 to permit http access from addresses other than 127.0.0.1. Yeay! Now on to the other issue with having two, redundant dspace servers and one active solr server -- how best to keep dspace available when one host goes down? The current dspace behavior is that it stops serving pages if it cannot contact the solr server to report usage data. Thus, if my dspace host2 (the one hosting the live solr server) goes down, dspace host1 will stop serving pages as well. Redundancy is lost. I can think of two reasonably elegant ways to resolve this: *** SOLUTION #1: change the dspace code to add a timeout for the solr client's attempt to send data to the solr server. I've found one possible place to tweak for this, in dspace-stats/src/main/java/org/dspace/statistics/SolrLogger.java: *** 86,91 **** --- 86,95 ---- try { server = new CommonsHttpSolrServer(ConfigurationManager.getProperty("solr-statistics", "server")); + + // add a connection timout to prevent solr client from hanging if solr server offline + server.setConnectionTimeout(500); + SolrQuery solrQuery = new SolrQuery() .setQuery("type:2 AND id:1"); server.query(solrQuery); I've tested this change and it seems to work well -- if I make dspace host2 (hosting the "live" solr server) unavailable, dspace host1 will continue serving documents just fine (albeit with a 500 millisecond delay for each document). An error and a java exception trail get logged to the dspace log for each failure to contact the solr server. If I again make dspace host2 available, dspace host1 resumes sending usage data to the solr server on dspace host2. Nice! So is this the best place to tweak dspace to accomplish this? Any reason to think this tweak is going to bomb? I think I'd try to add more code to swallow the exception thrown from this so as not to junk up dspace log, but other than that? *** SOLUTION #2: run a standalone solr server. Rather than have either of the dspace servers host the "live" solr server, we would run a separate generic apache solr server. This host would own the solr server ip address and solr statistics index disk resource. If our monitoring app detects that the solr server has stopped responding, it would kill that service and transfer the ip address and disk resource to a standby solr server. But can dspace use a generic (or minimally customized) standalone apache solr server for its statistics? Or is the solr server inside dspace too customized for this to be feasible? I suppose we could install full-blown dspace servers just to use their solr servers for this, but that starts to seem ugly. Thanks in advance for any advice. --Mike Reynolds University of Washington Libraries ITS Compulsory reading: DSpace Mailing List Response Etiquette http://www.lib.washington.edu/its/mailing+list+response+etiquette.html ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette