In the hopes that it will help someone get Solr running in a very clean way, here's an informational email.

For my Solr install on CentOS 6, I use /opt/solr4 as my installation path, and /index/solr4 as my solr home. The /index directory is a dedicated filesystem, /opt is part of the root filesystem.

From the example directory, I copied cloud-scripts, contexts, etc, lib, webapps, and start.jar over to /opt/solr4. My stuff was created before 4.3.0, so the resources directory didn't exist. I was already using log4j with a custom Solr build, and I put my log4j.properties file in etc instead. I created a logs directory and a run directory in /opt/solr4.

My data structure in /index/solr4 is complex. All a new user really needs to know is that solr.xml goes here and dictates the rest of the structure. There is a symlink at /index/solr4/lib, pointing to /opt/solr4/solrlib - so that jars placed in ${solr.solr.home}/lib are actually located in the program directory, not the data directory. That makes for a much cleaner version control scenario - both directories are git repositories cloned from our internal git server.

Unlike the example configs, my solrconfig.xml files do not have <lib> directives for loading jars. That gets automatically handled by the jars living in that symlinked lib directory. See SOLR-4852 for caveats regarding central lib directories.

https://issues.apache.org/jira/browse/SOLR-4852

If you want to run SolrCloud, you would need to install zookeeper separately and put your zkHost parameter in solr.xml. Due to a bug, putting zkHost in solr.xml doesn't work properly until 4.4.0.

Here's the current state of my init script. It's redhat-specific. I used /bin/bash (instead of /bin/sh) in the shebang because I am pretty sure that there are bash-isms in it, and bash is always available on the systems that I use:

http://apaste.info/9fVA

Notable features:
* Runs Solr as an unprivileged user.
* Has three methods for stopping Solr, tries graceful methods first.
 1) The jetty STOPPORT/STOPKEY mechanism.
 2) PID saved by the 'start' action.
 3) Any program using the Solr listening port.
* Before killing by PID, tries to make sure that the process actually is Solr.
* Sets up remote JMX, by default without authentication or SSL.
* Highly tuned CMS garbage collection.
* Sets up GC logging.
* Virtually everything is overridable via /etc/sysconfig/solr4.
* Points at an overridable log4j config file, by default in /opt/solr4/etc.
* Removes the existing PID file if the server is just booting up -- which it knows by noting that server uptime is less than three minutes.

It shouldn't be too hard to convert this so it works on debian-derived systems. That would involve rewriting portions that use redhat init routines, and probably start-stop-daemon. What I'd really like is one script that will work on any system, but that will require a fair amount of work.

It's a work in progress. It should load log4j.properties from resources instead of etc. I'd like to include it in the Solr download, but without a fair amount of documentation and possibly an installation script, which still must be written, that won't be possible.

Feel free to ask questions about anything that doesn't seem clear. I welcome ideas for improvement on both my own setup and the solr example.

Thanks,
Shawn

Reply via email to