Aaron, if public access is needed, most people just need to query Solr, not
update it. We tend to do this with reverse proxies. With a proxy you can
whitelist with the request handler and query params that are visible to the
outside world. You can use invariants to restrict many things even further.

There's a lot of information on the ajax-solr lists on this. I recently did
this for a client in the Windows world with IIS's reverse proxy features as
documented here:
http://www.opensourceconnections.com/2013/06/17/lockdown-solr-with-iis-as-a-reverse-proxy/comment-page-1/#comment-18709.
You can get more detailed with a small snippet of code that sits between
the Internet and Solr and does things that a regex can't easily catch.

A simple way to set this up is to bind Solr at 8983 on localhost not
0.0.0.0 in Jetty configs, then have the proxy resident on the Solr box
forward only allowed requests to localhost:8983. Anyone who wants to do an
update or hit the admin interface needs to be ssh tunneled to the Solr box.
Everyone else has to go through the reverse proxy.

I prefer this to doing a lot of heavy Jetty config tweaking as it lets me
*mostly* leave Solr-Jetty's default configs alone. I like having a mostly
clean separation of concerns between security and search.

Hope that helps,
-Doug






On Mon, Jun 24, 2013 at 1:51 AM, Aaron Greenspan
<aar...@thinkcomputer.com>wrote:

> Hi,
>
> Some more unsolicited feedback since my last experience setting up Solr…
>
> I am concerned that having a duplicate copy of a large part of my database
> up on the internet at a guessable location, available for the world to see,
> is probably not such a good idea. So I went to look up the various methods
> available to secure Solr, and found that all of them are terrible, if
> recent documentation is even available, which it's often not. Most of the
> blog posts I found are from 2010, presumably long before the version I use
> was created.
>
> According to the Solr Security wiki (
> http://wiki.apache.org/solr/SolrSecurity), it looks like you can edit
> some XML files (if you can find them) in complex ways to turn on HTTP
> authentication, or you can restrict the IP that Solr runs on. Less clear is
> some way to change the default port number from 8983.
>
> The wiki itself is full of semi-useless information, which is pretty
> infuriating since it's supposed to be the best source. The XML edits seem
> to change for different versions of Solr. Statements like "standard Java
> web security can be added by tuning the container and the Solr web
> application configuration itself via web.xml" are not helpful to me. I
> don't know what "standard Java web security" is, nor am I inclined to trust
> it since "Java security" is already believed by many to be something of an
> oxymoron. I don't have any idea where the file web.xml is--the default Solr
> install is a nest of needlessly complex folders. (Is it the one at
> ~/example/solr-webapp/webapp/WEB-INF/web.xml?) At the end of the page,
> there is a reference to "server.xml", but according to my install there is
> no such file.
>
> Basically, instead of (or at least on top of) this giant mess, the web
> interface for Solr should prompt the user, before doing anything else, to
> set up an administrative username and password, which one should be able to
> optionally require for queries and/or updates. It's just common sense. If I
> remember correctly, Netscape Enterprise Server prompted you to do that a
> decade and a half ago, and the internet has gotten a lot less friendly
> since then. You should also be able to limit the IP addresses that Solr
> runs on through the web interface, and change the port if desired, (or
> add/remove/edit users and passwords).
>
> The web server should also log when someone signs into the administrative
> interface, and from what IP address. There's probably some way to do this
> through the "Logging/Level" tree, but it's not exactly clear to me.
>
> In the meantime, I found that the approach most likely to work, and least
> likely to take a week to implement, was just to use iptables to set up a
> firewall on port 8983. Contrary to what one post on StackExchange (voted
> -1) says, it works only if you do the ACCEPT rules (iptables -A INPUT -p
> tcp -s xxx.xxx.xxx.xxx --dport 8983 -j ACCEPT) before the DROP all rule
> (iptables -A INPUT -p tcp --dport 8983 -j DROP). But either way, that's a
> pretty ridiculous solution. I don't know of any other server product that
> disregards security so willingly.
>
> Aaron
>
>
> Aaron Greenspan
> President & CEO
> Think Computer Corporation
>
> telephone +1 415 670 9350
> fax +1 415 373 3959
> e-mail aar...@thinkcomputer.com
> web http://www.thinkcomputer.com
>
>
>


-- 
Doug Turnbull
Search & Big Data Architect
OpenSource Connections <http://o19s.com>

Reply via email to