Re: solr security patch

2014-11-05 Thread kuttan palliyalil
Got it. Thank you Shawn.
RegardsRaj 

 On Wednesday, November 5, 2014 10:39 PM, Shawn Heisey 
 wrote:
   

 On 11/5/2014 5:04 PM, kuttan palliyalil wrote:
> I am trying to apply the security patch(Solr-4470.patch) on solr 4.10.1 tag. 
> SOLR-4470.patch 14/Mar/14 16:15278 kB
> 
> Getting error with the hunk failure. Could any one confirm if this the right 
> patch for 4.10.1.

The latest patch is almost 8 months old.  The pace of change in the
Lucene/Solr codebase is extremely fast, so this is VERY outdated.

That patch will successfully apply to trunk revision 1577540, but you
won't be able to use "svn up" to bring the code up to date.

The patch will not apply successfully to any up-to-date branch or tag.
It's very likely that you will need to examine any patch hunks that
don't apply, and make the changes manually.  There is no automated way
to handle this.

Thanks,
Shawn



   

Re: solr security patch

2014-11-05 Thread Shawn Heisey
On 11/5/2014 5:04 PM, kuttan palliyalil wrote:
> I am trying to apply the security patch(Solr-4470.patch) on solr 4.10.1 tag. 
> SOLR-4470.patch 14/Mar/14 16:15278 kB
> 
> Getting error with the hunk failure. Could any one confirm if this the right 
> patch for 4.10.1.

The latest patch is almost 8 months old.  The pace of change in the
Lucene/Solr codebase is extremely fast, so this is VERY outdated.

That patch will successfully apply to trunk revision 1577540, but you
won't be able to use "svn up" to bring the code up to date.

The patch will not apply successfully to any up-to-date branch or tag.
It's very likely that you will need to examine any patch hunks that
don't apply, and make the changes manually.  There is no automated way
to handle this.

Thanks,
Shawn



Re: SOLR Security - Displaying endpoints to public

2014-01-07 Thread Michael Della Bitta
I think generally it might be true that it's too difficult for an admin
without very specific knowledge of Solr internals to utilize simple URL
rewriting to prevent exploits. To show what I mean, here's a story where
someone was able to exploit a Solr server through a custom webapp, which in
theory is many times more obfuscated than a simple rewrite.

http://www.agarri.fr/kom/archives/2013/11/27/compromising_an_unreachable_solr_server_with_cve-2013-6397/index.html

I know that some of the vulnerabilities used in this writeup have been
fixed, but the potential for other vulnerabilities such as these to appear
in the future is likely. That's just how software development works. It
would be hard for a casual user to maintain a rewriting scheme that both
secured the current release of Solr in any configuration, while also
preventing any new features from being exploited.

Just my two cents.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062  | c: +1 917 477 7906

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions  | g+:
plus.google.com/appinions
w: appinions.com 


On Tue, Jan 7, 2014 at 3:45 AM, Raymond Wiker  wrote:

> Indeed it is - but you'll also need mod_proxy ("just" rewriting will not be
> sufficient).
>
>
> On Tue, Jan 7, 2014 at 3:42 AM, Otis Gospodnetic <
> otis.gospodne...@gmail.com
> > wrote:
>
> > Apache url_rewrite can help with this and it's only a few minutes to set
> > up.
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Mon, Jan 6, 2014 at 12:55 PM, Developer  wrote:
> >
> > > Hi,
> > >
> > > We are currently showing the SOLR endpoints to the public when using
> our
> > > application (public users would be able to view the SOLR endpoints
> > > (/select)
> > > and the query in debugging console).
> > >
> > > I am trying to figure out if there is any security threat in terms of
> > > displaying the endpoints directly in internet. We have disabled the
> > update
> > > handler in production so I assume writes / updates are not possible.
> > >
> > > The below URL mentions a point 'Solr does not concern itself with
> > security
> > > either at the document level or the communication level. It is strongly
> > > recommended that the application server containing Solr be firewalled
> > such
> > > the only clients with access to Solr are your own.'
> > >
> > > Is the above statement true even if we just display the read-only
> > endpoints
> > > to the public users? Can someone please advise?
> > >
> > > http://wiki.apache.org/solr/SolrSecurity
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://lucene.472066.n3.nabble.com/SOLR-Security-Displaying-endpoints-to-public-tp4109792.html
> > > Sent from the Solr - User mailing list archive at Nabble.com.
> > >
> >
>


Re: SOLR Security - Displaying endpoints to public

2014-01-07 Thread Raymond Wiker
Indeed it is - but you'll also need mod_proxy ("just" rewriting will not be
sufficient).


On Tue, Jan 7, 2014 at 3:42 AM, Otis Gospodnetic  wrote:

> Apache url_rewrite can help with this and it's only a few minutes to set
> up.
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Mon, Jan 6, 2014 at 12:55 PM, Developer  wrote:
>
> > Hi,
> >
> > We are currently showing the SOLR endpoints to the public when using our
> > application (public users would be able to view the SOLR endpoints
> > (/select)
> > and the query in debugging console).
> >
> > I am trying to figure out if there is any security threat in terms of
> > displaying the endpoints directly in internet. We have disabled the
> update
> > handler in production so I assume writes / updates are not possible.
> >
> > The below URL mentions a point 'Solr does not concern itself with
> security
> > either at the document level or the communication level. It is strongly
> > recommended that the application server containing Solr be firewalled
> such
> > the only clients with access to Solr are your own.'
> >
> > Is the above statement true even if we just display the read-only
> endpoints
> > to the public users? Can someone please advise?
> >
> > http://wiki.apache.org/solr/SolrSecurity
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/SOLR-Security-Displaying-endpoints-to-public-tp4109792.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>


Re: SOLR Security - Displaying endpoints to public

2014-01-06 Thread Otis Gospodnetic
Apache url_rewrite can help with this and it's only a few minutes to set up.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Jan 6, 2014 at 12:55 PM, Developer  wrote:

> Hi,
>
> We are currently showing the SOLR endpoints to the public when using our
> application (public users would be able to view the SOLR endpoints
> (/select)
> and the query in debugging console).
>
> I am trying to figure out if there is any security threat in terms of
> displaying the endpoints directly in internet. We have disabled the update
> handler in production so I assume writes / updates are not possible.
>
> The below URL mentions a point 'Solr does not concern itself with security
> either at the document level or the communication level. It is strongly
> recommended that the application server containing Solr be firewalled such
> the only clients with access to Solr are your own.'
>
> Is the above statement true even if we just display the read-only endpoints
> to the public users? Can someone please advise?
>
> http://wiki.apache.org/solr/SolrSecurity
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-Security-Displaying-endpoints-to-public-tp4109792.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: SOLR Security - Displaying endpoints to public

2014-01-06 Thread Raymond Wiker

On 06 Jan 2014, at 19:37 , Shawn Heisey  wrote:

> On 1/6/2014 11:18 AM, Shawn Heisey wrote:
>> Even if you disable admin handlers so that it's impossible to gather full 
>> information about your schema and other settings, generating legitimate 
>> queries is probably enough for an attacker to get the information they need.
> 
> Self-replying on this point: If you *don't* disable admin handlers, an 
> attacker would also be able to simply unload the core and ask Solr to delete 
> it from disk.
> 
> A side effect of disabling admin handlers is that the admin UI won't work 
> either.  In terms of security hardening, that's a good thing ... but it makes 
> it *very* difficult to gather useful information about your installation's 
> health.
> 

If you want to apply some sort of access restrictions on the content, you will 
need a mechanism to identify the user and add parameters to restrict the result 
set. You will also need to stop the user from circumventing this mechanism, 
which basically means that the "raw" Solr endpoints must not be accessible to 
the user.



Re: SOLR Security - Displaying endpoints to public

2014-01-06 Thread Shawn Heisey

On 1/6/2014 11:18 AM, Shawn Heisey wrote:
Even if you disable admin handlers so that it's impossible to gather 
full information about your schema and other settings, generating 
legitimate queries is probably enough for an attacker to get the 
information they need.


Self-replying on this point: If you *don't* disable admin handlers, an 
attacker would also be able to simply unload the core and ask Solr to 
delete it from disk.


A side effect of disabling admin handlers is that the admin UI won't 
work either.  In terms of security hardening, that's a good thing ... 
but it makes it *very* difficult to gather useful information about your 
installation's health.


Thanks,
Shawn



Re: SOLR Security - Displaying endpoints to public

2014-01-06 Thread Shawn Heisey

On 1/6/2014 10:55 AM, Developer wrote:

We are currently showing the SOLR endpoints to the public when using our
application (public users would be able to view the SOLR endpoints (/select)
and the query in debugging console).

I am trying to figure out if there is any security threat in terms of
displaying the endpoints directly in internet. We have disabled the update
handler in production so I assume writes / updates are not possible.

The below URL mentions a point 'Solr does not concern itself with security
either at the document level or the communication level. It is strongly
recommended that the application server containing Solr be firewalled such
the only clients with access to Solr are your own.'

Is the above statement true even if we just display the read-only endpoints
to the public users? Can someone please advise?


Without an application between the public and Solr that sanitizes user 
input, an attacker can send denial of service queries to your Solr 
instance that will cause it to spin so hard it can't serve regular 
queries.  We can't block such things in server code, because sometimes 
such queries *are* legitimate, they just take a lot of resources and 
time to complete.


Even if you disable admin handlers so that it's impossible to gather 
full information about your schema and other settings, generating 
legitimate queries is probably enough for an attacker to get the 
information they need.


If your design is such that client-side scripting handles almost 
everything, you probably need to set up a proxy in front of Solr that's 
configured to deny things that look suspicious.  I do not know of any 
publicly available proxy configurations like this, and I have never come 
across any private ones either.


Thanks,
Shawn



Re: Solr Security

2013-06-24 Thread Doug Turnbull
Aaron, if public access is needed, most people just need to query Solr, not
update it. We tend to do this with reverse proxies. With a proxy you can
whitelist with the request handler and query params that are visible to the
outside world. You can use invariants to restrict many things even further.

There's a lot of information on the ajax-solr lists on this. I recently did
this for a client in the Windows world with IIS's reverse proxy features as
documented here:
http://www.opensourceconnections.com/2013/06/17/lockdown-solr-with-iis-as-a-reverse-proxy/comment-page-1/#comment-18709.
You can get more detailed with a small snippet of code that sits between
the Internet and Solr and does things that a regex can't easily catch.

A simple way to set this up is to bind Solr at 8983 on localhost not
0.0.0.0 in Jetty configs, then have the proxy resident on the Solr box
forward only allowed requests to localhost:8983. Anyone who wants to do an
update or hit the admin interface needs to be ssh tunneled to the Solr box.
Everyone else has to go through the reverse proxy.

I prefer this to doing a lot of heavy Jetty config tweaking as it lets me
*mostly* leave Solr-Jetty's default configs alone. I like having a mostly
clean separation of concerns between security and search.

Hope that helps,
-Doug






On Mon, Jun 24, 2013 at 1:51 AM, Aaron Greenspan
wrote:

> Hi,
>
> Some more unsolicited feedback since my last experience setting up Solr…
>
> I am concerned that having a duplicate copy of a large part of my database
> up on the internet at a guessable location, available for the world to see,
> is probably not such a good idea. So I went to look up the various methods
> available to secure Solr, and found that all of them are terrible, if
> recent documentation is even available, which it's often not. Most of the
> blog posts I found are from 2010, presumably long before the version I use
> was created.
>
> According to the Solr Security wiki (
> http://wiki.apache.org/solr/SolrSecurity), it looks like you can edit
> some XML files (if you can find them) in complex ways to turn on HTTP
> authentication, or you can restrict the IP that Solr runs on. Less clear is
> some way to change the default port number from 8983.
>
> The wiki itself is full of semi-useless information, which is pretty
> infuriating since it's supposed to be the best source. The XML edits seem
> to change for different versions of Solr. Statements like "standard Java
> web security can be added by tuning the container and the Solr web
> application configuration itself via web.xml" are not helpful to me. I
> don't know what "standard Java web security" is, nor am I inclined to trust
> it since "Java security" is already believed by many to be something of an
> oxymoron. I don't have any idea where the file web.xml is--the default Solr
> install is a nest of needlessly complex folders. (Is it the one at
> ~/example/solr-webapp/webapp/WEB-INF/web.xml?) At the end of the page,
> there is a reference to "server.xml", but according to my install there is
> no such file.
>
> Basically, instead of (or at least on top of) this giant mess, the web
> interface for Solr should prompt the user, before doing anything else, to
> set up an administrative username and password, which one should be able to
> optionally require for queries and/or updates. It's just common sense. If I
> remember correctly, Netscape Enterprise Server prompted you to do that a
> decade and a half ago, and the internet has gotten a lot less friendly
> since then. You should also be able to limit the IP addresses that Solr
> runs on through the web interface, and change the port if desired, (or
> add/remove/edit users and passwords).
>
> The web server should also log when someone signs into the administrative
> interface, and from what IP address. There's probably some way to do this
> through the "Logging/Level" tree, but it's not exactly clear to me.
>
> In the meantime, I found that the approach most likely to work, and least
> likely to take a week to implement, was just to use iptables to set up a
> firewall on port 8983. Contrary to what one post on StackExchange (voted
> -1) says, it works only if you do the ACCEPT rules (iptables -A INPUT -p
> tcp -s xxx.xxx.xxx.xxx --dport 8983 -j ACCEPT) before the DROP all rule
> (iptables -A INPUT -p tcp --dport 8983 -j DROP). But either way, that's a
> pretty ridiculous solution. I don't know of any other server product that
> disregards security so willingly.
>
> Aaron
>
>
> Aaron Greenspan
> President & CEO
> Think Computer Corporation
>
> telephone +1 415 670 9350
> fax +1 415 373 3959
> e-mail aar...@thinkcomputer.com
> web http://www.thinkcomputer.com
>
>
>


-- 
Doug Turnbull
Search & Big Data Architect
OpenSource Connections 


RE: Solr Security

2013-06-24 Thread Boogie Shafer
its a little frustrating to see the smug responses to your query

and its fair to say the solr security situation could be *improved*

this JIRA ticket is worth reading
https://issues.apache.org/jira/browse/SOLR-4470

in short

-it is possible to restrict access to solr nodes using connection filtering 
(this gets real cumbersome in a public cloud architecture, but apparently the 
idea of a secure perimeter and a trusted lan dies hard)

-it is possible to protect the communications between solr nodes using 
techniques like ipsec between the nodes themselves (ipsec adoption for anything 
but VPN clients has never been high for a reason, again its cumbersome)

-its not yet possible to implement auth for node-node communication in the 
cloud config, so if you want to go finer grained than "this node can talk to 
this node" you are out of luck


there seems to be a long running debate about whether solr should address 
security or leave it up to the "container"it would seem that debate should 
have been moot once the cloud based topology came about and solr nodes started 
talking to other solr nodes and to zookeeper servers, and yet here we 
arestill acting like it runs entirely on a single system and expecting that 
the webapp "container" can provide all the security needed.

so yeah, there are some obvious shortcomings, deploy appropriately



From: Aaron Greenspan
Sent: Sunday, June 23, 2013 22:51
To: solr-user@lucene.apache.org
Subject: Solr Security

Hi,

Some more unsolicited feedback since my last experience setting up Solr…

I am concerned that having a duplicate copy of a large part of my database up 
on the internet at a guessable location, available for the world to see, is 
probably not such a good idea. So I went to look up the various methods 
available to secure Solr, and found that all of them are terrible, if recent 
documentation is even available, which it's often not. Most of the blog posts I 
found are from 2010, presumably long before the version I use was created.

According to the Solr Security wiki (http://wiki.apache.org/solr/SolrSecurity), 
it looks like you can edit some XML files (if you can find them) in complex 
ways to turn on HTTP authentication, or you can restrict the IP that Solr runs 
on. Less clear is some way to change the default port number from 8983.

The wiki itself is full of semi-useless information, which is pretty 
infuriating since it's supposed to be the best source. The XML edits seem to 
change for different versions of Solr. Statements like "standard Java web 
security can be added by tuning the container and the Solr web application 
configuration itself via web.xml" are not helpful to me. I don't know what 
"standard Java web security" is, nor am I inclined to trust it since "Java 
security" is already believed by many to be something of an oxymoron. I don't 
have any idea where the file web.xml is--the default Solr install is a nest of 
needlessly complex folders. (Is it the one at 
~/example/solr-webapp/webapp/WEB-INF/web.xml?) At the end of the page, there is 
a reference to "server.xml", but according to my install there is no such file.

Basically, instead of (or at least on top of) this giant mess, the web 
interface for Solr should prompt the user, before doing anything else, to set 
up an administrative username and password, which one should be able to 
optionally require for queries and/or updates. It's just common sense. If I 
remember correctly, Netscape Enterprise Server prompted you to do that a decade 
and a half ago, and the internet has gotten a lot less friendly since then. You 
should also be able to limit the IP addresses that Solr runs on through the web 
interface, and change the port if desired, (or add/remove/edit users and 
passwords).

The web server should also log when someone signs into the administrative 
interface, and from what IP address. There's probably some way to do this 
through the "Logging/Level" tree, but it's not exactly clear to me.

In the meantime, I found that the approach most likely to work, and least 
likely to take a week to implement, was just to use iptables to set up a 
firewall on port 8983. Contrary to what one post on StackExchange (voted -1) 
says, it works only if you do the ACCEPT rules (iptables -A INPUT -p tcp -s 
xxx.xxx.xxx.xxx --dport 8983 -j ACCEPT) before the DROP all rule (iptables -A 
INPUT -p tcp --dport 8983 -j DROP). But either way, that's a pretty ridiculous 
solution. I don't know of any other server product that disregards security so 
willingly.

Aaron


Aaron Greenspan
President & CEO
Think Computer Corporation

telephone +1 415 670 9350
fax +1 415 373 3959
e-mail aar...@thinkcomputer.com
web http://www.thinkcomputer.com




Re: Solr Security

2013-06-24 Thread Andy Lester

On Jun 24, 2013, at 12:51 AM, Aaron Greenspan  wrote:

>  all of them are terrible,

> it looks like you can edit some XML files (if you can find them) 

> The wiki itself is full of semi-useless information, which is pretty 
> infuriating since it's supposed to be the best source.

> Statements like "standard Java web security can be added by tuning the 
> container and the Solr web application configuration itself via web.xml" are 
> not helpful to me.

>  this giant mess,

> It's just common sense.

> Netscape Enterprise Server prompted you to do that a decade and a half ago

>  But either way, that's a pretty ridiculous solution.

> I don't know of any other server product that disregards security so 
> willingly.


Why are you wasting your time with such an inferior project?  Perhaps 
ElasticSearch is more to your liking.

xoxo,
Andy

--
Andy Lester => a...@petdance.com => www.petdance.com => AIM:petdance



Re: Solr Security

2013-06-24 Thread Walter Underwood
http://lmgtfy.com/?q=jetty+access+control

wunder

On Jun 23, 2013, at 10:51 PM, Aaron Greenspan wrote:

> Hi,
> 
> Some more unsolicited feedback since my last experience setting up Solr…
> 
> I am concerned that having a duplicate copy of a large part of my database up 
> on the internet at a guessable location, available for the world to see, is 
> probably not such a good idea. So I went to look up the various methods 
> available to secure Solr, and found that all of them are terrible, if recent 
> documentation is even available, which it's often not. Most of the blog posts 
> I found are from 2010, presumably long before the version I use was created.
> 
> According to the Solr Security wiki 
> (http://wiki.apache.org/solr/SolrSecurity), it looks like you can edit some 
> XML files (if you can find them) in complex ways to turn on HTTP 
> authentication, or you can restrict the IP that Solr runs on. Less clear is 
> some way to change the default port number from 8983.
> 
> The wiki itself is full of semi-useless information, which is pretty 
> infuriating since it's supposed to be the best source. The XML edits seem to 
> change for different versions of Solr. Statements like "standard Java web 
> security can be added by tuning the container and the Solr web application 
> configuration itself via web.xml" are not helpful to me. I don't know what 
> "standard Java web security" is, nor am I inclined to trust it since "Java 
> security" is already believed by many to be something of an oxymoron. I don't 
> have any idea where the file web.xml is--the default Solr install is a nest 
> of needlessly complex folders. (Is it the one at 
> ~/example/solr-webapp/webapp/WEB-INF/web.xml?) At the end of the page, there 
> is a reference to "server.xml", but according to my install there is no such 
> file.
> 
> Basically, instead of (or at least on top of) this giant mess, the web 
> interface for Solr should prompt the user, before doing anything else, to set 
> up an administrative username and password, which one should be able to 
> optionally require for queries and/or updates. It's just common sense. If I 
> remember correctly, Netscape Enterprise Server prompted you to do that a 
> decade and a half ago, and the internet has gotten a lot less friendly since 
> then. You should also be able to limit the IP addresses that Solr runs on 
> through the web interface, and change the port if desired, (or 
> add/remove/edit users and passwords).
> 
> The web server should also log when someone signs into the administrative 
> interface, and from what IP address. There's probably some way to do this 
> through the "Logging/Level" tree, but it's not exactly clear to me.
> 
> In the meantime, I found that the approach most likely to work, and least 
> likely to take a week to implement, was just to use iptables to set up a 
> firewall on port 8983. Contrary to what one post on StackExchange (voted -1) 
> says, it works only if you do the ACCEPT rules (iptables -A INPUT -p tcp -s 
> xxx.xxx.xxx.xxx --dport 8983 -j ACCEPT) before the DROP all rule (iptables -A 
> INPUT -p tcp --dport 8983 -j DROP). But either way, that's a pretty 
> ridiculous solution. I don't know of any other server product that disregards 
> security so willingly.
> 
> Aaron
> 
>   
> Aaron Greenspan
> President & CEO
> Think Computer Corporation
> 
> telephone +1 415 670 9350
> fax +1 415 373 3959
> e-mail aar...@thinkcomputer.com
> web http://www.thinkcomputer.com
> 






Re: Solr Security

2013-06-24 Thread Daniel Collins
To change Solr's default port number just pass -Djetty.port= on the
command line, works a treat.

As Solr is deployed as a web-app, it is assumed that the administrator
would be familiar with web apps, servlet containers and their security, if
not, then that is something you need to investigate generally.  Solr comes
out of the box with Jetty, but lots of installations use Tomcat and other
Servlet Engines, so having standard procedures isn't really viable when all
the servlet containers use different mechanisms to configure their security.

I believe historically that out of the box deployment was very much an
example installation (just for testing/getting to know Solr), and not
recommended as a baseline for building a production system, that said, I
know there is work on-going to make it a lot more user-friendly.  I can't
say I've done it but
http://www.eclipse.org/jetty/documentation/current/configuring-security.html#configuring-security-authentication
seems
a reasonable place to start with the default embedded Jetty installation
that Solr uses.  I would suggest starting with a simpler webapp and
learning the basic of web app deployment and security through Jetty (or
Tomcat, or whatever you want to use).

web.xml is a standard file in servlet engines, in our installation it is in
solr-webapp/webapp/WEB-INF/web.xml.  It is contained within
solr.war, and is deployed as part of the first time you run Solr, so if you
want to change it, you can run solr once, and then hack that file, but to
do it properly, you will need to re-bundle it back into solr.war.



On 24 June 2013 08:04, K Wong  wrote:

> You might want to read up on Jetty webserver security if that is what you
> are using for the web container.
>
> K
>


Re: Solr Security

2013-06-24 Thread K Wong
You might want to read up on Jetty webserver security if that is what you
are using for the web container.

K


Re: Solr Security

2013-06-23 Thread VIGNESH S
Hi Aaron,

Are you talking about Securing Lucene Index ?

If so You can try using https://code.google.com/p/lucenetransform/.

Thanks and Regards
Vignesh Srinivasan
9739135640


On Mon, Jun 24, 2013 at 11:21 AM, Aaron Greenspan
wrote:

> Hi,
>
> Some more unsolicited feedback since my last experience setting up Solr…
>
> I am concerned that having a duplicate copy of a large part of my database
> up on the internet at a guessable location, available for the world to see,
> is probably not such a good idea. So I went to look up the various methods
> available to secure Solr, and found that all of them are terrible, if
> recent documentation is even available, which it's often not. Most of the
> blog posts I found are from 2010, presumably long before the version I use
> was created.
>
> According to the Solr Security wiki (
> http://wiki.apache.org/solr/SolrSecurity), it looks like you can edit
> some XML files (if you can find them) in complex ways to turn on HTTP
> authentication, or you can restrict the IP that Solr runs on. Less clear is
> some way to change the default port number from 8983.
>
> The wiki itself is full of semi-useless information, which is pretty
> infuriating since it's supposed to be the best source. The XML edits seem
> to change for different versions of Solr. Statements like "standard Java
> web security can be added by tuning the container and the Solr web
> application configuration itself via web.xml" are not helpful to me. I
> don't know what "standard Java web security" is, nor am I inclined to trust
> it since "Java security" is already believed by many to be something of an
> oxymoron. I don't have any idea where the file web.xml is--the default Solr
> install is a nest of needlessly complex folders. (Is it the one at
> ~/example/solr-webapp/webapp/WEB-INF/web.xml?) At the end of the page,
> there is a reference to "server.xml", but according to my install there is
> no such file.
>
> Basically, instead of (or at least on top of) this giant mess, the web
> interface for Solr should prompt the user, before doing anything else, to
> set up an administrative username and password, which one should be able to
> optionally require for queries and/or updates. It's just common sense. If I
> remember correctly, Netscape Enterprise Server prompted you to do that a
> decade and a half ago, and the internet has gotten a lot less friendly
> since then. You should also be able to limit the IP addresses that Solr
> runs on through the web interface, and change the port if desired, (or
> add/remove/edit users and passwords).
>
> The web server should also log when someone signs into the administrative
> interface, and from what IP address. There's probably some way to do this
> through the "Logging/Level" tree, but it's not exactly clear to me.
>
> In the meantime, I found that the approach most likely to work, and least
> likely to take a week to implement, was just to use iptables to set up a
> firewall on port 8983. Contrary to what one post on StackExchange (voted
> -1) says, it works only if you do the ACCEPT rules (iptables -A INPUT -p
> tcp -s xxx.xxx.xxx.xxx --dport 8983 -j ACCEPT) before the DROP all rule
> (iptables -A INPUT -p tcp --dport 8983 -j DROP). But either way, that's a
> pretty ridiculous solution. I don't know of any other server product that
> disregards security so willingly.
>
> Aaron
>
>
> Aaron Greenspan
> President & CEO
> Think Computer Corporation
>
> telephone +1 415 670 9350
> fax +1 415 373 3959
> e-mail aar...@thinkcomputer.com
> web http://www.thinkcomputer.com
>
>
>


-- 
Thanks and Regards
Vignesh Srinivasan
9739135640


Re: SOLR Security

2012-05-15 Thread Anupam Bhattacharya
Thanks for the suggestions.

I tried to use SolrJ within my Servlet. Although the SolrJ QueryResponse is
not returning a well formed Json Object.
I need the Json String with quotes as below. although
QueryResponse.toString() doesn't  return json with quotes at all.

jsonp1337064466204({"responseHeader":{"status":0,"QTime":0,"params":{"json.wrf":"jsonp1337064466204","facet":"true","facet.mincount":"1","q":"*:*","facet.limit":"-1","json.nl":"map","facet.field":["title","abstract"],"wt":"json","rows":"0"}},"response":{"numFound":0,"start":0,"docs":[]},"facet_counts":{"facet_queries":{},"facet_fields":{"title":{},"abstract":{}},"facet_dates":{},"facet_ranges":{}}})

Regards

Anupam


On Fri, May 11, 2012 at 7:56 PM, Welty, Richard wrote:

> in fact, there's a sample proxy.php on the ajax-solr web page which can
> easily be modified into a security layer. my solr servers only listen to
> requests issued by a narrow list of systems, and everything gets routed
> through a modified copy of the proxy.php file, which checks whether the
> user is logged in, and adds terms to the query to limit returned results to
> those the user is permitted to see.
>
>
> -Original Message-
> From: Jan Høydahl [mailto:j...@hoydahl.no]
> Sent: Fri 5/11/2012 9:45 AM
> To: solr-user@lucene.apache.org
> Subject: Re: SOLR Security
>
> Hi,
>
> There is nothing stopping you from pointing Ajax-SOLR to a URL on your
> app-server, which acts as a security insulation layer between the Solr
> backend and the world. In this (thin) layer you can analyze the input and
> choose carefully what to let through and not.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.facebook.com/Cominvent
> Solr Training - www.solrtraining.com
>
> On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:
>
> > Yes, I agree with you.
> >
> > But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
> > solution ?
> >
> > Anupam
> >
> > On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
> > mklosterme...@riskexchange.com> wrote:
> >
> >> Instead of hitting the Solr server directly from the client, I think I
> >> would go through your application server, which would have access to all
> >> the users data and can forward that to the Solr server, thereby hiding
> it
> >> from the client.
> >>
> >> Mike
> >>
> >>
> >> -Original Message-
> >> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
> >> Sent: Thursday, May 10, 2012 9:53 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: SOLR Security
> >>
> >> I am using Ajax-Solr Framework for creating a search interface. The
> search
> >> interface works well.
> >> In my case, the results have document level security so by even indexing
> >> records with there authorized users help me to filter results per user
> >> based on the authentication of the user.
> >>
> >> The problem that I have to a pass always a parameter to the SOLR Server
> >> with userid={xyz} which one can figure out from the SOLR URL(ajax call
> url)
> >> using Firebug tool in the Net Console on Firefox and can change this
> >> parameter value to see others records which he/she is not authorized.
> >> Basically it is Cross Site Scripting Issue.
> >>
> >> I have read about some approaches for Solr Security like Nginx with
> Jetty
> >> & .htaccess based security.Overall what i understand from this is that
> we
> >> can restrict users to do update/delete operations on SOLR as well as we
> can
> >> restrict the SOLR admin interface to certain IPs also. But How can I
> >> restrict the {solr-server}/solr/select based results from access by
> >> different user id's ?
> >>
>
>
>
>


RE: SOLR Security

2012-05-11 Thread Welty, Richard
in fact, there's a sample proxy.php on the ajax-solr web page which can easily 
be modified into a security layer. my solr servers only listen to requests 
issued by a narrow list of systems, and everything gets routed through a 
modified copy of the proxy.php file, which checks whether the user is logged 
in, and adds terms to the query to limit returned results to those the user is 
permitted to see.


-Original Message-
From: Jan Høydahl [mailto:j...@hoydahl.no]
Sent: Fri 5/11/2012 9:45 AM
To: solr-user@lucene.apache.org
Subject: Re: SOLR Security
 
Hi,

There is nothing stopping you from pointing Ajax-SOLR to a URL on your 
app-server, which acts as a security insulation layer between the Solr backend 
and the world. In this (thin) layer you can analyze the input and choose 
carefully what to let through and not.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:

> Yes, I agree with you.
> 
> But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
> solution ?
> 
> Anupam
> 
> On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
> mklosterme...@riskexchange.com> wrote:
> 
>> Instead of hitting the Solr server directly from the client, I think I
>> would go through your application server, which would have access to all
>> the users data and can forward that to the Solr server, thereby hiding it
>> from the client.
>> 
>> Mike
>> 
>> 
>> -Original Message-
>> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
>> Sent: Thursday, May 10, 2012 9:53 PM
>> To: solr-user@lucene.apache.org
>> Subject: SOLR Security
>> 
>> I am using Ajax-Solr Framework for creating a search interface. The search
>> interface works well.
>> In my case, the results have document level security so by even indexing
>> records with there authorized users help me to filter results per user
>> based on the authentication of the user.
>> 
>> The problem that I have to a pass always a parameter to the SOLR Server
>> with userid={xyz} which one can figure out from the SOLR URL(ajax call url)
>> using Firebug tool in the Net Console on Firefox and can change this
>> parameter value to see others records which he/she is not authorized.
>> Basically it is Cross Site Scripting Issue.
>> 
>> I have read about some approaches for Solr Security like Nginx with Jetty
>> & .htaccess based security.Overall what i understand from this is that we
>> can restrict users to do update/delete operations on SOLR as well as we can
>> restrict the SOLR admin interface to certain IPs also. But How can I
>> restrict the {solr-server}/solr/select based results from access by
>> different user id's ?
>> 





Re: SOLR Security

2012-05-11 Thread Jan Høydahl
Hi,

There is nothing stopping you from pointing Ajax-SOLR to a URL on your 
app-server, which acts as a security insulation layer between the Solr backend 
and the world. In this (thin) layer you can analyze the input and choose 
carefully what to let through and not.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 11. mai 2012, at 06:37, Anupam Bhattacharya wrote:

> Yes, I agree with you.
> 
> But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
> solution ?
> 
> Anupam
> 
> On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
> mklosterme...@riskexchange.com> wrote:
> 
>> Instead of hitting the Solr server directly from the client, I think I
>> would go through your application server, which would have access to all
>> the users data and can forward that to the Solr server, thereby hiding it
>> from the client.
>> 
>> Mike
>> 
>> 
>> -Original Message-
>> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
>> Sent: Thursday, May 10, 2012 9:53 PM
>> To: solr-user@lucene.apache.org
>> Subject: SOLR Security
>> 
>> I am using Ajax-Solr Framework for creating a search interface. The search
>> interface works well.
>> In my case, the results have document level security so by even indexing
>> records with there authorized users help me to filter results per user
>> based on the authentication of the user.
>> 
>> The problem that I have to a pass always a parameter to the SOLR Server
>> with userid={xyz} which one can figure out from the SOLR URL(ajax call url)
>> using Firebug tool in the Net Console on Firefox and can change this
>> parameter value to see others records which he/she is not authorized.
>> Basically it is Cross Site Scripting Issue.
>> 
>> I have read about some approaches for Solr Security like Nginx with Jetty
>> & .htaccess based security.Overall what i understand from this is that we
>> can restrict users to do update/delete operations on SOLR as well as we can
>> restrict the SOLR admin interface to certain IPs also. But How can I
>> restrict the {solr-server}/solr/select based results from access by
>> different user id's ?
>> 



Re: SOLR Security

2012-05-10 Thread Anupam Bhattacharya
Yes, I agree with you.

But Ajax-SOLR Framework doesn't fit in that manner. Any alternative
solution ?

Anupam

On Fri, May 11, 2012 at 9:41 AM, Klostermeyer, Michael <
mklosterme...@riskexchange.com> wrote:

> Instead of hitting the Solr server directly from the client, I think I
> would go through your application server, which would have access to all
> the users data and can forward that to the Solr server, thereby hiding it
> from the client.
>
> Mike
>
>
> -Original Message-
> From: Anupam Bhattacharya [mailto:anupam...@gmail.com]
> Sent: Thursday, May 10, 2012 9:53 PM
> To: solr-user@lucene.apache.org
> Subject: SOLR Security
>
> I am using Ajax-Solr Framework for creating a search interface. The search
> interface works well.
> In my case, the results have document level security so by even indexing
> records with there authorized users help me to filter results per user
> based on the authentication of the user.
>
> The problem that I have to a pass always a parameter to the SOLR Server
> with userid={xyz} which one can figure out from the SOLR URL(ajax call url)
> using Firebug tool in the Net Console on Firefox and can change this
> parameter value to see others records which he/she is not authorized.
> Basically it is Cross Site Scripting Issue.
>
> I have read about some approaches for Solr Security like Nginx with Jetty
> & .htaccess based security.Overall what i understand from this is that we
> can restrict users to do update/delete operations on SOLR as well as we can
> restrict the SOLR admin interface to certain IPs also. But How can I
> restrict the {solr-server}/solr/select based results from access by
> different user id's ?
>


RE: SOLR Security

2012-05-10 Thread Klostermeyer, Michael
Instead of hitting the Solr server directly from the client, I think I would go 
through your application server, which would have access to all the users data 
and can forward that to the Solr server, thereby hiding it from the client.

Mike


-Original Message-
From: Anupam Bhattacharya [mailto:anupam...@gmail.com] 
Sent: Thursday, May 10, 2012 9:53 PM
To: solr-user@lucene.apache.org
Subject: SOLR Security

I am using Ajax-Solr Framework for creating a search interface. The search 
interface works well.
In my case, the results have document level security so by even indexing 
records with there authorized users help me to filter results per user based on 
the authentication of the user.

The problem that I have to a pass always a parameter to the SOLR Server with 
userid={xyz} which one can figure out from the SOLR URL(ajax call url) using 
Firebug tool in the Net Console on Firefox and can change this parameter value 
to see others records which he/she is not authorized.
Basically it is Cross Site Scripting Issue.

I have read about some approaches for Solr Security like Nginx with Jetty & 
.htaccess based security.Overall what i understand from this is that we can 
restrict users to do update/delete operations on SOLR as well as we can 
restrict the SOLR admin interface to certain IPs also. But How can I restrict 
the {solr-server}/solr/select based results from access by different user id's ?


Re: Solr security

2011-05-10 Thread Brian Lamb
Great posts all. I will give these a look and come up with something based
on these recommendations. I'm sure as I begin implementing something, I will
have more questions arise.

On Tue, May 10, 2011 at 9:00 AM, Anthony Wlodarski <
anth...@tinkertownlabs.com> wrote:

> The WIKI has a loose interpretation of how to set-up Jetty securely.
>  Please take a look at the article I wrote here:
> http://anthonyw.net/2011/04/securing-jetty-and-solr-with-php-authentication/.
>  Even if PHP is not your language that sits on top of Solr you can still use
> the first part of the tutorial.  If you are using Tomcat I would recommend
> looking here:
> http://blog.comtaste.com/2009/02/securing_your_solr_server_on_t.html
>
> Regards,
>
> -Anthony
>
>
> On 05/09/2011 05:28 PM, Jan Høydahl wrote:
>
>> Hi,
>>
>> You can simply configure a firewall on your Solr server to only allow
>> access from your frontend server. Whether you use the built-in software
>> firewall of Linux/Windows/Whatever or use some other FW utility is a choice
>> you need to make. This is by design - you should never ever expose your
>> backend services, whether it's a search server or a database server, to the
>> public.
>>
>> Read more about Solr security on the WIKI:
>> http://wiki.apache.org/solr/SolrSecurity
>>
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>>
>> On 9. mai 2011, at 20.57, Brian Lamb wrote:
>>
>>  Hi all,
>>>
>>> Is it possible to set up solr so that it will only execute dataimport
>>> commands if they come from localhost?
>>>
>>> Right now, my application and my solr installation are on different
>>> servers
>>> so any requests are formatted http://domain:8983 instead of
>>> http://localhost:8983. I am concerned that when I launch my application,
>>> there will be the potential for abuse. Is the best solution to have
>>> everything reside on the same server?
>>>
>>> What are some other solutions?
>>>
>>> Thanks,
>>>
>>> Brian Lamb
>>>
>>
> --
> Anthony Wlodarski
> Lead Software Engineer
> Get2Know.me (http://www.get2know.me)
> Office: 646-285-0500 x217
> Fax: 646-285-0400
>
>


Re: Solr security

2011-05-10 Thread Anthony Wlodarski
The WIKI has a loose interpretation of how to set-up Jetty securely.  
Please take a look at the article I wrote here:  
http://anthonyw.net/2011/04/securing-jetty-and-solr-with-php-authentication/.  
Even if PHP is not your language that sits on top of Solr you can still 
use the first part of the tutorial.  If you are using Tomcat I would 
recommend looking here: 
http://blog.comtaste.com/2009/02/securing_your_solr_server_on_t.html


Regards,

-Anthony

On 05/09/2011 05:28 PM, Jan Høydahl wrote:

Hi,

You can simply configure a firewall on your Solr server to only allow access 
from your frontend server. Whether you use the built-in software firewall of 
Linux/Windows/Whatever or use some other FW utility is a choice you need to 
make. This is by design - you should never ever expose your backend services, 
whether it's a search server or a database server, to the public.

Read more about Solr security on the WIKI: 
http://wiki.apache.org/solr/SolrSecurity

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 9. mai 2011, at 20.57, Brian Lamb wrote:


Hi all,

Is it possible to set up solr so that it will only execute dataimport
commands if they come from localhost?

Right now, my application and my solr installation are on different servers
so any requests are formatted http://domain:8983 instead of
http://localhost:8983. I am concerned that when I launch my application,
there will be the potential for abuse. Is the best solution to have
everything reside on the same server?

What are some other solutions?

Thanks,

Brian Lamb


--
Anthony Wlodarski
Lead Software Engineer
Get2Know.me (http://www.get2know.me)
Office: 646-285-0500 x217
Fax: 646-285-0400



Re: Solr security

2011-05-09 Thread Jan Høydahl
Hi,

You can simply configure a firewall on your Solr server to only allow access 
from your frontend server. Whether you use the built-in software firewall of 
Linux/Windows/Whatever or use some other FW utility is a choice you need to 
make. This is by design - you should never ever expose your backend services, 
whether it's a search server or a database server, to the public.

Read more about Solr security on the WIKI: 
http://wiki.apache.org/solr/SolrSecurity

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 9. mai 2011, at 20.57, Brian Lamb wrote:

> Hi all,
> 
> Is it possible to set up solr so that it will only execute dataimport
> commands if they come from localhost?
> 
> Right now, my application and my solr installation are on different servers
> so any requests are formatted http://domain:8983 instead of
> http://localhost:8983. I am concerned that when I launch my application,
> there will be the potential for abuse. Is the best solution to have
> everything reside on the same server?
> 
> What are some other solutions?
> 
> Thanks,
> 
> Brian Lamb



Re: Solr security

2011-05-09 Thread Upayavira
Solr does not provide security (I believe Lucid EnterpriseWorks has
something there).

You should keep Solr itself secure behind a firewall, and pass all
requests through some intermediary that only allows sensible stuff
through to Solr itself. That way, the DataImportHandler is accessible
inside your firewall, and your search functionality is available
outside.

Upayavira

On Mon, 09 May 2011 14:57 -0400, "Brian Lamb"
 wrote:
> Hi all,
> 
> Is it possible to set up solr so that it will only execute dataimport
> commands if they come from localhost?
> 
> Right now, my application and my solr installation are on different
> servers
> so any requests are formatted http://domain:8983 instead of
> http://localhost:8983. I am concerned that when I launch my application,
> there will be the potential for abuse. Is the best solution to have
> everything reside on the same server?
> 
> What are some other solutions?
> 
> Thanks,
> 
> Brian Lamb
> 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: Solr security

2008-11-17 Thread Chris Hostetter

: > Full ack. What do you think about the only solr related thing "left", the
: > paramter filtering/blocking (eg. rows<1000). Is this suitable to do it in a
: > Filter delivered by solr? Of course as an optional alternative.

: As eric mentioned earlier, this could be done in a QueryComponent -- the
: prepare part could just make sure the query parameters are all within
: reasonable ranges.  This seems like something reasonable to add to solr.

i don't even see it requiring a new component -- the existing 
QueryComponent could treat this similar to the way the DismaxQParser deals 
with q and q.alt ... add two new params: start.max and rows.max that 
default to some very large values; QueryComponent respects start & rows 
only as long as they don't exceed the corrisponding max; peoples that want 
ot lock down their ports can make them invariants for the handlers that 
are exposed.


-Hoss



Re: Solr security

2008-11-17 Thread Noble Paul നോബിള്‍ नोब्ळ्
If the user is using the new java Solr replication then he can get rid
of the /update and /update/csv handlers altogether. So the slaves are
completely read-only
--Noble



On Tue, Nov 18, 2008 at 2:14 AM, Sean Timm <[EMAIL PROTECTED]> wrote:
> I believe the Solr replication scripts require POSTing a commit to read in
> the new index--so at least limited POST capability is required in most
> scenarios.
>
> -Sean
>
> Lance Norskog wrote:
>>
>> About that "read-only" switch for Solr: one of the basic HTTP design
>> guidelines is that GET should only return values, and should never change
>> the state of the data. All changes to the data should be made with POST.
>> (In
>> REST style guidelines, PUT, POST, and DELETE.) This prevents you from
>> passing around URLs in email that can destroy the index.  The first role
>> of
>> security is to prevent accidents.
>>
>> I would suggest two layers of "read-only" switch. 1) Open the Lucene index
>> in read-only mode. 2) Allow only search servers to accept GET requests.
>>
>> Lance
>>
>>
>



-- 
--Noble Paul


Re: Solr security

2008-11-17 Thread Ian Holsman

Ryan McKinley wrote:


On Nov 17, 2008, at 4:20 PM, Erik Hatcher wrote:

trouble is, you can also GET /solr/update, even all on the URL, no 
request body...


  
 



Solr is a bad RESTafarian.



but with Ian's options in the apache config, this would not work...  
rather it would only work if stream.body was a POST




order deny,allow
deny from all
allow from 192.168.0.1

?
or perhaps locationmatch.. but you get the picture.






Getting warmer!

Erik


On Nov 17, 2008, at 4:11 PM, Ian Holsman wrote:


if thats the case putting apache in front of it would be handy.

something like

order deny,allow
deny from all
allow from 192.168.0.1


might be helpful.

Sean Timm wrote:
I believe the Solr replication scripts require POSTing a commit to 
read in the new index--so at least limited POST capability is 
required in most scenarios.


-Sean

Lance Norskog wrote:

About that "read-only" switch for Solr: one of the basic HTTP design
guidelines is that GET should only return values, and should never 
change
the state of the data. All changes to the data should be made with 
POST. (In

REST style guidelines, PUT, POST, and DELETE.) This prevents you from
passing around URLs in email that can destroy the index.  The 
first role of

security is to prevent accidents.

I would suggest two layers of "read-only" switch. 1) Open the 
Lucene index
in read-only mode. 2) Allow only search servers to accept GET 
requests.


Lance













Re: Solr security

2008-11-17 Thread Ryan McKinley


On Nov 17, 2008, at 4:20 PM, Erik Hatcher wrote:

trouble is, you can also GET /solr/update, even all on the URL, no  
request body...


  


Solr is a bad RESTafarian.



but with Ian's options in the apache config, this would not work...   
rather it would only work if stream.body was a POST







Getting warmer!

Erik


On Nov 17, 2008, at 4:11 PM, Ian Holsman wrote:


if thats the case putting apache in front of it would be handy.

something like

order deny,allow
deny from all
allow from 192.168.0.1


might be helpful.

Sean Timm wrote:
I believe the Solr replication scripts require POSTing a commit to  
read in the new index--so at least limited POST capability is  
required in most scenarios.


-Sean

Lance Norskog wrote:
About that "read-only" switch for Solr: one of the basic HTTP  
design
guidelines is that GET should only return values, and should  
never change
the state of the data. All changes to the data should be made  
with POST. (In
REST style guidelines, PUT, POST, and DELETE.) This prevents you  
from
passing around URLs in email that can destroy the index.  The  
first role of

security is to prevent accidents.

I would suggest two layers of "read-only" switch. 1) Open the  
Lucene index
in read-only mode. 2) Allow only search servers to accept GET  
requests.


Lance










Re: Solr security

2008-11-17 Thread Erik Hatcher
trouble is, you can also GET /solr/update, even all on the URL, no  
request body...


   


Solr is a bad RESTafarian.

Getting warmer!

Erik


On Nov 17, 2008, at 4:11 PM, Ian Holsman wrote:


if thats the case putting apache in front of it would be handy.

something like

order deny,allow
deny from all
allow from 192.168.0.1


might be helpful.

Sean Timm wrote:
I believe the Solr replication scripts require POSTing a commit to  
read in the new index--so at least limited POST capability is  
required in most scenarios.


-Sean

Lance Norskog wrote:

About that "read-only" switch for Solr: one of the basic HTTP design
guidelines is that GET should only return values, and should never  
change
the state of the data. All changes to the data should be made with  
POST. (In
REST style guidelines, PUT, POST, and DELETE.) This prevents you  
from
passing around URLs in email that can destroy the index.  The  
first role of

security is to prevent accidents.

I would suggest two layers of "read-only" switch. 1) Open the  
Lucene index
in read-only mode. 2) Allow only search servers to accept GET  
requests.


Lance








Re: Solr security

2008-11-17 Thread Ian Holsman

if thats the case putting apache in front of it would be handy.

something like

order deny,allow
deny from all
allow from 192.168.0.1


might be helpful.

Sean Timm wrote:
I believe the Solr replication scripts require POSTing a commit to 
read in the new index--so at least limited POST capability is required 
in most scenarios.


-Sean

Lance Norskog wrote:

About that "read-only" switch for Solr: one of the basic HTTP design
guidelines is that GET should only return values, and should never 
change
the state of the data. All changes to the data should be made with 
POST. (In

REST style guidelines, PUT, POST, and DELETE.) This prevents you from
passing around URLs in email that can destroy the index.  The first 
role of

security is to prevent accidents.

I would suggest two layers of "read-only" switch. 1) Open the Lucene 
index

in read-only mode. 2) Allow only search servers to accept GET requests.

Lance

  






Re: Solr security

2008-11-17 Thread Sean Timm
I believe the Solr replication scripts require POSTing a commit to read 
in the new index--so at least limited POST capability is required in 
most scenarios.


-Sean

Lance Norskog wrote:

About that "read-only" switch for Solr: one of the basic HTTP design
guidelines is that GET should only return values, and should never change
the state of the data. All changes to the data should be made with POST. (In
REST style guidelines, PUT, POST, and DELETE.) This prevents you from
passing around URLs in email that can destroy the index.  The first role of
security is to prevent accidents.

I would suggest two layers of "read-only" switch. 1) Open the Lucene index
in read-only mode. 2) Allow only search servers to accept GET requests.

Lance

  


RE: Solr security

2008-11-17 Thread Lance Norskog
About that "read-only" switch for Solr: one of the basic HTTP design
guidelines is that GET should only return values, and should never change
the state of the data. All changes to the data should be made with POST. (In
REST style guidelines, PUT, POST, and DELETE.) This prevents you from
passing around URLs in email that can destroy the index.  The first role of
security is to prevent accidents.

I would suggest two layers of "read-only" switch. 1) Open the Lucene index
in read-only mode. 2) Allow only search servers to accept GET requests.

Lance



Re: Solr security

2008-11-17 Thread Sean Timm
http://issues.apache.org/jira/browse/SOLR-527 (An XML commit only 
request handler) is pertinent to this discussion as well.


-Sean

Ian Holsman wrote:

There was a patch by Sean Timm you should investigate as well.

It limited a query so it would take a maximum of X seconds to execute, 
and would just return the rows it had found in that time.



Feak, Todd wrote:

I see value in this in the form of protecting the client from itself.

For example, our Solr isn't accessible from the Internet. It's all
behind firewalls. But, the client applications can make programming
mistakes. I would love the ability to lock them down to a certain number
of rows, just in case someone typos and puts in 1000 instead of 100, or
the like.

Admittedly, testing and QA should catch these things, but sometimes it's
nice to put in a few safeguards to stop the obvious mistakes from
occurring.

-Todd Feak

-Original Message-
From: Matthias Epheser [mailto:[EMAIL PROTECTED] Sent: Monday, 
November 17, 2008 9:07 AM

To: solr-user@lucene.apache.org
Subject: Re: Solr security

Ryan McKinley schrieb:
  however I have found that in any site where
 

stability/load and uptime are a serious concern, this is better

handled  
in a tier in front of java -- typically the loadbalancer / haproxy / 
whatever -- and managed by people more cautious then me.



Full ack. What do you think about the only solr related thing "left",
the paramter filtering/blocking (eg. rows<1000). Is this suitable to 
do it

in a Filter delivered by solr? Of course as an optional alternative.

 

ryan







  




Re: Solr security

2008-11-17 Thread Ian Holsman

There was a patch by Sean Timm you should investigate as well.

It limited a query so it would take a maximum of X seconds to execute, 
and would just return the rows it had found in that time.



Feak, Todd wrote:

I see value in this in the form of protecting the client from itself.

For example, our Solr isn't accessible from the Internet. It's all
behind firewalls. But, the client applications can make programming
mistakes. I would love the ability to lock them down to a certain number
of rows, just in case someone typos and puts in 1000 instead of 100, or
the like.

Admittedly, testing and QA should catch these things, but sometimes it's
nice to put in a few safeguards to stop the obvious mistakes from
occurring.

-Todd Feak

-Original Message-
From: Matthias Epheser [mailto:[EMAIL PROTECTED] 
Sent: Monday, November 17, 2008 9:07 AM

To: solr-user@lucene.apache.org
Subject: Re: Solr security

Ryan McKinley schrieb:
  however I have found that in any site where
  

stability/load and uptime are a serious concern, this is better

handled 
  
in a tier in front of java -- typically the loadbalancer / haproxy / 
whatever -- and managed by people more cautious then me.



Full ack. What do you think about the only solr related thing "left",
the 
paramter filtering/blocking (eg. rows<1000). Is this suitable to do it
in a 
Filter delivered by solr? Of course as an optional alternative.


  

ryan







  




Re: Solr security

2008-11-17 Thread Ryan McKinley


On Nov 17, 2008, at 12:06 PM, Matthias Epheser wrote:


Ryan McKinley schrieb:
however I have found that in any site where
stability/load and uptime are a serious concern, this is better  
handled in a tier in front of java -- typically the loadbalancer /  
haproxy / whatever -- and managed by people more cautious then me.


Full ack. What do you think about the only solr related thing  
"left", the paramter filtering/blocking (eg. rows<1000). Is this  
suitable to do it in a Filter delivered by solr? Of course as an  
optional alternative.




This could be done in a standard ServletFilter -- but that requires  
mucking with web.xml and may be more difficult if you are worried  
about it for some Handlers and not others.


As eric mentioned earlier, this could be done in a QueryComponent --  
the prepare part could just make sure the query parameters are all  
within reasonable ranges.  This seems like something reasonable to add  
to solr.


ryan


RE: Solr security

2008-11-17 Thread Feak, Todd
I see value in this in the form of protecting the client from itself.

For example, our Solr isn't accessible from the Internet. It's all
behind firewalls. But, the client applications can make programming
mistakes. I would love the ability to lock them down to a certain number
of rows, just in case someone typos and puts in 1000 instead of 100, or
the like.

Admittedly, testing and QA should catch these things, but sometimes it's
nice to put in a few safeguards to stop the obvious mistakes from
occurring.

-Todd Feak

-Original Message-
From: Matthias Epheser [mailto:[EMAIL PROTECTED] 
Sent: Monday, November 17, 2008 9:07 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr security

Ryan McKinley schrieb:
  however I have found that in any site where
> stability/load and uptime are a serious concern, this is better
handled 
> in a tier in front of java -- typically the loadbalancer / haproxy / 
> whatever -- and managed by people more cautious then me.

Full ack. What do you think about the only solr related thing "left",
the 
paramter filtering/blocking (eg. rows<1000). Is this suitable to do it
in a 
Filter delivered by solr? Of course as an optional alternative.

> 
> ryan
> 
> 




Re: Solr security

2008-11-17 Thread Matthias Epheser

Ryan McKinley schrieb:
 however I have found that in any site where
stability/load and uptime are a serious concern, this is better handled 
in a tier in front of java -- typically the loadbalancer / haproxy / 
whatever -- and managed by people more cautious then me.


Full ack. What do you think about the only solr related thing "left", the 
paramter filtering/blocking (eg. rows<1000). Is this suitable to do it in a 
Filter delivered by solr? Of course as an optional alternative.




ryan






Re: Solr security

2008-11-17 Thread Mark Miller

Ryan McKinley wrote:
solr.jar on the other hand lets you package what you want around 
search features to build a setup for your needs.  Java already has so 
many options for how to secure / authenticate that you can just plug 
them into your own app.  (if that is appropriate).  In the past I have 
used a filter based on:

http://www.onjava.com/pub/a/onjava/2004/03/24/loadcontrol.html
to limit load -- however I have found that in any site where 
stability/load and uptime are a serious concern, this is better 
handled in a tier in front of java -- typically the loadbalancer / 
haproxy / whatever -- and managed by people more cautious then me.


ryan

Couldn't agree more. Almost all security and protection belong outside 
of solr. It can and will be done better, and solr can stick to what its 
good at. Smaller things like limiting complex query attacks or something 
seem more reasonable, but any real security should be provided 
elsewhere. Wouldn't that be odd if a bunch of open source products 
reimplemented network security layers and defenses on every project...




Re: Solr security

2008-11-17 Thread Ryan McKinley



Say you do filtering by user - how would you enforce that the client
(if it's a browser) only send in the proper filter?


Ryan already mentioned his technique... and here's how I'd do it  
similarly...


 Write a custom servlet Filter that grokked roles/authentication  
(this piece you'd need in any Java application tier anyway) [or  
plugin in an existing implementation through Spring or something  
like that]  And then massaging of the request to Solr could happen  
in that pipeline, or adding a query parameter to the Solr request  
(ignoring anything sent by the client request for say, &user=...).   
Perhaps plug in a custom SearchComponent that massaged a request  
parameter into a Solr filter query or whatever.




right, but the question is still: is there anything general enough to  
be in solr core?


Everything I can think of requires a good sense of how the auth model  
is encoded in your data and how you want to expose it.  Nothing I have  
done is general enough to share with even my next project.


The only think I could imagine is perhaps adding "getUserPrincipal()"  
to the SolrRequest interface -- but this quickly explodes into also  
wanting the request method (POST vs GET) or the user-agent...  in the  
end I just add the HttpServletRequest to the context and grab stuff  
from there.  Perhaps the default RequestDispatcher could add the  
HttpServletRequest to the context...




Doesn't seem like
you can unless you put all the user authentication stuff and
application logic right in Solr.


  ;)

Exactly.  Sort of.


Now I guess you *could* stick everything in Solr that you would
normally stick in the middle tier, but it doesn't seem like a great
idea to me.


Let's be clear about where we are drawing the boundaries of the  
definition of "Solr".


One could say that Solr is solr.war and the HTTP conventions.  Or is  
it solr.jar?  Or is it the SolrJ API?




all of the above :)

In my view we need to be clear about who solr.war is packaged for.  I  
think we are pretty clear that solr.war should be thought of similar  
to a MySQL install -- that is a database server that unless you  
*really* know what you are doing should most likely be behind a  
firewall.


solr.jar on the other hand lets you package what you want around  
search features to build a setup for your needs.  Java already has so  
many options for how to secure / authenticate that you can just plug  
them into your own app.  (if that is appropriate).  In the past I have  
used a filter based on:

http://www.onjava.com/pub/a/onjava/2004/03/24/loadcontrol.html
to limit load -- however I have found that in any site where stability/ 
load and uptime are a serious concern, this is better handled in a  
tier in front of java -- typically the loadbalancer / haproxy /  
whatever -- and managed by people more cautious then me.


ryan




Re: Solr security

2008-11-17 Thread Walter Underwood
TCP-level attacks like SYN-flooding.

All kinds of HTTP breakage that Apache has fixed over the years.
You really want a bombproof TCP and HTTP implementation.

Very, very slow clients that keep a socket open for a long time
while the bits drool out to them.

We saw problems with all service threads being busy, and implemented
a deadman timer to reboot if no threads were in listen state for
two minutes.

We put in IP address checks for access to admin pages. You can do a
similar thing with Apache by only making the search pages available
and requiring admins to go directly to Solr on a different port.
That port can be blocked by a firewall.

Finally, you get the years of experience and documentation in configuring
Apache for use exposed on the Internet.

wunder

On 11/17/08 7:28 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote:

> 
> On Nov 17, 2008, at 10:22 AM, Walter Underwood wrote:
>> It is possible to make it safe, but a lot of work. We did this for
>> Ultraseek. I would always, always front it with Apache, to get some
>> of Apache's protection.
> 
> What protections specifically are you speaking of with Apache in
> front?  Authentication?  Row limiting?
> 
> Erik
> 



Re: Solr security

2008-11-17 Thread Erik Hatcher


On Nov 17, 2008, at 10:22 AM, Walter Underwood wrote:

It is possible to make it safe, but a lot of work. We did this for
Ultraseek. I would always, always front it with Apache, to get some
of Apache's protection.


What protections specifically are you speaking of with Apache in  
front?  Authentication?  Row limiting?


Erik



Re: Solr security

2008-11-17 Thread Walter Underwood
Limiting the number of rows only handles one attack. The one I mentioned,
fetching one page deep in the result set, caused a big issue on prod at
our site. We needed to limit the max for "start" as well as "rows".

It is possible to make it safe, but a lot of work. We did this for
Ultraseek. I would always, always front it with Apache, to get some
of Apache's protection.

wunder

On 11/17/08 6:04 AM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote:
> 
> On Nov 16, 2008, at 6:55 PM, Walter Underwood wrote:
>> Limiting the maximum number of rows doesn't work, because
>> they can request rows 2-20100. --wunder
> 
> But you could limit how many rows could be returned in a single
> request... that'd close off one DoS mechanism.
> 
> Erik




Re: Solr security

2008-11-17 Thread Matthias Epheser

Erik Hatcher schrieb:


On Nov 16, 2008, at 6:18 PM, Ryan McKinley wrote:

my assumption with solrjs is that you are hitting "read-only" solr 
servers that you don't mind if people query directly.


Exactly the assumption I'm going with too.

 It would not be appropriate for something where you don't want people 
(who really care) to know you are running solr and could execute 
arbitrary queries.


Since it is an example, I don't mind leaving the /admin interface open 
on:

http://example.solrstuff.org/solrjs/admin/
but /update has a password:
http://example.solrstuff.org/solrjs/update

I have said in the past I like the idea of a "read-only" flag in solr 
config that would throw an error if you try to do something with the 
UpdateHandler.  However there are other ways to do that also.




As the thoughts and ideas of this thread are spread in several emails, let me 
just drop my uncoordinated thoughts here:


For solrjs, what exactly is the required information solr has to provide 
"directly":


- We need data for several widgets. This data will be in 99% of the cases some 
facet information and/or result docs. The result docs will be in suitable 
ranges, no webpage will display 10+ result items at the same time.


- So "potentially dangerous" request params like rows>1000 or some other 
handlers apart from StandardRequest may be blocked.


- update handlers and admin interface shouldn't be exposed.


Like others mentioned before, I'm not sure this is a task that *has* to be 
solved inside Solr. As a standalone servlet, it is verly likely that it is NOT 
accessible directly in a production environment.


Hiding or password protecting update/admin is an easy task using a proxy like 
apache http. It could also be solved by a configurable ServletFilter delivered 
with solr, that is initialized inside solr's web.xml. To separate the concerns, 
I think it should not be coded "deeper" inside the solr code. The idea of a 
"read-only" server can be implemented like that. Optional update urls that are 
only accessed inside a firewall or something may also be present.


This servlet filter may also check the request params for things that are not 
needed for solrjs and potentially dangerous. It even may check how frequently 
urls are accessed (thinking about DoS).


I think even if it looks like a direct access, using solrjs doesn't have to be 
different to "common" solr webapps. Usually these apps take user input, a web 
application translates this input into a solr query and translates the result in 
a suitable client format. Other solr stuff is blocked indirectly because only 
this app has access to solr. Now the last 2 steps are done inside the client. 
But if we block stuff that isn't used by the client, we are in control of what 
may happen.


If that isn't secure enough, the more complicated solution would be the create 
such a stateful servlet that holds the query state of a client, and solrjs only 
performs /select/solrjs/?new_query=city:vienna or something. Then the query 
generation and all solr related stuff happens again on the server.


I think it should easily be reached to deliver this SecuritySolrFilter with the 
standard solr distribution, making it configurable for the user to decide what 
urls are blocked/password protected and what request parameters should be 
checked for illegal values. On the other hand, existing firewalls and proxies of 
the destination system may be used.Therefore some "best-practices" may be 
helpful in the solr wiki.


I would be fine by me to help implementing a standard securty filter for solr.

WDYT?

regards,
matthias


Re: Solr security

2008-11-17 Thread Erik Hatcher


On Nov 17, 2008, at 9:07 AM, Yonik Seeley wrote:

On Mon, Nov 17, 2008 at 8:54 AM, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
Sounds like the perfect case for a query parser plugin... or use  
dismax as
Ryan mentioned.  Shouldn't Solr be hardened for these cases  
anyway?  Or at

least hardenable.


Say you do filtering by user - how would you enforce that the client
(if it's a browser) only send in the proper filter?


Ryan already mentioned his technique... and here's how I'd do it  
similarly...


  Write a custom servlet Filter that grokked roles/authentication  
(this piece you'd need in any Java application tier anyway) [or plugin  
in an existing implementation through Spring or something like that]   
And then massaging of the request to Solr could happen in that  
pipeline, or adding a query parameter to the Solr request (ignoring  
anything sent by the client request for say, &user=...).  Perhaps plug  
in a custom SearchComponent that massaged a request parameter into a  
Solr filter query or whatever.



 Doesn't seem like
you can unless you put all the user authentication stuff and
application logic right in Solr.


   ;)

Exactly.  Sort of.


Now I guess you *could* stick everything in Solr that you would
normally stick in the middle tier, but it doesn't seem like a great
idea to me.


Let's be clear about where we are drawing the boundaries of the  
definition of "Solr".


One could say that Solr is solr.war and the HTTP conventions.  Or is  
it solr.jar?  Or is it the SolrJ API?


Erik



Re: Solr security

2008-11-17 Thread Yonik Seeley
On Mon, Nov 17, 2008 at 8:54 AM, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
> Sounds like the perfect case for a query parser plugin... or use dismax as
> Ryan mentioned.  Shouldn't Solr be hardened for these cases anyway?  Or at
> least hardenable.

Say you do filtering by user - how would you enforce that the client
(if it's a browser) only send in the proper filter?  Doesn't seem like
you can unless you put all the user authentication stuff and
application logic right in Solr.

Now I guess you *could* stick everything in Solr that you would
normally stick in the middle tier, but it doesn't seem like a great
idea to me.

-Yonik


Re: Solr security

2008-11-17 Thread Erik Hatcher


On Nov 16, 2008, at 6:55 PM, Walter Underwood wrote:

Limiting the maximum number of rows doesn't work, because
they can request rows 2-20100. --wunder


But you could limit how many rows could be returned in a single  
request... that'd close off one DoS mechanism.


Erik



Re: Solr security

2008-11-17 Thread Erik Hatcher


On Nov 16, 2008, at 6:27 PM, Ryan McKinley wrote:
I'd be parsing out wildcards, boosts, and fuzzy searches (or at  
least thinking about the effects).
I mean "jakarta apache"~1000 or roam~0.1 aren't as efficient as a  
regular query.




Even if you leave the solr instance public, you can still limit  
grossly inefficent params by forcing things to use  the dismax query  
parser.  You can use invariants to lock what options are available.


I suppose we don't have a way to say the *maximum* number of rows  
you can request is 100 (or something like that)


A LimitingRowsSearchComponent could easily do this as a plugin though.

Erik



Re: Solr security

2008-11-17 Thread Erik Hatcher


On Nov 16, 2008, at 6:18 PM, Ryan McKinley wrote:

my assumption with solrjs is that you are hitting "read-only" solr  
servers that you don't mind if people query directly.


Exactly the assumption I'm going with too.

 It would not be appropriate for something where you don't want  
people (who really care) to know you are running solr and could  
execute arbitrary queries.


Since it is an example, I don't mind leaving the /admin interface  
open on:

http://example.solrstuff.org/solrjs/admin/
but /update has a password:
http://example.solrstuff.org/solrjs/update

I have said in the past I like the idea of a "read-only" flag in  
solr config that would throw an error if you try to do something  
with the UpdateHandler.  However there are other ways to do that also.


Yes, I was asked about this elusive read-only switch at Solr Boot Camp  
at ApacheCon as well.


How are you password protecting the update handler?  This is the kind  
of goody I'd like to distill out of this thread and wikify 


What's it take to make a read-only Solr server now?  Can replication  
still be made to work?  (I plead ignorance on the guts of the Java- 
based replication feature) - requires password protected handlers?   
Shouldn't we bake some of this into the default example configuration  
instead of update handlers being wide open by default?


Erik




Re: Solr security

2008-11-17 Thread Erik Hatcher


On Nov 16, 2008, at 6:12 PM, Ian Holsman wrote:
famous last words and all, but you shouldn't be just passing what a  
user types directly into a application should you?


LOL

I'd be parsing out wildcards, boosts, and fuzzy searches (or at  
least thinking about the effects).
I mean "jakarta apache"~1000 or roam~0.1 aren't as efficient as a  
regular query.


Sounds like the perfect case for a query parser plugin... or use  
dismax as Ryan mentioned.  Shouldn't Solr be hardened for these cases  
anyway?  Or at least hardenable.



but they don't let me into design meetings any more ;(


Apparently they shouldn't let me into them either ;)

Erik



Re: Solr security

2008-11-16 Thread Walter Underwood
Limiting the maximum number of rows doesn't work, because
they can request rows 2-20100. --wunder

On 11/16/08 3:27 PM, "Ryan McKinley" <[EMAIL PROTECTED]> wrote:

>> 
>> I'd be parsing out wildcards, boosts, and fuzzy searches (or at
>> least thinking about the effects).
>> I mean "jakarta apache"~1000 or roam~0.1 aren't as efficient as a
>> regular query.
>> 
> 
> Even if you leave the solr instance public, you can still limit
> grossly inefficent params by forcing things to use  the dismax query
> parser.  You can use invariants to lock what options are available.
> 
> I suppose we don't have a way to say the *maximum* number of rows you
> can request is 100 (or something like that)
> 
> ryan



Re: Solr security

2008-11-16 Thread Ryan McKinley


I'd be parsing out wildcards, boosts, and fuzzy searches (or at  
least thinking about the effects).
I mean "jakarta apache"~1000 or roam~0.1 aren't as efficient as a  
regular query.




Even if you leave the solr instance public, you can still limit  
grossly inefficent params by forcing things to use  the dismax query  
parser.  You can use invariants to lock what options are available.


I suppose we don't have a way to say the *maximum* number of rows you  
can request is 100 (or something like that)


ryan


Re: Solr security

2008-11-16 Thread Ryan McKinley
my assumption with solrjs is that you are hitting "read-only" solr  
servers that you don't mind if people query directly.  It would not be  
appropriate for something where you don't want people (who really  
care) to know you are running solr and could execute arbitrary queries.


Since it is an example, I don't mind leaving the /admin interface open  
on:

http://example.solrstuff.org/solrjs/admin/
but /update has a password:
http://example.solrstuff.org/solrjs/update

I have said in the past I like the idea of a "read-only" flag in solr  
config that would throw an error if you try to do something with the  
UpdateHandler.  However there are other ways to do that also.


ryan


On Nov 16, 2008, at 6:03 PM, Erik Hatcher wrote:

What about SolrJS?   Isn't it designed to hit a Solr directly?   
(Sure, as long as the response looked like Solr response, it could  
have come through some magic 'security' tier).


Erik

On Nov 16, 2008, at 5:54 PM, Ryan McKinley wrote:
I'm not totally sure what you are suggesting.  Is there a general  
way people deal with security and search?


I'm assuming we already have good ways (better ways) to make sure  
people are authorized/logged in etc.  What do you imagine "solr  
security" would add?


FYI, I used to have a custom RequstHandler that got the user  
principal from the HttpServletRequest (I have a custom  
SolrDispatchFilter that adds that to the context) and then augments  
the query with a filter that limits to stuff that user can see.  I  
replaced all that with a something that adds the filter to the  
Solrj query.


Assuming it is "safe" and all that, what do you think we could add  
that would be general enough?


ryan


On Nov 16, 2008, at 5:12 PM, Erik Hatcher wrote:

I'm pondering the viability of running Solr as effectively a UI  
server... what I mean by that is having a public facing browser- 
based application hitting a Solr backend directly for JSON, XML,  
etc data.


I know folks are doing this (I won't name names, in case this  
thread comes up with any vulnerabilities that would effect such  
existing environments).


Let's just assume a typical deployment environment... replicated  
Solr's behind a load balancer, maybe even a caching proxy.

What known vulnerabilities are there in Solr 1.3, for example?

What I think we can get out this is a Solr deployment  
configuration suitable for direct browser access, but we're not  
safely there yet are we?  Is this an absurd goal?  Must we always  
have a moving piece between browser and data/search servers?


Thanks,
Erik







Re: Solr security

2008-11-16 Thread Walter Underwood
Agreed, it is pretty easy to create a large variety of denial
of service attacks with sorts, wildcards, requesting a large
number of results, or a page deep in the results.

We have protected against several different DoS problems
in our front-end code.

wunder

On 11/16/08 3:12 PM, "Ian Holsman" <[EMAIL PROTECTED]> wrote:

> Erik Hatcher wrote:
>> 
>> On Nov 16, 2008, at 5:41 PM, Ian Holsman wrote:
>>> First thing I would look at is disabling write access, or writing a
>>> servlet that sits on top of the write handler to filter your data.
>> 
>> We can turn off all the update handlers, but how does that affect
>> replication?  Can a Solr replicant be entirely read-only in the HTTP
>> request sense?
>> 
>>> Second thing I would be concerned about is people writing DoS queries
>>> that bypass the cache.
>>> 
>>> 
>>> so you may need to write your own custom request handler to filter
>>> out that kind of thing.
>> 
>> Is this a concern that can be punted to what you'd naturally be
>> putting in front of Solr anyway or a proxy tier that can have DoS
>> blocking rules?  I mean, if you're deploying a Struts that hits Solr
>> under the covers, how do you prevent against DoS on that?  A malicious
>> user could keep sending queries indirectly to a Solr through a whole
>> lot of public apps now.  In other words, another tier in front of Solr
>> doesn't add (much) to DoS protection to an underlying Solr, no?
> 
> famous last words and all, but you shouldn't be just passing what a user
> types directly into a application should you?
> 
> I'd be parsing out wildcards, boosts, and fuzzy searches (or at least
> thinking about the effects).
> I mean "jakarta apache"~1000 or roam~0.1 aren't as efficient as a
> regular query.
> 
> but they don't let me into design meetings any more ;(
>> Erik
>> 
>> 
> 



Re: Solr security

2008-11-16 Thread Ian Holsman

Erik Hatcher wrote:


On Nov 16, 2008, at 5:41 PM, Ian Holsman wrote:
First thing I would look at is disabling write access, or writing a 
servlet that sits on top of the write handler to filter your data.


We can turn off all the update handlers, but how does that affect 
replication?  Can a Solr replicant be entirely read-only in the HTTP 
request sense?


Second thing I would be concerned about is people writing DoS queries 
that bypass the cache.



so you may need to write your own custom request handler to filter 
out that kind of thing.


Is this a concern that can be punted to what you'd naturally be 
putting in front of Solr anyway or a proxy tier that can have DoS 
blocking rules?  I mean, if you're deploying a Struts that hits Solr 
under the covers, how do you prevent against DoS on that?  A malicious 
user could keep sending queries indirectly to a Solr through a whole 
lot of public apps now.  In other words, another tier in front of Solr 
doesn't add (much) to DoS protection to an underlying Solr, no?


famous last words and all, but you shouldn't be just passing what a user 
types directly into a application should you?


I'd be parsing out wildcards, boosts, and fuzzy searches (or at least 
thinking about the effects).
I mean "jakarta apache"~1000 or roam~0.1 aren't as efficient as a 
regular query.


but they don't let me into design meetings any more ;(

Erik






Re: Solr security

2008-11-16 Thread Erik Hatcher
What about SolrJS?   Isn't it designed to hit a Solr directly?  (Sure,  
as long as the response looked like Solr response, it could have come  
through some magic 'security' tier).


Erik

On Nov 16, 2008, at 5:54 PM, Ryan McKinley wrote:
I'm not totally sure what you are suggesting.  Is there a general  
way people deal with security and search?


I'm assuming we already have good ways (better ways) to make sure  
people are authorized/logged in etc.  What do you imagine "solr  
security" would add?


FYI, I used to have a custom RequstHandler that got the user  
principal from the HttpServletRequest (I have a custom  
SolrDispatchFilter that adds that to the context) and then augments  
the query with a filter that limits to stuff that user can see.  I  
replaced all that with a something that adds the filter to the Solrj  
query.


Assuming it is "safe" and all that, what do you think we could add  
that would be general enough?


ryan


On Nov 16, 2008, at 5:12 PM, Erik Hatcher wrote:

I'm pondering the viability of running Solr as effectively a UI  
server... what I mean by that is having a public facing browser- 
based application hitting a Solr backend directly for JSON, XML,  
etc data.


I know folks are doing this (I won't name names, in case this  
thread comes up with any vulnerabilities that would effect such  
existing environments).


Let's just assume a typical deployment environment... replicated  
Solr's behind a load balancer, maybe even a caching proxy.

What known vulnerabilities are there in Solr 1.3, for example?

What I think we can get out this is a Solr deployment configuration  
suitable for direct browser access, but we're not safely there yet  
are we?  Is this an absurd goal?  Must we always have a moving  
piece between browser and data/search servers?


Thanks,
Erik





Re: Solr security

2008-11-16 Thread Mark Miller
Plus, it's just too big a can of worms for solr to handle. You could  
protect up to a small point, but a real ddos attack is not going to be  
defended against by solr. At best we could put in 'kiddie' protection  
against.


- Mark


On Nov 16, 2008, at 5:51 PM, Erik Hatcher <[EMAIL PROTECTED]>  
wrote:




On Nov 16, 2008, at 5:41 PM, Ian Holsman wrote:
First thing I would look at is disabling write access, or writing a  
servlet that sits on top of the write handler to filter your data.


We can turn off all the update handlers, but how does that affect  
replication?  Can a Solr replicant be entirely read-only in the HTTP  
request sense?


Second thing I would be concerned about is people writing DoS  
queries that bypass the cache.



so you may need to write your own custom request handler to filter  
out that kind of thing.


Is this a concern that can be punted to what you'd naturally be  
putting in front of Solr anyway or a proxy tier that can have DoS  
blocking rules?  I mean, if you're deploying a Struts that hits Solr  
under the covers, how do you prevent against DoS on that? A  
malicious user could keep sending queries indirectly to a Solr  
through a whole lot of public apps now.  In other words, another  
tier in front of Solr doesn't add (much) to DoS protection to an  
underlying Solr, no?


   Erik



Re: Solr security

2008-11-16 Thread Ryan McKinley
I'm not totally sure what you are suggesting.  Is there a general way  
people deal with security and search?


I'm assuming we already have good ways (better ways) to make sure  
people are authorized/logged in etc.  What do you imagine "solr  
security" would add?


FYI, I used to have a custom RequstHandler that got the user principal  
from the HttpServletRequest (I have a custom SolrDispatchFilter that  
adds that to the context) and then augments the query with a filter  
that limits to stuff that user can see.  I replaced all that with a  
something that adds the filter to the Solrj query.


Assuming it is "safe" and all that, what do you think we could add  
that would be general enough?


ryan


On Nov 16, 2008, at 5:12 PM, Erik Hatcher wrote:

I'm pondering the viability of running Solr as effectively a UI  
server... what I mean by that is having a public facing browser- 
based application hitting a Solr backend directly for JSON, XML, etc  
data.


I know folks are doing this (I won't name names, in case this thread  
comes up with any vulnerabilities that would effect such existing  
environments).


Let's just assume a typical deployment environment... replicated  
Solr's behind a load balancer, maybe even a caching proxy.

What known vulnerabilities are there in Solr 1.3, for example?

What I think we can get out this is a Solr deployment configuration  
suitable for direct browser access, but we're not safely there yet  
are we?  Is this an absurd goal?  Must we always have a moving piece  
between browser and data/search servers?


Thanks,
Erik





Re: Solr security

2008-11-16 Thread Erik Hatcher


On Nov 16, 2008, at 5:41 PM, Ian Holsman wrote:
First thing I would look at is disabling write access, or writing a  
servlet that sits on top of the write handler to filter your data.


We can turn off all the update handlers, but how does that affect  
replication?  Can a Solr replicant be entirely read-only in the HTTP  
request sense?


Second thing I would be concerned about is people writing DoS  
queries that bypass the cache.



so you may need to write your own custom request handler to filter  
out that kind of thing.


Is this a concern that can be punted to what you'd naturally be  
putting in front of Solr anyway or a proxy tier that can have DoS  
blocking rules?  I mean, if you're deploying a Struts that hits Solr  
under the covers, how do you prevent against DoS on that?  A malicious  
user could keep sending queries indirectly to a Solr through a whole  
lot of public apps now.  In other words, another tier in front of Solr  
doesn't add (much) to DoS protection to an underlying Solr, no?


Erik



Re: Solr security

2008-11-16 Thread Ian Holsman

Erik Hatcher wrote:
I'm pondering the viability of running Solr as effectively a UI 
server... what I mean by that is having a public facing browser-based 
application hitting a Solr backend directly for JSON, XML, etc data.


I know folks are doing this (I won't name names, in case this thread 
comes up with any vulnerabilities that would effect such existing 
environments).


Let's just assume a typical deployment environment... replicated 
Solr's behind a load balancer, maybe even a caching proxy.

What known vulnerabilities are there in Solr 1.3, for example?

What I think we can get out this is a Solr deployment configuration 
suitable for direct browser access, but we're not safely there yet are 
we?  Is this an absurd goal?  Must we always have a moving piece 
between browser and data/search servers?


Thanks,
Erik




First thing I would look at is disabling write access, or writing a 
servlet that sits on top of the write handler to filter your data.


Second thing I would be concerned about is people writing DoS queries 
that bypass the cache.


so you may need to write your own custom request handler to filter out 
that kind of thing.




Re: Solr Security and XSRF

2008-06-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
SOLR-607 is still open.Till it is committed this solution may not be poossible
--Noble

On Mon, Jun 30, 2008 at 10:23 AM, Noble Paul നോബിള്‍ नोब्ळ्
<[EMAIL PROTECTED]> wrote:
> If you have a master slave configuration I guess it is a good idea to
> remove the updatehandler altogether from slaves.
> --Noble
>
> On Sat, Jun 28, 2008 at 2:39 AM, Chris Hostetter
> <[EMAIL PROTECTED]> wrote:
>>
>> : > A basic technique that can be used to mitigate the risk of a possible 
>> CSRF
>> : > attack like this is to configure your Servlet Container so that access to
>> : > paths which can modify the index (ie: /update, /update/csv, etc...) are
>> : > restricted either to specific client IPs, or using HTTP Authentication.
>> :
>> : My understanding is that HTTP authentication is useless against XSRF,
>> : because browsers cache the authentication tokens. Once you have
>> : authenticated, you are still vulnerable to attacks.
>>
>> while I agree that is generally true, my suggestion about using HTTP
>> Authentication was to reduce the number of "clients" that have access to
>> update URLs.  if you add authentication but then give everyone on your
>> intranet a username/password it would certianly defeat the point ...
>> people, using web browsers, don't typically need to hit URLs
>> like "/update".   Updates are typically sent by other applications, so if
>> you use authentication and hardcode credentials into your "update clients"
>> (which are automated and no going to hit arbitrary URLs maliciously fed to
>> them by bad guys) but do not give credentials to people who surf the
>> public internet you can help mitigate the risk.
>>
>> (of course: if you have a something like a webcrawler scraping webpages
>> and posting the docs directly to Solr, then that crawler could fall prey
>> to a CSRF attack as well -- you'd want to make sure that the
>> "crawl" requests didn't have access to the same credentials needed for
>> sending updates)
>>
>> : Restricting access to the servlet container by IP is probably safer.
>> : To access the admin pages, I proxy the servlet container via Apache,
>> : similar to this snippet given below.
>> :
>> : This requires the user to authenticate via SSL for all SOLR-related
>> : pages, and additionally blocks all update queries. If one also would
>>
>> But doesn't that mean that any system with a legitimate need to update
>> the index has to bypass your proxy?  What prevents a CSRF based attack
>> from bypasing your proxy as well?
>> : Comments, anyone? This configuration is container-agnostic, so if no
>> : serious problems are found with my setup, which Wiki page would be
>> : most appropriate for this snippet?
>>
>> It's container agnostic, but does require the use an an Apache HTTPD proxy
>> ... recipies like this seem like they'd be suitable on the SolrSecurity
>> page (but it still seems like there's a missing piece ... restricting the
>> appserver so it only accepts requests from the proxy, and a way for the
>> proxy to pass requests on to /update if and only if they come from
>> specific IPs, or specific "users", etc...
>>
>>
>> -Hoss
>>
>>
>
>
>
> --
> --Noble Paul
>



-- 
--Noble Paul


Re: Solr Security and XSRF

2008-06-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
If you have a master slave configuration I guess it is a good idea to
remove the updatehandler altogether from slaves.
--Noble

On Sat, Jun 28, 2008 at 2:39 AM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
>
> : > A basic technique that can be used to mitigate the risk of a possible CSRF
> : > attack like this is to configure your Servlet Container so that access to
> : > paths which can modify the index (ie: /update, /update/csv, etc...) are
> : > restricted either to specific client IPs, or using HTTP Authentication.
> :
> : My understanding is that HTTP authentication is useless against XSRF,
> : because browsers cache the authentication tokens. Once you have
> : authenticated, you are still vulnerable to attacks.
>
> while I agree that is generally true, my suggestion about using HTTP
> Authentication was to reduce the number of "clients" that have access to
> update URLs.  if you add authentication but then give everyone on your
> intranet a username/password it would certianly defeat the point ...
> people, using web browsers, don't typically need to hit URLs
> like "/update".   Updates are typically sent by other applications, so if
> you use authentication and hardcode credentials into your "update clients"
> (which are automated and no going to hit arbitrary URLs maliciously fed to
> them by bad guys) but do not give credentials to people who surf the
> public internet you can help mitigate the risk.
>
> (of course: if you have a something like a webcrawler scraping webpages
> and posting the docs directly to Solr, then that crawler could fall prey
> to a CSRF attack as well -- you'd want to make sure that the
> "crawl" requests didn't have access to the same credentials needed for
> sending updates)
>
> : Restricting access to the servlet container by IP is probably safer.
> : To access the admin pages, I proxy the servlet container via Apache,
> : similar to this snippet given below.
> :
> : This requires the user to authenticate via SSL for all SOLR-related
> : pages, and additionally blocks all update queries. If one also would
>
> But doesn't that mean that any system with a legitimate need to update
> the index has to bypass your proxy?  What prevents a CSRF based attack
> from bypasing your proxy as well?
> : Comments, anyone? This configuration is container-agnostic, so if no
> : serious problems are found with my setup, which Wiki page would be
> : most appropriate for this snippet?
>
> It's container agnostic, but does require the use an an Apache HTTPD proxy
> ... recipies like this seem like they'd be suitable on the SolrSecurity
> page (but it still seems like there's a missing piece ... restricting the
> appserver so it only accepts requests from the proxy, and a way for the
> proxy to pass requests on to /update if and only if they come from
> specific IPs, or specific "users", etc...
>
>
> -Hoss
>
>



-- 
--Noble Paul


Re: Solr Security and XSRF

2008-06-27 Thread Chris Hostetter

: > A basic technique that can be used to mitigate the risk of a possible CSRF
: > attack like this is to configure your Servlet Container so that access to
: > paths which can modify the index (ie: /update, /update/csv, etc...) are
: > restricted either to specific client IPs, or using HTTP Authentication.
: 
: My understanding is that HTTP authentication is useless against XSRF,
: because browsers cache the authentication tokens. Once you have
: authenticated, you are still vulnerable to attacks.

while I agree that is generally true, my suggestion about using HTTP 
Authentication was to reduce the number of "clients" that have access to 
update URLs.  if you add authentication but then give everyone on your 
intranet a username/password it would certianly defeat the point ... 
people, using web browsers, don't typically need to hit URLs  
like "/update".   Updates are typically sent by other applications, so if 
you use authentication and hardcode credentials into your "update clients" 
(which are automated and no going to hit arbitrary URLs maliciously fed to 
them by bad guys) but do not give credentials to people who surf the 
public internet you can help mitigate the risk.

(of course: if you have a something like a webcrawler scraping webpages 
and posting the docs directly to Solr, then that crawler could fall prey 
to a CSRF attack as well -- you'd want to make sure that the 
"crawl" requests didn't have access to the same credentials needed for 
sending updates)

: Restricting access to the servlet container by IP is probably safer.
: To access the admin pages, I proxy the servlet container via Apache,
: similar to this snippet given below.
: 
: This requires the user to authenticate via SSL for all SOLR-related
: pages, and additionally blocks all update queries. If one also would

But doesn't that mean that any system with a legitimate need to update 
the index has to bypass your proxy?  What prevents a CSRF based attack 
from bypasing your proxy as well?
: Comments, anyone? This configuration is container-agnostic, so if no
: serious problems are found with my setup, which Wiki page would be
: most appropriate for this snippet?

It's container agnostic, but does require the use an an Apache HTTPD proxy 
... recipies like this seem like they'd be suitable on the SolrSecurity 
page (but it still seems like there's a missing piece ... restricting the 
appserver so it only accepts requests from the proxy, and a way for the 
proxy to pass requests on to /update if and only if they come from 
specific IPs, or specific "users", etc...


-Hoss



Re: Solr Security and XSRF

2008-06-26 Thread Christian Vogler
On Fri, Jun 27, 2008 at 1:54 AM, Chris Hostetter
<[EMAIL PROTECTED]> wrote:
> A basic technique that can be used to mitigate the risk of a possible CSRF
> attack like this is to configure your Servlet Container so that access to
> paths which can modify the index (ie: /update, /update/csv, etc...) are
> restricted either to specific client IPs, or using HTTP Authentication.

My understanding is that HTTP authentication is useless against XSRF,
because browsers cache the authentication tokens. Once you have
authenticated, you are still vulnerable to attacks.

Restricting access to the servlet container by IP is probably safer.
To access the admin pages, I proxy the servlet container via Apache,
similar to this snippet given below.

This requires the user to authenticate via SSL for all SOLR-related
pages, and additionally blocks all update queries. If one also would
like to block specific admin pages, one could conceivably do so by
adding  + Deny directives.

Comments, anyone? This configuration is container-agnostic, so if no
serious problems are found with my setup, which Wiki page would be
most appropriate for this snippet?


ServerName your.server.name
ServerAdmin [EMAIL PROTECTED]

SSLEngine on
SSLCertificateFile /etc/ssl/certs/your_cert.pem
SSLCertificateKeyFile /etc/ssl/private/your_key.pem

DocumentRoot /var/webroot/www/webadmin/html

   ErrorLog /var/webroot/www/webadmin/logs/error_ssl.log
   # Possible values include: debug, info, notice, warn, error, crit,
   # alert, emerg.
   LogLevel warn

   CustomLog /var/webroot/www/webadmin/logs/access_ssl.log combined

# SOLR admin pages

Order deny,allow
Allow from all # change this to restrict to specific
IP addresses


ProxyPreserveHost On
ProxyRequests Off
ProxyPass /solr/admin http://127.0.0.1:9000/solr/admin
ProxyPassReverse /solr/admin http://127.0.0.1:9000/solr/admin
ProxyPass /solr/select http://127.0.0.1:9000/solr/select
ProxyPassReverse /solr/select http://127.0.0.1:9000/solr/select


AuthType Basic
AuthName "SOLR Admin Pages"
AuthUserFile /var/webroot/www/webadmin/auth/solr-auth
Require valid-user



Best regards
- Christian