Re: GET or POST for large queries?

2011-02-18 Thread Jan Høydahl
OK.

I would ask on the mailing list of ManifoldCF to see if they have some 
experience with OLS.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 18. feb. 2011, at 17.29, mrw wrote:

> 
> Thanks for the tip.  No, I did not know about that.  Unfortunately, we use
> Oracle OLS which does not appear to be supported.
> 
> 
> Jan Høydahl / Cominvent wrote:
>> 
>> Hi,
>> 
>> There are better ways to combat row level security in search than sending
>> huge lists of users over the wire.
>> 
>> Have you checked out the ManifoldCF project with which you can integrate
>> security to Solr? http://incubator.apache.org/connectors/
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> 
>> 
> 
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2527765.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: GET or POST for large queries?

2011-02-18 Thread mrw

Thanks for the tip.  No, I did not know about that.  Unfortunately, we use
Oracle OLS which does not appear to be supported.


Jan Høydahl / Cominvent wrote:
> 
> Hi,
> 
> There are better ways to combat row level security in search than sending
> huge lists of users over the wire.
> 
> Have you checked out the ManifoldCF project with which you can integrate
> security to Solr? http://incubator.apache.org/connectors/
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2527765.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: GET or POST for large queries?

2011-02-18 Thread Jan Høydahl
Hi,

There are better ways to combat row level security in search than sending huge 
lists of users over the wire.

Have you checked out the ManifoldCF project with which you can integrate 
security to Solr? http://incubator.apache.org/connectors/

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 18. feb. 2011, at 15.30, mrw wrote:

> 
> Thanks for the response.
> 
> Yes, the queries are fairly large.  Basically, the corporate security policy
> dictates that we use row-level security attributes from the DB for access
> control to Solr.   So,  we bake row-level security attributes from the
> database into the index, and then, at query time, ask for those same
> attributes from the DB and pass them as part of the Solr query.  So, imagine
> a bank VP with access to tens of thousands of customer records and
> transactions, and all those access attributes get sent to Solr.  The system
> works well for the low-level account managers and low-entitlement users, but
> cannot scale for the high-level folks.
> 
> POSTing the data appears to avoid the header threshold issue, but it breaks
> because of the "too many boolean clauses" error.
> 
> 
> 
> 
> gearond wrote:
>> 
>> Probably you could do it, and solving a problem in business supersedes 
>> 'rightness' concerns, much to the dismay of geeks and 'those who like
>> rightness 
>> and say the word "Neemph!" '. 
>> 
>> 
>> the not rightness about this is that:
>> POST, PUT, DELETE are assumed to make changes to the URL's backend.
>> GET is assumed NOT to make changes.
>> 
>> So if your POST does not make a change . . . it breaks convention. But if
>> it 
>> solves the problem . . . :-)
>> 
>> Another way would be to GET with a 'query file' location, and then have
>> the 
>> server fetch that query and execute it.
>> 
>> Boy!!! I'd love to see one of your queries!!! You must have a few ANDs/ORs
>> in 
>> them :-)
>> 
>> Dennis Gearon
>> 
> 
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2526934.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: GET or POST for large queries?

2011-02-18 Thread Markus Jelsma
Increase the setting in solrconfig

On Friday 18 February 2011 15:30:11 mrw wrote:
> Thanks for the response.
> 
> POSTing the data appears to avoid the header threshold issue, but it breaks
> because of the "too many boolean clauses" error.
> 
> gearond wrote:
> > Probably you could do it, and solving a problem in business supersedes
> > 'rightness' concerns, much to the dismay of geeks and 'those who like
> > rightness
> > and say the word "Neemph!" '.
> > 
> > 
> > the not rightness about this is that:
> > POST, PUT, DELETE are assumed to make changes to the URL's backend.
> > GET is assumed NOT to make changes.
> > 
> > So if your POST does not make a change . . . it breaks convention. But if
> > it
> > solves the problem . . . :-)
> > 
> > Another way would be to GET with a 'query file' location, and then have
> > the
> > server fetch that query and execute it.
> > 
> > Boy!!! I'd love to see one of your queries!!! You must have a few
> > ANDs/ORs in
> > them :-)
> > 
> >  Dennis Gearon

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Re: GET or POST for large queries?

2011-02-18 Thread mrw

Thanks for the response and info.

I'll try that.  


Jonathan Rochkind wrote:
> 
> Yes, I think it's 1024 by default.  I think you can raise it in your 
> config. But your performance may suffer.
> 
> Best would be to try and find a better way to do what you want without 
> using thousands of clauses. This might require some custom Java plugins 
> to Solr though.
> 
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2526950.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: GET or POST for large queries?

2011-02-18 Thread mrw

Thanks for the response.

Yes, the queries are fairly large.  Basically, the corporate security policy
dictates that we use row-level security attributes from the DB for access
control to Solr.   So,  we bake row-level security attributes from the
database into the index, and then, at query time, ask for those same
attributes from the DB and pass them as part of the Solr query.  So, imagine
a bank VP with access to tens of thousands of customer records and
transactions, and all those access attributes get sent to Solr.  The system
works well for the low-level account managers and low-entitlement users, but
cannot scale for the high-level folks.

POSTing the data appears to avoid the header threshold issue, but it breaks
because of the "too many boolean clauses" error.




gearond wrote:
> 
> Probably you could do it, and solving a problem in business supersedes 
> 'rightness' concerns, much to the dismay of geeks and 'those who like
> rightness 
> and say the word "Neemph!" '. 
> 
> 
> the not rightness about this is that:
> POST, PUT, DELETE are assumed to make changes to the URL's backend.
> GET is assumed NOT to make changes.
> 
> So if your POST does not make a change . . . it breaks convention. But if
> it 
> solves the problem . . . :-)
> 
> Another way would be to GET with a 'query file' location, and then have
> the 
> server fetch that query and execute it.
> 
> Boy!!! I'd love to see one of your queries!!! You must have a few ANDs/ORs
> in 
> them :-)
> 
>  Dennis Gearon
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2526934.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: GET or POST for large queries?

2011-02-17 Thread Dennis Gearon
Probably you could do it, and solving a problem in business supersedes 
'rightness' concerns, much to the dismay of geeks and 'those who like rightness 
and say the word "Neemph!" '. 


the not rightness about this is that:
POST, PUT, DELETE are assumed to make changes to the URL's backend.
GET is assumed NOT to make changes.

So if your POST does not make a change . . . it breaks convention. But if it 
solves the problem . . . :-)

Another way would be to GET with a 'query file' location, and then have the 
server fetch that query and execute it.

Boy!!! I'd love to see one of your queries!!! You must have a few ANDs/ORs in 
them :-)

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.





From: mrw 
To: solr-user@lucene.apache.org
Sent: Thu, February 17, 2011 11:27:06 AM
Subject: GET or POST for large queries?


We are running into some issues with large queries.  Initially, they were
ostensibly header buffer overruns, because increasing Jetty's
headerBufferSize value to 65536 resolved them. This seems like a kludge, but
it does solve the problem for 95% of our users.

However, we do have queries that are physically larger than that and for
which increasing the headerBufferSize to 65536 does not work.  This is due
to security requirements:  Security descriptors are baked into the index,
and then potentially thousands of them (depending on the user context) are
passed in with each query.  These excessive queries are only a problem with
approximately 5% of users who are highly entitled, but the number of
security descriptors in are likely to increase and we won't have a
workaround for this security policy any time soon.

After a lot of Googling, it seems to me that it's common to increase the
headerBufferSize, but I don't see any other strategies.  Is it
possible/feasible to switch to use POST for querying?

Thanks!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2521700.html

Sent from the Solr - User mailing list archive at Nabble.com.


Re: GET or POST for large queries?

2011-02-17 Thread Jonathan Rochkind
Yes, I think it's 1024 by default.  I think you can raise it in your 
config. But your performance may suffer.


Best would be to try and find a better way to do what you want without 
using thousands of clauses. This might require some custom Java plugins 
to Solr though.


On 2/17/2011 3:52 PM, mrw wrote:

Yeah, I tried switching to POST.

It seems to be handling the size, but apparently Solr has a limit on the
number of boolean comparisons -- I'm now getting "too many boolean clauses"
errors emanating from

org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:108).
:)


Thanks for responding.



Erik Hatcher-4 wrote:

Yes, you may use POST to make search requests to Solr.

Erik




Re: GET or POST for large queries?

2011-02-17 Thread mrw

Yeah, I tried switching to POST.

It seems to be handling the size, but apparently Solr has a limit on the
number of boolean comparisons -- I'm now getting "too many boolean clauses"
errors emanating from

org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:108).
 
:)


Thanks for responding.



Erik Hatcher-4 wrote:
> 
> Yes, you may use POST to make search requests to Solr.
> 
>   Erik
> 
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2522293.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: GET or POST for large queries?

2011-02-17 Thread Erik Hatcher
Yes, you may use POST to make search requests to Solr.

Erik

On Feb 17, 2011, at 14:27 , mrw wrote:

> 
> We are running into some issues with large queries.  Initially, they were
> ostensibly header buffer overruns, because increasing Jetty's
> headerBufferSize value to 65536 resolved them. This seems like a kludge, but
> it does solve the problem for 95% of our users.
> 
> However, we do have queries that are physically larger than that and for
> which increasing the headerBufferSize to 65536 does not work.  This is due
> to security requirements:  Security descriptors are baked into the index,
> and then potentially thousands of them (depending on the user context) are
> passed in with each query.  These excessive queries are only a problem with
> approximately 5% of users who are highly entitled, but the number of
> security descriptors in are likely to increase and we won't have a
> workaround for this security policy any time soon.
> 
> After a lot of Googling, it seems to me that it's common to increase the
> headerBufferSize, but I don't see any other strategies.  Is it
> possible/feasible to switch to use POST for querying?
> 
> Thanks!
> -- 
> View this message in context: 
> http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2521700.html
> Sent from the Solr - User mailing list archive at Nabble.com.



GET or POST for large queries?

2011-02-17 Thread mrw

We are running into some issues with large queries.  Initially, they were
ostensibly header buffer overruns, because increasing Jetty's
headerBufferSize value to 65536 resolved them. This seems like a kludge, but
it does solve the problem for 95% of our users.

However, we do have queries that are physically larger than that and for
which increasing the headerBufferSize to 65536 does not work.  This is due
to security requirements:  Security descriptors are baked into the index,
and then potentially thousands of them (depending on the user context) are
passed in with each query.  These excessive queries are only a problem with
approximately 5% of users who are highly entitled, but the number of
security descriptors in are likely to increase and we won't have a
workaround for this security policy any time soon.

After a lot of Googling, it seems to me that it's common to increase the
headerBufferSize, but I don't see any other strategies.  Is it
possible/feasible to switch to use POST for querying?

Thanks!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/GET-or-POST-for-large-queries-tp2521700p2521700.html
Sent from the Solr - User mailing list archive at Nabble.com.