[jira] [Comment Edited] (SOLR-12673) Superset Query

Varun Thacker (JIRA) Mon, 20 Aug 2018 12:14:10 -0700


    [ 
https://issues.apache.org/jira/browse/SOLR-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586377#comment-16586377
 ]


Varun Thacker edited comment on SOLR-12673 at 8/20/18 7:13 PM:
---------------------------------------------------------------

Thanks everyone for the tips!

Here's a index
{code:java}
[
{"id" : "1", "users_ss" : ["user1"], "count_i" : 1},
{"id" : "2", "users_ss" : ["user4"], "count_i" : 1},
{"id" : "3", "users_ss" : ["user1", "user2"], "count_i" : 2},
{"id" : "4", "users_ss" : ["user1", "user2", "user3"], "count_i" : 3},
{"id" : "5", "users_ss" : ["user1", "user2", "user3", "user4"], "count_i" : 4}
]{code}
Query : If the document has any user apart from user1, user2 don't show it.
 Results : id:1, id:3

This is what I tried so far :
 Mikhail, here is the frange query . Here's the query I fired
{code:java}
q=*:*&fq={!frange cache=false l=0 
incl=true}sub(sum(query($bq1),query($bq2)),count_i)&bq1=users_ss:user1^=1&bq2=users_ss:user2^=1{code}
This however gives me only document 3 . We don't get document1 because of 
[http://lucene.apache.org/core/7_4_0/queries/org/apache/lucene/queries/function/valuesource/MultiFunction.html#allExists-int-org.apache.lucene.queries.function.FunctionValues:A-]

 

EDIT: Change func to frange to filter out score=0 documents 

Here's another approach I tried with a solution Hoss has proposed 
{code:java}
q={!frange 
u=1}eq(count_i,sum(termfreq(users_ss,"user1"),termfreq(users_ss,"user2"))){code}
This seems to be doing the job
{code:java}
{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":7,
    "params":{
      "q":"*:*",
      
"fl":"id,sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\")),count_i,score",
      "fq":"{!frange 
l=1}eq(count_i,sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\")))",
      "_":"1534788187996"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
      {
        "id":"1",
        "count_i":1,
        "sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))":1.0,
        "score":1.0},
      {
        "id":"3",
        "count_i":2,
        "sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))":2.0,
        "score":1.0}]
  }} 
{code}
 

So looks like solutions do exist : My question would be is it worth writing a 
specialized query and would it be more efficient than these solutions ?


was (Author: varunthacker):
Thanks everyone for the tips!


Here's a index
{code:java}
[
{"id" : "1", "users_ss" : ["user1"], "count_i" : 1},
{"id" : "2", "users_ss" : ["user4"], "count_i" : 1},
{"id" : "3", "users_ss" : ["user1", "user2"], "count_i" : 2},
{"id" : "4", "users_ss" : ["user1", "user2", "user3"], "count_i" : 3},
{"id" : "5", "users_ss" : ["user1", "user2", "user3", "user4"], "count_i" : 4}
]{code}
Query : If the document has any user apart from user1, user2 don't show it.
Results : id:1, id:3

This is what I tried so far :
Mikhail, here is the frange query . Here's the query I fired
{code:java}
q=*:*&fq={!frange cache=false l=0 
incl=true}sub(sum(query($bq1),query($bq2)),count_i)&bq1=users_ss:user1^=1&bq2=users_ss:user2^=1{code}
This however gives me only document 3 . We don't get document1 because of 
[http://lucene.apache.org/core/7_4_0/queries/org/apache/lucene/queries/function/valuesource/MultiFunction.html#allExists-int-org.apache.lucene.queries.function.FunctionValues:A-]

 

Here's another approach I tried with a solution Hoss has proposed 
{code:java}
q={!func}eq(count_i,sum(termfreq(users_ss,"user1"),termfreq(users_ss,"user2"))){code}
This seems to be doing the job , but I still get documents with score = 0
{code:java}
{
"responseHeader": {
"zkConnected": true,
"status": 0,
"QTime": 6,
"params": {
"q": 
"{!func}eq(count_i,sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\")))",
"fl": 
"id,sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\")),count_i,score",
"_": "1534788187996"
}
},
"response": {
"numFound": 5,
"start": 0,
"maxScore": 1.0,
"docs": [{
"id": "1",
"count_i": 1,
"sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))": 1.0,
"score": 1.0
},
{
"id": "3",
"count_i": 2,
"sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))": 2.0,
"score": 1.0
},
{
"id": "4",
"count_i": 3,
"sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))": 2.0,
"score": 0.0
},
{
"id": "2",
"count_i": 1,
"sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))": 0.0,
"score": 0.0
},
{
"id": "5",
"count_i": 4,
"sum(termfreq(users_ss,\"user1\"),termfreq(users_ss,\"user2\"))": 2.0,
"score": 0.0
}
]
}
}{code}
 

So looks like solutions do exist : My question would be is it worth writing a 
specialized query and would it be more efficient than these solutions ?

> Superset Query
> --------------
>
>                 Key: SOLR-12673
>                 URL: https://issues.apache.org/jira/browse/SOLR-12673
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Varun Thacker
>            Priority: Major
>
> Here's the use-case I am trying to solve for document level access control.
> Documents :
> {code:java}
> [
>   {"id" : "1", "users" : ["user1"]},
>   {"id" : "2", "users" : ["user4"]},
>   {"id" : "3", "users" : ["user1", "user2"]},
>   {"id" : "4", "users" : ["user1", "user2", "user3"]},
>   {"id" : "5", "users" : ["user1", "user2", "user3", "user4"]}
> ]{code}
>  
> Query : If the document has any user apart from user1, user2 or user3 don't 
> show it.
> Results : id:1, id:3, id:4
> Query : If the document has any user apart from user1, user2 don't show it.
> Results : id:1, id:3
> I'm thinking this can be solved by writing a post-filter
> Syntax:
> {code:java}
> {!union_has_all field=users}user1,user2,user3{code}
> The post filter would get each document at a time and see if there is a user 
> in that document that is not part of the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-12673) Superset Query

Reply via email to