[ 
https://issues.apache.org/jira/browse/SOLR-12460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amrit Sarkar updated SOLR-12460:
--------------------------------
    Description: 
FacetProcessor.java :: {{handleJoinField}} calls 
SolrIndexSearcher.getDocSet(..domainquery..) which eventually cache join 
queries like:

{code}
key:{  !join from=[vin_s-value] to=[vin_s-value] }   {   !cache=false }   
ConstantScore(BitSetDocTopFilter) 
value:   org.apache.solr.search.SortedIntDocSet@40886054 lastAccessed:824,
{code}

and when we execute the same query again, some of the entries are not getting 
used at all. Please note: the filterCache entries are not getting used strictly 
when we have more than 1 shard in the collection.

Sample: {{car_stuff}} collection with multiple doc-types: vehicle, claims, 
defects - 10 shards

{code}
http://localhost:8983/solr/car_stuff/query?rows=0&q=doc_type_s:vehicle AND 
v_model_s:model_009&json.facet={  
   models:{  
      type:terms,
      field:"v_model_s",
      limit:-1,
      facet:{  
         year_per_model:{  
            type:terms,
            field:"v_year_i",
            limit:-1,
            facet:{  
               claim_month:{  
                  domain:{  
                     join:{  
                        from:"vin_s",
                        to:"vin_s"
                     },
                     filter:"doc_type_s:claim"
                  },
                  type:terms,
                  field:"claim_opcode_s",
                  limit:-1
               }
            }
         }
      }
   }
}
{code}

After executing this query for first time, filterCache for one of the cores 
looks like:
{code}
      "CACHE.searcher.filterCache":{
        "lookups":145,
        "hits":47,
        "cumulative_evictions":0,
        "size":99,
        "hitratio":0.32,
        "evictions":0,
        "cumulative_lookups":145,
        "cumulative_hitratio":0.32,
        "warmupTime":0,
        "inserts":99,
        "cumulative_inserts":99,
        "cumulative_hits":47},
{code}

2nd time executing same query:
{code}
      "CACHE.searcher.filterCache":{
        "lookups":291,
        "hits":145,
        "cumulative_evictions":0,
        "size":147,
        "hitratio":0.5,
        "evictions":0,
        "cumulative_lookups":291,
        "cumulative_hitratio":0.5,
        "warmupTime":0,
        "inserts":147,
        "cumulative_inserts":147,
        "cumulative_hits":145},
{code}

Looking into the entries of the filterCache looks like this:

{code}
.....
 key: v_year_i:[1977 TO 1977] value: org.apache.solr.search.BitDocSet@1524fa9c 
lastAccessed:457,
 key: v_model_s:model_003 value: org.apache.solr.search.BitDocSet@61f348dd 
lastAccessed:157,
key:{  !join from=[vin_s-value] to=[vin_s-value] }   {   !cache=false }   
ConstantScore(BitSetDocTopFilter) 
value:   org.apache.solr.search.SortedIntDocSet@40886054 lastAccessed:824,
.....

The collection backup is uploaded on the JIRA.

  was:
FacetProcessor.java :: {{handleJoinField}} calls 
SolrIndexSearcher.getDocSet(..domainquery..) which eventually cache join 
queries like:

{code}
key:{  !join from=[vin_s-value] to=[vin_s-value] }   {   !cache=false }   
ConstantScore(BitSetDocTopFilter) 
value:   org.apache.solr.search.SortedIntDocSet@40886054 lastAccessed:824,
{code}

and when we execute the same query again, some of the entries are not getting 
used at all. Please note: the filtercache entries are not getting used strictly 
when we have more than 1 shard in the collection.

Sample: {{car_stuff}} collection with multiple doc-types: vehicle, claims, 
defects - 10 shards

{code}
http://localhost:8983/solr/car_stuff/query?rows=0&q=doc_type_s:vehicle AND 
v_model_s:model_009&json.facet={  
   models:{  
      type:terms,
      field:"v_model_s",
      limit:-1,
      facet:{  
         year_per_model:{  
            type:terms,
            field:"v_year_i",
            limit:-1,
            facet:{  
               claim_month:{  
                  domain:{  
                     join:{  
                        from:"vin_s",
                        to:"vin_s"
                     },
                     filter:"doc_type_s:claim"
                  },
                  type:terms,
                  field:"claim_opcode_s",
                  limit:-1
               }
            }
         }
      }
   }
}
{code}

After executing this query for first time, filtercache for one of the cores 
looks like:



> Filtercache getting filled up when domain switches are involved in Json facets
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-12460
>                 URL: https://issues.apache.org/jira/browse/SOLR-12460
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Facet Module
>    Affects Versions: master (8.0)
>            Reporter: Amrit Sarkar
>            Priority: Major
>
> FacetProcessor.java :: {{handleJoinField}} calls 
> SolrIndexSearcher.getDocSet(..domainquery..) which eventually cache join 
> queries like:
> {code}
> key:{  !join from=[vin_s-value] to=[vin_s-value] }   {   !cache=false }   
> ConstantScore(BitSetDocTopFilter) 
> value:   org.apache.solr.search.SortedIntDocSet@40886054 lastAccessed:824,
> {code}
> and when we execute the same query again, some of the entries are not getting 
> used at all. Please note: the filterCache entries are not getting used 
> strictly when we have more than 1 shard in the collection.
> Sample: {{car_stuff}} collection with multiple doc-types: vehicle, claims, 
> defects - 10 shards
> {code}
> http://localhost:8983/solr/car_stuff/query?rows=0&q=doc_type_s:vehicle AND 
> v_model_s:model_009&json.facet={  
>    models:{  
>       type:terms,
>       field:"v_model_s",
>       limit:-1,
>       facet:{  
>          year_per_model:{  
>             type:terms,
>             field:"v_year_i",
>             limit:-1,
>             facet:{  
>                claim_month:{  
>                   domain:{  
>                      join:{  
>                         from:"vin_s",
>                         to:"vin_s"
>                      },
>                      filter:"doc_type_s:claim"
>                   },
>                   type:terms,
>                   field:"claim_opcode_s",
>                   limit:-1
>                }
>             }
>          }
>       }
>    }
> }
> {code}
> After executing this query for first time, filterCache for one of the cores 
> looks like:
> {code}
>       "CACHE.searcher.filterCache":{
>         "lookups":145,
>         "hits":47,
>         "cumulative_evictions":0,
>         "size":99,
>         "hitratio":0.32,
>         "evictions":0,
>         "cumulative_lookups":145,
>         "cumulative_hitratio":0.32,
>         "warmupTime":0,
>         "inserts":99,
>         "cumulative_inserts":99,
>         "cumulative_hits":47},
> {code}
> 2nd time executing same query:
> {code}
>       "CACHE.searcher.filterCache":{
>         "lookups":291,
>         "hits":145,
>         "cumulative_evictions":0,
>         "size":147,
>         "hitratio":0.5,
>         "evictions":0,
>         "cumulative_lookups":291,
>         "cumulative_hitratio":0.5,
>         "warmupTime":0,
>         "inserts":147,
>         "cumulative_inserts":147,
>         "cumulative_hits":145},
> {code}
> Looking into the entries of the filterCache looks like this:
> {code}
> .....
>  key: v_year_i:[1977 TO 1977] value: 
> org.apache.solr.search.BitDocSet@1524fa9c lastAccessed:457,
>  key: v_model_s:model_003 value: org.apache.solr.search.BitDocSet@61f348dd 
> lastAccessed:157,
> key:{  !join from=[vin_s-value] to=[vin_s-value] }   {   !cache=false }   
> ConstantScore(BitSetDocTopFilter) 
> value:   org.apache.solr.search.SortedIntDocSet@40886054 lastAccessed:824,
> .....
> The collection backup is uploaded on the JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to