Hi All,

I spent some time today playing around with subfacets and facets functions now 
available in helios search 0.05 and I have some concerns... They look very 
promising .

I indexed 10 000 documents and built some queries to look at each feature and 
found some weird behaviour that I could not explain.

The first query I made was to find all documents having the word "java" in 
their title and then compute a facet on the field position_id with stats about 
the field job_id. Basically, I want the number of unique Job_ids for each 
position_id for all matching documents.

http://localhost:8983/solr/current/select?q=title:java&facet=on&facet.field=position_id&facet.stat=unique(job_id)&rows=1&facet.limit=10&facet.mincount=1&wt=json&indent=on&fl=job_id,position_id,super_alias_id

the response looks good except for one little thing... the mincount is not 
respected whenever I specify the facet.stat parameter. Removing it will cause 
the mincount to be respected but then I need this parameter.

Without the parameter the facet looks like this:
"facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "position_id":[
        "265151",5,
        "927284",1,
        "1662380",1,
        "2625553",1,
        "2862455",1,
        "4128904",1,
        "4253203",1]},  <=== accounted for all 11 documents

And now when adding the parameter:


"facets":{

    "position_id":{

      "stats":{

        "unique(job_id)":11, <== again, 11 documents, which is good

        "count":11},

      "buckets":[{

          "val":265151,

          "unique(job_id)":5,

          "count":5},

        {

          "val":927284,

          "unique(job_id)":1,

          "count":1},

        {

          "val":1662380,

          "unique(job_id)":1,

          "count":1},

        {

          "val":2625553,

          "unique(job_id)":1,

          "count":1},

        {

          "val":2862455,

          "unique(job_id)":1,

          "count":1},

        {

          "val":4128904,

          "unique(job_id)":1,

          "count":1},

        {

          "val":4253203,

          "unique(job_id)":1,

          "count":1},

        {

          "val":1133,

          "unique(job_id)":0, <== what is this?

          "count":0},
                .... Many zero entries following...

I was wondering where the extra entries were coming from... the position_id = 
1133 above is not even a match for my query (its title is "Audit Consultant")
I`ve also noticed a similar behaviour when using subfacets. It looks like the 
number of items returned always match the "facet.limit" parameter.
If not enough values are present for a given entry then the bucket is filled 
with documents not matching the original query.

Am I doing something wrong?

Reply via email to