[ 
https://issues.apache.org/jira/browse/SOLR-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yonik Seeley updated SOLR-153:
------------------------------

    Attachment: facettree.patch

Much more complete code, algorithm-wise.

I added code to build a tree.  It's based on a priority queue, but it only 
takes unionSize into account when selecting nodes to merge (not maxDf at all), 
and is thus sub-optimal.  I expect it to be replaced in the future, but it may 
work well enough for the first working version.

I added searching code that traverses the tree and expands nodes, estimating 
child intersection counts based on the parent count multiplied by the fraction 
of bits set in the child union.  
Right now, the next node to evaluate is based on estimatedIntersectionCount * 
maxDf, but something like estimatedIntersectionCount * sqrt(maxDf) might work 
better in the future.

This is still all really brainstorming code, all in one file, completely 
untested, and it will not work since there is no code to hook it up to Solr 
(construct a request or get the result).  This update is really just to back up 
the code somewhere, or in case I get hit by a bus :-)


> Facet Index
> -----------
>
>                 Key: SOLR-153
>                 URL: https://issues.apache.org/jira/browse/SOLR-153
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Yonik Seeley
>         Attachments: facettree.patch, facettree.patch, facettree.patch
>
>
> A facet index, initially for non-hierarchical facets.
> Start with all terms, and a set of documents for each term.  Group lower 
> level nodes by taking the union of the sets, but keep track of the largest 
> set going back all the way to the leaves (the max doc-freq for that node).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to