Check out Chris Hostetter's methodology for doing this at cnet.

http://mail-archives.apache.org/mod_mbox/lucene-java-user/200508.mbox/[EMAIL 
PROTECTED]

This sounds like it matches your requirements.

cheers,
j

On 12/7/05, Ching-Pei Hsing <[EMAIL PROTECTED]> wrote:
>
> Has anyway solved the following problem, or have good suggestions?
>
>
>
> Each document is assigned to one or more category nodes in a hierarchy.
>
> For example,
>
>
>
> Document1: /Computer/Desktop,
>
> Document2: /Computer/Notebook; /Salesforce/ExtremePortable
>
> Document3: /Computer/Server
>
> ......
>
>
>
> For each search operations, not only a list of documents hit is
> presented but a list of categories containing those documents as well as
> the count of documents are also computed
>
>
>
> /Computer/Desktop(30)
>
> /Computer/Notebook(12)
>
> /Computer/Accessories(51)
>
>
>
> One can see this really useful because it can "guide" the user while
> refining the search criteria and quickly reduce the size of the result.
> I know we can do this, by brut force, by going through the entire result
> set, retrieving data for the category field and start aggregating and
> counting. It's not scalable though if the number of documents needs to
> go through is high. It can create performance issues under load if each
> execution thread held on to the index reader for too long (due to the
> number of documents needs to go through).
>
>
>
> Is there any API or approach we can leverage at search time? Is there
> anything we can do at the indexing time? Or, is there any technology we
> need to integrate, like those for data warehousing? Any comments or
> pointers will be greatly appreciated.
>
>
>
> Thanks
>
>
>
> Ching-pei
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Reply via email to