Organizing multiple searchers around overlapping subsets of data

Michael Ludwig Fri, 08 May 2009 01:59:42 -0700

I have one type of document, but different searchers, each of
which is interested in a different subset of the documents,
which are different configurations of TV channels {A,B,C,D}.


* Application S1 is interested in all channels, i.e. {A,B,C,D}.
* Application S2 is interested in {A,B,C}.
* Application S3 is interested in {A,C,D}.
* Application S4 is interested in {B,D}.

As can be seen from this simplified example, the subsets are
not disjoint, but do have considerable overlaps.

The total data volume is only about 200 MB. There are four
searchers, and they may become ten or a dozen.

The set elements an application may or may not be interested
in, however, i.e. the channels, which are {A,B,C,D} in this
example, are not just four, but about 150, each of which has
about 1000 documents.

What is the best way to organize this?

(a) Set up different cores for each application, i.e. going
multi-core, thereby incurring a good deal of redundancy, but
simplifying searches?

(b) Apply filter queries to select documents from only, say
60, 80 or 110 out of 150 channels.

(c) Something else I'm not aware of.

Am I right in suspecting that multi-core makes less sense with
increasing overlaps and hence redundancy?

Michael Ludwig

Organizing multiple searchers around overlapping subsets of data

Reply via email to