Well, my first question is why 50K groups is necessary, and whether you can simplify that. How a user can manually choose from among that many groups is "interesting". But assuming they're all necessary, I can think of two things.
If the user can only select ranges, just put in filter queries using ranges. Or possibly both ranges and individual entries, as fq=group:[1A TO 10000A] OR group:(2B 45C 98Z) etc. You need to be a little careful how you put index these so range queries work properly, in the above you'd miss 2A because it's sorting lexicographically, you'd need to store in some form that sorts like 0000001A 010000A and so on. You wouldn't need to show that form to the user, just form your fq's in the app to work with that form. If that won't work (you wouldn't want this to get huge), think about a "post filter" that would only operate on documents that had made it through the select, although how to convey which groups the user selected to the post filter is an open question. Best, Erick On Wed, Oct 9, 2013 at 12:23 PM, David Philip <davidphilipshe...@gmail.com> wrote: > Hi All, > > I have an issue in handling filters for one of our requirements and > liked to get suggestion for the best approaches. > > > *Use Case:* > > 1. We have List of groups and the number of groups can increase upto >1 > million. Currently we have almost 90 thousand groups in the solr search > system. > > 2. Just before the user hits a search, He has options to select the no. of > groups he want to retrieve. [the distinct list of these group Names for > display are retrieved from other solr index that has more information about > groups] > > *3.User Operation:** * > Say if user selected group 1A - group 10000A. and searches for key:cancer. > > > The current approach I was thinking is : get search results and filter > query by groupids' list selected by user. But my concern is When these > groups list is increasing to >50k unique Ids, This can cause lot of delay > in getting search results. So wanted to know whether there are different > filtering ways that I can try for? > > I was thinking of one more approach as suggested by my colleague to do - > intersection. - > Get the groupIds' selected by user. > Get the list of groupId's from search results, > Perform intersection of both and then get the entire result set of only > those groupid that intersected. Is this better way? Can I use any cache > technique in this case? > > > - David.