Re: filtering/faceting by a big list of IDs

Roman Chyla Thu, 13 Feb 2014 05:23:42 -0800

Hi Tri,
Look at this:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/%3CCAEN8dyX_Am_v4f=5614eu35fnhb5h7dzkmkzdfwvrrm1xpq...@mail.gmail.com%3E
Roman
On 13 Feb 2014 03:39, "Tri Cao" <tm...@me.com> wrote:


> Hi Joel,
>
> Thanks a lot for the suggestion.
>
> After thinking more about this, I think I could skip the faceting count
> for now,
> and so just provide a filtering option without display how many items that
> would
> be there after filtering. After all, even Google Shopping product search
> doesn't
> display the facet counts :) Given that, I think the easiest way is to add
> a new
> PostFilter to the query.
>
> Thanks again,
> Tri
>
> On Feb 12, 2014, at 12:03 PM, Joel Bernstein <joels...@gmail.com> wrote:
>
> Tri,
>
> You will most likely need to implement a custom QParserPlugin to
> efficiently handle what you described. Inside of this QParserPlugin you
> could create the logic that would bring in your outside list of ID's and
> build a DocSet that could be applied to the fq and the facet.query. I
> haven't attempted to use a QParserPlugin with a facet.query, but in theory
> it would work.
>
> With the filter query you also have the option of implementing your Query
> as a PostFilter. PostFilter logic is applied at collect time so the logic
> needs to only be applied to the documents that match the query. In many
> cause this can be faster, especially when result sets are relatively small
> but the index is large.
>
>
> Joel Bernstein
> Search Engineer at Heliosearch
>
>
> On Wed, Feb 12, 2014 at 2:12 PM, Tri Cao <tm...@me.com> wrote:
>
> Hi all,
>
> I am running a Solr application and I would need to implement a feature
>
> that requires faceting and filtering on a large list of IDs. The IDs are
>
> stored outside of Solr and is specific to the current logged on user. An
>
> example of this is the articles/tweets the user has read in the last few
>
> weeks. Note that the IDs here are the real document IDs and not Lucene
>
> internal docids.
>
> So the question is what would be the best way to implement this in Solr?
>
> The list could be as large as a ten of thousands of IDs. The obvious way of
>
> rewriting Solr query to add the ID list as "facet.query" and "fq" doesn't
>
> seem to be the best way because: a) the query would be very long, and b) it
>
> would surely exceed that the default limit of 1024 Boolean clauses and I
>
> am sure the limit is there for a reason.
>
> I had a similar problem before but back then I was using Lucene directly
>
> and the way I solved it is to use a MultiTermQuery to retrieve the internal
>
> docids from the ID list and then apply the resulting DocSet to counting and
>
> filtering. It was working reasonably for list of size ~10K, and with proper
>
> caching, it was working ok. My current application is very invested in Solr
>
> that going back to Lucene is not an option anymore.
>
> All advice/suggestion are welcomed.
>
> Thanks,
>
> Tri
>
>

Re: filtering/faceting by a big list of IDs

Reply via email to