Hi Mike,

No, my problem is that the field article_outlinks is multivalued thus it contains several urls not related to my search. I would like to facet only urls matching my query.

For exemple(only on one document, but my search targets over 1M docs):

Doc1:
article_url:
url1.com/1
url2.com/2
url1.com/1
url1.com/3

And my query is: article_url:url1.com* and I facet by article_url and I want it to give me:
url1.com/1 (2)
url1.com/3 (1)

But right now, because url2.com/2 is contained in a multivalued field with the matching urls, I get this:
url1.com/1 (2)
url1.com/3 (1)
url2.com/2 (1)

I can use facet.prefix to filter, but it's not very flexible if my url contains a subdomain as facet.prefix doesn't support wildcards.

Thank you,

Olivier

Mike Topper a écrit :
Hi Olivier,

are the facet counts on the urls you dont want 0?

if so you can use facet.mincount to only return results greater than 0.

-Mike

Olivier H. Beauchesne wrote:
Hi,

Long time lurker, first time poster.

I have a multi-valued field, let's call it article_outlinks containing
all outgoing urls from a document. I want to get all matching urls
sorted by counts.

For exemple, I want to get all outgoing wikipedia url in my documents
sorted by counts.

So I execute a query like this:
q=article_outlinks:http*wikipedia.org*  and I facet on article_outlinks

But I get facets containing the other urls in the documents. I can get
something close by using facet.prefix=http://en.wikipedia.org but I
want to include other subdomains on wikipedia (ex: fr.wikipedia.org).

Is there a way to do a search and getting facets only matching my query?

I know facet.prefix isn't a query, but is there a way to get that
behavior?

Is it easy to extend solr to do something like that?

Thank you,

Olivier

Sorry for my english.



Reply via email to