yeah, but then I would have to retrieve *a lot* of facets. I think for now i'll retrieve all the subdomains with facet.prefix and then merge those queries. Not ideal, but when I will have more motivation, I will submit a patch to solr :-)

Michael a écrit :
You could post-process the response and remove urls that don't match your
domain pattern.

On Mon, Aug 31, 2009 at 9:45 AM, Olivier H. Beauchesne <oliv...@olihb.com>wrote:

Hi Mike,

No, my problem is that the field article_outlinks is multivalued thus it
contains several urls not related to my search. I would like to facet only
urls matching my query.

For exemple(only on one document, but my search targets over 1M docs):

Doc1:
article_url:
url1.com/1
url2.com/2
url1.com/1
url1.com/3

And my query is: article_url:url1.com* and I facet by article_url and I
want it to give me:
url1.com/1 (2)
url1.com/3 (1)

But right now, because url2.com/2 is contained in a multivalued field with
the matching urls, I get this:
url1.com/1 (2)
url1.com/3 (1)
url2.com/2 (1)

I can use facet.prefix to filter, but it's not very flexible if my url
contains a subdomain as facet.prefix doesn't support wildcards.

Thank you,

Olivier

Mike Topper a écrit :

 Hi Olivier,
are the facet counts on the urls you dont want 0?

if so you can use facet.mincount to only return results greater than 0.

-Mike

Olivier H. Beauchesne wrote:


Hi,

Long time lurker, first time poster.

I have a multi-valued field, let's call it article_outlinks containing
all outgoing urls from a document. I want to get all matching urls
sorted by counts.

For exemple, I want to get all outgoing wikipedia url in my documents
sorted by counts.

So I execute a query like this:
q=article_outlinks:http*wikipedia.org*  and I facet on article_outlinks

But I get facets containing the other urls in the documents. I can get
something close by using facet.prefix=http://en.wikipedia.org but I
want to include other subdomains on wikipedia (ex: fr.wikipedia.org).

Is there a way to do a search and getting facets only matching my query?

I know facet.prefix isn't a query, but is there a way to get that
behavior?

Is it easy to extend solr to do something like that?

Thank you,

Olivier

Sorry for my english.






Reply via email to