Re: inconsistent results when faceting on multivalued field

2011-10-23 Thread Erick Erickson
I think the key here is you are a bit confused about what
the multiValued thing is all about. The fq clause says,
essentially, restrict all my search results to the documents
where 1213206 occurs in sou_codeMetier.
That's *all* the fq clause does.

Now, by saying facet.field=sou_codeMetier you're asking Solr
to count the number of documents that exist for each unique
value in that field. A single document can be counted many
times. Each bucket is a unique value in the field.

On the other hand, saying
facet.query=sou_codeMetier:[1213206 TO
1213206] you're asking Solr to count all the documents
that make it through your query (*:* in this case) with
*any* value in the indicated range.

Facet queries really have nothing to do with filter queries. That is,
facet queries in no way restrict the documents that are returned,
they just indicate ways of counting documents into buckets

Best
Erick

On Fri, Oct 21, 2011 at 10:01 AM, Darren Govoni dar...@ontrenet.com wrote:
 My interpretation of your results are that your FQ found 1281 documents
 with 1213206 value in sou_codeMetier field. Of those results, 476 also
 had 1212104 as a value...and so on. Since ALL the results will have
 the field value in your FQ, then I would expect the other values to
 be equal or less occurring from the result set, which they appear to be.



 On 10/21/2011 03:55 AM, Alain Rogister wrote:

 Pravesh,

 Not exactly. Here is the search I do, in more details (different field
 name,
 but same issue).

 I want to get a count for a specific value of the sou_codeMetier field,
 which is multivalued. I expressed this by including a fq clause :


 /select/?q=*:*facet=truefacet.field=sou_codeMetierfq=sou_codeMetier:1213206rows=0

 The response (excerpt only):

 lst name=facet_fields
 lst name=sou_codeMetier
 int name=12132061281/int
 int name=1212104476/int
 int name=121320603285/int
 int name=1213101260/int
 int name=121320602208/int
 int name=121320605171/int
 int name=1212201152/int
 ...

 As you see, I get back both the expected results and extra results I would
 expect to be filtered out by the fq clause.

 I can eliminate the extra results with a
 'f.sou_codeMetier.facet.prefix=1213206' clause.

 But I wonder if Solr's behavior is correct and how the fq filtering works
 exactly.

 If I replace the facet.field clause with a facet.query clause, like this:

 /select/?q=*:*facet=truefacet.query=sou_codeMetier:[1213206 TO
 1213206]rows=0

 The results contain a single item:

 lst name=facet_queries
 int name=sou_codeMetier:[1213206 TO 1213206]1281/int
 /lst

 The 'fq=sou_codeMetier:1213206' clause isn't necessary here and does not
 affect the results.

 Thanks,

 Alain

 On Fri, Oct 21, 2011 at 9:18 AM, praveshsuyalprav...@yahoo.com  wrote:

 Could u clarify on below:

 When I make a search on facet.qua_code=1234567 ??

 Are u trying to say, when u fire a fresh search for a facet item, like;
 q=qua_code:1234567??

 This this would fetch for documents where qua_code fields contains either
 the terms 1234567 OR both terms (1234567  9384738.and others terms).
 This would be since its a multivalued field and hence if you see the
 facet,
 then its shown for both the terms.

 If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I

 only
 get the expected counts

 You will get facet for documents which have term 1234567 only
 (facet.query
 would apply to the facets,so as to which facet to be picked/shown)

 Regds
 Pravesh



 --
 View this message in context:

 http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html
 Sent from the Solr - User mailing list archive at Nabble.com.





Re: inconsistent results when faceting on multivalued field

2011-10-21 Thread pravesh
Could u clarify on below:
When I make a search on facet.qua_code=1234567 ??

Are u trying to say, when u fire a fresh search for a facet item, like;
q=qua_code:1234567??

This this would fetch for documents where qua_code fields contains either
the terms 1234567 OR both terms (1234567  9384738.and others terms).
This would be since its a multivalued field and hence if you see the facet,
then its shown for both the terms.

If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only
get the expected counts

You will get facet for documents which have term 1234567 only (facet.query
would apply to the facets,so as to which facet to be picked/shown)

Regds
Pravesh



--
View this message in context: 
http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: inconsistent results when faceting on multivalued field

2011-10-21 Thread Alain Rogister
Pravesh,

Not exactly. Here is the search I do, in more details (different field name,
but same issue).

I want to get a count for a specific value of the sou_codeMetier field,
which is multivalued. I expressed this by including a fq clause :

/select/?q=*:*facet=truefacet.field=sou_codeMetierfq=sou_codeMetier:1213206rows=0

The response (excerpt only):

lst name=facet_fields
lst name=sou_codeMetier
int name=12132061281/int
int name=1212104476/int
int name=121320603285/int
int name=1213101260/int
int name=121320602208/int
int name=121320605171/int
int name=1212201152/int
...

As you see, I get back both the expected results and extra results I would
expect to be filtered out by the fq clause.

I can eliminate the extra results with a
'f.sou_codeMetier.facet.prefix=1213206' clause.

But I wonder if Solr's behavior is correct and how the fq filtering works
exactly.

If I replace the facet.field clause with a facet.query clause, like this:

/select/?q=*:*facet=truefacet.query=sou_codeMetier:[1213206 TO
1213206]rows=0

The results contain a single item:

lst name=facet_queries
int name=sou_codeMetier:[1213206 TO 1213206]1281/int
/lst

The 'fq=sou_codeMetier:1213206' clause isn't necessary here and does not
affect the results.

Thanks,

Alain

On Fri, Oct 21, 2011 at 9:18 AM, pravesh suyalprav...@yahoo.com wrote:

 Could u clarify on below:
 When I make a search on facet.qua_code=1234567 ??

 Are u trying to say, when u fire a fresh search for a facet item, like;
 q=qua_code:1234567??

 This this would fetch for documents where qua_code fields contains either
 the terms 1234567 OR both terms (1234567  9384738.and others terms).
 This would be since its a multivalued field and hence if you see the facet,
 then its shown for both the terms.

 If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I
 only
 get the expected counts

 You will get facet for documents which have term 1234567 only (facet.query
 would apply to the facets,so as to which facet to be picked/shown)

 Regds
 Pravesh



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: inconsistent results when faceting on multivalued field

2011-10-21 Thread Darren Govoni

My interpretation of your results are that your FQ found 1281 documents
with 1213206 value in sou_codeMetier field. Of those results, 476 also
had 1212104 as a value...and so on. Since ALL the results will have
the field value in your FQ, then I would expect the other values to
be equal or less occurring from the result set, which they appear to be.



On 10/21/2011 03:55 AM, Alain Rogister wrote:

Pravesh,

Not exactly. Here is the search I do, in more details (different field name,
but same issue).

I want to get a count for a specific value of the sou_codeMetier field,
which is multivalued. I expressed this by including a fq clause :

/select/?q=*:*facet=truefacet.field=sou_codeMetierfq=sou_codeMetier:1213206rows=0

The response (excerpt only):

lst name=facet_fields
lst name=sou_codeMetier
int name=12132061281/int
int name=1212104476/int
int name=121320603285/int
int name=1213101260/int
int name=121320602208/int
int name=121320605171/int
int name=1212201152/int
...

As you see, I get back both the expected results and extra results I would
expect to be filtered out by the fq clause.

I can eliminate the extra results with a
'f.sou_codeMetier.facet.prefix=1213206' clause.

But I wonder if Solr's behavior is correct and how the fq filtering works
exactly.

If I replace the facet.field clause with a facet.query clause, like this:

/select/?q=*:*facet=truefacet.query=sou_codeMetier:[1213206 TO
1213206]rows=0

The results contain a single item:

lst name=facet_queries
int name=sou_codeMetier:[1213206 TO 1213206]1281/int
/lst

The 'fq=sou_codeMetier:1213206' clause isn't necessary here and does not
affect the results.

Thanks,

Alain

On Fri, Oct 21, 2011 at 9:18 AM, praveshsuyalprav...@yahoo.com  wrote:


Could u clarify on below:

When I make a search on facet.qua_code=1234567 ??

Are u trying to say, when u fire a fresh search for a facet item, like;
q=qua_code:1234567??

This this would fetch for documents where qua_code fields contains either
the terms 1234567 OR both terms (1234567  9384738.and others terms).
This would be since its a multivalued field and hence if you see the facet,
then its shown for both the terms.


If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I

only
get the expected counts

You will get facet for documents which have term 1234567 only (facet.query
would apply to the facets,so as to which facet to be picked/shown)

Regds
Pravesh



--
View this message in context:
http://lucene.472066.n3.nabble.com/inconsistent-results-when-faceting-on-multivalued-field-tp3438991p3440128.html
Sent from the Solr - User mailing list archive at Nabble.com.





inconsistent results when faceting on multivalued field

2011-10-20 Thread Alain Rogister
I am surprised by the results I am getting from a search in a Solr 3.4
index.

My schema has a multivalued field of type 'string' :

field name=qua_code type=string multiValued=true indexed=true
stored=true/

The field values are 7-digit or 9-digit integer numbers; this corresponds to
a hierarchy. I could have used a numeric type instead of string but no
numerical operations are performed against the values.

Now, each document contains 0-N values for this field, such as:

8625774
1234567
123456701
123456702
123456703
9384738

When I make a search on facet.qua_code=1234567 , I am getting the counts I
expect (seemingly correct) + a large number of counts for *other* field
values (e.g. 9384738).

If I reword the query as 'facet.query=qua_code:1234567 TO 1234567', I only
get the expected counts.

I can also filter out the extraneous results with a facet.prefix clause.

Should I file an issue or am I misunderstanding something about faceting on
multivalued fields ?

Thanks.