Re: AW: AW: FacetField-Result on String-Field contains value with count 0?
On 1/13/2017 7:36 AM, Sebastian Riemer wrote: > Thanks, that's actually where I come from. But I don't want to exclude values > leading to a count of zero. > > Background to this: A user searched for mediaType "book" which gave him 10 > results. Now some other task/routine whatever changes all those 10 books to > be say 10 ebooks, because the type has been incorrect. The user makes a > refresh, still looking for "book" gets 0 results (which is expected) and > because we rule out facet.fields having count 0, I don't get back the > selected mediaType "book" and thus I cannot select this value in the > select-dropdown-filter for the mediaType. This leads to confusion for the > user, since he has no results, but doesn't see that it's because of he still > has that mediaType-filter set to a value "books" which now actually leads to > 0 results. Some users are always going to be confused in one way or another when something behaves in a way that's contrary to their expectations. If you plan your interface correctly, you can eliminate the biggest sources of confusion ... but there's an applicable saying here: You can never make things idiot-proof. There's always a better idiot. The facet.mincount parameter is the way to deal with this problem, as Bill Bell already mentioned. One of the reasons that facet.mincount exists is to remove terms that have no documents, but still exist in the index. If the q parameter was an actual query instead of "all docs" and the request didn't have facet.mincount, then the facet for that field would still have thirteen entries, many of which might be zero. Thanks, Shawn
AW: AW: FacetField-Result on String-Field contains value with count 0?
Thanks @Toke, for pointing out these options. I'll have a read about expungeDeletes. Sounds even more so, that having solr filter out 0-counts is a good idea and I should handle my use-case outside of solr. Thanks again, Sebastian On Fri, 2017-01-13 at 14:19 +, Sebastian Riemer wrote: > the second search should have been this: http://localhost:8983/solr/w > emi/select?fq=m_mediaType_s:%221%22&indent=on&q=*:*&rows=0&start=0&wt > =json > (or in other words, give me all documents having value "1" for field > "m_mediaType_s") > > Since this search gives zero results, why is it included in the > facet.fields result-count list? Qualified guess (I don't know the JSON faceting code in details): The list of possible facet values is extracted from the DocValues structure in the segment files, without respect to documents marked as deleted. At some point you had one or more documents with m_mediaType_s:1, which were later deleted. If your index is not too large, you can verify this by optimizing down to 1 segment, which will remove all traces of deleted documents (unless the index is already 1 segment). If you cannot live with the false terms, committing with expungeDeletes=true should do the trick, although it is likely to make your indexing process a lot heavier. The reason for this inaccuracy is that it is quite heavy to verify whether a docvalue is referenced by a document: Each time one or more documents in a segment are deleted, all references from all documents in that segment would have to be checked to create a correct mapping. As this only affects mincount=0 combined with your use case where _all_ documents with a certain docvalue are deleted, my guess it that it is seen as too much of an edge case to handle. -- Toke Eskildsen, Royal Danish Library
AW: FacetField-Result on String-Field contains value with count 0?
Nice, thank you very much for your explanation! >> Solr returns all fields as facet result where there was some value at some time as long as the the documents are somewhere in the index, even when they're marked as indexed. So there must have been a document with m_mediaType_s=1. Even if all these documents are deleted already, its values still appear in the facet result. I did not know about that! That makes perfect sense. I am quite sure there has been a time where that field contained the value "1". Even more, as now where I rebuild my index, the value "1" is not present as facet.field result anymore. I'll think about how to deal with my situation then, maybe it would be better to keep solr filtering out 0-count facet-fields and insert the filterquery leading to 0 results into the select-dropdown "manually". -Ursprüngliche Nachricht- Von: Michael Kuhlmann [mailto:k...@solr.info] Gesendet: Freitag, 13. Januar 2017 15:43 An: solr-user@lucene.apache.org Betreff: Re: FacetField-Result on String-Field contains value with count 0? Then I don't understand your problem. Solr already does exactly what you want. Maybe the problem is different: I assume that there never was a value of "1" in the index, leading to your confusion. Solr returns all fields as facet result where there was some value at some time as long as the the documents are somewhere in the index, even when they're marked as indexed. So there must have been a document with m_mediaType_s=1. Even if all these documents are deleted already, its values still appear in the facet result. This holds true until segments get merged so that all deleted documents are pruned. So if you send a forceMerge request, chances are good that "1" won't come up any more. -Michael Am 13.01.2017 um 15:36 schrieb Sebastian Riemer: > Hi Bill, > > Thanks, that's actually where I come from. But I don't want to exclude values > leading to a count of zero. > > Background to this: A user searched for mediaType "book" which gave him 10 > results. Now some other task/routine whatever changes all those 10 books to > be say 10 ebooks, because the type has been incorrect. The user makes a > refresh, still looking for "book" gets 0 results (which is expected) and > because we rule out facet.fields having count 0, I don't get back the > selected mediaType "book" and thus I cannot select this value in the > select-dropdown-filter for the mediaType. This leads to confusion for the > user, since he has no results, but doesn't see that it's because of he still > has that mediaType-filter set to a value "books" which now actually leads to > 0 results. > > -Ursprüngliche Nachricht----- > Von: billnb...@gmail.com [mailto:billnb...@gmail.com] > Gesendet: Freitag, 13. Januar 2017 15:23 > An: solr-user@lucene.apache.org > Betreff: Re: AW: FacetField-Result on String-Field contains value with count > 0? > > Set mincount to 1 > > Bill Bell > Sent from mobile > > >> On Jan 13, 2017, at 7:19 AM, Sebastian Riemer wrote: >> >> Pardon me, >> the second search should have been this: >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22&inden >> t =on&q=*:*&rows=0&start=0&wt=json (or in other words, give me all >> documents having value "1" for field "m_mediaType_s") >> >> Since this search gives zero results, why is it included in the facet.fields >> result-count list? >> >> >> >> Hi, >> >> Please help me understand: >> http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s&facet=on&indent=on&q=*:*&wt=json >> returns: >> >> "facet_counts":{ >>"facet_queries":{}, >>"facet_fields":{ >> "m_mediaType_s":[ >>"2",25561, >>"3",19027, >>"10",1966, >>"11",1705, >>"12",1067, >>"4",1056, >>"5",291, >>"8",68, >>"13",2, >>"6",2, >>"7",1, >>"9",1, >>"1",0]}, >>"facet_ranges":{}, >>"facet_intervals":{}, >>"facet_heatmaps":{}}} >> >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22&inden >> t >> =on&q=*:*&rows=0&start=0&wt=json >> >> >> ? "response":{"numFound":25561,"start":0,"docs":[] >> >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22&inden >> t >> =on&q=*:*&rows=0&start=0&wt=json >> >> >> ? "response":{"numFound":0,"start":0,"docs":[] >> >> So why does the search for facet.field even contain the value "1", if it >> does not exist? >> >> And why does it e.g. not contain >> "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsInclude >> I tInTheFacetFieldsResultListAnywaysWithCountZero" : 0 >> >> Best regards, >> Sebastian >> >> Additional info, field m_mediaType_s is a string; >> > stored="true" /> >> > /> >>
Re: AW: FacetField-Result on String-Field contains value with count 0?
On Fri, 2017-01-13 at 14:19 +, Sebastian Riemer wrote: > the second search should have been this: http://localhost:8983/solr/w > emi/select?fq=m_mediaType_s:%221%22&indent=on&q=*:*&rows=0&start=0&wt > =json > (or in other words, give me all documents having value "1" for field > "m_mediaType_s") > > Since this search gives zero results, why is it included in the > facet.fields result-count list? Qualified guess (I don't know the JSON faceting code in details): The list of possible facet values is extracted from the DocValues structure in the segment files, without respect to documents marked as deleted. At some point you had one or more documents with m_mediaType_s:1, which were later deleted. If your index is not too large, you can verify this by optimizing down to 1 segment, which will remove all traces of deleted documents (unless the index is already 1 segment). If you cannot live with the false terms, committing with expungeDeletes=true should do the trick, although it is likely to make your indexing process a lot heavier. The reason for this inaccuracy is that it is quite heavy to verify whether a docvalue is referenced by a document: Each time one or more documents in a segment are deleted, all references from all documents in that segment would have to be checked to create a correct mapping. As this only affects mincount=0 combined with your use case where _all_ documents with a certain docvalue are deleted, my guess it that it is seen as too much of an edge case to handle. -- Toke Eskildsen, Royal Danish Library
Re: FacetField-Result on String-Field contains value with count 0?
Then I don't understand your problem. Solr already does exactly what you want. Maybe the problem is different: I assume that there never was a value of "1" in the index, leading to your confusion. Solr returns all fields as facet result where there was some value at some time as long as the the documents are somewhere in the index, even when they're marked as indexed. So there must have been a document with m_mediaType_s=1. Even if all these documents are deleted already, its values still appear in the facet result. This holds true until segments get merged so that all deleted documents are pruned. So if you send a forceMerge request, chances are good that "1" won't come up any more. -Michael Am 13.01.2017 um 15:36 schrieb Sebastian Riemer: > Hi Bill, > > Thanks, that's actually where I come from. But I don't want to exclude values > leading to a count of zero. > > Background to this: A user searched for mediaType "book" which gave him 10 > results. Now some other task/routine whatever changes all those 10 books to > be say 10 ebooks, because the type has been incorrect. The user makes a > refresh, still looking for "book" gets 0 results (which is expected) and > because we rule out facet.fields having count 0, I don't get back the > selected mediaType "book" and thus I cannot select this value in the > select-dropdown-filter for the mediaType. This leads to confusion for the > user, since he has no results, but doesn't see that it's because of he still > has that mediaType-filter set to a value "books" which now actually leads to > 0 results. > > -Ursprüngliche Nachricht- > Von: billnb...@gmail.com [mailto:billnb...@gmail.com] > Gesendet: Freitag, 13. Januar 2017 15:23 > An: solr-user@lucene.apache.org > Betreff: Re: AW: FacetField-Result on String-Field contains value with count > 0? > > Set mincount to 1 > > Bill Bell > Sent from mobile > > >> On Jan 13, 2017, at 7:19 AM, Sebastian Riemer wrote: >> >> Pardon me, >> the second search should have been this: >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22&indent >> =on&q=*:*&rows=0&start=0&wt=json (or in other words, give me all >> documents having value "1" for field "m_mediaType_s") >> >> Since this search gives zero results, why is it included in the facet.fields >> result-count list? >> >> >> >> Hi, >> >> Please help me understand: >> http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s&facet=on&indent=on&q=*:*&wt=json >> returns: >> >> "facet_counts":{ >>"facet_queries":{}, >>"facet_fields":{ >> "m_mediaType_s":[ >>"2",25561, >>"3",19027, >>"10",1966, >>"11",1705, >>"12",1067, >>"4",1056, >>"5",291, >>"8",68, >>"13",2, >>"6",2, >>"7",1, >>"9",1, >>"1",0]}, >>"facet_ranges":{}, >>"facet_intervals":{}, >>"facet_heatmaps":{}}} >> >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22&indent >> =on&q=*:*&rows=0&start=0&wt=json >> >> >> ? "response":{"numFound":25561,"start":0,"docs":[] >> >> http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22&indent >> =on&q=*:*&rows=0&start=0&wt=json >> >> >> ? "response":{"numFound":0,"start":0,"docs":[] >> >> So why does the search for facet.field even contain the value "1", if it >> does not exist? >> >> And why does it e.g. not contain >> "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsIncludeI >> tInTheFacetFieldsResultListAnywaysWithCountZero" : 0 >> >> Best regards, >> Sebastian >> >> Additional info, field m_mediaType_s is a string; >> > stored="true" /> >> > /> >>
AW: AW: FacetField-Result on String-Field contains value with count 0?
Hi Bill, Thanks, that's actually where I come from. But I don't want to exclude values leading to a count of zero. Background to this: A user searched for mediaType "book" which gave him 10 results. Now some other task/routine whatever changes all those 10 books to be say 10 ebooks, because the type has been incorrect. The user makes a refresh, still looking for "book" gets 0 results (which is expected) and because we rule out facet.fields having count 0, I don't get back the selected mediaType "book" and thus I cannot select this value in the select-dropdown-filter for the mediaType. This leads to confusion for the user, since he has no results, but doesn't see that it's because of he still has that mediaType-filter set to a value "books" which now actually leads to 0 results. -Ursprüngliche Nachricht- Von: billnb...@gmail.com [mailto:billnb...@gmail.com] Gesendet: Freitag, 13. Januar 2017 15:23 An: solr-user@lucene.apache.org Betreff: Re: AW: FacetField-Result on String-Field contains value with count 0? Set mincount to 1 Bill Bell Sent from mobile > On Jan 13, 2017, at 7:19 AM, Sebastian Riemer wrote: > > Pardon me, > the second search should have been this: > http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22&indent > =on&q=*:*&rows=0&start=0&wt=json (or in other words, give me all > documents having value "1" for field "m_mediaType_s") > > Since this search gives zero results, why is it included in the facet.fields > result-count list? > > > > Hi, > > Please help me understand: > http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s&facet=on&indent=on&q=*:*&wt=json > returns: > > "facet_counts":{ >"facet_queries":{}, >"facet_fields":{ > "m_mediaType_s":[ >"2",25561, >"3",19027, >"10",1966, >"11",1705, >"12",1067, >"4",1056, >"5",291, >"8",68, >"13",2, >"6",2, >"7",1, >"9",1, >"1",0]}, >"facet_ranges":{}, >"facet_intervals":{}, >"facet_heatmaps":{}}} > > http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22&indent > =on&q=*:*&rows=0&start=0&wt=json > > > ? "response":{"numFound":25561,"start":0,"docs":[] > > http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22&indent > =on&q=*:*&rows=0&start=0&wt=json > > > ? "response":{"numFound":0,"start":0,"docs":[] > > So why does the search for facet.field even contain the value "1", if it does > not exist? > > And why does it e.g. not contain > "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsIncludeI > tInTheFacetFieldsResultListAnywaysWithCountZero" : 0 > > Best regards, > Sebastian > > Additional info, field m_mediaType_s is a string; > stored="true" /> > /> >
Re: AW: FacetField-Result on String-Field contains value with count 0?
Set mincount to 1 Bill Bell Sent from mobile > On Jan 13, 2017, at 7:19 AM, Sebastian Riemer wrote: > > Pardon me, > the second search should have been this: > http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22&indent=on&q=*:*&rows=0&start=0&wt=json > > (or in other words, give me all documents having value "1" for field > "m_mediaType_s") > > Since this search gives zero results, why is it included in the facet.fields > result-count list? > > > > Hi, > > Please help me understand: > http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s&facet=on&indent=on&q=*:*&wt=json > returns: > > "facet_counts":{ >"facet_queries":{}, >"facet_fields":{ > "m_mediaType_s":[ >"2",25561, >"3",19027, >"10",1966, >"11",1705, >"12",1067, >"4",1056, >"5",291, >"8",68, >"13",2, >"6",2, >"7",1, >"9",1, >"1",0]}, >"facet_ranges":{}, >"facet_intervals":{}, >"facet_heatmaps":{}}} > > http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22&indent=on&q=*:*&rows=0&start=0&wt=json > > > ? "response":{"numFound":25561,"start":0,"docs":[] > > http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22&indent=on&q=*:*&rows=0&start=0&wt=json > > > ? "response":{"numFound":0,"start":0,"docs":[] > > So why does the search for facet.field even contain the value "1", if it does > not exist? > > And why does it e.g. not contain > "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsIncludeItInTheFacetFieldsResultListAnywaysWithCountZero" > : 0 > > Best regards, > Sebastian > > Additional info, field m_mediaType_s is a string; > stored="true" /> > >
AW: FacetField-Result on String-Field contains value with count 0?
Pardon me, the second search should have been this: http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%221%22&indent=on&q=*:*&rows=0&start=0&wt=json (or in other words, give me all documents having value "1" for field "m_mediaType_s") Since this search gives zero results, why is it included in the facet.fields result-count list? Hi, Please help me understand: http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s&facet=on&indent=on&q=*:*&wt=json returns: "facet_counts":{ "facet_queries":{}, "facet_fields":{ "m_mediaType_s":[ "2",25561, "3",19027, "10",1966, "11",1705, "12",1067, "4",1056, "5",291, "8",68, "13",2, "6",2, "7",1, "9",1, "1",0]}, "facet_ranges":{}, "facet_intervals":{}, "facet_heatmaps":{}}} http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22&indent=on&q=*:*&rows=0&start=0&wt=json ? "response":{"numFound":25561,"start":0,"docs":[] http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22&indent=on&q=*:*&rows=0&start=0&wt=json ? "response":{"numFound":0,"start":0,"docs":[] So why does the search for facet.field even contain the value "1", if it does not exist? And why does it e.g. not contain "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsIncludeItInTheFacetFieldsResultListAnywaysWithCountZero" : 0 Best regards, Sebastian Additional info, field m_mediaType_s is a string;
FacetField-Result on String-Field contains value with count 0?
Hi, Please help me understand: http://localhost:8983/solr/wemi/select?facet.field=m_mediaType_s&facet=on&indent=on&q=*:*&wt=json returns: "facet_counts":{ "facet_queries":{}, "facet_fields":{ "m_mediaType_s":[ "2",25561, "3",19027, "10",1966, "11",1705, "12",1067, "4",1056, "5",291, "8",68, "13",2, "6",2, "7",1, "9",1, "1",0]}, "facet_ranges":{}, "facet_intervals":{}, "facet_heatmaps":{}}} http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%222%22&indent=on&q=*:*&rows=0&start=0&wt=json ? "response":{"numFound":25561,"start":0,"docs":[] http://localhost:8983/solr/wemi/select?fq=m_mediaType_s:%220%22&indent=on&q=*:*&rows=0&start=0&wt=json ? "response":{"numFound":0,"start":0,"docs":[] So why does the search for facet.field even contain the value "1", if it does not exist? And why does it e.g. not contain "SomeReallyCrazyOtherValueWhichLikeValue"1"DoesNotExistButLetsIncludeItInTheFacetFieldsResultListAnywaysWithCountZero" : 0 Best regards, Sebastian Additional info, field m_mediaType_s is a string;