I made more tests with the Lucene/SOLR 4.0 version deployed in March and the latest Lucene 4.0 beta version over the weekend.
My findings: - the version deployed in march doesn't contain the error I now come across in Beta4.0 (The number of documents part of the facetcounts differs from the real number of documents in a subsequent drill-down request using a filter query) This is true even in case a lot of updates were done against the index At the moment this can be seen under http://sb-tp1.swissbib.unibas.ch/(e.g. with the term 'mitbestimmung' and the facet value 'nebis I used for all my tests) As a note: because we have to migrate the OS of our servers the host might be down in the course of the current week for one or two days. - using the latest Lucene/Solr Beta version, the error occurs when updates are committed against the index as I described it in my former messages. When the index is new and freshly built the error doesn't occur (I made these tests on a host which is not accessible for the public) >From my point of view this is a severe bug in Lucene/Solr Beta 4.0 because filter queries are used very, very often! I would be very happy if someone of the SOLR core team could comment it. Thanks a lot for support! Günter Hipler 2012/8/31 Günter Hipler <vogese...@gmail.com> > > Hi, > > thanks for your responses! > > I made a more simple query with only one facet and without any boosting > stuff so it should be easier to focus the problem > > > facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true > -> > facet=on& > facet.mincount=1& > facet.limit=100& > rows=0& > start=0& > q=+(+%2Bmitbestimmung++)+& > facet.field=navNetwork& > qt=only_queryfields_edismax& > debugQuery=true > > facet counts say 2734 documents for nebis > parsedQuery > (+(+DisjunctionMaxQuery((title_series:mitbestimmung | > title_uniform:mitbestimmung | authorfull:mitbestimmung | > callnum:mitbestimmung | sfulltext:mitbestimmung | title_short:mitbestimmung > | sbranchlib:mitbestimmung | bibid:mitbestimmung | > sfullTextRemoteData:mitbestimmung | title_long:mitbestimmung | > autnum:mitbestimmung | subfull:mitbestimmung | > publplace:mitbestimmung))))/no_coord > parsedQuery_toString > +(+(title_series:mitbestimmung | title_uniform:mitbestimmung | > authorfull:mitbestimmung | callnum:mitbestimmung | sfulltext:mitbestimmung > | title_short:mitbestimmung | sbranchlib:mitbestimmung | > bibid:mitbestimmung | sfullTextRemoteData:mitbestimmung | > title_long:mitbestimmung | autnum:mitbestimmung | subfull:mitbestimmung | > publplace:mitbestimmung)) > > > > facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!term+f%3DnavNetwork}nebis > -> > facet=on&facet.mincount=1& > facet.limit=100& > rows=0& > start=0& > q=+(+%2Bmitbestimmung++)+& > facet.field=navNetwork& > qt=only_queryfields_edismax& > debugQuery=true& > fq={!term+f%3DnavNetwork}nebis > > delivers 2871 (not the same as the number indicated in the base query) > What is interesting: > the facetcount of the second query itself shows the 'correct' number > indicated in the base query (2734) > > parsedQuery and parsedQuery_ToString same as in base query > @Jack: and is exactly the same for a filter query with fq=navNetwork:nebis > we are using the term query parser to overcome problems with escaping > special characters (as it is also described in the > Solr Enterprise Search server book on page 189) > > > Using the alternatives suggested by Hoss > > http://sb-s7.swissbib.unibas.ch:8080/solr/collection1/select?facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+%28+%2Bmitbestimmung++%29+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!raw%20f=navNetwork}nebis<http://sb-s7.swissbib.unibas.ch:8080/solr/collection1/select?facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+%28+%2Bmitbestimmung++%29+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq=%7B!raw%20f=navNetwork%7Dnebis> > and > > facet=on&facet.mincount=1&facet.limit=100&rows=0&start=0&q=+(+%2Bmitbestimmung++)+&facet.field=navNetwork&qt=only_queryfields_edismax&debugQuery=true&fq={!lucene}navNetwork:nebis > don't change the result. The number of returned documents is higher than > it should be related to the number of facets in the facet counts displayed > in the base query > > > the type we are using for navNetwork: > <field name="navNetwork" type="stdID" multiValued="true" stored="false" /> > <!-- text field type for IDs of all sorts and colors, generic usage > (20.03.2012/osc) --> > <fieldType name="stdID" class="solr.TextField" sortMissingLast="true" > omitNorms="true"> > <analyzer> > <tokenizer class="solr.KeywordTokenizerFactory" /> > <filter class="solr.LowerCaseFilterFactory" /> > <filter class="solr.PatternReplaceFilterFactory" > pattern="^(\([a-z]+\))vtls0" > replacement="$10" > replace="all" > /> > <filter class="solr.PatternReplaceFilterFactory" > pattern="[^\w]+" > replacement="" > replace="all" > /> > <filter class="solr.TrimFilterFactory" /> > <filter class="solr.LengthFilterFactory" min="2" max="100" /> > </analyzer> > </fieldType> > > > which in my opinion should be a common treatment for facet types > > the new requestHandler I'm using is quite simple (without any boosting and > other stuff as it is done in the original one): > <requestHandler default="true" name="only_queryfields_edismax" > class="solr.SearchHandler"> > <lst name="defaults"> > <!-- use the extended dismax query parser --> > <str name="defType">edismax</str> > <str name="echoParams">explicit</str> > <str name="qf"> > title_long title_short title_uniform title_series authorfull > publplace subfull sfulltext sfullTextRemoteData syear bibid > sbranchlib callnum autnum > </str> > </lst> > </requestHandler> > > > What I try to do as next as soon as possible: > - I'm going to setup a new index with the Lucene 4.0 version from March > (to be more exactly: it's version 4.0-2012-03-09_11-29-20) > to see what are the results even in case of frequent updates > > - setup a 'new' index with Lucene beta4 (without any updates) and to test > more thoroughly if I get the same not consistent results (as it is > currently after updating the index) > > > Thanks a lot for your support! > > Günter > > > > > > 2012/8/30 Chris Hostetter <hossman_luc...@fucit.org> > >> >> The "q" and "bq" params have changed slightly between your first query and >> the query where you add the "fq" param ... because of how "bq" is >> additively added to the main query, it's possible this difference may >> account for the behavior your are seeing -- double check the debugQuery >> output for your main query between teh two requests to see if they match >> up. Heck: you can try the second query w/o the "fq" and sanity check that >> it still matches the same number of docs as the first query. >> >> If that's working fine, can you please give us more info about your >> "navNetwork" field, how is it configured? >> >> if you could show us the debugQuery output and numFound for these simple >> queries (no special requestHandler settings please) that would also be >> helpful.. >> >> /select?q={!raw f=navNetwork}nebis >> /select?q={!term f=navNetwork}nebis >> /select?q={!lucene}navNetwork:nebis >> >> >> : My query against an index is (I leaved out some of the facet fields) >> : f.navBranchlib.facet.limit=1000& >> : facet=on&facet.mincount=1& >> : facet.limit=100& >> : bq=navBranchlib:A100^1000& >> : bq=navBranchlib:UFSW^1000& >> : start=0&q=+(+%2Bmitbestimmung++)+& >> : facet.field=navNetwork& >> : qt=sb-bbfull-01 >> : -> qt refers to an edismax query-parser >> : >> : I get a result for the navNetwork facets which looks like >> : >> : <lst name="navNetwork"> >> : <int name="ids">3810</int> >> : <int name="nebis">2732</int> >> : <int name="idsbb">1945</int> >> : </lst> >> : >> : using a fq Parameter to drill down against the navNetwork facets >> : facet=on&facet.mincount=1& >> : facet.limit=100& >> : q=(+(+%2Bmitbestimmung++)+)& >> : facet.field=navNetwork& >> : qt=sb-bbfull-01& >> : fq={!term+f%3DnavNetwork}nebis >> : delivers 2806 Documents - instead of the expected 2732 >> : >> : >> : A boolean query instead of the fq is providing the correct result of >> 2732 >> : documents >> : facet=on&facet.mincount=1& >> : facet.limit=100& >> : %2Bmitbestimmung+%2BnavNetwork:nebis& >> : facet.field=navNetwork& >> : qt=sb-bbfull-01& >> : >> : >> : >> : The behaviour is not consistent. Some of the facets provide the correct >> : result, some not. >> : What I can't say for sure: The behaviour was correct (if I'm not wrong) >> : once the whole index was newly created. After running >> : some updates I got these results. >> : The application reflecting this behaviour is available under: >> : http://sb-tp1.swissbib.unibas.ch >> : >> : We are using Lucene/SOLR since the end of last year and deployed >> regularly >> : the various nightly builds. >> : The last version this error(?) didn't appear is from March 2012. The >> : application using it is available under >> : http://baselbern.swissbib.ch >> : The target "books and more" is using the Lucene 4.0 march version. The >> : index is being updated several times a day and uses the same >> : filter queries as for Lucene/SOLR 4.0 beta and alpha. >> : >> : My question: >> : - has something changed in the last versions or is this a bug? >> : >> : Günter Hipler >> : >> >> -Hoss > > >