[ https://issues.apache.org/jira/browse/SOLR-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658732#comment-13658732 ]
Lakshmi Venkataswamy commented on SOLR-4824: -------------------------------------------- Not sure I understand. When I have 30 days of data I get 362,803 results. When I add another 11 days worth of data the same search returns 1,338 results. Even if there is a maximum limit would I not see a capping of the results as opposed to a drastic drop ? > Fuzzy / Faceting results are changed after ingestion of documents past a > certain number > ---------------------------------------------------------------------------------------- > > Key: SOLR-4824 > URL: https://issues.apache.org/jira/browse/SOLR-4824 > Project: Solr > Issue Type: Bug > Affects Versions: 4.2, 4.3 > Environment: Ubuntu 12.04 LTS 12.04.2 > jre1.7.0_17 > jboss-as-7.1.1.Final > Reporter: Lakshmi Venkataswamy > > In upgrading from SOLR 3.6 to 4.2/4.3 and comparing results on fuzzy queries, > I found that after a certain number of documents were ingested the fuzzy > query had drastically lower number of results. We have approximately 18,000 > documents per day and after ingesting approximately 40 days of documents, the > next incremental day of documents results in a lower number of results of a > fuzzy search. > The query : > http://10.100.1.xx:8080/solr/corex/select?q=cc:worde~1&facet=on&facet.field=date&fl=date&facet.sort > produces the following result before the threshold is crossed > <response><lst name="responseHeader"> > <int name="status">0</int><int name="QTime">2349</int><lst name="params"><str > name="facet">on</str><str name="fl">date</str><str name="facet.sort"/> > <str name="q">cc:worde~1</str><str > name="facet.field">date</str></lst></lst><result name="response" > numFound="362803" start="0"></result> > <lst name="facet_counts"><lst name="facet_queries"/><lst > name="facet_fields"><lst name="date"> > <int name="2012-12-31">2866</int> > <int name="2013-01-01">11372</int> > <int name="2013-01-02">11514</int> > <int name="2013-01-03">12015</int> > <int name="2013-01-04">11746</int> > <int name="2013-01-05">10853</int> > <int name="2013-01-06">11053</int> > <int name="2013-01-07">11815</int> > <int name="2013-01-08">11427</int> > <int name="2013-01-09">11475</int> > <int name="2013-01-10">11461</int> > <int name="2013-01-11">12058</int> > <int name="2013-01-12">11335</int> > <int name="2013-01-13">12039</int> > <int name="2013-01-14">12064</int> > <int name="2013-01-15">12234</int> > <int name="2013-01-16">12545</int> > <int name="2013-01-17">11766</int> > <int name="2013-01-18">12197</int> > <int name="2013-01-19">11414</int> > <int name="2013-01-20">11633</int> > <int name="2013-01-21">12863</int> > <int name="2013-01-22">12378</int> > <int name="2013-01-23">11947</int> > <int name="2013-01-24">11822</int> > <int name="2013-01-25">11882</int> > <int name="2013-01-26">10474</int> > <int name="2013-01-27">11051</int> > <int name="2013-01-28">11776</int> > <int name="2013-01-29">11957</int> > <int name="2013-01-30">11260</int> > <int name="2013-01-31">8511</int> > </lst></lst><lst name="facet_dates"/><lst > name="facet_ranges"/></lst></response> > Once the 40 days of documents ingested threshold is crossed the results drop > as show below for the same query > <response><lst name="responseHeader"> > <int name="status">0</int><int name="QTime">2</int><lst name="params"><str > name="facet">on</str><str name="fl">date</str><str name="facet.sort"/><str > name="q">cc:worde~1</str><str name="facet.field">date</str></lst></lst> > <result name="response" numFound="1338" start="0"></result> > <lst name="facet_counts"><lst name="facet_queries"/><lst > name="facet_fields"><lst name="date"> > <int name="2012-12-31">0</int> > <int name="2013-01-01">41</int> > <int name="2013-01-02">21</int> > <int name="2013-01-03">24</int> > <int name="2013-01-04">19</int> > <int name="2013-01-05">9</int> > <int name="2013-01-06">11</int> > <int name="2013-01-07">17</int> > <int name="2013-01-08">14</int> > <int name="2013-01-09">24</int> > <int name="2013-01-10">43</int> > <int name="2013-01-11">14</int> > <int name="2013-01-12">52</int> > <int name="2013-01-13">57</int> > <int name="2013-01-14">25</int> > <int name="2013-01-15">17</int> > <int name="2013-01-16">34</int> > <int name="2013-01-17">11</int> > <int name="2013-01-18">16</int> > <int name="2013-01-19">121</int> > <int name="2013-01-20">33</int> > <int name="2013-01-21">26</int> > <int name="2013-01-22">59</int> > <int name="2013-01-23">27</int> > <int name="2013-01-24">10</int> > <int name="2013-01-25">9</int> > <int name="2013-01-26">6</int> > <int name="2013-01-27">16</int> > <int name="2013-01-28">11</int> > <int name="2013-01-29">15</int> > <int name="2013-01-30">21</int> > <int name="2013-01-31">109</int> > <int name="2013-02-01">11</int> > <int name="2013-02-02">7</int> > <int name="2013-02-03">10</int> > <int name="2013-02-04">8</int> > <int name="2013-02-05">13</int> > <int name="2013-02-06">75</int> > <int name="2013-02-07">77</int> > <int name="2013-02-08">31</int> > <int name="2013-02-09">35</int> > <int name="2013-02-10">22</int> > <int name="2013-02-11">18</int> > <int name="2013-02-12">11</int> > <int name="2013-02-13">68</int> > <int name="2013-02-14">40</int> > </lst></lst><lst name="facet_dates"/><lst > name="facet_ranges"/></lst></response> > I have also tested this with different months of data and have seen the same > issue around the number of documents. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org