Great to hear you’ve made some progress on this issue - hopefully the database challenges are easy enough to solve!
— Pat > On 2 Dec. 2016, at 12:00 am, Simon <simon.wilkin...@gmail.com > <mailto:simon.wilkin...@gmail.com>> wrote: > > The value for Status.active.id is the same as it has been. I also double > checked this and older entries included in the index have the same status_id > as the newer entries not being included. > > The configuration file is being updated when I run a full ts:index or > ts:rebuild. > > The sql_query value is currently this: > sql_query = SELECT SQL_NO_CACHE `entries`.`id` * 1 + 0 AS `id` , > `entries`.`title` AS `title`, `entries`.`body` AS `body`, `entries`.`id` AS > `sphinx_internal_id`, 3940594292 AS `class_crc`, 0 AS `sphinx_deleted`, > `entries`.`journal_id` AS `journal_id`, > UNIX_TIMESTAMP(`entries`.`created_at`) AS `created_at`, > UNIX_TIMESTAMP(`entries`.`opened_at`) AS `opened_at`, `entries`.`status_id` > AS `status_id` FROM `entries` WHERE `entries`.`id` >= $start AND > `entries`.`id` <= $end AND status_id = 1 GROUP BY `entries`.`id` ORDER BY > NULL > > If I change that to simply do a 'SELECT COUNT(*) FROM `entries` WHERE > status_id = 1 GROUP BY `entries`.`id` ORDER BY NULL' the count I get back is > 19,940,635. Which actually matches the number of records that I'm seeing on > the ts:reindex call. > > The DB I run the indexing and searching on is actually a slave DB, and when I > run the same query on the master I get back a much larger number > (20,527,906). So it seems the issue may not be with Sphinx at all, but with > my master-slave setup (which is even more confusing because the slave status > shows as 0 seconds behind the master). Anyways, looks like I need to turn my > attention to non-sphinx issues. > > Thanks again Pat for taking the time to respond, and for getting me thinking > about some different possibilities. It's greatly appreciated! > > Cheers, > Simon > > > On Wednesday, 30 November 2016 07:47:18 UTC-5, Pat Allan wrote: > Hmm, this is certainly an odd one! Is it possible the Status.active.id > <http://status.active.id/> value has changed? > > The ts:reindex task *only* reindexes the data. ts:index, however, will both > regenerate the configuration file and reindex the data. Given you’ve been > running the former, that would explain why the delta indices are still > present in the generated configuration file. That said, running ts:rebuild > should regenerate the configuration file as well, so I’m wondering if that > file isn’t being updated for some reason? So, that’s where my next focus for > debugging on the server would be… > > … and if it *is* regenerating correctly, and the deltas are now removed, the > next question is: is the generated sql_query value for the source correct? > Can you take that query and modify it to use COUNT(*) and confirm how many > records it matches against? > > Also, just for reference: which version of the Thinking Sphinx gem are you > using? > >> On 30 Nov. 2016, at 11:41 pm, Simon <simon.w...@gmail.com <javascript:>> >> wrote: >> >> Hi Pat, >> >> Thanks so much for the response! The number of records actually does not >> appear to fully match, as a count on active entries returns 20,515,798. >> Also, after having my normal cron job that re-indexes run this morning, I >> noticed that the number of records collected is the exact same as before >> (19,940,635). Is there a limit to the number of records Sphinx can handle, >> or any other common scenarios that could be preventing new entries from >> getting included in the indexing? >> >> As for the delta indexes, we actually removed these a while ago from the >> index definition as they were causing some headaches. Our configuration >> file still includes the delta block, but this has never seemed to be an >> issue in indexing. I could remove the delta info from the config file >> (something I've actually been meaning to do), but I didn't want to introduce >> more variables into what might have changed while trying to trouble shoot >> this issue. >> >> Here is the search call, even though the record counts don't match, just in >> case it is helpful at all in continuing to try and figure this out: >> >> filters = { >> :journal_id => journal_ids, >> :status_id => Status.active.id <http://status.active.id/> >> } >> # Check to see if we are ordering in a specific way >> params[:order] ||= '@relevance DESC' >> case params[:order] >> when 'cad' >> order = 'created_at DESC' >> when 'ca' >> order = 'created_at ASC' >> else >> order = '@relevance DESC' >> end >> entries = Entry.search params[:criteria], :with => filters, :sort_mode >> => :extended, :order => order >> >> Thanks again, >> Simon >> >> On Wednesday, 30 November 2016 07:29:36 UTC-5, Pat Allan wrote: >> Hi Simon, >> >> I guess the first place I’d start is by verifying the number of records >> you’re expecting Sphinx to index. The log you shared says 19,940,635 - does >> that match Entry.count(:conditions => {:status_id => Status.active.id >> <http://status.active.id/>})? >> >> Also: the indexing output suggests there’s a delta index, but that’s not in >> the index definition - removed for brevity? >> >> And if the counts match, then can you share the search call you’re running >> to confirm newer records are not appearing? >> >> Cheers, >> >> — >> Pat >> >>> On 30 Nov. 2016, at 12:40 am, Simon <simon.w...@gmail.com <>> wrote: >>> >>> Hi, >>> >>> I'm having an issue that just started recently. Indexing appears to >>> complete successfully, but new entries are not appearing in search results >>> (older entries appear). >>> >>> This seems to have started after I tried a rake ts:rebuild instead of what >>> I normally used (rake ts.reindex). I have since switched back to a reindex, >>> but still nothing new seems to be getting picked up. Unfortunately I am >>> running an older version of Ruby (1.8.7), Rails (2.3.18) and Sphinx (Sphinx >>> 1.10-beta (r2420)). >>> >>> My model definition is as follows: >>> >>> define_index do >>> indexes title >>> indexes body >>> has journal_id >>> has created_at, opened_at, status_id >>> >>> where "status_id = #{Status.active.id <http://status.active.id/>}" >>> end >>> >>> The output of calling rake ts:reindex is: >>> >>> Sphinx 1.10-beta (r2420) >>> >>> Copyright (c) 2001-2010, Andrew Aksyonoff >>> >>> Copyright (c) 2008-2010, Sphinx Technologies Inc (http://sphinxsearch.com >>> <http://sphinxsearch.com/>) >>> >>> >>> >>> using config file >>> '/home/ubuntu/rails/penzu/config/pandora_readonly.sphinx.conf'... >>> >>> indexing index 'entry_core'... >>> >>> collected 19940635 docs, 34535.3 MB >>> >>> WARNING: sort_hits: merge_block_size=224 kb too low, increasing mem_limit >>> may improve performance >>> >>> sorted 6182.9 Mhits, 100.0% done >>> >>> total 19940635 docs, 34535317325 bytes >>> >>> total 16953.201 sec, 2037097 bytes/sec, 1176.21 docs/sec >>> >>> indexing index 'entry_delta'... >>> >>> collected 0 docs, 0.0 MB >>> >>> total 0 docs, 0 bytes >>> >>> total 0.155 sec, 0 bytes/sec, 0.00 docs/sec >>> >>> skipping non-plain index 'entry'... >>> >>> total 92992 reads, 1534.381 sec, 228.9 kb/call avg, 16.5 msec/call avg >>> >>> total 40583 writes, 192.954 sec, 1042.6 kb/call avg, 4.7 msec/call avg >>> >>> rotating indices: succesfully sent SIGHUP to searchd (pid=3802). >>> >>> >>> So it all appears successful, but no new results appear. So for instance, >>> if i add a new entry, and then call reindex, that entry is not found in >>> search results. But an entry with the same search term from a month ago >>> does appear in the results. >>> >>> I have tried a complete rake ts:index, I have tried deleting all of the >>> generated index files (entry_core.spp, entry_core.spi, etc.) but nothing >>> seems to make a difference. Does anybody have any ideas what might be >>> happening here, or any other suggestions for what I can try? >>> >>> Thanks, >>> Simon >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Thinking Sphinx" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to thinking-sphi...@googlegroups.com <>. >>> To post to this group, send email to thinkin...@googlegroups.com <>. >>> Visit this group at https://groups.google.com/group/thinking-sphinx >>> <https://groups.google.com/group/thinking-sphinx>. >>> For more options, visit https://groups.google.com/d/optout >>> <https://groups.google.com/d/optout>. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Thinking Sphinx" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to thinking-sphi...@googlegroups.com <javascript:>. >> To post to this group, send email to thinkin...@googlegroups.com >> <javascript:>. >> Visit this group at https://groups.google.com/group/thinking-sphinx >> <https://groups.google.com/group/thinking-sphinx>. >> For more options, visit https://groups.google.com/d/optout >> <https://groups.google.com/d/optout>. > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to thinking-sphinx+unsubscr...@googlegroups.com > <mailto:thinking-sphinx+unsubscr...@googlegroups.com>. > To post to this group, send email to thinking-sphinx@googlegroups.com > <mailto:thinking-sphinx@googlegroups.com>. > Visit this group at https://groups.google.com/group/thinking-sphinx > <https://groups.google.com/group/thinking-sphinx>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphinx+unsubscr...@googlegroups.com. To post to this group, send email to thinking-sphinx@googlegroups.com. Visit this group at https://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.