Hi Loganathan, Seven hours is indeed a long time for indexing! Here’s some related thoughts that may help you find a decent solution:
Optimise SQL statements Given the amount of time involved, I’m curious as to how many records you may be indexing, and how complex the underlying SQL queries are. Have you confirmed you have database indices on foreign key columns? Also, (and I’ve just noticed it’s not covered in the documentation) it’s possible to shift multi-value attributes into separate queries, which may make the overall speed faster (especially if they are creating joins that are only necessary for those attributes and not the rest of the index): # This is only applicable for multi-value attributes: has taggings.tag_id, :as => :tag_ids, :source => :query # Also, you can use ranged queries, which may perform # better depending on the data situation: has taggings.tag_id, :as => :tag_ids, :source => :ranged_query But yeah, I generally recommend streamlining the SQL queries as much as possible - it reduces the load on your database and speeds up indexing time. Separate hosts for the database and Sphinx Speaking of reducing load - do you have Sphinx and your database on the same server? If so, is having separate servers an option? This may slow down indexing slightly (as you’ve got the extra time for data transferring from one machine to the other) but should reduce the load on the database server, which is a good thing. Setting the memory limit for the indexer http://sphinxsearch.com/docs/current.html#conf-mem-limit <http://sphinxsearch.com/docs/current.html#conf-mem-limit> Sphinx has the ability to cap the amount of memory used by the indexer. This may help limit the resources it uses and thus allow you to SSH into the machine while it’s running? You can set this per-environment in config/thinking_sphinx.yml. Indexing data separately This is what you’ve directly asked about. You can certainly run the indexer elsewhere and then copy it to the main Sphinx server, but there’s some things to note: Firstly, you will still want to have Sphinx reference the same production database. And then, when copying, I would recommend you append .new. into the index file names (e.g. article_core.spa becomes article_core.new.spa), then copy the files over to your main machine, then send a SIGHUP signal to the daemon. It’ll then load the new index data in and replace the old files itself. If you’re running the indexer manually, then you can get the correct file names by running it with the --rotate flag - but otherwise: all Sphinx index files need to be renamed (not just spa files). And yes, you may end up with slightly out-of-date data, but that’s going to be the case no matter what - especially due to the long time indexing requires. I really hope there are possibilities for SQL optimisation, because this can have such a significant impact on indexing times. Otherwise, I hope the rest of the information is useful too. Cheers, — Pat > On 7 Sep 2017, at 11:10 am, Loganathan Sellappa <loganathan...@gmail.com > <mailto:loganathan...@gmail.com>> wrote: > > Hi All, > > Whenever we do the full re-indexing on our database, it is taking around 7 > hours to complete due to large data sets. Even sometimes it is making our > server to go unresponsive and we couldn't able to do the SSH until the > indexer completes. > > I'm looking out for the way to run the indexer outside of our production > machine and copy the indexed file to our machine afterward. Just wanted to > know whether it'll cause any impact or not and I need clarifications for > below question > > During the process of Indexing & copying to the production server, there will > be some delta difference will be there between my index file & production > server's index file, will it be handled automatically by the indexer or do I > need to stop the indexer on production server during the process? > > > regards, > Loganathan > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to thinking-sphinx+unsubscr...@googlegroups.com > <mailto:thinking-sphinx+unsubscr...@googlegroups.com>. > To post to this group, send email to thinking-sphinx@googlegroups.com > <mailto:thinking-sphinx@googlegroups.com>. > Visit this group at https://groups.google.com/group/thinking-sphinx > <https://groups.google.com/group/thinking-sphinx>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to thinking-sphinx+unsubscr...@googlegroups.com. To post to this group, send email to thinking-sphinx@googlegroups.com. Visit this group at https://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.