Hi Jonathan Can you share your index definitions so I can get a better idea of where the problem might be?
Also: which versions of Rails and Thinking Sphinx are you using? — Pat > On 28 Jun 2015, at 11:47 pm, [email protected] wrote: > > Hi Pat, > > I implemented according to this, and the indexing time went down (5 times > faster on development). However, the delta indexing time went up (30 times > slower on development). See below the indexing stats: > > Total docs Bytes Time (sec) Total docs > Bytes Time (sec) > incident_index_1_core 7331 6531122 39.436 > incident_index_6_core 7331 28239593 8.802 > incident_index_1_delta 6 1128 0.184 > incident_index_6_delta 6 24763425 5.234 > incident_index_2_core 7319 6751189 45.477 > incident_index_7_core 7319 28331726 8.819 > incident_index_2_delta 5 843 0.233 > incident_index_7_delta 5 24763289 5.321 > incident_index_3_core 7390 6803814 42.064 > incident_index_8_core 7390 28310121 7.913 > incident_index_3_delta 8 2143 0.203 > incident_index_8_delta 8 24764366 5.282 > incident_index_4_core 7278 6377664 37.665 > incident_index_9_core 7278 28162260 7.891 > incident_index_4_delta 6 1108 0.436 > incident_index_9_delta 6 24763330 5.456 > incident_index_5_core 7396 6601358 39.704 > incident_index_10_core 7396 28152075 9.562 > incident_index_5_delta 6 944 0.216 > incident_index_10_delta 6 24763308 5.303 > > Any idea why this is happening? > > Thanks, > Jonathan > > On Friday, July 26, 2013 at 3:57:38 PM UTC+3, Pat Allan wrote: > Heya Steve > > Was just looking into how difficult this would be to implement properly, and > noticed I have added the ability to take a string as the source query - > instead of the column references. So, it's possible without hacking around in > the index definition itself: > > https://gist.github.com/pat/6088629 <https://gist.github.com/pat/6088629> > > It's worth noting that the document id (Sphinx's equivalent of a primary key) > involves the normal primary key with an offset and a multiplier. Make sure > those two integers match what's in your generated index in sql_query. They > may change when you add other indices to your app (depends on alphabetical > order of your index files). > > Also: there's probably some metaprogramming you could add to simplify things > a bit more. > > Would love to hear if this approach helps with your real app and not just the > test one :) > > -- > Pat > > On 26/07/2013, at 12:14 AM, Pat Allan wrote: > > > Hi Steve > > > > I've got a way forward to greatly improve the speed of indexing… > > unfortunately, it's not going to work within Thinking Sphinx easily right > > now. > > > > Sphinx has the ability to gather attribute and field values from separate > > queries - this existed for TS v1/v2 for attributes, and fields was added in > > TS v3, but the catch is those separate queries don't work for HABTM joins. > > I'd love to change that, it's just painful from an ActiveRecord perspective > > because you're not dealing with a model's table as the base, but the HABTM > > join table. > > > > Here's the configuration for the relevant source that I modified by hand: > > https://gist.github.com/pat/6080031 <https://gist.github.com/pat/6080031> > > > > You'll see that the main query is nice and short - and then there's each of > > the MVA and joined field definitions. If you put this in the generated > > source definition in config/development.sphinx.conf, and then run the > > indexer manually (NOT through the rake task, that'll overwrite this): > > indexer --config config/development.sphinx.conf --all --rotate > > > > (Remove --rotate if Sphinx isn't running.) You'll see it's pretty damn > > fast. > > > > Now, ways forward? Well, I'd love to write something for TS v3 that can > > handle HABTM - it's just a shame that it might need to be pure ARel rather > > than ActiveRecord-built (which can otherwise help with joins). > > > > But otherwise: switch from HABTM to has_many/has_many :through - make each > > of the joins an actual model. Then, you can add :source => :query to each > > of the appropriate field and attribute definitions, and it should generate > > something pretty much the same. > > > > Hope this provides some clarity at the very least! And also: thanks for the > > test app, really helped with debugging! > > > > -- > > Pat > > > > > > On 25/07/2013, at 2:54 PM, Steve Kenworthy wrote: > > > >> Hi there, > >> > >> Firstly, thinking-sphinx is awesome and I love it. Thanks Pat for an > >> excellent project. V3 is looking great and represents a lot of hard work > >> and effort. > >> > >> I've been using thinking-sphinx to index a document model and it's really > >> slowed down when I add lots of associations in the index. In fact, it > >> never finishes on my machine (8Gig RAM, 8 CPU's) when I add 4 indexes. > >> > >> Times: > >> • 4 seconds - when 1 association (images) is indexed > >> • 6 seconds - when 2 associations (images and subscribers) are > >> indexed > >> • 23 seconds - when 2 associations (images and countries) are > >> indexed > >> • 115 seconds - when 3 associations (images, subscribers and tags) > >> are indexed > >> • 113 seconds - when 3 associations (images, subscribers and > >> videos) are indexed (just to prove it's not tags slowing it down) > >> • ꝏ (not finishing) - when 4 associations or more are selected. > >> > >> Here's my index file: > >> > >> ThinkingSphinx::Index.define :document, with: :active_record, delta: true, > >> sql_range_step: 999999999, group_concat_max_len: 16384 do > >> > >> has countries(:id), as: :country_ids > >> has images(:id), as: :image_ids, facet: true > >> has subscribers(:id), as: :subscriber_ids, facet: true > >> has tags(:id), as: :tag_ids, facet: true > >> has videos(:id), as: :video_ids, facet: true > >> > >> indexes countries.name <http://countries.name/>, as: :countries > >> indexes images.title, as: :images > >> indexes subscribers.title, as: :subscribers > >> indexes tags.name <http://tags.name/>, as: :tags > >> indexes videos.title, as: :videos > >> > >> has updated_at > >> > >> end > >> > >> The generated sql is a massive group_by query and is not finishing. See it > >> here https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing > >> <https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing> > >> > >> I'd really appreciate some advice on how to optimise this so indexing > >> becomes viable again. Do I just have too much going on here? I'm using > >> facets, indexes and attributes. Perhaps there is a better way to optimise? > >> A friend suggested pre-computing with some joins... how would this work? > >> > >> Vital stats: using mysql v14.14, sphinx 2.0.4, Ubuntu, rails 3.2.13, > >> thinking-sphinx 3.0.4 > >> > >> For those who'd like to take a look, I've uploaded a sample project here > >> https://github.com/crossroads/rails3-ts-example > >> <https://github.com/crossroads/rails3-ts-example> which can be cloned. If > >> you follow the instructions, it will setup a db with test data and > >> reproduce the problem quickly. > >> > >> There's also the sphinx generated SQL and EXPLAIN: > >> https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing > >> <https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing> > >> > >> Thanks in advance for anyone taking the time to read. > >> > >> Regards, > >> Steve > >> > >> -- > >> You received this message because you are subscribed to the Google Groups > >> "Thinking Sphinx" group. > >> To unsubscribe from this group and stop receiving emails from it, send an > >> email to thinking-sphi...@ <>googlegroups.com <http://googlegroups.com/>. > >> To post to this group, send email to thinkin...@ <>googlegroups. > >> <http://googlegroups.com/>com <http://googlegroups.com/>. > >> Visit this group at http://groups.google.com/group/thinking-sphinx > >> <http://groups.google.com/group/thinking-sphinx>. > >> For more options, visit https://groups.google.com/groups/opt_out > >> <https://groups.google.com/groups/opt_out>. > >> > >> > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Thinking Sphinx" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to thinking-sphi...@ <>googlegroups.com <http://googlegroups.com/>. > > To post to this group, send email to thinkin...@ <>googlegroups. > > <http://googlegroups.com/>com <http://googlegroups.com/>. > > Visit this group at http://groups.google.com/group/thinking-sphinx > > <http://groups.google.com/group/thinking-sphinx>. > > For more options, visit https://groups.google.com/groups/opt_out > > <https://groups.google.com/groups/opt_out>. > > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at http://groups.google.com/group/thinking-sphinx > <http://groups.google.com/group/thinking-sphinx>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.
