You’d have to end up with a fair bit of duplication, but it’s technically possible.
# creates both core and delta indices ThinkingSphinx::Index.define(:article, :with => :active_record, :delta => ThinkingSphinx::Deltas::ResqueDelta ) do # … end Is the equivalent of: # create core index ThinkingSphinx::Index.define(:article, :with => :active_record, :delta? => false, :delta_processor => ThinkingSphinx::Deltas.processor_for(ThinkingSphinx::Deltas::ResqueDelta) ) do # … end # create delta index ThinkingSphinx::Index.define(:article, :with => :active_record, :delta? => true, :delta_processor => ThinkingSphinx::Deltas.processor_for(ThinkingSphinx::Deltas::ResqueDelta) ) do # … end The first being the core index, the second being the delta, with the same definition block normally being applied to both. If you want to have something slightly different in the delta index definition block, I guess you could try something along these lines? — Pat > On 30 Jun 2015, at 4:45 pm, [email protected] wrote: > > Thanks, sharding the joined queries works. I'd also like to improve them for > the deltas. Is there any way to add "WHERE delta = 1" to the joined queries > in the delta definition? > > On Monday, June 29, 2015 at 5:12:27 PM UTC+3, Pat Allan wrote: > I’m not sure why the sizes are so different, but I think the overall issue is > related to the three attributes that have :source => :query. > > I’d recommend making two changes to each of them: > > * Add a condition to each query that filters by the appropriate incident ids > (like you’re doing for the main query) so the results are sharded in the same > way. > * Perhaps add a second SQL statement to each of those attributes (separated > by a semi-colon), with :source set to :ranged_query, as covered in the Sphinx > documentation: > http://sphinxsearch.com/docs/current.html#conf-sql-attr-multi > > The first of those isn’t too complex, so I’d start with that. Certainly the > second is far more fiddly, but may be worthwhile. > > Hope this helps! > > — > Pat > >> On 29 Jun 2015, at 8:52 pm, [email protected] wrote: >> >> I even less understand the number of bytes in delta indexes 6 - 10. Why does >> 1_delta contain 1128 bytes and 6_delta 24M? They're on the same records. >> >> On Monday, June 29, 2015 at 9:03:04 AM UTC+3, [email protected] wrote: >> Rails version: 4.1.7 >> TS version: 3.0.6 >> >> On Monday, June 29, 2015 at 5:17:37 AM UTC+3, Pat Allan wrote: >> Hi Jonathan >> >> Can you share your index definitions so I can get a better idea of where the >> problem might be? >> >> Also: which versions of Rails and Thinking Sphinx are you using? >> >> — >> Pat >> >>> On 28 Jun 2015, at 11:47 pm, [email protected] wrote: >>> >>> Hi Pat, >>> >>> I implemented according to this, and the indexing time went down (5 times >>> faster on development). However, the delta indexing time went up (30 times >>> slower on development). See below the indexing stats: >>> >>> Total docs Bytes Time (sec) Total docs >>> Bytes Time (sec) >>> incident_index_1_core 7331 6531122 39.436 >>> incident_index_6_core 7331 28239593 8.802 >>> incident_index_1_delta 6 1128 0.184 >>> incident_index_6_delta 6 24763425 5.234 >>> incident_index_2_core 7319 6751189 45.477 >>> incident_index_7_core 7319 28331726 8.819 >>> incident_index_2_delta 5 843 0.233 >>> incident_index_7_delta 5 24763289 5.321 >>> incident_index_3_core 7390 6803814 42.064 >>> incident_index_8_core 7390 28310121 7.913 >>> incident_index_3_delta 8 2143 0.203 >>> incident_index_8_delta 8 24764366 5.282 >>> incident_index_4_core 7278 6377664 37.665 >>> incident_index_9_core 7278 28162260 7.891 >>> incident_index_4_delta 6 1108 0.436 >>> incident_index_9_delta 6 24763330 5.456 >>> incident_index_5_core 7396 6601358 39.704 >>> incident_index_10_core 7396 28152075 9.562 >>> incident_index_5_delta 6 944 0.216 >>> incident_index_10_delta 6 24763308 5.303 >>> >>> Any idea why this is happening? >>> >>> Thanks, >>> Jonathan >>> >>> On Friday, July 26, 2013 at 3:57:38 PM UTC+3, Pat Allan wrote: >>> Heya Steve >>> >>> Was just looking into how difficult this would be to implement properly, >>> and noticed I have added the ability to take a string as the source query - >>> instead of the column references. So, it's possible without hacking around >>> in the index definition itself: >>> >>> https://gist.github.com/pat/6088629 >>> >>> It's worth noting that the document id (Sphinx's equivalent of a primary >>> key) involves the normal primary key with an offset and a multiplier. Make >>> sure those two integers match what's in your generated index in sql_query. >>> They may change when you add other indices to your app (depends on >>> alphabetical order of your index files). >>> >>> Also: there's probably some metaprogramming you could add to simplify >>> things a bit more. >>> >>> Would love to hear if this approach helps with your real app and not just >>> the test one :) >>> >>> -- >>> Pat >>> >>> On 26/07/2013, at 12:14 AM, Pat Allan wrote: >>> >>> > Hi Steve >>> > >>> > I've got a way forward to greatly improve the speed of indexing… >>> > unfortunately, it's not going to work within Thinking Sphinx easily right >>> > now. >>> > >>> > Sphinx has the ability to gather attribute and field values from separate >>> > queries - this existed for TS v1/v2 for attributes, and fields was added >>> > in TS v3, but the catch is those separate queries don't work for HABTM >>> > joins. I'd love to change that, it's just painful from an ActiveRecord >>> > perspective because you're not dealing with a model's table as the base, >>> > but the HABTM join table. >>> > >>> > Here's the configuration for the relevant source that I modified by hand: >>> > https://gist.github.com/pat/6080031 >>> > >>> > You'll see that the main query is nice and short - and then there's each >>> > of the MVA and joined field definitions. If you put this in the generated >>> > source definition in config/development.sphinx.conf, and then run the >>> > indexer manually (NOT through the rake task, that'll overwrite this): >>> > indexer --config config/development.sphinx.conf --all --rotate >>> > >>> > (Remove --rotate if Sphinx isn't running.) You'll see it's pretty damn >>> > fast. >>> > >>> > Now, ways forward? Well, I'd love to write something for TS v3 that can >>> > handle HABTM - it's just a shame that it might need to be pure ARel >>> > rather than ActiveRecord-built (which can otherwise help with joins). >>> > >>> > But otherwise: switch from HABTM to has_many/has_many :through - make >>> > each of the joins an actual model. Then, you can add :source => :query to >>> > each of the appropriate field and attribute definitions, and it should >>> > generate something pretty much the same. >>> > >>> > Hope this provides some clarity at the very least! And also: thanks for >>> > the test app, really helped with debugging! >>> > >>> > -- >>> > Pat >>> > >>> > >>> > On 25/07/2013, at 2:54 PM, Steve Kenworthy wrote: >>> > >>> >> Hi there, >>> >> >>> >> Firstly, thinking-sphinx is awesome and I love it. Thanks Pat for an >>> >> excellent project. V3 is looking great and represents a lot of hard work >>> >> and effort. >>> >> >>> >> I've been using thinking-sphinx to index a document model and it's >>> >> really slowed down when I add lots of associations in the index. In >>> >> fact, it never finishes on my machine (8Gig RAM, 8 CPU's) when I add 4 >>> >> indexes. >>> >> >>> >> Times: >>> >> • 4 seconds - when 1 association (images) is indexed >>> >> • 6 seconds - when 2 associations (images and subscribers) are >>> >> indexed >>> >> • 23 seconds - when 2 associations (images and countries) are >>> >> indexed >>> >> • 115 seconds - when 3 associations (images, subscribers and >>> >> tags) are indexed >>> >> • 113 seconds - when 3 associations (images, subscribers and >>> >> videos) are indexed (just to prove it's not tags slowing it down) >>> >> • ꝏ (not finishing) - when 4 associations or more are selected. >>> >> >>> >> Here's my index file: >>> >> >>> >> ThinkingSphinx::Index.define :document, with: :active_record, delta: >>> >> true, sql_range_step: 999999999, group_concat_max_len: 16384 do >>> >> >>> >> has countries(:id), as: :country_ids >>> >> has images(:id), as: :image_ids, facet: true >>> >> has subscribers(:id), as: :subscriber_ids, facet: true >>> >> has tags(:id), as: :tag_ids, facet: true >>> >> has videos(:id), as: :video_ids, facet: true >>> >> >>> >> indexes countries.name, as: :countries >>> >> indexes images.title, as: :images >>> >> indexes subscribers.title, as: :subscribers >>> >> indexes tags.name, as: :tags >>> >> indexes videos.title, as: :videos >>> >> >>> >> has updated_at >>> >> >>> >> end >>> >> >>> >> The generated sql is a massive group_by query and is not finishing. See >>> >> it here >>> >> https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing >>> >> >>> >> I'd really appreciate some advice on how to optimise this so indexing >>> >> becomes viable again. Do I just have too much going on here? I'm using >>> >> facets, indexes and attributes. Perhaps there is a better way to >>> >> optimise? A friend suggested pre-computing with some joins... how would >>> >> this work? >>> >> >>> >> Vital stats: using mysql v14.14, sphinx 2.0.4, Ubuntu, rails 3.2.13, >>> >> thinking-sphinx 3.0.4 >>> >> >>> >> For those who'd like to take a look, I've uploaded a sample project here >>> >> https://github.com/crossroads/rails3-ts-example which can be cloned. If >>> >> you follow the instructions, it will setup a db with test data and >>> >> reproduce the problem quickly. >>> >> >>> >> There's also the sphinx generated SQL and EXPLAIN: >>> >> https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing >>> >> >>> >> Thanks in advance for anyone taking the time to read. >>> >> >>> >> Regards, >>> >> Steve >>> >> >>> >> -- >>> >> You received this message because you are subscribed to the Google >>> >> Groups "Thinking Sphinx" group. >>> >> To unsubscribe from this group and stop receiving emails from it, send >>> >> an email to [email protected]. >>> >> To post to this group, send email to [email protected]. >>> >> Visit this group at http://groups.google.com/group/thinking-sphinx. >>> >> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >>> >> >>> > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google Groups >>> > "Thinking Sphinx" group. >>> > To unsubscribe from this group and stop receiving emails from it, send an >>> > email to [email protected]. >>> > To post to this group, send email to [email protected]. >>> > Visit this group at http://groups.google.com/group/thinking-sphinx. >>> > For more options, visit https://groups.google.com/groups/opt_out. >>> > >>> > >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Thinking Sphinx" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/thinking-sphinx. >>> For more options, visit https://groups.google.com/d/optout. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Thinking Sphinx" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/thinking-sphinx. >> For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/thinking-sphinx. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.
