I even less understand the number of bytes in delta indexes 6 - 10. Why does 1_delta contain 1128 bytes and 6_delta 24M? They're on the same records.
On Monday, June 29, 2015 at 9:03:04 AM UTC+3, [email protected] wrote: > > Rails version: 4.1.7 > TS version: 3.0.6 > > On Monday, June 29, 2015 at 5:17:37 AM UTC+3, Pat Allan wrote: >> >> Hi Jonathan >> >> Can you share your index definitions so I can get a better idea of where >> the problem might be? >> >> Also: which versions of Rails and Thinking Sphinx are you using? >> >> — >> Pat >> >> On 28 Jun 2015, at 11:47 pm, [email protected] wrote: >> >> Hi Pat, >> >> I implemented according to this, and the indexing time went down (5 times >> faster on development). However, the delta indexing time went up (30 times >> slower on development). See below the indexing stats: >> >> Total docsBytesTime (sec)Total docsBytesTime (sec)incident_index_1_core >> 7331653112239.436incident_index_6_core7331282395938.802 >> incident_index_1_delta611280.184incident_index_6_delta6247634255.234 >> incident_index_2_core7319675118945.477incident_index_7_core731928331726 >> 8.819incident_index_2_delta58430.233incident_index_7_delta5247632895.321 >> incident_index_3_core7390680381442.064incident_index_8_core739028310121 >> 7.913incident_index_3_delta821430.203incident_index_8_delta8247643665.282 >> incident_index_4_core7278637766437.665incident_index_9_core727828162260 >> 7.891incident_index_4_delta611080.436incident_index_9_delta6247633305.456 >> incident_index_5_core7396660135839.704incident_index_10_core739628152075 >> 9.562incident_index_5_delta69440.216incident_index_10_delta6247633085.303 >> >> Any idea why this is happening? >> >> Thanks, >> Jonathan >> >> On Friday, July 26, 2013 at 3:57:38 PM UTC+3, Pat Allan wrote: >>> >>> Heya Steve >>> >>> Was just looking into how difficult this would be to implement properly, >>> and noticed I have added the ability to take a string as the source query - >>> instead of the column references. So, it's possible without hacking around >>> in the index definition itself: >>> >>> https://gist.github.com/pat/6088629 >>> >>> It's worth noting that the document id (Sphinx's equivalent of a primary >>> key) involves the normal primary key with an offset and a multiplier. Make >>> sure those two integers match what's in your generated index in sql_query. >>> They may change when you add other indices to your app (depends on >>> alphabetical order of your index files). >>> >>> Also: there's probably some metaprogramming you could add to simplify >>> things a bit more. >>> >>> Would love to hear if this approach helps with your real app and not >>> just the test one :) >>> >>> -- >>> Pat >>> >>> On 26/07/2013, at 12:14 AM, Pat Allan wrote: >>> >>> > Hi Steve >>> > >>> > I've got a way forward to greatly improve the speed of indexing… >>> unfortunately, it's not going to work within Thinking Sphinx easily right >>> now. >>> > >>> > Sphinx has the ability to gather attribute and field values from >>> separate queries - this existed for TS v1/v2 for attributes, and fields was >>> added in TS v3, but the catch is those separate queries don't work for >>> HABTM joins. I'd love to change that, it's just painful from an >>> ActiveRecord perspective because you're not dealing with a model's table as >>> the base, but the HABTM join table. >>> > >>> > Here's the configuration for the relevant source that I modified by >>> hand: >>> > https://gist.github.com/pat/6080031 >>> > >>> > You'll see that the main query is nice and short - and then there's >>> each of the MVA and joined field definitions. If you put this in the >>> generated source definition in config/development.sphinx.conf, and then run >>> the indexer manually (NOT through the rake task, that'll overwrite this): >>> >>> > indexer --config config/development.sphinx.conf --all --rotate >>> > >>> > (Remove --rotate if Sphinx isn't running.) You'll see it's pretty damn >>> fast. >>> > >>> > Now, ways forward? Well, I'd love to write something for TS v3 that >>> can handle HABTM - it's just a shame that it might need to be pure ARel >>> rather than ActiveRecord-built (which can otherwise help with joins). >>> > >>> > But otherwise: switch from HABTM to has_many/has_many :through - make >>> each of the joins an actual model. Then, you can add :source => :query to >>> each of the appropriate field and attribute definitions, and it should >>> generate something pretty much the same. >>> > >>> > Hope this provides some clarity at the very least! And also: thanks >>> for the test app, really helped with debugging! >>> > >>> > -- >>> > Pat >>> > >>> > >>> > On 25/07/2013, at 2:54 PM, Steve Kenworthy wrote: >>> > >>> >> Hi there, >>> >> >>> >> Firstly, thinking-sphinx is awesome and I love it. Thanks Pat for an >>> excellent project. V3 is looking great and represents a lot of hard work >>> and effort. >>> >> >>> >> I've been using thinking-sphinx to index a document model and it's >>> really slowed down when I add lots of associations in the index. In fact, >>> it never finishes on my machine (8Gig RAM, 8 CPU's) when I add 4 indexes. >>> >>> >> >>> >> Times: >>> >> • 4 seconds - when 1 association (images) is indexed >>> >> • 6 seconds - when 2 associations (images and subscribers) >>> are indexed >>> >> • 23 seconds - when 2 associations (images and countries) are >>> indexed >>> >> • 115 seconds - when 3 associations (images, subscribers and >>> tags) are indexed >>> >> • 113 seconds - when 3 associations (images, subscribers and >>> videos) are indexed (just to prove it's not tags slowing it down) >>> >> • ꝏ (not finishing) - when 4 associations or more are >>> selected. >>> >> >>> >> Here's my index file: >>> >> >>> >> ThinkingSphinx::Index.define :document, with: :active_record, delta: >>> true, sql_range_step: 999999999, group_concat_max_len: 16384 do >>> >> >>> >> has countries(:id), as: :country_ids >>> >> has images(:id), as: :image_ids, facet: true >>> >> has subscribers(:id), as: :subscriber_ids, facet: true >>> >> has tags(:id), as: :tag_ids, facet: true >>> >> has videos(:id), as: :video_ids, facet: true >>> >> >>> >> indexes countries.name, as: :countries >>> >> indexes images.title, as: :images >>> >> indexes subscribers.title, as: :subscribers >>> >> indexes tags.name, as: :tags >>> >> indexes videos.title, as: :videos >>> >> >>> >> has updated_at >>> >> >>> >> end >>> >> >>> >> The generated sql is a massive group_by query and is not finishing. >>> See it here >>> https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing >>> >> >>> >> I'd really appreciate some advice on how to optimise this so indexing >>> becomes viable again. Do I just have too much going on here? I'm using >>> facets, indexes and attributes. Perhaps there is a better way to optimise? >>> A friend suggested pre-computing with some joins... how would this work? >>> >>> >> >>> >> Vital stats: using mysql v14.14, sphinx 2.0.4, Ubuntu, rails 3.2.13, >>> thinking-sphinx 3.0.4 >>> >> >>> >> For those who'd like to take a look, I've uploaded a sample project >>> here https://github.com/crossroads/rails3-ts-example which can be >>> cloned. If you follow the instructions, it will setup a db with test data >>> and reproduce the problem quickly. >>> >> >>> >> There's also the sphinx generated SQL and EXPLAIN: >>> https://github.com/crossroads/rails3-ts-example#what-sphinx-is-doing >>> >> >>> >> Thanks in advance for anyone taking the time to read. >>> >> >>> >> Regards, >>> >> Steve >>> >> >>> >> -- >>> >> You received this message because you are subscribed to the Google >>> Groups "Thinking Sphinx" group. >>> >> To unsubscribe from this group and stop receiving emails from it, >>> send an email to [email protected]. >>> >> To post to this group, send email to thinkin...@googlegroups. >>> <http://googlegroups.com/>com <http://googlegroups.com/>. >>> >> Visit this group at http://groups.google.com/group/thinking-sphinx. >>> >> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >>> >> >>> > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "Thinking Sphinx" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> > To post to this group, send email to thinkin...@googlegroups. >>> <http://googlegroups.com/>com <http://googlegroups.com/>. >>> > Visit this group at http://groups.google.com/group/thinking-sphinx. >>> > For more options, visit https://groups.google.com/groups/opt_out. >>> > >>> > >>> >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Thinking Sphinx" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To post to this group, send email to [email protected]. >> Visit this group at http://groups.google.com/group/thinking-sphinx. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.
