To be clear: the join is appearing twice (incorrectly) instead of once (for the attributes that have been there a while)?
On 22/07/2013, at 11:59 PM, Daniel Vandersluis wrote: > Alright so using the updated version (I just pulled the master version, I saw > there were some other fixes that looked good) does create a proper query that > sphinx can work with! I'm not really sure what different using the query > source should make, but searching still seems fast and indexing was much > faster. > > I noticed, however, that a join is still being created for the column in the > main sql_query despite no longer being necessary. > > On Saturday, July 20, 2013 11:41:09 PM UTC-4, Daniel Vandersluis wrote: > Thanks for your help with this, Pat - I will try out the updated version ASAP > (probably Monday morning) and let you know! > > On Saturday, July 20, 2013 11:11:59 PM UTC-4, Pat Allan wrote: > Hi Daniel > > Just pushed a fix for this - the GROUP_CONCAT certainly shouldn't be in > sql_query. > > If you want to use the latest: > > gem 'thinking-sphinx', '~> 3.0.4', > :git => 'git://github.com/pat/thinking-sphinx.git', > :branch => 'master', > :ref => '578ae397e7' > > Appreciate your patience with this. > > Cheers > > -- > Pat > > On 19/07/2013, at 5:15 AM, Daniel Vandersluis wrote: > > > Seems to be because GROUP_CONCAT(job_applications_users.`job_id` SEPARATOR > > ',') AS `job_ids` was still being included in the sql_query line... is > > there another option I need to remove it? > > > > > > On Thursday, July 18, 2013 3:03:01 PM UTC-4, Daniel Vandersluis wrote: > > Hmm... when I try that I get this error: > > > > indexing index 'user_core'... > > ERROR: index 'user_core': multi-valued attribute 'job_ids' of wrong > > source-type found in query; must be 'field'. > > > > The line in the generated config file looks like this: > > > > sql_attr_multi = uint job_ids from query; SELECT > > `job_applications`.`user_id` * 9 + 1 AS `id`, `job_applications`.`job_id` > > AS `job_ids` FROM `job_applications` > > > > On Thursday, July 18, 2013 2:53:29 PM UTC-4, Daniel Vandersluis wrote: > > I will try out source: :query, thanks. > > > > In terms of last_job_application, it doesn't use an ORDER, there is a > > last_job_application_id foreign key on the model. > > > > On Thursday, July 18, 2013 2:16:03 PM UTC-4, Pat Allan wrote: > > Hi Daniel > > > > As for this issue: the short answer is I'm not entirely sure the index > > definition will always give you the results you're after. Although you do > > seem to be using MySQL, and that can be a little carefree with its approach > > to SQL results. > > > > You're asking for the 'last job application' - but ORDER clauses can't > > apply to joins, and so there's no guarantee that the job application in > > question for the three attributes is the 'last' one. > > > > That said, a potential work-around: change the new attribute definition so > > it's in a separate query: > > > > has job_applications.job_id, as: :job_ids, facet: true, source: :query > > > > That should remove the second join - and while it will involve a separate > > query, it may perform much better that way. > > > > Give it a shot. > > > > -- > > Pat > > > > On 18/07/2013, at 12:59 AM, Daniel Vandersluis wrote: > > > > > Index: > > > > > > define_index do > > > indexes 'TRIM(LOWER(first_name))', as: :first_name, :sortable => true > > > indexes 'TRIM(LOWER(last_name))', as: :last_name, :sortable => true > > > indexes email, :sortable => true > > > indexes resumes.document, :as => "document", :sortable => true > > > > > > has :id, :as => :user_id > > > has client_id > > > has updated_at > > > has is_internal > > > > > > has location.country_id, as: "country_id", facet: true > > > has location.state_id, as: "state_id" > > > has location.city_id, as: "city_id" > > > has 'RADIANS(locations.latitude)', as: :lat, type: :float > > > has 'RADIANS(locations.longitude)', as: :lng, type: :float > > > > > > has last_job_application.source.source_type_id, as: "source_type_id" > > > has last_job_application.source_id, as: "source_id", facet: true > > > has last_job_application.application_status_id, as: > > > "application_status_id" > > > > > > has tags(:id), as: "tag_ids", facet: true > > > has profiles(:profile_type_id), as: "profile_type_ids", facet: true > > > has job_applications(:job_id), as: "job_ids", facet: true # THIS IS > > > THE NEW ATTRIBUTE !!! > > > > > > has candidate_answers(:answer_id), as: "candidate_answer_ids" > > > > > > set_property latitude_attr: "lat" > > > set_property longitude_attr: "lng" > > > > > > # By default due to our has many relationship with resumes documents, > > > MySQL only returns > > > # the first 1024 characters of the document. This enforces the size > > > during indexing. > > > set_property group_concat_max_len: 500000 > > > end > > > > > > Models: > > > > > > class User < ActiveRecord::Base > > > belongs_to :last_job_application, class_name: > > > 'JobApplication' > > > has_many :resumes > > > has_many :job_applications, dependent: :destroy, > > > autosave: true > > > has_many :candidate_answers, through: :job_applications > > > has_many :candidate_profiles, dependent: :delete_all > > > has_many :candidate_tags_candidate_users > > > has_many :candidate_tags, through: > > > :candidate_tags_candidate_users, source: :candidate_tag, dependent: > > > :delete_all > > > end > > > > > > class JobApplication < ActiveRecord::Base > > > belongs_to :user > > > belongs_to :application_status > > > belongs_to :job, counter_cache: true > > > belongs_to :source > > > end > > > > > > Please let me know if there's any other model you'd like to see, thanks! > > > > > > On Tuesday, July 16, 2013 8:49:04 PM UTC-4, Pat Allan wrote: > > > Can you show us the index definition and the related associations? > > > > > > On 17/07/2013, at 5:02 AM, Daniel Vandersluis wrote: > > > > > > > I did an explain query, like I said, all the joins are indexed (either > > > > as primary keys or actual indexes). In both cases (before and after > > > > adding the attribute), it takes milliseconds to execute the query, but > > > > a while before all data has been collected. > > > > > > > > +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+ > > > > > > > > | id | select_type | table | type | > > > > possible_keys | key > > > > | key_len | ref > > > > | rows | Extra | > > > > +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+ > > > > > > > > | 1 | SIMPLE | users | index | NULL > > > > | PRIMARY > > > > | 4 | NULL | 23754 > > > > | | > > > > | 1 | SIMPLE | documents | ref | > > > > index_documents_on_user_id | index_documents_on_user_id > > > > | 5 | prod-2013-05-28.users.id > > > > | 1 | | > > > > | 1 | SIMPLE | locations | eq_ref | PRIMARY > > > > | PRIMARY > > > > | 4 | prod-2013-05-28.users.location_id | 1 > > > > | | > > > > | 1 | SIMPLE | job_applications | eq_ref | PRIMARY > > > > | PRIMARY > > > > | 4 | prod-2013-05-28.users.last_job_application_id | 1 > > > > | | > > > > | 1 | SIMPLE | sources | eq_ref | PRIMARY > > > > | PRIMARY > > > > | 4 | prod-2013-05-28.job_applications.source_id | 1 > > > > | | > > > > | 1 | SIMPLE | tags_users | ref | > > > > index_tags_on_user_id | index_tags_on_user_id > > > > | 5 | prod-2013-05-28.users.id > > > > | 1 | | > > > > | 1 | SIMPLE | tags | eq_ref | PRIMARY > > > > | PRIMARY > > > > | 4 | prod-2013-05-28.tags_users.tag_id | 1 > > > > | Using index | > > > > | 1 | SIMPLE | profiles | ref | > > > > index_profiles_on_user_id_and_profile_type | > > > > index_profiles_on_user_id_and_profile_type | 5 | > > > > prod-2013-05-28.users.id | 1 | Using index | > > > > | 1 | SIMPLE | job_applications_users | ref | > > > > index_job_applications_on_user_id | > > > > index_job_applications_on_user_id | 5 | > > > > prod-2013-05-28.users.id | 1 | | > > > > | 1 | SIMPLE | job_applications_users_join | ref | > > > > index_job_applications_on_user_id | > > > > index_job_applications_on_user_id | 5 | > > > > prod-2013-05-28.users.id | 1 | Using index | > > > > | 1 | SIMPLE | candidate_answers | ref | > > > > uidx_on_candidate_answers | uidx_on_candidate_answers > > > > | 5 | > > > > prod-2013-05-28.job_applications_users_join.id | 8 | Using index | > > > > +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+ > > > > > > > > > > > > The join is exactly the same except for the alias: > > > > > > > > LEFT OUTER JOIN `job_applications` `job_applications_users` > > > > ON `job_applications_users`.`user_id` = `users`.`id` > > > > LEFT OUTER JOIN `job_applications` `job_applications_users_join` > > > > ON `job_applications_users_join`.`user_id` = `users`.`id` > > > > > > > > > > > > > > > > On Tuesday, July 16, 2013 2:12:02 PM UTC-4, Pat Allan wrote: > > > > Hi Daniel > > > > Any slowness in indexing is going to be related to the generated SQL > > > > query - when you say the query itself seems fast, how are you comparing > > > > it? I'd recommend running it through EXPLAIN to get some idea of what > > > > could be causing it to be slow. > > > > > > > > Is there any way in which the join is different beyond the aliased > > > > name? > > > > > > > > As for upgrading - I'm not sure if there's been any changes related to > > > > query generation, but using the latest releases is always recommended > > > > (in this case, 2.1.0). > > > > > > > > Cheers > > > > > > > > -- > > > > Pat > > > > > > > > > > > > On 17/07/2013, at 3:09 AM, Daniel Vandersluis wrote: > > > > > > > > > Is there any reason an index would suddenly take 3x as long to index > > > > > after adding an extra has_many attribute to the index definition? The > > > > > query itself is completely indexed, and takes about 1.5ms to run > > > > > (plus data collection time, there are about 200k records in the main > > > > > table that is being indexed, plus a bunch of attributes - sphinx > > > > > reports it as a 466MB index). Prior to adding the extra attribute, > > > > > indexing took about 9 minutes, and now it takes 29. The new attribute > > > > > averages just over 1 record per row, with a maximum of 78. > > > > > > > > > > Possibly related is that adding the new attribute causes the query TS > > > > > generates to add a duplicate join (with a different alias) to a join > > > > > that's added by a different attribute (however as mentioned the query > > > > > itself seems to be fast). > > > > > > > > > > I'm using ThinkingSphinx 2.0.11 currently - would updating to TS3 > > > > > help? > > > > > > > > > > -- > > > > > You received this message because you are subscribed to the Google > > > > > Groups "Thinking Sphinx" group. > > > > > To unsubscribe from this group and stop receiving emails from it, > > > > > send an email to [email protected]. > > > > > To post to this group, send email to [email protected]. > > > > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > > > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > You received this message because you are subscribed to the Google > > > > Groups "Thinking Sphinx" group. > > > > To unsubscribe from this group and stop receiving emails from it, send > > > > an email to [email protected]. > > > > To post to this group, send email to [email protected]. > > > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > > > > > > > > > > > > > -- > > > You received this message because you are subscribed to the Google Groups > > > "Thinking Sphinx" group. > > > To unsubscribe from this group and stop receiving emails from it, send an > > > email to [email protected]. > > > To post to this group, send email to [email protected]. > > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Thinking Sphinx" group. > > To unsubscribe from this group and stop receiving emails from it, send an > > email to [email protected]. > > To post to this group, send email to [email protected]. > > Visit this group at http://groups.google.com/group/thinking-sphinx. > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/thinking-sphinx. > For more options, visit https://groups.google.com/groups/opt_out. > > -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/groups/opt_out.
