Seems to be because GROUP_CONCAT(job_applications_users.`job_id` SEPARATOR ',') AS `job_ids` was still being included in the sql_query line... is there another option I need to remove it?
On Thursday, July 18, 2013 3:03:01 PM UTC-4, Daniel Vandersluis wrote: > > Hmm... when I try that I get this error: > > indexing index 'user_core'... > ERROR: index 'user_core': multi-valued attribute 'job_ids' of wrong > source-type found in query; must be 'field'. > > The line in the generated config file looks like this: > > sql_attr_multi = uint job_ids from query; SELECT > `job_applications`.`user_id` * 9 + 1 AS `id`, `job_applications`.`job_id` > AS `job_ids` FROM `job_applications` > > On Thursday, July 18, 2013 2:53:29 PM UTC-4, Daniel Vandersluis wrote: >> >> I will try out source: :query, thanks. >> >> In terms of last_job_application, it doesn't use an ORDER, there is a >> last_job_application_id foreign key on the model. >> >> On Thursday, July 18, 2013 2:16:03 PM UTC-4, Pat Allan wrote: >>> >>> Hi Daniel >>> >>> As for this issue: the short answer is I'm not entirely sure the index >>> definition will always give you the results you're after. Although you do >>> seem to be using MySQL, and that can be a little carefree with its approach >>> to SQL results. >>> >>> You're asking for the 'last job application' - but ORDER clauses can't >>> apply to joins, and so there's no guarantee that the job application in >>> question for the three attributes is the 'last' one. >>> >>> That said, a potential work-around: change the new attribute definition >>> so it's in a separate query: >>> >>> has job_applications.job_id, as: :job_ids, facet: true, source: :query >>> >>> That should remove the second join - and while it will involve a >>> separate query, it may perform much better that way. >>> >>> Give it a shot. >>> >>> -- >>> Pat >>> >>> On 18/07/2013, at 12:59 AM, Daniel Vandersluis wrote: >>> >>> > Index: >>> > >>> > define_index do >>> > indexes 'TRIM(LOWER(first_name))', as: :first_name, :sortable => >>> true >>> > indexes 'TRIM(LOWER(last_name))', as: :last_name, :sortable => >>> true >>> > indexes email, :sortable => true >>> > indexes resumes.document, :as => "document", :sortable => true >>> > >>> > has :id, :as => :user_id >>> > has client_id >>> > has updated_at >>> > has is_internal >>> > >>> > has location.country_id, as: "country_id", facet: true >>> > has location.state_id, as: "state_id" >>> > has location.city_id, as: "city_id" >>> > has 'RADIANS(locations.latitude)', as: :lat, type: :float >>> > has 'RADIANS(locations.longitude)', as: :lng, type: :float >>> > >>> > has last_job_application.source.source_type_id, as: >>> "source_type_id" >>> > has last_job_application.source_id, as: "source_id", facet: true >>> > has last_job_application.application_status_id, as: >>> "application_status_id" >>> > >>> > has tags(:id), as: "tag_ids", facet: true >>> > has profiles(:profile_type_id), as: "profile_type_ids", facet: >>> true >>> > has job_applications(:job_id), as: "job_ids", facet: true # THIS >>> IS THE NEW ATTRIBUTE !!! >>> > >>> > has candidate_answers(:answer_id), as: "candidate_answer_ids" >>> > >>> > set_property latitude_attr: "lat" >>> > set_property longitude_attr: "lng" >>> > >>> > # By default due to our has many relationship with resumes >>> documents, MySQL only returns >>> > # the first 1024 characters of the document. This enforces the >>> size during indexing. >>> > set_property group_concat_max_len: 500000 >>> > end >>> > >>> > Models: >>> > >>> > class User < ActiveRecord::Base >>> > belongs_to :last_job_application, class_name: >>> 'JobApplication' >>> > has_many :resumes >>> > has_many :job_applications, dependent: :destroy, >>> autosave: true >>> > has_many :candidate_answers, through: >>> :job_applications >>> > has_many :candidate_profiles, dependent: :delete_all >>> > has_many :candidate_tags_candidate_users >>> > has_many :candidate_tags, through: >>> :candidate_tags_candidate_users, source: :candidate_tag, dependent: >>> :delete_all >>> > end >>> > >>> > class JobApplication < ActiveRecord::Base >>> > belongs_to :user >>> > belongs_to :application_status >>> > belongs_to :job, counter_cache: true >>> > belongs_to :source >>> > end >>> > >>> > Please let me know if there's any other model you'd like to see, >>> thanks! >>> > >>> > On Tuesday, July 16, 2013 8:49:04 PM UTC-4, Pat Allan wrote: >>> > Can you show us the index definition and the related associations? >>> > >>> > On 17/07/2013, at 5:02 AM, Daniel Vandersluis wrote: >>> > >>> > > I did an explain query, like I said, all the joins are indexed >>> (either as primary keys or actual indexes). In both cases (before and after >>> adding the attribute), it takes milliseconds to execute the query, but a >>> while before all data has been collected. >>> > > >>> > > >>> +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+ >>> >>> >>> > > | id | select_type | table | type | >>> possible_keys | key >>> | key_len | ref | >>> rows | Extra | >>> > > >>> +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+ >>> >>> >>> > > | 1 | SIMPLE | users | index | NULL >>> | PRIMARY >>> | 4 | NULL | 23754 | >>> | >>> > > | 1 | SIMPLE | documents | ref | >>> index_documents_on_user_id | index_documents_on_user_id >>> | 5 | prod-2013-05-28.users.id >>> | 1 | | >>> > > | 1 | SIMPLE | locations | eq_ref | PRIMARY >>> | PRIMARY >>> | 4 | prod-2013-05-28.users.location_id | 1 | >>> | >>> > > | 1 | SIMPLE | job_applications | eq_ref | PRIMARY >>> | PRIMARY >>> | 4 | prod-2013-05-28.users.last_job_application_id | 1 | >>> | >>> > > | 1 | SIMPLE | sources | eq_ref | PRIMARY >>> | PRIMARY >>> | 4 | prod-2013-05-28.job_applications.source_id | 1 | >>> | >>> > > | 1 | SIMPLE | tags_users | ref | >>> index_tags_on_user_id | index_tags_on_user_id >>> | 5 | prod-2013-05-28.users.id >>> | 1 | | >>> > > | 1 | SIMPLE | tags | eq_ref | PRIMARY >>> | PRIMARY >>> | 4 | prod-2013-05-28.tags_users.tag_id | 1 | >>> Using index | >>> > > | 1 | SIMPLE | profiles | ref | >>> index_profiles_on_user_id_and_profile_type | >>> index_profiles_on_user_id_and_profile_type | 5 | >>> prod-2013-05-28.users.id | 1 | Using index | >>> > > | 1 | SIMPLE | job_applications_users | ref | >>> index_job_applications_on_user_id | >>> index_job_applications_on_user_id | 5 | >>> prod-2013-05-28.users.id | 1 | | >>> > > | 1 | SIMPLE | job_applications_users_join | ref | >>> index_job_applications_on_user_id | >>> index_job_applications_on_user_id | 5 | >>> prod-2013-05-28.users.id | 1 | Using index | >>> > > | 1 | SIMPLE | candidate_answers | ref | >>> uidx_on_candidate_answers | uidx_on_candidate_answers >>> | 5 | prod-2013-05-28.job_applications_users_join.id| >>> 8 | Using index | >>> > > >>> +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+ >>> >>> >>> > > >>> > > The join is exactly the same except for the alias: >>> > > >>> > > LEFT OUTER JOIN `job_applications` `job_applications_users` >>> > > ON `job_applications_users`.`user_id` = `users`.`id` >>> > > LEFT OUTER JOIN `job_applications` `job_applications_users_join` >>> > > ON `job_applications_users_join`.`user_id` = `users`.`id` >>> > > >>> > > >>> > > >>> > > On Tuesday, July 16, 2013 2:12:02 PM UTC-4, Pat Allan wrote: >>> > > Hi Daniel >>> > > Any slowness in indexing is going to be related to the generated SQL >>> query - when you say the query itself seems fast, how are you comparing it? >>> I'd recommend running it through EXPLAIN to get some idea of what could be >>> causing it to be slow. >>> > > >>> > > Is there any way in which the join is different beyond the aliased >>> name? >>> > > >>> > > As for upgrading - I'm not sure if there's been any changes related >>> to query generation, but using the latest releases is always recommended >>> (in this case, 2.1.0). >>> > > >>> > > Cheers >>> > > >>> > > -- >>> > > Pat >>> > > >>> > > >>> > > On 17/07/2013, at 3:09 AM, Daniel Vandersluis wrote: >>> > > >>> > > > Is there any reason an index would suddenly take 3x as long to >>> index after adding an extra has_many attribute to the index definition? The >>> query itself is completely indexed, and takes about 1.5ms to run (plus data >>> collection time, there are about 200k records in the main table that is >>> being indexed, plus a bunch of attributes - sphinx reports it as a 466MB >>> index). Prior to adding the extra attribute, indexing took about 9 minutes, >>> and now it takes 29. The new attribute averages just over 1 record per row, >>> with a maximum of 78. >>> > > > >>> > > > Possibly related is that adding the new attribute causes the query >>> TS generates to add a duplicate join (with a different alias) to a join >>> that's added by a different attribute (however as mentioned the query >>> itself seems to be fast). >>> > > > >>> > > > I'm using ThinkingSphinx 2.0.11 currently - would updating to TS3 >>> help? >>> > > > >>> > > > -- >>> > > > You received this message because you are subscribed to the Google >>> Groups "Thinking Sphinx" group. >>> > > > To unsubscribe from this group and stop receiving emails from it, >>> send an email to [email protected]. >>> > > > To post to this group, send email to [email protected]. >>> > > > Visit this group at http://groups.google.com/group/thinking-sphinx. >>> >>> > > > For more options, visit https://groups.google.com/groups/opt_out. >>> > > > >>> > > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > -- >>> > > You received this message because you are subscribed to the Google >>> Groups "Thinking Sphinx" group. >>> > > To unsubscribe from this group and stop receiving emails from it, >>> send an email to [email protected]. >>> > > To post to this group, send email to [email protected]. >>> > > Visit this group at http://groups.google.com/group/thinking-sphinx. >>> > > For more options, visit https://groups.google.com/groups/opt_out. >>> > > >>> > > >>> > >>> > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "Thinking Sphinx" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> > To post to this group, send email to [email protected]. >>> > Visit this group at http://groups.google.com/group/thinking-sphinx. >>> > For more options, visit https://groups.google.com/groups/opt_out. >>> > >>> > >>> >>> >>> -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/groups/opt_out.
