Hi Daniel
Just pushed a fix for this - the GROUP_CONCAT certainly shouldn't be in
sql_query.
If you want to use the latest:
gem 'thinking-sphinx', '~> 3.0.4',
:git => 'git://github.com/pat/thinking-sphinx.git',
:branch => 'master',
:ref => '578ae397e7'
Appreciate your patience with this.
Cheers
--
Pat
On 19/07/2013, at 5:15 AM, Daniel Vandersluis wrote:
> Seems to be because GROUP_CONCAT(job_applications_users.`job_id` SEPARATOR
> ',') AS `job_ids` was still being included in the sql_query line... is there
> another option I need to remove it?
>
>
> On Thursday, July 18, 2013 3:03:01 PM UTC-4, Daniel Vandersluis wrote:
> Hmm... when I try that I get this error:
>
> indexing index 'user_core'...
> ERROR: index 'user_core': multi-valued attribute 'job_ids' of wrong
> source-type found in query; must be 'field'.
>
> The line in the generated config file looks like this:
>
> sql_attr_multi = uint job_ids from query; SELECT `job_applications`.`user_id`
> * 9 + 1 AS `id`, `job_applications`.`job_id` AS `job_ids` FROM
> `job_applications`
>
> On Thursday, July 18, 2013 2:53:29 PM UTC-4, Daniel Vandersluis wrote:
> I will try out source: :query, thanks.
>
> In terms of last_job_application, it doesn't use an ORDER, there is a
> last_job_application_id foreign key on the model.
>
> On Thursday, July 18, 2013 2:16:03 PM UTC-4, Pat Allan wrote:
> Hi Daniel
>
> As for this issue: the short answer is I'm not entirely sure the index
> definition will always give you the results you're after. Although you do
> seem to be using MySQL, and that can be a little carefree with its approach
> to SQL results.
>
> You're asking for the 'last job application' - but ORDER clauses can't apply
> to joins, and so there's no guarantee that the job application in question
> for the three attributes is the 'last' one.
>
> That said, a potential work-around: change the new attribute definition so
> it's in a separate query:
>
> has job_applications.job_id, as: :job_ids, facet: true, source: :query
>
> That should remove the second join - and while it will involve a separate
> query, it may perform much better that way.
>
> Give it a shot.
>
> --
> Pat
>
> On 18/07/2013, at 12:59 AM, Daniel Vandersluis wrote:
>
> > Index:
> >
> > define_index do
> > indexes 'TRIM(LOWER(first_name))', as: :first_name, :sortable => true
> > indexes 'TRIM(LOWER(last_name))', as: :last_name, :sortable => true
> > indexes email, :sortable => true
> > indexes resumes.document, :as => "document", :sortable => true
> >
> > has :id, :as => :user_id
> > has client_id
> > has updated_at
> > has is_internal
> >
> > has location.country_id, as: "country_id", facet: true
> > has location.state_id, as: "state_id"
> > has location.city_id, as: "city_id"
> > has 'RADIANS(locations.latitude)', as: :lat, type: :float
> > has 'RADIANS(locations.longitude)', as: :lng, type: :float
> >
> > has last_job_application.source.source_type_id, as: "source_type_id"
> > has last_job_application.source_id, as: "source_id", facet: true
> > has last_job_application.application_status_id, as:
> > "application_status_id"
> >
> > has tags(:id), as: "tag_ids", facet: true
> > has profiles(:profile_type_id), as: "profile_type_ids", facet: true
> > has job_applications(:job_id), as: "job_ids", facet: true # THIS IS THE
> > NEW ATTRIBUTE !!!
> >
> > has candidate_answers(:answer_id), as: "candidate_answer_ids"
> >
> > set_property latitude_attr: "lat"
> > set_property longitude_attr: "lng"
> >
> > # By default due to our has many relationship with resumes documents,
> > MySQL only returns
> > # the first 1024 characters of the document. This enforces the size
> > during indexing.
> > set_property group_concat_max_len: 500000
> > end
> >
> > Models:
> >
> > class User < ActiveRecord::Base
> > belongs_to :last_job_application, class_name:
> > 'JobApplication'
> > has_many :resumes
> > has_many :job_applications, dependent: :destroy, autosave:
> > true
> > has_many :candidate_answers, through: :job_applications
> > has_many :candidate_profiles, dependent: :delete_all
> > has_many :candidate_tags_candidate_users
> > has_many :candidate_tags, through:
> > :candidate_tags_candidate_users, source: :candidate_tag, dependent:
> > :delete_all
> > end
> >
> > class JobApplication < ActiveRecord::Base
> > belongs_to :user
> > belongs_to :application_status
> > belongs_to :job, counter_cache: true
> > belongs_to :source
> > end
> >
> > Please let me know if there's any other model you'd like to see, thanks!
> >
> > On Tuesday, July 16, 2013 8:49:04 PM UTC-4, Pat Allan wrote:
> > Can you show us the index definition and the related associations?
> >
> > On 17/07/2013, at 5:02 AM, Daniel Vandersluis wrote:
> >
> > > I did an explain query, like I said, all the joins are indexed (either as
> > > primary keys or actual indexes). In both cases (before and after adding
> > > the attribute), it takes milliseconds to execute the query, but a while
> > > before all data has been collected.
> > >
> > > +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+
> > >
> > > | id | select_type | table | type | possible_keys
> > > | key
> > > | key_len | ref | rows |
> > > Extra |
> > > +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+
> > >
> > > | 1 | SIMPLE | users | index | NULL
> > > | PRIMARY
> > > | 4 | NULL | 23754 |
> > > |
> > > | 1 | SIMPLE | documents | ref |
> > > index_documents_on_user_id | index_documents_on_user_id
> > > | 5 | prod-2013-05-28.users.id
> > > | 1 | |
> > > | 1 | SIMPLE | locations | eq_ref | PRIMARY
> > > | PRIMARY
> > > | 4 | prod-2013-05-28.users.location_id | 1 |
> > > |
> > > | 1 | SIMPLE | job_applications | eq_ref | PRIMARY
> > > | PRIMARY
> > > | 4 | prod-2013-05-28.users.last_job_application_id | 1 |
> > > |
> > > | 1 | SIMPLE | sources | eq_ref | PRIMARY
> > > | PRIMARY
> > > | 4 | prod-2013-05-28.job_applications.source_id | 1 |
> > > |
> > > | 1 | SIMPLE | tags_users | ref |
> > > index_tags_on_user_id | index_tags_on_user_id
> > > | 5 | prod-2013-05-28.users.id
> > > | 1 | |
> > > | 1 | SIMPLE | tags | eq_ref | PRIMARY
> > > | PRIMARY
> > > | 4 | prod-2013-05-28.tags_users.tag_id | 1 |
> > > Using index |
> > > | 1 | SIMPLE | profiles | ref |
> > > index_profiles_on_user_id_and_profile_type |
> > > index_profiles_on_user_id_and_profile_type | 5 |
> > > prod-2013-05-28.users.id | 1 | Using index |
> > > | 1 | SIMPLE | job_applications_users | ref |
> > > index_job_applications_on_user_id |
> > > index_job_applications_on_user_id | 5 |
> > > prod-2013-05-28.users.id | 1 | |
> > > | 1 | SIMPLE | job_applications_users_join | ref |
> > > index_job_applications_on_user_id |
> > > index_job_applications_on_user_id | 5 |
> > > prod-2013-05-28.users.id | 1 | Using index |
> > > | 1 | SIMPLE | candidate_answers | ref |
> > > uidx_on_candidate_answers | uidx_on_candidate_answers
> > > | 5 | prod-2013-05-28.job_applications_users_join.id
> > > | 8 | Using index |
> > > +----+-------------+-----------------------------+--------+--------------------------------------------+--------------------------------------------+---------+------------------------------------------------+-------+-------------+
> > >
> > >
> > > The join is exactly the same except for the alias:
> > >
> > > LEFT OUTER JOIN `job_applications` `job_applications_users`
> > > ON `job_applications_users`.`user_id` = `users`.`id`
> > > LEFT OUTER JOIN `job_applications` `job_applications_users_join`
> > > ON `job_applications_users_join`.`user_id` = `users`.`id`
> > >
> > >
> > >
> > > On Tuesday, July 16, 2013 2:12:02 PM UTC-4, Pat Allan wrote:
> > > Hi Daniel
> > > Any slowness in indexing is going to be related to the generated SQL
> > > query - when you say the query itself seems fast, how are you comparing
> > > it? I'd recommend running it through EXPLAIN to get some idea of what
> > > could be causing it to be slow.
> > >
> > > Is there any way in which the join is different beyond the aliased name?
> > >
> > > As for upgrading - I'm not sure if there's been any changes related to
> > > query generation, but using the latest releases is always recommended (in
> > > this case, 2.1.0).
> > >
> > > Cheers
> > >
> > > --
> > > Pat
> > >
> > >
> > > On 17/07/2013, at 3:09 AM, Daniel Vandersluis wrote:
> > >
> > > > Is there any reason an index would suddenly take 3x as long to index
> > > > after adding an extra has_many attribute to the index definition? The
> > > > query itself is completely indexed, and takes about 1.5ms to run (plus
> > > > data collection time, there are about 200k records in the main table
> > > > that is being indexed, plus a bunch of attributes - sphinx reports it
> > > > as a 466MB index). Prior to adding the extra attribute, indexing took
> > > > about 9 minutes, and now it takes 29. The new attribute averages just
> > > > over 1 record per row, with a maximum of 78.
> > > >
> > > > Possibly related is that adding the new attribute causes the query TS
> > > > generates to add a duplicate join (with a different alias) to a join
> > > > that's added by a different attribute (however as mentioned the query
> > > > itself seems to be fast).
> > > >
> > > > I'm using ThinkingSphinx 2.0.11 currently - would updating to TS3 help?
> > > >
> > > > --
> > > > You received this message because you are subscribed to the Google
> > > > Groups "Thinking Sphinx" group.
> > > > To unsubscribe from this group and stop receiving emails from it, send
> > > > an email to [email protected].
> > > > To post to this group, send email to [email protected].
> > > > Visit this group at http://groups.google.com/group/thinking-sphinx.
> > > > For more options, visit https://groups.google.com/groups/opt_out.
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups
> > > "Thinking Sphinx" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an
> > > email to [email protected].
> > > To post to this group, send email to [email protected].
> > > Visit this group at http://groups.google.com/group/thinking-sphinx.
> > > For more options, visit https://groups.google.com/groups/opt_out.
> > >
> > >
> >
> >
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Thinking Sphinx" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to [email protected].
> > To post to this group, send email to [email protected].
> > Visit this group at http://groups.google.com/group/thinking-sphinx.
> > For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Thinking Sphinx" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/thinking-sphinx.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
--
You received this message because you are subscribed to the Google Groups
"Thinking Sphinx" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/thinking-sphinx.
For more options, visit https://groups.google.com/groups/opt_out.