On Sat, Jan 7, 2012 at 12:20 AM, Linus Pettersson <
linus.petters...@gmail.com> wrote:

> Wow! Thank you for all the help Peter! I really appreciate it.


No prob, thank you for getting closer to the root cause.


> I will test that code tomorrow as it's getting quite late here.
>
> I did a small test now though. It seems that you can't mix joins and
> includes.
>

Indeed, strange ...


>
> This doesn't work:
> Category.includes(:subcategories).joins("INNER JOIN resellercategories AS
> r ON subcategories.id = r.s
> ubcategory_id").where("subcategories.id > 0")
>
> "SELECT "categories".* FROM "categories" INNER JOIN resellercategories AS
> r ON subcategories.id =
> r.subcategory_id WHERE (subcategories.id > 0)
> ActiveRecord::StatementInvalid: SQLite3::SQLException: no such column:
> subcategories.id: SELECT "categories".* FROM "categ
> ories" INNER JOIN resellercategories AS r ON subcategories.id =
> r.subcategory_id WHERE (subcategories.id > 0)"
>
>
> But this do work:
> Category.includes(:subcategories).where("subcategories.id > 0")
>
> "SELECT "categories"."id" AS t0_r0, "categories"."name" AS t0_r1,
> "categories"."created_at" AS t0_r2, "categ
> ories"."updated_at" AS t0_r3, "categories"."permalink" AS t0_r4,
> "subcategories"."id" AS t1_r0, "subcategories"."name" AS
> t1_r1, "subcategories"."category_id" AS t1_r2,
> "subcategories"."created_at" AS t1_r3, "subcategories"."updated_at" AS t1_r
> 4, "subcategories"."permalink" AS t1_r5 FROM "categories" LEFT OUTER JOIN
> "subcategories" ON "subcategories"."category_id"
>  = "categories"."id" WHERE (subcategories.id > 0)"
>
>
Indeed. I added a Project class with User has_many :projects to my test
project and can confirm:

010:0> a = Account.includes(:users).where("users.name like 'peter'")
  SQL (1.1ms)  SELECT "accounts"."id" AS t0_r0, "accounts"."number" AS
t0_r1, "accounts"."created_at" AS t0_r2, "accounts"."updated_at" AS t0_r3,
"users"."id" AS t1_r0, "users"."name" AS t1_r1, "users"."account_id" AS
t1_r2, "users"."created_at" AS t1_r3, "users"."updated_at" AS t1_r4 FROM
"accounts" LEFT OUTER JOIN "users" ON "users"."account_id" =
"accounts"."id" WHERE (users.name like 'peter')
=> [#<Account id: 1, number: "123", created_at: "2011-12-01 22:24:16",
updated_at: "2011-12-01 22:24:16">]
011:0> a1 = a.first
=> #<Account id: 1, number: "123", created_at: "2011-12-01 22:24:16",
updated_at: "2011-12-01 22:24:16">
012:0> a1.users.first
=> #<User id: 1, name: "peter", account_id: 1, created_at: "2011-12-01
22:24:16", updated_at: "2011-12-01 22:24:16">

This works as expected. 1 large query for accounts and associated users.

013:0> a1.users.first.projects
  Project Load (1.0ms)  SELECT "projects".* FROM "projects" WHERE
"projects"."user_id" = 1
=> [#<Project id: 1, name: "project_of_peter", user_id: 1, created_at:
"2012-01-07
10:51:51", updated_at: "2012-01-07 10:51:51">]

Also as expected. Since "projects" was not joined or included in the
original query,
we need a second query here (which is then not filtered and thus not the
results we want).

014:0> a = Account.includes(:users).where("users.name like
'peter'").joins(:users => :projects)
  Account Load (1.3ms)  SELECT "accounts".* FROM "accounts" INNER JOIN
"users" ON "users"."account_id" = "accounts"."id" INNER JOIN "projects" ON
"projects"."user_id" = "users"."id" WHERE (users.name like 'peter')
  User Load (0.9ms)  SELECT "users".* FROM "users" WHERE
"users"."account_id" IN (1)
=> [#<Account id: 1, number: "123", created_at: "2011-12-01 22:24:16",
updated_at: "2011-12-01 22:24:16">]

This is the getting closer to the root cause as you discovered now. As soon
as there is a "joins" added,
the :includes fails to Eagerly load only the _filtered_ users in a single
complex query, but goes to a second
query (that is not filtered on users; that would be "not filtered on
subcategories" in your code).

015:0> a = Account.includes(:users).where("users.name like
'peter'").includes(:users => :projects)
  SQL (1.5ms)  SELECT "accounts"."id" AS t0_r0, "accounts"."number" AS
t0_r1, "accounts"."created_at" AS t0_r2, "accounts"."updated_at" AS t0_r3,
"users"."id" AS t1_r0, "users"."name" AS t1_r1, "users"."account_id" AS
t1_r2, "users"."created_at" AS t1_r3, "users"."updated_at" AS t1_r4,
"projects"."id" AS t2_r0, "projects"."name" AS t2_r1, "projects"."user_id"
AS t2_r2, "projects"."created_at" AS t2_r3, "projects"."updated_at" AS
t2_r4 FROM "accounts" LEFT OUTER JOIN "users" ON "users"."account_id" =
"accounts"."id" LEFT OUTER JOIN "projects" ON "projects"."user_id" =
"users"."id" WHERE (users.name like 'peter')
=> [#<Account id: 1, number: "123", created_at: "2011-12-01 22:24:16",
updated_at: "2011-12-01 22:24:16">]
016:0> a.size
=> 1
017:0> a.first.users
=> [#<User id: 1, name: "peter", account_id: 1, created_at: "2011-12-01
22:24:16", updated_at: "2011-12-01 22:24:16">]
018:0> a.first.users.first.projects
=> [#<Project id: 1, name: "project_of_peter", user_id: 1, created_at:
"2012-01-07
10:51:51", updated_at: "2012-01-07 10:51:51">]

So, using :includes everywhere seems to not trigger this problem.

I don't know if that is an acceptable fix for your project (depending on
much
unneeded data that loads for you)?

I don't know if this is a "bug" or just misunderstanding the behavior of
:joins
and :includes in Eager loading from our side?

If I assume the latter (us not understanding it well enough), then
conclusions
could be:
* :includes is just a convenience for eager loading (performance
improvement)
  and not designed to work together with joins in the way we expected
* trying to use has_many relationships with conditions may be a solution ...

First added a "lang" column to projects and trying a condition on the
projects table
(fails in the same way):

012:0> a = Account.includes(:users).joins(:users =>
:projects).where(['projects.lang = ?', 'ruby'])
  Account Load (1.2ms)  SELECT "accounts".* FROM "accounts" INNER JOIN
"users" ON "users"."account_id" = "accounts"."id" INNER JOIN "projects" ON
"projects"."user_id" = "users"."id" WHERE (projects.lang = 'ruby')
  User Load (0.5ms)  SELECT "users".* FROM "users" WHERE
"users"."account_id" IN (1)
=> [#<Account id: 1, number: "123", created_at: "2011-12-01 22:24:16",
updated_at: "2011-12-01 22:24:16">]

This is still correct. Only 1 account with id 1 has a user (peter) with a
project with lang "ruby".

013:0> a1 = a.first
=> #<Account id: 1, number: "123", created_at: "2011-12-01 22:24:16",
updated_at: "2011-12-01 22:24:16">

Getting that first (and only) account from the array.

014:0> a1.users
=> [#<User id: 1, name: "peter", account_id: 1, created_at: "2011-12-01
22:24:16", updated_at: "2012-01-07 11:28:20", gender: "M">, #<User id: 3,
name: "Sarah", account_id: 1, created_at: "2012-01-06 21:23:22",
updated_at: "2012-01-07 11:28:24", gender: "F">]

But this is not what we want. We get all users from account 1, not just the
users with a "ruby" project.

015:0> a1.users.first.projects
  Project Load (0.9ms)  SELECT "projects".* FROM "projects" WHERE
"projects"."user_id" = 1
=> [#<Project id: 1, name: "project_of_peter", user_id: 1, created_at:
"2012-01-07
10:51:51", updated_at: "2012-01-07 11:38:15", lang: "ruby">, #<Project id:
2, name: "coffeescript_project", user_id: 1, created_at: "2012-01-07 11:39:10",
updated_at: "2012-01-07 11:39:10", lang: "coffee">]

And similar problem, for user 'peter' we get all projects, also the ones in
a different language.

016:0> a1.users[1].projects
  Project Load (0.8ms)  SELECT "projects".* FROM "projects" WHERE
"projects"."user_id" = 3
=> []

And user 'sarah' has not projects at all, but still in the list of users.


OK, trying another solution ...

class Account < ActiveRecord::Base
  has_many :users

  attr_accessor :lang

  has_many :users_by_project_lang,
           :class_name => "User",
           :include => :projects,
           :conditions => Proc.new { ["projects.lang = ?", lang] }
end

I add a special has_many relationship on the Account class
(that would be your Category class). Now using that to
find the "users" LIMITED to a certain project_lang
(in your case that would be subcategories, limited to a certain
product.gender).

018:0> a1.lang='ruby'
=> "ruby"

The conditions will be late evaluated against the "self" (that is a1 here).
So
we need to set it on a1.

019:0> a1.users_by_project_lang
  SQL (1.5ms)  SELECT "users"."id" AS t0_r0, "users"."name" AS t0_r1,
"users"."account_id" AS t0_r2, "users"."created_at" AS t0_r3,
"users"."updated_at" AS t0_r4, "users"."gender" AS t0_r5, "projects"."id"
AS t1_r0, "projects"."name" AS t1_r1, "projects"."user_id" AS t1_r2,
"projects"."created_at" AS t1_r3, "projects"."updated_at" AS t1_r4,
"projects"."lang" AS t1_r5 FROM "users" LEFT OUTER JOIN "projects" ON
"projects"."user_id" = "users"."id" WHERE "users"."account_id" = 1 AND
(projects.lang = 'ruby')
=> [#<User id: 1, name: "peter", account_id: 1, created_at: "2011-12-01
22:24:16", updated_at: "2012-01-07 11:28:20", gender: "M">]

Now we get back our "complex" query, filtered on a parameter of the the
joined products table.

021:0> a1.users_by_project_lang.each{|user| puts "user = #{user} with
projects #{user.projects}"}
user = #<User:0x94cd7c8> with projects [#<Project id: 1, name:
"project_of_peter", user_id: 1, created_at: "2012-01-07 10:51:51",
updated_at: "2012-01-07 11:38:15", lang: "ruby">]
=> [#<User id: 1, name: "peter", account_id: 1, created_at: "2011-12-01
22:24:16", updated_at: "2012-01-07 11:28:20", gender: "M">]

And now, only the correct users ('peter') with the filtered projects (with
lang 'ruby') are returned.

So, I presume, if you make a similar has_many in Category

has_many :subcategories_by_product_gender

this would work.

But ... since the has_many only has a :include (and not a :joins) options,
this seems
equivalent to the first option of using all :includes for the entire chain
(and
no :joins). So, in reality, it does not seem to be better than that first
option
(which has the disavantage of loading all the associated info, even if you
so not need it).

An alternative might be to make a dedicated has_many and use :finder_sql
to hand-code the SQL ...

http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html

":finder_sql
Specify a complete SQL statement to fetch the association. This is a good
way
to go for complex associations that depend on multiple tables."

It seems this feature request seems complex enough to warrant that.

I would certainly be interested in a "proper" solution for this issue
(using a
mix of :includes and :joins, where the :includes are eagerly loaded in
the SELECT part in the first query (so these results are filtered) and the
:joins are only used to eveluate conditions or ordering, etc. but not have
to be included in the SELECT part).

I looked into scopes, but I failed to find a proper way to apply a scope to
a has_many relationship ... (maybe with_scope {}, but that seems to be
depricated ?).

HTH,

Peter

-- 
Peter Vandenabeele
http://twitter.com/peter_v
http://rails.vandenabeele.com

-- 
You received this message because you are subscribed to the Google Groups "Ruby 
on Rails: Talk" group.
To post to this group, send email to rubyonrails-talk@googlegroups.com.
To unsubscribe from this group, send email to 
rubyonrails-talk+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en.

Reply via email to