Gernot,

        While there may be syntactically better ways to construct your 
particular DM queries, I don't necessarily think you're going about it all 
wrong.  The truth is that the current logic behind generating JOINs for many 
sophisticated (multi-table) join scenarios is plain broken.

The good news is that it just so happens there's a recent/current discussion + 
forked branch or two focused on refactoring the Query/JOIN construction logic.  
The bad news is it probably won't be production ready any time soon.  YMMV, but 
I've personally resigned myself to using raw SQL in these sorts of 
performance-is-important scenarios.

cheers,
--jordan


On Sep 1, 2010, at 6:52 AM, Gernot wrote:

> Hi all
> 
> We recently started a large project on ruby on rails. This project
> contains complex business logic and has a complex data model. We
> decided to go with datamapper instead of activerecord, since I like
> datamapper's features, especially properties and identity map support.
> 
> I am now working on the first query that is a little more complex. The
> models look like this (this is a simplified version of our models with
> generic model names and only those properties that are of interest
> here):
> 
> class Container
> 
>  include DataMapper::Resource
> 
>  property :id,                 Serial
>  property :valid_from,    Date, :required => true, :index => true
>  property :valid_through, Date, :index => true
> 
>  has n, :sub_containers
> 
> end
> 
> class SubContainer
> 
>  include DataMapper::Resource
> 
>  property :id,                 Serial
>  property :valid_from,    Date, :index => true
>  property :valid_through, Date, :index => true
> 
>  belongs_to :container
>  has n, :units
> 
> end
> 
> class SubContainer
> 
>  include DataMapper::Resource
> 
>  property :id,                 Serial
>  property :valid_from,    Date, :index => true
>  property :valid_through, Date, :index => true
> 
>  belongs_to :sub_container
> 
> end
> 
> As you can see, all three classes have properties that describe an
> optional data range where the object ist valid. Now, I want to get all
> units that are valid within a given date range. So I implemented
> scopes on these three classes. valid_though can be nil which means
> that the object is valid infinitely. Each object has to respect the
> valid_through date range of its parent. Here's the code:
> 
> class Container
> 
>  def self.valid_within(daterange)
>    all(:valid_from.lte => daterange.last) & (all(:valid_through.gte
> => daterange.first) | all(:valid_through => nil))
>  end
> 
> end
> 
> class SubContainer
> 
>  def self.valid_within(daterange)
>    (all(:valid_from => nil)  | all(:valid_from.lte =>
> daterange.last)) &
>    (all(:valid_through.gte => daterange.first) | all(:valid_through
> => nil)) &
>    all(:container => Container.valid_through(daterange))
>  end
> 
> end
> 
> class SubContainer
> 
>  def self.valid_within(daterange)
>    (all(:valid_from => nil)  | all(:valid_from.lte =>
> daterange.last)) &
>    (all(:valid_through.gte => daterange.first) | all(:valid_through
> => nil)) &
>    all(:haus => Haus.valid_through(daterange))
>  end
> 
> end
> 
> Now I can query my units like this:
> 
> Unit.valid_through(Date.new(2010,1,1)...Date.new(2010,1,31))
> 
> This query works, but it generates SQL that hits the database hard. In
> real life we will have about 150'000 units, 5'000 subcontainers and
> 1'000 containers. The SQL I was expecting to see would look like this:
> 
> select distinct u.* from units u
>                                 inner join sub_containers sc on 
> u.sub_container_id
> = sc.id
>                                 inner join containers c on sc.container_id = 
> c.id
>                    where
> 
>                      c.valid_from <= '20100131'
>                      and(sc.valid_from is null or sc.valid_from >=
> '20100131')
>                      and(u.valid_from is null or u.valid_from >=
> '20100131')
> 
>                      -- Verwaltungsende darf nicht vor der
> Verarbeitungsperiode liegen
>                      and(c.valid_through is null or c.valid_through
>> = '20100101')
>                      and(sc.valid_through is null or sc.valid_through
>> = '20100101')
>                      and(u.valid_through is null or u.valid_through
>> = '20100101')
> 
> Instead, datamapper generates a query with unnecessary group by
> clauses and sub selects (in) instead of using inner joins. The
> datamapper query is way slower than my handwritten SQL statement. I
> already found out how to get rid of the group by clauses. And I don't
> want to have pure SQL in our code, we want to stay database
> independent. So here are my questions, finally:
> 
> - Is there a way to set the default for group by's (:unique => ?) to
> "false"? I knwo what I am doing and I would like to turn it on only
> when I really need it.
> - Am I doing this all wrong or is there a way to force datamapper to
> use inner joins?
> 
> Thank you very much
> Gernot Kogler
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "DataMapper" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/datamapper?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"DataMapper" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/datamapper?hl=en.

Reply via email to