Re: Trouble expressing a query in the ORM

2011-09-12 Thread Stuart
On Sep 9, 8:28 am, Pewpewarrows  wrote:
> Tim Shaffer's response would have you doing N+1 queries, and having to loop
> through all of your Target objects in-memory. Technically it would work, but
> as soon as you have a decently sized amount of data in there it'd slow to a
> crawl.
>

I just wanted to chime in and point out that select_related() is a
helpful tool for avoiding n+1 queries in django.

   https://docs.djangoproject.com/en/1.3/ref/models/querysets/#select-related

It may not be particularly suitable for this particular problem (due
to the requirement of bringing back only the most recent property on
each target) but I wanted to mention it for the archives since no one
else had.


--Stuart

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Christophe Pettus

On Sep 9, 2011, at 9:44 AM, Pewpewarrows wrote:

> I don't understand the 10-15 limit either. I've done giant "in" queries in 
> the past that perform fine with large data sets and proper indexing.

You can set the rule of thumb whereever you like; eventually, the curves 
between an IN and a join will cross, however.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Christophe Pettus

On Sep 9, 2011, at 8:41 AM, Simon Riggs wrote:

> Not sure I understand the limitation to 10-15 entries.

That's my rule of thumb as to when I prefer to re-express a query as a join 
rather than an IN.  Larger values can work just fine, but __in in Django tends 
to be abused with gigantic inclusion sets.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Pewpewarrows



The best way I can see to do this straight with the Django ORM is:
>>
>> prop_dates = 
>> Target.objects.annotate(latest_property=Max('property__export_date')).values_list('latest_property',
>>  
>> flat=True)
>> properties = Property.objects.filter(export_date__in=prop_dates)
>>
>
> Let say I have two targets which I'll call target1 and target2.
>
> Target1 have a property for yesterday and one for today. Target2 only have 
> a property for yesterday. Both yesterday and today will be included in 
> prop_dates since they are the latest for at least one target. And then 
> Target1 will have two entries in properties while it should only have one 
> for today.
>

I don't follow. The first instruction fetches all the Target objects, along 
with the latest Property related to each Target, and only selects those 
Properties' dates. You're saying Target1 has a Property A from yesterday, 
and B from today. Target 2 has a property C from yesterday. Only B and C's 
dates would be returned, *not* A's.

 

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/0oOvXPhajiwJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Pewpewarrows
I don't understand the 10-15 limit either. I've done giant "in" queries in 
the past that perform fine with large data sets and proper indexing.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/wNEMxZhYgJEJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Thorsten Sanders

On 09.09.2011 17:36, Daniel Gagnon wrote:



On Fri, Sep 9, 2011 at 9:28 AM, Pewpewarrows > wrote:


Tim Shaffer's response would have you doing N+1 queries, and
having to loop through all of your Target objects in-memory.
Technically it would work, but as soon as you have a decently
sized amount of data in there it'd slow to a crawl.


Indeed and I have to sort and pick a subset of the sort after that. I 
can't run a loop through thousands and thousands of items to only 
select a hundred.


The best way I can see to do this straight with the Django ORM is:

prop_dates =

Target.objects.annotate(latest_property=Max('property__export_date')).values_list('latest_property',
flat=True)
properties = Property.objects.filter(export_date__in=prop_dates)


Let say I have two targets which I'll call target1 and target2.

Target1 have a property for yesterday and one for today. Target2 only 
have a property for yesterday. Both yesterday and today will be 
included in prop_dates since they are the latest for at least one 
target. And then Target1 will have two entries in properties while it 
should only have one for today.



Which comes out to two queries. If you absolutely had to, you
could execute some raw SQL to narrow it down to one query. Please
note that there is a (very) small possibility that the second
query might return extra Properties for a given Target. Because
it's operating based on date-timestamps, there could be two
Properties that happen to have the exact same export_date, one of
which happens to be the most recent for a given Target.
Eliminating those duplicates, if you want to account for that
scenario, can be done in-memory or I believe through some
adjustments to the second line above.


I just designed a SQL query, I am wondering if I couldn't translate 
the logic to the ORM. It looks like this:


select * from Target_Mgmt_target as t
inner join (select *,max(export_date) as max_export_date from 
Target_Mgmt_property group by target_id) as p on p.target_id = t.id 
 and p.export_date = p.max_export_date;

--

from django.db.models import Max
Target_Mgmt_target.objects.annotate(export_date=Max('Target_Mgmt_property__export_date')).values(you want>)


This should work

--
You received this message because you are subscribed to the Google Groups "Django 
users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Simon Riggs
On Fri, Sep 9, 2011 at 4:33 PM, Christophe Pettus  wrote:
>
> On Sep 9, 2011, at 6:28 AM, Pewpewarrows wrote:
>
>> prop_dates = 
>> Target.objects.annotate(latest_property=Max('property__export_date')).values_list('latest_property',
>>  flat=True)
>> properties = Property.objects.filter(export_date__in=prop_dates)
>
> Note that if prop_dates has more than 10-15 entries, it's going to perform 
> badly (at least on PostgreSQL, and almost certainly on MySQL too).  I think 
> this particular situation is definitely a .raw() opportunity.

Not sure I understand the limitation to 10-15 entries. Please explain?
Or EXPLAIN?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Daniel Gagnon
On Fri, Sep 9, 2011 at 9:28 AM, Pewpewarrows  wrote:

> Tim Shaffer's response would have you doing N+1 queries, and having to loop
> through all of your Target objects in-memory. Technically it would work, but
> as soon as you have a decently sized amount of data in there it'd slow to a
> crawl.
>
>
Indeed and I have to sort and pick a subset of the sort after that. I can't
run a loop through thousands and thousands of items to only select a
hundred.


> The best way I can see to do this straight with the Django ORM is:
>
> prop_dates =
> Target.objects.annotate(latest_property=Max('property__export_date')).values_list('latest_property',
> flat=True)
> properties = Property.objects.filter(export_date__in=prop_dates)
>

Let say I have two targets which I'll call target1 and target2.

Target1 have a property for yesterday and one for today. Target2 only have a
property for yesterday. Both yesterday and today will be included in
prop_dates since they are the latest for at least one target. And then
Target1 will have two entries in properties while it should only have one
for today.


>
> Which comes out to two queries. If you absolutely had to, you could execute
> some raw SQL to narrow it down to one query. Please note that there is a
> (very) small possibility that the second query might return extra Properties
> for a given Target. Because it's operating based on date-timestamps, there
> could be two Properties that happen to have the exact same export_date, one
> of which happens to be the most recent for a given Target. Eliminating those
> duplicates, if you want to account for that scenario, can be done in-memory
> or I believe through some adjustments to the second line above.
>

I just designed a SQL query, I am wondering if I couldn't translate the
logic to the ORM. It looks like this:

select * from Target_Mgmt_target as t
inner join (select *,max(export_date) as max_export_date from
Target_Mgmt_property group by target_id) as p on p.target_id = t.id and
p.export_date = p.max_export_date;

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Christophe Pettus

On Sep 9, 2011, at 6:28 AM, Pewpewarrows wrote:

> prop_dates = 
> Target.objects.annotate(latest_property=Max('property__export_date')).values_list('latest_property',
>  flat=True)
> properties = Property.objects.filter(export_date__in=prop_dates)

Note that if prop_dates has more than 10-15 entries, it's going to perform 
badly (at least on PostgreSQL, and almost certainly on MySQL too).  I think 
this particular situation is definitely a .raw() opportunity.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Pewpewarrows
Tim Shaffer's response would have you doing N+1 queries, and having to loop 
through all of your Target objects in-memory. Technically it would work, but 
as soon as you have a decently sized amount of data in there it'd slow to a 
crawl.

The best way I can see to do this straight with the Django ORM is:

prop_dates = 
Target.objects.annotate(latest_property=Max('property__export_date')).values_list('latest_property',
 
flat=True)
properties = Property.objects.filter(export_date__in=prop_dates)

Which comes out to two queries. If you absolutely had to, you could execute 
some raw SQL to narrow it down to one query. Please note that there is a 
(very) small possibility that the second query might return extra Properties 
for a given Target. Because it's operating based on date-timestamps, there 
could be two Properties that happen to have the exact same export_date, one 
of which happens to be the most recent for a given Target. Eliminating those 
duplicates, if you want to account for that scenario, can be done in-memory 
or I believe through some adjustments to the second line above.

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/Hfzi77PqKukJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Re: Trouble expressing a query in the ORM

2011-09-09 Thread Tim Shaffer
Maybe something like this...

for target in Target.objects.all():
property = target.property_set.latest("export_date")
print target.name, target.t1, target.t2, property.export_date, 
property.p1, property.p2

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/django-users/-/WzLgTBlwdMcJ.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.



Trouble expressing a query in the ORM

2011-09-08 Thread Daniel Gagnon
Hi.

I have a query I am trying to perform in the ORM and I'm not sure how to do
it.

I have two models: Target and Property with a one-to-many relationship. One
Target can have many Properties.

I want to list all targets along with their latest properties (as stored
under Property.export_date) sorted by fields on Property.

So if I have this simplified model:

Target
  - name
  - t1
  - t2

Property
  - fk_to_target
  - export_date
  - p1
  - p2

I want to have a list of record each having [name, t1, t2, export_date, p1,
p2] with every target only being once in the list with its latest property.

How do I make this happen in the ORM?

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.