On Mon, Jun 2, 2008 at 8:53 AM, Russell Keith-Magee <[EMAIL PROTECTED]> wrote: > > On Sun, Jun 1, 2008 at 9:04 PM, Nicolas Lara <[EMAIL PROTECTED]> wrote: >> >> I dont mind changing it to indexes. To me it is more readable with the >> list copying/slicing. I might do a bit of profiling for that to see >> the efficiency gain or simple change it to indexes. I'll write back >> about this soon. > > If we were talking about a complex calculation or difficult logic to > extract the sublist, I might agree, but in this case, it's mostly a > case of fields[0:last_field] and fields[first_aggregate:]. I'm not > convinced that sublists make that any easier to read. > > I've noted some rough profiling numbers in another post; those numbers > would seem to support my conjecture that creating the temporary list > has a non-trivial cost. >
I've changed most of the list copies. I am quite not sure it improves readability but it does have a small performance boost (in the really small dataset I used for testing). The readability I think I lost in the code I recovered with comments =). >>> * Be careful about generating efficient SQL, not just correct SQL. In >>> particular, with the grouping clause: remember that: >> >> Same comment as Tim. If we group by pk then we can only SELECT the pk. > > As I replied to Tim - my mistake. I've been in MySQL world, and I lost > track of MySQL's eccentricities. > > That said, if there was a way to accommodate this eccentricity, it > would be nice. Using implicit GROUP BY is a significant performance > boost under MySQL. I know this would be hard to do - if we can't, it > isn't a dealbreaker. > I would also like this to be done. I haven't looked much into the backends code but, though it might not be trivial, it shouldn't be too hard either. I will definitely look into this. > >> So far I like the idea of using something like: >> >> values('field','field', groups='group_name') >> >> So the user can pick the name of the grouping. This way we allow to >> have more meaningful names for the list of objects. For example: >> >> Book.objects.all().values('price', >> group_as='books_for_price').annotate(mean_age=Avg('authors__age')) > > This gets around the potential name clash problem, but it leaves one > of the bigger issues - the fact that the output become mixed. As it > currently stands, the values dictionary contains key-value pairs which > are field and field-value. However, the group key returns a queryset > value. This is a different type of data, with different scope - all > the fields in a values dictionary have the scope of a single instance, > except the groups member, which is a collection of instances. > > I'm not a big fan of mixing outputs like this. To my mind, if you have > two different types of output, or two different scopes of output, they > should be separated somehow (for example, as a tuple of outputs). This > isn't an entirely academic preference - consider that using your > proposed output format, it's difficult to do something as simple as: > > for values in Author.objects.values('name','age'): > for name,age in values.items(): > print name, age > > Simple use cases like this need to remain simple. And they will. The user would only need to worry about the mixed output when he is actually requesting the mixed output. Actually I think you made a mistake in your example, because values.items() wouldn't return the tuple (name, age) but the tuple (key, value) from the dict. for which you could simply do: for values in Author.objects.values('name','age'): for key,val in values.items(): print key, val It is true that for doing the same when you request the grouped objects you would need to complicate stuff a little bit. But chances are that if you requested to get the queryset you would actually need/want it. I do think it is better to some extent to have the same type of objects in a dict, but I also think in this case it wouldn't be a problem since the the mixed output would be explicitly requested by the user. We could even add a method to ValuesQuerySet to 'clean' the output avoiding the queryset from being returned (in case you received the queryset from some another module and want to handle it without knowing what's inside). we could even add a special method to return the values with the group queryset and have the default iterator exclude the groups. The problem I see with the tuple approach is that, well, you have a tuple instead of a dict. So values would in some cases return a list of dicts and in others a list of tuples, which I found more inconsistent than mixed types inside a dict. Also you would need to handle the tuple just to get to your data. instead of: for auth in Author.objects.values('name', 'age').annotate(....): print auth['name'] you would need to do: for auth in Author.objects.values('name', 'age').annotate(...): print auth[1]['name'] #or auth[0] I think any of the options I present above might be very simple and straight forward. What do you think? Cheers, -- Nicolas --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---