Well I expected to learn a thing or two, but I didn't expect not to understand the suggestions at all! :) Thanks to everyone who responded, and sorry for the typo (it was meant to be object_id throughout, not content_type).
So far Michael's solution works and is most comprehensible to me – ie I stand a fighting chance of figuring it out. I didn't realize you could make direct use of SortedDicts in django, so that's worth a try, too. Itertools.groupby is totally impenetrable to me, but works as well! Is there any consensus about which might go faster? I will go right now and google until my brain is full. Thanks again, Eric On Mar 4, 2008, at 12:56 AM, Michael H. Goldwasser wrote: > > > Hello Eric, > > Your basic outlook is fine, but you can do it much more efficiently > with a single sort. Here's the way I'd approach the task > (untested): > > # -------------------------------------------------- > # first compute the latest date for each id group; uses O(n) time > newest = {} > for q in queryset: > id,date = q.object_id, q.submit_date > if id not in newest or date > newest[id]: > newest[id] = date > > # now sort based on the following decorator as a key > data.sort(reverse=True, key=lambda x: (newest[x.object_id], > x.object_id, x.submit_date)) > # -------------------------------------------------- > > In essence, you compute the max date within each group, but your > approach (i.e., building explicit sublists and then repeatedly > calling max on those sublists) is far more time-consuming than the > above dictionary based approach. > > Note well that using a tuple as a decorator key is more efficient > than calling sort separately for each subgroup. The > lexicographical order of the following > > (newest[x.object_id], x.object_id, x.submit_date) > > should produce the order that you desire. The reason for the middle > entry is to ensure that items groups by object_id in the case that > two different groups achieve the same maximum date. It wasn't clear > to me whether you wanted elements within groups from oldest to > newest or newest to oldest. I believe that the code I give produces > the ordering that you intend, but you may adjust the sign of the > decorate elements if necessary. > > With regard, > Michael > > +----------------------------------------------- > | Michael Goldwasser > | Associate Professor > | Dept. Mathematics and Computer Science > | Saint Louis University > | 220 North Grand Blvd. > | St. Louis, MO 63103-2007 > > > On Monday March 3, 2008, Eric Abrahamsen wrote: > >> I have a grisly little sorting problem to which I've hacked >> together a >> solution, but I'm hoping someone here might have a better >> suggestion. >> >> I have a list of objects, each of which has two attributes, >> object_id >> and submit_date. What I want is to sort them by content_type, >> then by >> submit_date within content_type, and then sort each content_type >> block >> according to which block has the newest object by submit_date. >> (This >> sequence of sorting might not be optimal, I'm not sure). I'm >> actually >> creating a list of recent comments on blog entries for a python- >> based >> web framework, and want to arrange the comments according to blog >> entry (content_type), by submit_date within that entry, with the >> entries with the newest comments showing up on top. >> >> I don't believe a single cmp function fed to list.sort() can do >> this, >> because you can't know how two objects should be compared until you >> know all the values for all the objects. I'd be happy to be proven >> wrong here. >> >> After some false starts with dictionaries, here's what I've got. >> Queryset is the original list of comments (I'm doing this in >> django), >> and it returns a list of lists, though I might flatten it >> afterwards. >> It works, but it's ghastly unreadable and if there were a more >> graceful solution I'd feel better about life in general: >> >> >> def make_com_list(queryset): >> ids = set([com.object_id for com in queryset]) >> xlist = [[com for com in queryset if com.object_id == i] for >> i in >> ids] >> for ls in xlist: >> ls.sort(key=lambda x: x.submit_date) >> xlist.sort(key=lambda x: max([com.submit_date for com in x]), >> reverse=True) >> return xlist >> >> I'd appreciate any hints! >> >> Thanks, >> Eric >> > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor