Hi,

I've got a problem that takes a bit of explaining, but it's relatively simple when you get down to it. This is another django-related thing, but the issue itself is pure python.

I made a custom class, called an EventEngine, which represents a span of time. You initialize it with a queryset/list of django model instances, and the resulting EventEngine object has an 'events' attribute, which is a queryset/list of model instances that fall within that span of time, according to a datetime attribute on the django model.

Next come EventEngine subclasses, representing commonly used spans of time – Month, Week, Day, etc. The EventEngine __iter__ method is overridden so that each of these subclasses, when iterated over, produces child instances of the next smallest EventEngine subclass. Iterating on a Month produces Weeks, iterating on a Week produces Days and so on.

As each parent produces its children, the events in the 'events' attribute get partitioned out to the children as appropriate to their date ranges. Right now I'm doing this as stupidly as possible: for each child 'c' in the parent, I loop over all the parent events, and append them to c.events if their date attribute falls within the child's date range:

for e in self.events:
  if c.start <= getattr(e,self.start_attr) < c.stop:
  c.events.append(e)

This is stupid for (at least) two reasons: 1) it evaluates the entire queryset and sticks it in memory with the first child instance, obviating the whole point of an iterable, and 2) it loops over the entire event list for every child, even though the list is already properly sorted by date, and we could break from the loop with the first event that falls outside the child's date range. Performance is important, as some madman could initialize a Year with a huge queryset, and do nested iterations all the way down to Minute.

So I've got a few considerations:

1. The parent's events queryset should only be evaluated as much as is needed to yield one child at a time. 2. The parent's events queryset should not be duplicated as it is distributed among the children, ie it should make references and not copies (I don't know how to make sure this happens). 3. I don't want to empty the parent's 'events' list as it is distributed among the children, as other methods depend on the list being there. 4. The events attribute of the 'top level' instance is a django queryset, but it will be a list for all descendent instances (they can be nested arbitrarily deeply). This limits the number of functions available – ie you can't pop() a django queryset.

At first I thought the bisect module was the way to go, but it is too tightly tied to integer list indices, and works very awkwardly when bisecting on datetime attributes. List slices produce copies, unless I'm mistaken. Something tells me that a plain old while loop with a break might do the trick, but I can't quite see it clear. Ideally this would be something that starts iterating where the date attribute falls into the right range, and then stops iterating as soon as it passes out of that range. Probably I'll have to settle for just one or the other, but one can dream.

Thanks for reading all this. I'm hoping that all the restrictions and conditions will make the solution more obvious, rather than less obvious, to those who know more than I...

--Eric
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to