Thanks for this explanations! To add more info, I can say that the elements on left column are naturally ordered, I have thousands of records(~5000record/day) for each station but only a few station_id(~5-10stations totally). Instead of making a complex query, a cleaner approach would be to run the query with only the join and then do a loop in python to remove the unnecessary lines.
Moreover, right now, the worst think is that since I switched to postgres the initial query (the one with only the leftjoin) is taking minutes to be executed (almost 4m) against just few seconds on sqlite. I set an index on the mac field through pgAdmin and the results are really better, unfortunately this hack is not portable, is there a way to define indexes from web2py? Paolo 2013/1/8 Niphlod <niph...@gmail.com> > after a full day at work I may lack the usual fantasy, but what you're > trying to do can't be even achieved by a "relatively simple" windowed > function.... > What you're trying to do is recursing..... way out of DAL reach (if you > want to do it in a single query) > Trying to explain better.... > > Real data helps..... > For every station_id record with the same mac address you want to find the > "min gathered_on" record from the same table (with another station_id) and > "subtract it" for every next possible match. > > One thing is requiring > > "2013-01-21 11:23:35";"a";127167;"2013-01-21 11:23:45";"a";127168 > "2013-01-21 11:23:00";"a";127169;"2013-01-21 11:23:45";"a";127168 > That can be accomplished by something like this > > select * from ( > select start_point.gathered_on,start_point.mac,start_point.id, > end_point.gathered_on,end_point.mac,end_point.id, > row_number() over (partition by start_point.id order by end_point. > gathered_on) as filter_field > from record as start_point > inner join > record as end_point > on start_point.mac = end_point.mac > and start_point.gathered_on <= end_point.gathered_on > where start_point.station_id = 13 > and end_point.station_id = 14 > ) as q > where q.filter_field = 1 > > because for the record 127167 the next record with another station_id is > 127168, but then for the 127169 you don't want the 127168, you want 127170 > because 127168 "has been booked before" by 127169. > > Honestly, (beware of the lack of fantasy :P) I'd do a loop in python > instead of using recursing in the db itself unless you have zillions of > "windows" (i.e. you have 1000 station_id = 13 and 1000 station_id = 14, and > 1000 distinct station_id), just because it's more readable than what it > would be needed in raw sql.... > > -- > > > > -- Paolo --