----------------------------------------------- Israel Brewster Systems Analyst II Ravn Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7293 -----------------------------------------------
BEGIN:VCARD VERSION:3.0 N:Brewster;Israel;;; FN:Israel Brewster ORG:Frontier Flying Service;MIS TITLE:PC Support Tech II EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com TEL;type=WORK;type=pref:907-450-7293 item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701; item1.X-ABADR:us CATEGORIES:General X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson END:VCARD
On Oct 10, 2014, at 1:04 PM, Jim Nasby <jim.na...@bluetreble.com> wrote: > On 10/8/14, 3:17 PM, Israel Brewster wrote: >> Except that the last data point received is still valid as the aircraft's >> current location, even if it came in several hours ago - which we may well >> care about. That is, some users may need to see where a given aircraft (or >> the entire fleet) is, even if an aircraft hasn't updated in a while. That >> said, I did discuss this with my higher-ups, and got the ok to take it down >> to four hours. > > Note that in your explain output nothing is filtering by time at all; are you > sure you posted the right explain? > >>> I don't think PostgreSQL is going to be able to reason very effectively >>> about a ROW_NUMBER() in a inner table and then a row<=5 in the outer one >>> being equivalent to a LIMIT query for which it could walk an index and then >>> stopping once it finds 5 of them. >>> >>> Does this need to issued as a single query? Why not issue 55 different >>> queries? It seems like the client is likely going to need to pick the >>> returned list back out by tail number anyway, so both the client and the >>> server might be happier with separate queries. >> >> Good point. Doing that brought the entire execution time down to around >> 60ms. Just ever so slightly better than the ~1200ms I was getting before. >> :-) I just have an unreasonable aversion to making multiple queries. I guess >> this is a prime example of why I need to overcome that :-) > > Do you actually need the last 5 points? If you could get away with just the > most recent point, SELECT DISTINCT ON might do a better job of this in a > single query. At the moment, unfortunately yes - I have to do some calculations based on the past few data points. At some point I should be able to re-work the system such that said calculations are done when the points are saved, rather than when they are retrieved, which would be beneficial for a number of reasons. However, until I can get that done I need multiple points here. > > As for the concern about issuing multiple queries, if you code this into a > database function it should still be quite fast because there won't be any > round-trip between your application and the database. I've had phenomenally bad luck with coding queries into database functions. I had a number of functions written at one point that allowed me to do things like select <table>.function, <other_column> FROM table - until I noticed that said queries ran significantly slower than just doing the query I had encoded in the function as a sub-query instead. I was doing these same sub-queries in a bunch of different places, so I figured it would clarify things if I could just code them into a DB function that I called just like a column. It's been a while since I looked at those, however, so I can't say why they were slow. This usage may not suffer from the same problem. > > Something else to consider is having a second table that only keeps the last > X aircraft positions. I would do this by duplicating every insert into that > table via a trigger, and then have a separate process that ran once a minute > to delete any records other than the newest X. Because that table would > always be tiny queries against it should be blazing fast. Do note that you'll > want to vacuum that table frequently, like right after each delete. Indeed. I think I'm happy with the performance of the multiple queries, but this would doubtless be the "best" option (from a performance standpoint), as the table would be small and my select would be essentially SELECT * FROM TABLE, with a potential WHERE ... IN... clause. Thanks for all the help! > -- > Jim Nasby, Data Architect, Blue Treble Consulting > Data in Trouble? Get it in Treble! http://BlueTreble.com > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
-- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general