well, same old, same old.... sorry for the introduction, but I work as a DBA and query tuning/optimization is in front of my eyes for more than 8 hours a day ;-)
The first thing you learn in this job is try. The second is try. The third is try. The fourth is "understand the complexity". The fifth is retry ;-) Actually this is a pretty common request, and, first things first, relational databases sucks at this kind of queries. Just to clear out some doubts, I'll put some key points on this requirement: - you have a table - you want only one record for max and one record for min salaries (e.g. : if jobs and gates have the same salary, extract only one of them, no matter what) In addition, I'll clear some more doubts to you: when you use the DAL, every query is made in the same connection. If you use multiple queries, they use the same connection (no additional roundtrips for connections, that have some impact on timings). Once the connection is established, one can argue that making two distinct queries is anyway slower that making two queries, and this is generally considered true, but watch out the tests!! Let's break some method to achieve you resultset: method1 (you method): select name, salary from (select min(salary) as minsal from emp) as x inner join emp as t on t.salary = x.minsal union select name, salary from (select max(salary) as maxsal from emp) as x inner join emp as t on t.salary = x.maxsal it has a problem... if your table is populated with common salaries (for example, jobs and gates have a salary of 5), you have more than 2 rows, so the final print statements report incorrect data. Let's say, for the sake of the discussion, that you have distinct salaries in your table. method2: SELECT * FROM (SELECT emp.name, emp.salary FROM emp ORDER BY emp.salary LIMIT 1 OFFSET 0) AS a UNION ALL SELECT * FROM (SELECT emp.name, emp.salary FROM emp ORDER BY emp.salary DESC LIMIT 1 OFFSET 0) AS b PS: UNION ALL is used to be sure to return exactly two rows, UNION "filters out" the same rows which can be composed "dinamically" as: minquery = db()._select(db.emp.name, db.emp.salary, orderby=db.emp.salary, limitby=(0,1)) maxquery = db()._select(db.emp.name, db.emp.salary, orderby=~db.emp.salary, limitby=(0,1)) final_statement = 'SELECT * FROM (%s) AS a UNION ALL SELECT * FROM (%s) AS b' % (minquery[:-1], maxquery[:-1]) method3: two distinct queries: SELECT emp.name, emp.salary FROM emp ORDER BY emp.salary LIMIT 1 OFFSET 0 SELECT emp.name, emp.salary FROM emp ORDER BY emp.salary DESC LIMIT 1 OFFSET 0 they can be composed as minquery = db()._select(db.emp.name, db.emp.salary, orderby=db.emp.salary, limitby=(0,1)) maxquery = db()._select(db.emp.name, db.emp.salary, orderby=~db.emp.salary, limitby=(0,1)) So, method2 and method3 are "fullfilling" your request to have some way to write this kind of queries abstracted in DAL, in some ways. method3 pisses you off because of the "multiple queries" involved. Always for the sake of discussion, let me test this three methods. Let's say you want to measure actual db performance, using db.executesql() for all of the three methods with the plain text query used for every method (we want to measure db performance without our results being "compromised" by the DAL actually abstracting your queries) All tests were done initializing the db, and timing only the db.executesql() function, repeated 10000 times. Results, as always with timeit, are in seconds. test1: 5 records in the db, sqlite method1 : 0.867677211761 method2 : 1.0282201767 method3 : 1.17870497704 test2: 5 records in the db, sqlite, column salary indexed method1 : 0.867677211761 method2 : 0.810430049896 method3 : 0.806806087494 test3: 5000 records in the db, sqlite method1 : 59.733631134 method2 : 100.970735073 method3 : 100.889001131 test4: 5000 records in the db, sqlite, column salary indexed method1 : 31.1111619473 method2 : 0.830200910568 method3 : 0.837182998657 test5: 5 records in the db, postgresql method1 : 7.11347198486 method2 : 5.63948893547 method3 : 7.27239108086 test6: 5 records in the db, postgresql, column salary indexed method1 : 7.69179606438 method2 : 5.09459590912 method3 : 6.26593804359 test7: 5000 record in the db, postgresql method1 : 39.8379509449 method2 : 29.3324680328 method3 : 30.5438249111 test8: 5000 records in the db, postgresql, column salary indexed method1 : 22.3900408745 method2 : 4.97471690178 method3 : 6.57953190804 So, what do the tests show in general? A) different engines respond in different ways (i.e. if you are superoptimizing you have to test and consider the best way associated with your db engine). I won't be surprised if Mysql or Oracle show totally different timings.... B) different number of records in a table matter. Again, if you are superoptimizing, you have to consider the "expected" number of rows in a table What do the tests show in this particular case? A) putting an index in the column you are ordering on "always wins" B) scaling the number of records, method2 and method3 seems to deliver best performances than method1 C) making two distinct queries takes more time than one, but the difference is very little And don't forget, we are 'seeing' differences that matters in a repetition of 10000 times. In real world scenarios, the whole "why doesn't the DAL allow to construct a query that fires one time only to save time" isn't that important ;-D Last but not least, I don't know how much salaries varies during time....if you're actually superoptimizing, there are alternative ways to scale out... just a bunch of them: - let's say that salaries varies one time a week, you could query the first time and cache that value for a week - let's say that salaries table is "read" intensive but not "write" intensive, you can prepare a trigger (on insert and on update) that updates another table with two rows only, and you query this table for min and max Sorry for the long post, feel free to ask if something is wrong.