On Fri, 20 May 2016 14:17:25 +1000 "dandl" <david at andl.org> wrote:
> Every aggregation function is at least second order: a function that > applies a function to the set. So for MIN the function is 'less > than', for SUM() the function is 'plus' and so on. In Andl > aggregation functions are provided by fold(), which takes a function > as an argument. I want you to know that you hijacked my Saturday. I was bothered about what "first order" and "second order" mean, suspecting that we meant different things. After an afternoon with the Oracle of All Knowledge, I think we were talking about different things, and you had a better handle on your end than I did on mine. I was concerned that we were treading in territory outside first-order predicate logic. On review, as Wikipedia explains, HOL deals in another beast, namely the quantification of sets of sets. You were talking about something much simpler, second-order *functions*. The input is still a value -- an individual member of a set -- plus another function. As you say, there are many such in SQL. In keeping with the language's purpose, the primitive components are not exposed, so it's not possible to reconstruct min as FOLD(MIN,X). We can do similar things with subqueries, e.g. select sum(N) from (select count(*) as N from T group by a) as A One can imagine that restated as select F(sum, count, t) from T where F is defined as taking two functions and a value. I guess that would make F a third-order function. APL is instructive in this regard. What we usually call operators -- + - x ? -- are termed *functions* in APL, in keeping with their mathematical definition. A function that takes a function is called an operator. One such is "/", the reduction operator; SUM(t) could be expressed as +/t > > 2. Limit, as currently implemented, lies outside the theory > > because it doesn't operate on sets. > > I'll put that one on hold pending a suitable reference or detailed > mathematical treatment. I think I can accept "first(N)" could be a set function, and if SQL dealt in sets, LIMIT would be a deterministic function. But SQL deals in bags, and with a couple of odd exceptions -- random(), now() -- all its functions are determistic. LIMIT is not a deterministic function. I'm not sure what happens to first order predicate logic in the face of nondeterminism, but I'm sure it's not good. > Sorry. Your nth() is a second order function OK. > The (single-pass) implementation would maintain a temporary table of > rows that are 'minimum so far seen', to a maximum of N. It would be an > implementers decision what to do with a row equal to one in that > table once N has been reached: add it or ignore it? nth() acts on a single column; it keeps the set of N smallest values, as you say. The answer to your question is "ignore it" because a value equal to one in the set is already a member. Given the input C {1, 1, 2, 2, 2, 3} min(C) = 1 nth(C, 1) = {1} nth(C, 2) = {1, 2} I'm not claiming any deep insight, only that nth() would be handy and can be defined mathematically (even if I can't do it). --jkl