The query Ranbeer gave - as with any skyline query - can be solved with just pure SQL:

select * from books b where not exists(
  select * from books b2 where
    b2.rating >= b.rating and b2.price <= b.price and
    (b2.rating > b.rating or b2.price < b.price)
);
     book_name     | rating | price
-------------------+--------+-------
 Prodigal Daughter |      3 |   250
 The Notebook      |      4 |   300
 Fountain Head     |      5 |   350
(3 rows)

The idea of the BNL (block nested loop) skyline algorithm is to avoid the nested loop by storing "dominating" records as the query proceeds - in the above example, records which are relatively high in rating and low in price - and comparing each candidate record to those first.

BNL is the most reasonable skyline algorithm in the absence of a multidimensional (usually R-Tree) index on the columns. For answering skyline queries where such an index exists over all query columns, the most broadly used generalized algorithm is BBS [1].

Thanks,

Dave Fuhry

[1] Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2005. Progressive skyline computation in database systems. ACM Trans. Database Syst. 30, 1 (Mar. 2005), 41-82. DOI= http://doi.acm.org/10.1145/1061318.1061320

Gavin Sherry wrote:
On Tue, 6 Mar 2007, Alvaro Herrera wrote:

Also, keep in mind that there were plenty of changes in the executor.
This stuff is not likely to be very easy to implement efficiently using
our extant executor machinery; note that Ranbeer mentioned
implementation of "block nested loop" and other algorithms.  Not sure
how easy would be to fold that stuff into the optimizer for multi-input
aggregates, instead of hardwiring it to the SKYLINE OF syntax.


Yes, there's been a lot of working on calculating skyline efficiently,
with different sorting techniques and so on. This is the most interesting
part of the idea. You could calculate the query Ranbeer gave using pure
SQL and, perhaps, use of some covariance aggregates or something already.
Of course, it gets harder when you want to calculate across many
dimensions.

Personally, I'd love to see some of these newer data analysis
capabilities added to PostgreSQL -- or at least put out there as
interesting patches.

Thanks,

Gavin

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to