Insertion Sort Improvements

2022-08-25 Thread Benjamin Coutu
such an approach before? Please let me know your thoughts. Cheers, Benjamin [1] https://www.postgresql.org/message-id/flat/CAFBsxsHanJTsX9DNJppXJxwg3bU%2BYQ6pnmSfPM0uvYUaFdwZdQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CAApHDvoTTtoQYfp3d0kTPF6y1pjexgLwquzKmjzvjC9N

Re: Insertion Sort Improvements

2022-08-26 Thread Benjamin Coutu
> convenient if you know the type at compile time. See the attached, > which I had laying around when I was looking at PDQuicksort. I believe > it's similar to what you have in mind. That looks very promising. I also love your recent proposal of partitioning into null and non-null. I suspect that

Re: Insertion Sort Improvements

2022-09-27 Thread Benjamin Coutu
Getting back to improvements for small sort runs, it might make sense to consider using in-register based sorting via sorting networks for very small runs. This talk is specific to database sorting and illustrates how such an approach can be vectorized: https://youtu.be/HeFwVNHsDzc?list=PLSE8O

Re: Insertion Sort Improvements

2022-09-28 Thread Benjamin Coutu
> "Looks like"? I cannot find the reference, but I've read a while back that a well-known company from Redwood uses it for their in-memory columnar storage. That might have just been a rumor or might have been research only - not sure. It does not really matter anyways. > SortTuples are curren

Re: disfavoring unparameterized nested loops

2022-09-29 Thread Benjamin Coutu
I'd like to revamp this important discussion. As is well described in this fairly recent paper here https://www.vldb.org/pvldb/vol9/p204-leis.pdf (which also looks at Postgres) "estimation errors quickly grow as the number of joins increases, and that these errors are usually the reason for bad

Re: disfavoring unparameterized nested loops

2022-09-29 Thread Benjamin Coutu
> Right. But that seems fraught with difficulty. I suspect that the > costs that the planner attributes to each plan often aren't very > reliable in any absolute sense, even when everything is working very > well by every available metric. Even a very noisy cost model with > somewhat inaccurate sel

Re: disfavoring unparameterized nested loops

2022-09-29 Thread Benjamin Coutu
> In general I suspect that we'd be better off focussing on mitigating > the impact at execution time. There are at least a few things that we > could do there, at least in theory. Mostly very ambitious, long term > things. I think these things are orthogonal. No matter how good the cost model e

Re: disfavoring unparameterized nested loops

2022-09-30 Thread Benjamin Coutu
> Agreed, but dealing with uncertainty in those numbers is an enormous > task if you want to do it right. "Doing it right", IMV, would start > out by extending all the selectivity estimation functions to include > error bars; then we could have error bars on rowcount estimates and > then costs; th

Re: disfavoring unparameterized nested loops

2022-09-30 Thread Benjamin Coutu
> Actually, if we wanted to improve things in this area, we should have a > set of queries that don't chose optimal plans we can test with. We used > to see them a lot before we had extended statistics, but I don't > remember seeing many recently, let alone a collection of them. I guess > that is

Re: disfavoring unparameterized nested loops

2022-09-30 Thread Benjamin Coutu
> Sure, but the model isn't the problem here, really -- not to me. The > problem is that the planner can in some cases choose a plan that is > inherently unreasonable, at least in practical terms. You're talking > about uncertainties. But I'm actually talking about the opposite thing > -- certainty

Summary of Sort Improvement Proposals

2024-05-13 Thread Benjamin Coutu
iT1zBJ6A%40mail.gmail.com [5] https://www.postgresql.org/message-id/flat/CAEYLb_W%2B%2BUhrcWprzG9TyBVF7Sn-c1s9oLbABvAvPGdeP2DFSQ%40mail.gmail.com [6] https://www.postgresql.org/message-id/flat/683635b8-381b-5b08-6069-d6a45de19a12%40enterprisedb.com#12683b7a6c566eb5b926369b5fd2d41f -- Benjamin Co

Re: Insertion Sort Improvements

2023-05-23 Thread Benjamin Coutu
Greetings, I would like to revisit the discussion and concur with John's perspective that incremental progress through small, successive modifications is the appropriate approach to move forward. Therefore, I would like to propose two distinct ideas aimed at enhancing the functionality of inser

Re: Insertion Sort Improvements

2023-05-23 Thread Benjamin Coutu
additional check for n > 7 in the quicksort code path (see /src/include/lib/sort_template.h#L322). Currently we check n < 7 and a few lines down we check for n > 7, if we check n < 8 for insertion sort then the second check becomes obsolete. Benjamin Coutu http://www.zeyos.com