Hi Mahmoud, I finally had time to properly read the paper today - the general approach mostly matches how I imagined the estimation would work for inequalities, but it's definitely nice to see the algorithm properly formalized and analyzed.
What seems a bit strange to me is that the patch only deals with range types, leaving the scalar cases unchanged. I understand why (not having a MCV simplifies it a lot), but I'd bet joins on range types are waaaay less common than inequality joins on scalar types. I don't even remember seeing inequality join on a range column, TBH. That doesn't mean the patch is wrong, of course. But I'd expect users to be surprised we handle range types better than "old" scalar types (which range types build on, in some sense). Did you have any plans to work on improving estimates for the scalar case too? Or did you do the patch needed for the paper, and have no plans to continue working on this? I'm also wondering about not having MCV for ranges. I was a bit surprised we don't build MCV in compute_range_stats(), and perhaps we should start building those - if there are common ranges, this might significantly improve some of the estimates (just like for scalar columns). Which would mean the estimates for range types are just as complex as for scalars. Of course, we don't do that now. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company