Hi Mahmoud,

I finally had time to properly read the paper today - the general
approach mostly matches how I imagined the estimation would work for
inequalities, but it's definitely nice to see the algorithm properly
formalized and analyzed.

What seems a bit strange to me is that the patch only deals with range
types, leaving the scalar cases unchanged. I understand why (not having
a MCV simplifies it a lot), but I'd bet joins on range types are waaaay
less common than inequality joins on scalar types. I don't even remember
seeing inequality join on a range column, TBH.

That doesn't mean the patch is wrong, of course. But I'd expect users to
be surprised we handle range types better than "old" scalar types (which
range types build on, in some sense).

Did you have any plans to work on improving estimates for the scalar
case too? Or did you do the patch needed for the paper, and have no
plans to continue working on this?

I'm also wondering about not having MCV for ranges. I was a bit
surprised we don't build MCV in compute_range_stats(), and perhaps we
should start building those - if there are common ranges, this might
significantly improve some of the estimates (just like for scalar
columns). Which would mean the estimates for range types are just as
complex as for scalars. Of course, we don't do that now.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to