Hi, 

I'm currently trying to understand the expected row counts for a query 
involving a nested loop join and bitmap index scan 
on the functional index and a custom operator. And the numbers that I see don't 
make sense to me currently. Hopefully 
somebody here can shed some light on it, or confirm this is some kind of issue. 
 

Here is the query and explain analyze

explain analyze select * from twomass.psc as t , gaia_dr2.gaia_source as g 
where 
        (
                (q3c_ang2ipix(g.ra,g.dec) between q3c_nearby_it(t.ra, t.decl, 
0.0003, 0)  and  
                                                q3c_nearby_it(t.ra, t.decl, 
0.0003, 1))  
        or  
                (q3c_ang2ipix(g.ra,g.dec) between q3c_nearby_it(t.ra, t.decl, 
0.0003, 1)  and  
                                                q3c_nearby_it(t.ra, t.decl, 
0.0003, 3))
        ) 
        and
        0.0003 ==<<>>== (g.ra,g.dec,t.ra,t.decl)::q3c_type limit 10;

https://explain.depesz.com/s/vcNd

What I can't understand at all is how the estimate of 3E15 rows!!!! is obtained 
by the nested loop 
given that the bitmap heap scan is expected to return *one* single row for each 
row of the 'left' table. 
So in my mind the estimate of the total number of rows
should be ~ 1e9 rows after the nested loop. Because of this crazy overestimate, 
I actually have to force the nested loop 
in this query by disabling seqscan. 
(if I don't disable the seqscan -- this is the plan I get which ignores the 
indices:
https://explain.depesz.com/s/EIiG

Some more details about the query: 
q3c_ang2ipix(ra,dec) is the function mapping (double,double) -> bigint and the 
tables have a functional index on that.
Like this: 
                   Table "gaia_dr2.gaia_source"
              Column              |       Type        | Modifiers 
----------------------------------+-------------------+-----------
 ra                               | double precision  | 
 dec                              | double precision  | 
.......
Indexes:
    "gaia_source2_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec"))

q3c_nearby_() function just returns bigint.

The ==<<>== is the custom operator with custom low selectivity (1e-12 in this 
case)

The tables in the join in question have 450 mill  and 1.5 billion rows. 

I hope somebody can help me understand what's going on. 

Thank you in advance. 

  Sergey


PS the kind of query that I show  comes from the q3c module ( 
https://github.com/segasai/q3c ) 
that is used for spatial queries of large astronomical catalogues. 

Reply via email to