On Sun, Aug 10, 2014 at 11:48 PM, David Rowley <dgrowle...@gmail.com> wrote:

>
>  I've attached an updated version of the patch which fixes up some
> incorrect logic in the foreign key matching code, plus various comment
> improvements.
>

 I've made a few updated to the patch to simplify some logic in the code
which analyses the join condition. The result is slightly faster code for
detecting either successful or unsuccessful join removal.

I've also been doing a little benchmarking of the overhead that this adds
to planning time for a handful of different queries.
With the queries I tested the overhead was between ~20 and ~423 nanoseconds
per SEMI or ANTI join, the 20 was for the earliest fast path out on an
unsuccessful removal and the 423 was for a successful removal. (tests done
on a 4 year old intel i5 laptop). This accounted for between 0.01% and 0.2%
of planning time for the tested queries. I was quite happy with this, but I
did manage to knock it down a little more with the
bms_get_singleton_v1.patch, which I've attached. This reduces the range to
between ~15 and ~409 nanoseconds, but probably this is going into micro
benchmark territory... so perhaps not worth the extra code...

With the benchmarks I just put semiorantijoin_is_removable() in a tight 1
million iteration loop and grabbed the total planning time for that, I then
compared this to an unpatched master's planning time after dividing the
time reported for the 1 million removals version by 1 million.

I didn't really find a good way to measure the extra overhead in actually
loading the foreign key constraints in get_relation_info()

Regards

David Rowley
--Benchmark with the following code:

/*
        else if (sjinfo->jointype == JOIN_SEMI)
                {
                        List       *columnlist;
                        RelOptInfo *rel;
                        bool result;
                        int x;

                        for (x = 0; x < 1000000; x++)
                                result = semiorantijoin_is_removable(root, 
sjinfo, &columnlist, &rel);

                        /* Skip if not removable */
                        if (!result)
                                continue;

                        Assert(columnlist != NIL);
                        convert_semijoin_to_isnotnull_quals(root, rel, 
columnlist);
                }
*/

create table t2 (id int primary key);
create table t1 (value1 int references t2, value2 int, value3 int, value4 int);

-- test 1. Successful removal with single fk
explain select * from t1 where value1 in(select id from t2);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
425.019
413.903
407.200
411.014
411.370

-- as above, but with with bms_get_singleton patch applied
405.971
390.838
401.959
407.593
393.540

-- unpatched master planning time for above query
0.207
0.216
0.203
0.194
0.200

-- test 2. Non var in outer join condition.
explain select * from t1 where value1 in(select 0 from t2);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
40.419
40.374
36.831
42.832
40.248

-- as above, but with with bms_get_singleton patch applied
31.064
32.176
30.028
31.654
34.571

-- unpatched master planning time for above query
0.169
0.171
0.156
0.155
0.157

-- test 3. Fail case. No foreign key defined.
explain select * from t1 where value2 in(select id from t2);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
409.801
414.237
406.165
406.959
401.322

-- as above, but with with bms_get_singleton patch applied
367.536
347.254
349.383
348.291
348.063

-- unpatched master planning time for above query
0.214
0.200
0.199
0.195
0.248

-- test 4. 2 foreign keys defined. (with join removed)
ALTER TABLE t1 ADD CONSTRAINT t1_value2_fkey FOREIGN KEY (value2) REFERENCES t2;

explain select * from t1 where value2 in(select id from t2);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
431.408
421.737
426.109
419.222
418.526

-- as above, but with with bms_get_singleton patch applied
392.525
392.785
393.102
388.653
393.168

-- unpatched master planning time for above query
0.211
0.201
0.235
0.236
0.230

-- test 5. 2 foreign keys defined (without join removed)
explain select * from t1 where value3 in(select id from t2);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
426.327
423.019
431.331
423.405
413.779

-- as above, but with with bms_get_singleton patch applied
411.991
412.971
402.479
405.715
416.528

-- unpatched master planning time for above query
0.237
0.261
0.266
0.214
0.218

-- test 6. wrong operator.
explain select * from t1 where exists(select 1 from t2 where value1 > id);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
122.176
117.536
119.057
125.618
118.249

-- as above, but with with bms_get_singleton patch applied
109.182
108.424
109.526
110.091
107.240

-- unpatched master planning time for above query
0.181
0.220
0.171
0.178
0.170

-- test 7. No foreign keys
alter table t1 drop constraint t1_value1_fkey;
alter table t1 drop constraint t1_value2_fkey;

explain select * from t1 where value1 in(select id from t2);

--times in milliseconds for planning plus 1 million semiorantijoin_is_removable 
calls
21.320
20.776
20.984
20.564
20.845

-- as above, but with with bms_get_singleton patch applied
15.353
15.450
15.295
15.354
16.239

-- unpatched master planning time for above query
0.191
0.200
0.219
0.193
0.204

Attachment: semianti_join_removal_7be0c95_2014-08-17.patch
Description: Binary data

Attachment: bms_get_singleton_v1.patch
Description: Binary data

Attachment: anti_join_removal_benchmark.xlsx
Description: MS-Excel 2007 spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to