Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

[email protected] Sat, 21 Aug 2021 00:57:00 -0700

Hello all,

I think I have identified a major performance issue between V11.2 and 13.4 with 
respect to exception handling in UDFs. I have the following simplified query 
that pivots data and makes use of a UDF to convert data to a specific type, in 
this case, float:



select "iccqa_iccassmt_fk"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DEPTH (CM)') ,null) as "iccqa_DEPTH_CM"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'LENGTH (CM)') ,null) as "iccqa_LENGTH_CM"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'WIDTH (CM)') ,null) as "iccqa_WIDTH_CM"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DRAIN PRESENT') ,null) as "iccqa_DRAIN_PRESENT"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'MEASUREMENTS TAKEN') ,null) as "iccqa_MEASUREMENTS_TAKEN"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'SIGNS AND SYMPTOMS OF INFECTION') ,null) as 
"iccqa_SIGNS_AND_SYMPTOMS_OF_INFECTION"
from  (
-- 'A pivoted view of ICC QA assessments'
select VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_iccassmt_fk" as 
"iccqa_iccassmt_fk" -- The key identifying an ICC assessment.
     , VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" as 
"iccqar_ques_code" -- The question long code from the meta-data.
     , max(VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ans_val") as 
"iccqar_ans_val" -- The official answer, if applicable) from the meta-data.
  from VNAHGEDW_FACTS.AssessmentICCQA_Raw
where VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" in ('DEPTH (CM)', 
'LENGTH (CM)', 'WIDTH (CM)'
                                                               , 'DRAIN 
PRESENT', 'MEASUREMENTS TAKEN', 'SIGNS AND SYMPTOMS OF INFECTION'
                                                               ) group by 1, 2
) T
     group by 1
;


The UDF is simple as follows:


CREATE OR REPLACE FUNCTION TILDA.toFloat(str varchar, val real)
RETURNS real AS $$
BEGIN
  RETURN case when str is null then val else str::real end;
EXCEPTION WHEN OTHERS THEN
  RETURN val;
END;
$$ LANGUAGE plpgsql COST 1 IMMUTABLE;



It works as a coalesce but with a conversion. I think I have identified some 
large performance difference with the exception handling. It so happens that 
with the last 3 columns ('DRAIN PRESENT', 'MEASUREMENTS TAKEN' and 'SIGNS AND 
SYMPTOMS OF INFECTION'), the data is VERY dirty. There is a mix of 0/1, YES/NO, 
and other mistyped stuff. This means these 3 columns throw lots of exceptions 
in the UDF. To illustrate, I simply break this into 2 queries.



select "iccqa_iccassmt_fk"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DEPTH (CM)') ,null))::real as "iccqa_DEPTH_CM"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'LENGTH (CM)') ,null))::real as "iccqa_LENGTH_CM"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'WIDTH (CM)') ,null))::real as "iccqa_WIDTH_CM"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DRAIN PRESENT') ,null))::real as "iccqa_DRAIN_PRESENT"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'MEASUREMENTS TAKEN') ,null))::real as "iccqa_MEASUREMENTS_TAKEN"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'SIGNS AND SYMPTOMS OF INFECTION') ,null))::real as 
"iccqa_SIGNS_AND_SYMPTOMS_OF_INFECTION"
from  (
-- 'A pivoted view of ICC QA assessments'
select VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_iccassmt_fk" as 
"iccqa_iccassmt_fk" -- The key identifying an ICC assessment.
     , VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" as 
"iccqar_ques_code" -- The question long code from the meta-data.
     , max(VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ans_val") as 
"iccqar_ans_val" -- The official answer, if applicable) from the meta-data.
  from VNAHGEDW_FACTS.AssessmentICCQA_Raw
where VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" in ('DEPTH (CM)', 
'LENGTH (CM)', 'WIDTH (CM)'
                                                               , 'DRAIN 
PRESENT', 'MEASUREMENTS TAKEN', 'SIGNS AND SYMPTOMS OF INFECTION'
                                                               )
group by 1, 2
) T
     group by 1
;


The performance is as expected.


HashAggregate  (cost=448463.70..448467.20 rows=200 width=16) (actual 
time=6760.797..9585.397 rows=677899 loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Batches: 1  Memory Usage: 147489kB
  Buffers: shared hit=158815
  ->  HashAggregate  (cost=405997.87..417322.09 rows=1132422 width=56) (actual 
time=4576.514..5460.770 rows=2374628 loops=1)
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, 
assessmenticcqa_raw.iccqar_ques_code
        Batches: 1  Memory Usage: 368657kB
        Buffers: shared hit=158815
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..388224.53 rows=2369779 
width=38) (actual time=0.033..3298.544 rows=2374628 loops=1)
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEPTH (CM)","LENGTH 
(CM)","WIDTH (CM)","DRAIN PRESENT","MEASUREMENTS TAKEN","SIGNS AND SYMPTOMS OF 
INFECTION"}'::text[]))
              Rows Removed by Filter: 10734488
              Buffers: shared hit=158815
Planning:
  Buffers: shared hit=3
Planning Time: 0.198 ms
Execution Time: 9678.120 ms



However, once we switch with the three "bad" columns, the results fall apart.



select "iccqa_iccassmt_fk"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DEPTH (CM)') ,null))::real as "iccqa_DEPTH_CM"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'LENGTH (CM)') ,null))::real as "iccqa_LENGTH_CM"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'WIDTH (CM)') ,null))::real as "iccqa_WIDTH_CM"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DRAIN PRESENT') ,null))::real as "iccqa_DRAIN_PRESENT"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'MEASUREMENTS TAKEN') ,null))::real as "iccqa_MEASUREMENTS_TAKEN"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'SIGNS AND SYMPTOMS OF INFECTION') ,null))::real as 
"iccqa_SIGNS_AND_SYMPTOMS_OF_INFECTION"
from  (
-- 'A pivoted view of ICC QA assessments'
select VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_iccassmt_fk" as 
"iccqa_iccassmt_fk" -- The key identifying an ICC assessment.
     , VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" as 
"iccqar_ques_code" -- The question long code from the meta-data.
     , max(VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ans_val") as 
"iccqar_ans_val" -- The official answer, if applicable) from the meta-data.
  from VNAHGEDW_FACTS.AssessmentICCQA_Raw
where VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" in ('DEPTH (CM)', 
'LENGTH (CM)', 'WIDTH (CM)'
                                                               , 'DRAIN 
PRESENT', 'MEASUREMENTS TAKEN', 'SIGNS AND SYMPTOMS OF INFECTION'
                                                               )
group by 1, 2
) T
     group by 1
;



The performance falls apart. It is a huge performance difference from ~10s to 
~11mn and the only difference that I can think of is that the data is dirty 
which causes the exception path to be taken. The explain is:


HashAggregate  (cost=448463.70..448467.20 rows=200 width=16) (actual 
time=6672.921..696753.080 rows=677899 loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Batches: 1  Memory Usage: 131105kB
  Buffers: shared hit=158815
  ->  HashAggregate  (cost=405997.87..417322.09 rows=1132422 width=56) (actual 
time=4574.918..5446.022 rows=2374628 loops=1)
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, 
assessmenticcqa_raw.iccqar_ques_code
        Batches: 1  Memory Usage: 368657kB
        Buffers: shared hit=158815
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..388224.53 rows=2369779 
width=38) (actual time=0.032..3300.616 rows=2374628 loops=1)
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEPTH (CM)","LENGTH 
(CM)","WIDTH (CM)","DRAIN PRESENT","MEASUREMENTS TAKEN","SIGNS AND SYMPTOMS OF 
INFECTION"}'::text[]))
              Rows Removed by Filter: 10734488
              Buffers: shared hit=158815
Planning:
  Buffers: shared hit=3
Planning Time: 0.201 ms
Execution Time: 696868.845 ms



Now, on V11.2, the explain is:


HashAggregate  (cost=492171.36..492174.86 rows=200 width=16) (actual 
time=19322.522..50556.738 rows=743723 loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Buffers: shared hit=11 read=174155 dirtied=13
  ->  HashAggregate  (cost=445458.43..457915.21 rows=1245678 width=56) (actual 
time=16260.015..17575.088 rows=2601088 loops=1)
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, 
assessmenticcqa_raw.iccqar_ques_code
        Buffers: shared read=174155 dirtied=13
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..425803.93 rows=2620600 
width=38) (actual time=0.126..14425.239 rows=2601088 loops=1)
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEPTH (CM)","LENGTH 
(CM)","WIDTH (CM)","DRAIN PRESENT","MEASUREMENTS TAKEN","SIGNS AND SYMPTOMS OF 
INFECTION"}'::text[]))
              Rows Removed by Filter: 11778360
              Buffers: shared read=174155 dirtied=13
Planning Time: 36.121 ms
Execution Time: 50730.255 ms



So, we are seeing two issues:

  *   I think exception handling is significantly slower between V11.2 and 
v13.4. I see almost a 14x difference from 50s to 700s.
  *   Comparing the two queries on V11.2, the difference is 13s vs 50s. So even 
on V11.2, the exception handling adds a significant overhead which I was not 
expecting.

I'll be happy to update my test cases and share additional info if needed.

Thank you,
Laurent Hasson.

Big Performance drop of Exceptions in UDFs between V11.2 and 13.4

Reply via email to