Hello all,

I think I have identified a major performance issue between V11.2 and 13.4 with 
respect to exception handling in UDFs. I have the following simplified query 
that pivots data and makes use of a UDF to convert data to a specific type, in 
this case, float:


select "iccqa_iccassmt_fk"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DEPTH (CM)') ,null) as "iccqa_DEPTH_CM"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'LENGTH (CM)') ,null) as "iccqa_LENGTH_CM"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'WIDTH (CM)') ,null) as "iccqa_WIDTH_CM"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DRAIN PRESENT') ,null) as "iccqa_DRAIN_PRESENT"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'MEASUREMENTS TAKEN') ,null) as "iccqa_MEASUREMENTS_TAKEN"
     , Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'SIGNS AND SYMPTOMS OF INFECTION') ,null) as 
"iccqa_SIGNS_AND_SYMPTOMS_OF_INFECTION"
from  (
-- 'A pivoted view of ICC QA assessments'
select VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_iccassmt_fk" as 
"iccqa_iccassmt_fk" -- The key identifying an ICC assessment.
     , VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" as 
"iccqar_ques_code" -- The question long code from the meta-data.
     , max(VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ans_val") as 
"iccqar_ans_val" -- The official answer, if applicable) from the meta-data.
  from VNAHGEDW_FACTS.AssessmentICCQA_Raw
where VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" in ('DEPTH (CM)', 
'LENGTH (CM)', 'WIDTH (CM)'
                                                               , 'DRAIN 
PRESENT', 'MEASUREMENTS TAKEN', 'SIGNS AND SYMPTOMS OF INFECTION'
                                                               ) group by 1, 2
) T
     group by 1
;


The UDF is simple as follows:


CREATE OR REPLACE FUNCTION TILDA.toFloat(str varchar, val real)
RETURNS real AS $$
BEGIN
  RETURN case when str is null then val else str::real end;
EXCEPTION WHEN OTHERS THEN
  RETURN val;
END;
$$ LANGUAGE plpgsql COST 1 IMMUTABLE;



It works as a coalesce but with a conversion. I think I have identified some 
large performance difference with the exception handling. It so happens that 
with the last 3 columns ('DRAIN PRESENT', 'MEASUREMENTS TAKEN' and 'SIGNS AND 
SYMPTOMS OF INFECTION'), the data is VERY dirty. There is a mix of 0/1, YES/NO, 
and other mistyped stuff. This means these 3 columns throw lots of exceptions 
in the UDF. To illustrate, I simply break this into 2 queries.



select "iccqa_iccassmt_fk"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DEPTH (CM)') ,null))::real as "iccqa_DEPTH_CM"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'LENGTH (CM)') ,null))::real as "iccqa_LENGTH_CM"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'WIDTH (CM)') ,null))::real as "iccqa_WIDTH_CM"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DRAIN PRESENT') ,null))::real as "iccqa_DRAIN_PRESENT"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'MEASUREMENTS TAKEN') ,null))::real as "iccqa_MEASUREMENTS_TAKEN"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'SIGNS AND SYMPTOMS OF INFECTION') ,null))::real as 
"iccqa_SIGNS_AND_SYMPTOMS_OF_INFECTION"
from  (
-- 'A pivoted view of ICC QA assessments'
select VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_iccassmt_fk" as 
"iccqa_iccassmt_fk" -- The key identifying an ICC assessment.
     , VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" as 
"iccqar_ques_code" -- The question long code from the meta-data.
     , max(VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ans_val") as 
"iccqar_ans_val" -- The official answer, if applicable) from the meta-data.
  from VNAHGEDW_FACTS.AssessmentICCQA_Raw
where VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" in ('DEPTH (CM)', 
'LENGTH (CM)', 'WIDTH (CM)'
                                                               , 'DRAIN 
PRESENT', 'MEASUREMENTS TAKEN', 'SIGNS AND SYMPTOMS OF INFECTION'
                                                               )
group by 1, 2
) T
     group by 1
;


The performance is as expected.


HashAggregate  (cost=448463.70..448467.20 rows=200 width=16) (actual 
time=6760.797..9585.397 rows=677899 loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Batches: 1  Memory Usage: 147489kB
  Buffers: shared hit=158815
  ->  HashAggregate  (cost=405997.87..417322.09 rows=1132422 width=56) (actual 
time=4576.514..5460.770 rows=2374628 loops=1)
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, 
assessmenticcqa_raw.iccqar_ques_code
        Batches: 1  Memory Usage: 368657kB
        Buffers: shared hit=158815
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..388224.53 rows=2369779 
width=38) (actual time=0.033..3298.544 rows=2374628 loops=1)
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEPTH (CM)","LENGTH 
(CM)","WIDTH (CM)","DRAIN PRESENT","MEASUREMENTS TAKEN","SIGNS AND SYMPTOMS OF 
INFECTION"}'::text[]))
              Rows Removed by Filter: 10734488
              Buffers: shared hit=158815
Planning:
  Buffers: shared hit=3
Planning Time: 0.198 ms
Execution Time: 9678.120 ms



However, once we switch with the three "bad" columns, the results fall apart.



select "iccqa_iccassmt_fk"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DEPTH (CM)') ,null))::real as "iccqa_DEPTH_CM"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'LENGTH (CM)') ,null))::real as "iccqa_LENGTH_CM"
--     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'WIDTH (CM)') ,null))::real as "iccqa_WIDTH_CM"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'DRAIN PRESENT') ,null))::real as "iccqa_DRAIN_PRESENT"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'MEASUREMENTS TAKEN') ,null))::real as "iccqa_MEASUREMENTS_TAKEN"
     , (Tilda.toFloat(MAX("iccqar_ans_val") filter (where "iccqar_ques_code"= 
'SIGNS AND SYMPTOMS OF INFECTION') ,null))::real as 
"iccqa_SIGNS_AND_SYMPTOMS_OF_INFECTION"
from  (
-- 'A pivoted view of ICC QA assessments'
select VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_iccassmt_fk" as 
"iccqa_iccassmt_fk" -- The key identifying an ICC assessment.
     , VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" as 
"iccqar_ques_code" -- The question long code from the meta-data.
     , max(VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ans_val") as 
"iccqar_ans_val" -- The official answer, if applicable) from the meta-data.
  from VNAHGEDW_FACTS.AssessmentICCQA_Raw
where VNAHGEDW_FACTS.AssessmentICCQA_Raw."iccqar_ques_code" in ('DEPTH (CM)', 
'LENGTH (CM)', 'WIDTH (CM)'
                                                               , 'DRAIN 
PRESENT', 'MEASUREMENTS TAKEN', 'SIGNS AND SYMPTOMS OF INFECTION'
                                                               )
group by 1, 2
) T
     group by 1
;



The performance falls apart. It is a huge performance difference from ~10s to 
~11mn and the only difference that I can think of is that the data is dirty 
which causes the exception path to be taken. The explain is:


HashAggregate  (cost=448463.70..448467.20 rows=200 width=16) (actual 
time=6672.921..696753.080 rows=677899 loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Batches: 1  Memory Usage: 131105kB
  Buffers: shared hit=158815
  ->  HashAggregate  (cost=405997.87..417322.09 rows=1132422 width=56) (actual 
time=4574.918..5446.022 rows=2374628 loops=1)
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, 
assessmenticcqa_raw.iccqar_ques_code
        Batches: 1  Memory Usage: 368657kB
        Buffers: shared hit=158815
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..388224.53 rows=2369779 
width=38) (actual time=0.032..3300.616 rows=2374628 loops=1)
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEPTH (CM)","LENGTH 
(CM)","WIDTH (CM)","DRAIN PRESENT","MEASUREMENTS TAKEN","SIGNS AND SYMPTOMS OF 
INFECTION"}'::text[]))
              Rows Removed by Filter: 10734488
              Buffers: shared hit=158815
Planning:
  Buffers: shared hit=3
Planning Time: 0.201 ms
Execution Time: 696868.845 ms



Now, on V11.2, the explain is:


HashAggregate  (cost=492171.36..492174.86 rows=200 width=16) (actual 
time=19322.522..50556.738 rows=743723 loops=1)
  Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk
  Buffers: shared hit=11 read=174155 dirtied=13
  ->  HashAggregate  (cost=445458.43..457915.21 rows=1245678 width=56) (actual 
time=16260.015..17575.088 rows=2601088 loops=1)
        Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, 
assessmenticcqa_raw.iccqar_ques_code
        Buffers: shared read=174155 dirtied=13
        ->  Seq Scan on assessmenticcqa_raw  (cost=0.00..425803.93 rows=2620600 
width=38) (actual time=0.126..14425.239 rows=2601088 loops=1)
              Filter: ((iccqar_ques_code)::text = ANY ('{"DEPTH (CM)","LENGTH 
(CM)","WIDTH (CM)","DRAIN PRESENT","MEASUREMENTS TAKEN","SIGNS AND SYMPTOMS OF 
INFECTION"}'::text[]))
              Rows Removed by Filter: 11778360
              Buffers: shared read=174155 dirtied=13
Planning Time: 36.121 ms
Execution Time: 50730.255 ms



So, we are seeing two issues:

  *   I think exception handling is significantly slower between V11.2 and 
v13.4. I see almost a 14x difference from 50s to 700s.
  *   Comparing the two queries on V11.2, the difference is 13s vs 50s. So even 
on V11.2, the exception handling adds a significant overhead which I was not 
expecting.

I'll be happy to update my test cases and share additional info if needed.

Thank you,
Laurent Hasson.

Reply via email to