Re: ERROR: stack depth limit exceeded
Thank you all for taking the time to help me with my question and offer your advice. Your responses were greatly appreciated! At 2023-09-08 21:53:33, "Tom Lane" wrote: >gzh writes: >> In the Release Notes for PostgreSQL 12.14, we saw the following change: >> https://www.postgresql.org/docs/release/12.14/ > >>> Add recursion and looping defenses in subquery pullup (Tom Lane) >>> A contrived query can result in deep recursion and unreasonable amounts of >>> time spent trying to flatten subqueries. A proper fix for that seems unduly >>> invasive for a back-patch, but we can at least add stack depth checks and >>> an interrupt check to allow the query to be cancelled. > > >> Our understanding is that this change will cause some complex SQL statements >> that were previously not reporting errors to report errors in the new >> version. > >The key word there is "contrived". You are not going to hit this limit >without intentionally trying. The example that led to adding this check >was a synthetic query with 1 UNION ALL branches: > >https://www.postgresql.org/message-id/flat/703c09a2-08f3-d2ec-b33d-dbecd62428b8%40postgrespro.ru > >Also notice that the query misbehaved before this patch, too, by consuming >excessive RAM. > > regards, tom lane
Re: ERROR: stack depth limit exceeded
I'm sorry I didn't explain the issue clearly. Our system is currently running on PostgreSQL 12.13 and we are planning to upgrade to PostgreSQL 12.16. We are currently in the evaluation phase. In the Release Notes for PostgreSQL 12.14, we saw the following change: https://www.postgresql.org/docs/release/12.14/ >Add recursion and looping defenses in subquery pullup (Tom Lane) > >A contrived query can result in deep recursion and unreasonable amounts of >time spent trying to flatten subqueries. A proper fix for that seems unduly >invasive for a back-patch, but we can at least add stack depth checks and an >interrupt check to allow the query to be cancelled. Our understanding is that this change will cause some complex SQL statements that were previously not reporting errors to report errors in the new version. If our understanding of this change is correct, we would like to find out which SQL statements will report errors in the new version. Do you have any good methods for doing this? At 2023-09-07 21:29:56, "Tom Lane" wrote: >gzh writes: >> I upgraded the version of PostgreSQL from 12.13 to 12.16. >> If multiple subqueries or a large number of UNION ALL were used, >> the previously unseen error message "ERROR: stack depth limit exceeded" will >> appear after upgrading the version. > >Please provide a self-contained example. The people who might be >able to fix this are too busy to guess about what you are seeing. > > regards, tom lane
ERROR: stack depth limit exceeded
Hi, I upgraded the version of PostgreSQL from 12.13 to 12.16. If multiple subqueries or a large number of UNION ALL were used, the previously unseen error message "ERROR: stack depth limit exceeded" will appear after upgrading the version. I understand that increasing the value of max_stack_depth can prevent code errors, but is there a good way to find out which SQL statements have issues without having to run all SQL statements, as it would be too expensive? Regards
Re: How to improve the performance of my SQL query?
Thank you very much for taking the time to reply to my question. > The problem is none of the explains you sent match with the description > above. The last one when you forced the optimizer to go with index scan (SET > enable_seqscan TO off;) the chosen index seems to be one composed by ms_cd > and cd_ate (following your standard of 2 characters column name). There may > have a couple of explanations to this: > - One is that the index may not be exactly the same as described above; The primary key of the tbl_sha table consists of several fields, and ms_cd is just one of them. I just explained the definitions of the fields used in the query SQL. > - Another one is the order in the index. Because you have a composed index > the order of the columns in the index matters, and it seems the order is > (ms_cd, et_cd, etrys). I wonder if you could recreate this index with the > following order: (ms_cd, etrys, et_cd) and run the same query; The index of TBL_SHA table is defined as follows. CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS) CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS) CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD) > You can also try to trick the optimizer, for example, what is the result (and > explain) of the below query? > > WITH tbi (ry_cd) AS ( > SELECT tbl_inf.ry_cd > FROM tbl_inf tbi > WHERE tbi.ms_cd = 'MLD009' > AND tbl_inf.ry_cd = '0001' > ) SELECT COUNT(et_cd) FROM tbl_sha tbs > JOIN tbi ON tbi.ry_cd = tbs .etrys > WHERE tbs .ms_cd = 'MLD009'; The SQL execution encountered an error, so I made some modifications. Please refer to the execution plan. WHERE tbi.ms_cd = 'MLD009' → WHERE tbl_inf.ms_cd = 'MLD009' QUERY PLAN Limit (cost=2668811.76..2668811.77 rows=1 width=8) (actual time=133555.074..133557.729 rows=1 loops=1) -> Aggregate (cost=2668811.76..2668811.77 rows=1 width=8) (actual time=133555.072..133557.726 rows=1 loops=1) -> Nested Loop (cost=1000.29..2664512.83 rows=1719572 width=9) (actual time=29657.638..133341.053 rows=2113500 loops=1) -> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=1.316..1.321 rows=1 loops=1) Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '0001'::bpchar)) Heap Fetches: 1 -> Gather (cost=1000.00..2647308.80 rows=1719572 width=18) (actual time=29656.318..132969.910 rows=2113500 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on tbl_sha tbs (cost=0.00..2474351.60 rows=716488 width=18) (actual time=29654.184..132876.292 rows=704500 loops=3) Filter: ((ms_cd = 'MLD009'::bpchar) AND (etrys = '0001'::bpchar)) Rows Removed by Filter: 14678996 Planning Time: 0.164 ms Execution Time: 133557.767 ms > Well, adding more resources tends to improve performance, but it's usually > not linear and the improvement may not be as large as you want for the extra > price you are paying. I would first try to understand the performance problem > because using the "add more resources" approach may just delay the problem > and it tends to get worse with time as the dataset increases. I strongly agree with your viewpoint, but I currently don't have a solution in mind for the problem. At 2023-07-28 04:38:39, "Charly" wrote: Hi "gzh", Based on the info you provided I'm assuming you are trying to use the TBL_SHA primary key to do an index-only scan as in you mentioned above you have: > TBL_SHA > ms_cd character(6) NOT NULL -- PRIMARY KEY > et_cd character(8) > etrys character(8) Assuming a composed index here by the 3 columns. > TBL_INF > ms_cd character(6) NOT NULL -- PRIMARY KEY > ry_cd character(8) NOT NULL -- PRIMARY KEY Here it's more clear that there is a composed index based on those 2 columns. The problem is none of the explains you sent match with the description above. The last one when you forced the optimizer to go with index scan (SET enable_seqscan TO off;) the chosen index seems to be one composed by ms_cd and cd_ate (following your standard of 2 characters column name). There may have a couple of explanations to this: - One is that the index may not be exactly the same as described above; - Another one is the order in the index. Because you have a composed index the order of the columns in the index matters, and it seems the order is (ms_cd, et_cd, etrys). I wonder if you could rec
Re: How to improve the performance of my SQL query?
Thank you for your suggestion. The database is hosted on AWS RDS, with the instance having vCPU=2 and RAM=4GB. Could the low performance be due to the low configuration of AWS RDS? We are considering trying a higher configuration instance. At 2023-07-27 11:36:20, "David Rowley" wrote: >On Wed, 26 Jul 2023 at 19:46, gzh wrote: >> QUERY PLAN (enable_seqscan=on) > >> Execution Time: 167183.133 ms > >> QUERY PLAN (enable_seqscan=off) > >> Execution Time: 22320.153 ms > >effective_cache_size and random_page_cost are the settings you should >be adjusting to coax the planner into using the index. > >A rule of thumb for effective_cache_size would be to set it to about >75% of RAM. There are certainly cases where lower would make more >sense, certainly, 75% will make more sense than the default 4GB value >in the majority of cases. > >For random_page_cost, the default of 4.0 has been the default since >HDDs were common. SSDs are common now and, comparatively to sequential >I/O, their random I/O is faster than that of an HDD, so you may get >better results by lowering random_page_cost. > >David
Re: How to improve the performance of my SQL query?
Thank you very much for taking the time to reply to my question. >You might want to check your description of the table definitions. >Going by the above EXPLAIN ANALYZE output, it very much does not look >like ms_cd is the primary key of TBL_SHA. If it is then it's very >weird that you have 320 rows for MS_CD = 'MLD009'. You have some >data corruption if that's the case. I suspect you've just not >accurately described the table definition, however. The primary key of the SHA table has six fields, and ms_cd is just one of them. I'm sorry, I didn't make that clear. >Try executing the query after having done: > >SET enable_seqscan TO off; > >What plan does it use now? > >Is that plan faster or slower than the seq scan plan? There's improvement, but it's still quite slow. QUERY PLAN (enable_seqscan=on) Limit (cost=2693516.87..2693516.88 rows=1 width=8) (actual time=167089.822..167183.058 rows=1 loops=1) -> Aggregate (cost=2693516.87..2693516.88 rows=1 width=8) (actual time=167089.820..167183.056 rows=1 loops=1) -> Nested Loop (cost=1000.29..2688558.85 rows=1983209 width=9) (actual time=43544.753..166906.304 rows=2413500 loops=1) -> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=1.034..1.038 rows=1 loops=1) Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '0001'::bpchar)) Heap Fetches: 1 -> Gather (cost=1000.00..2668718.45 rows=1983209 width=18) (actual time=43543.714..166447.333 rows=2413500 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on tbl_sha (cost=0.00..2469397.55 rows=826337 width=18) (actual time=43537.056..166225.162 rows=804500 loops=3) Filter: ((ms_cd = 'MLD009'::bpchar) AND (etrys = '0001'::bpchar)) Rows Removed by Filter: 15362328 Planning Time: 2.942 ms Execution Time: 167183.133 ms SET enable_seqscan TO off; QUERY PLAN (enable_seqscan=off) Limit (cost=2880973.06..2880973.07 rows=1 width=8) (actual time=22295.419..22320.102 rows=1 loops=1) -> Aggregate (cost=2880973.06..2880973.07 rows=1 width=8) (actual time=22295.418..22320.100 rows=1 loops=1) -> Nested Loop (cost=93112.74..2876169.16 rows=1921561 width=9) (actual time=265.880..22000.432 rows=2413500 loops=1) -> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.013..0.020 rows=1 loops=1) Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '0001'::bpchar)) Heap Fetches: 1 -> Gather (cost=93112.45..2856945.24 rows=1921561 width=18) (actual time=265.864..21535.325 rows=2413500 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Bitmap Heap Scan on tbl_sha (cost=92112.45..2663789.14 rows=800650 width=18) (actual time=260.540..21442.169 rows=804500 loops=3) Recheck Cond: (ms_cd = 'MLD009'::bpchar) Rows Removed by Index Recheck: 49 Filter: (etrys = '0001'::bpchar) Rows Removed by Filter: 295500 Heap Blocks: exact=13788 lossy=10565 -> Bitmap Index Scan on index_search_04_mscd_cdate (cost=0.00..91632.06 rows=3402599 width=0) (actual time=249.718..249.718 rows=330 loops=1) Index Cond: (ms_cd = 'MLD009'::bpchar) Planning Time: 0.670 ms Execution Time: 22320.153 ms At 2023-07-25 21:04:16, "David Rowley" wrote: >On Fri, 21 Jul 2023 at 13:44, gzh wrote: >> >> The definitions of the columns used in SQL are as follows. >> >> TBL_SHA >> ms_cd character(6) NOT NULL -- PRIMARY KEY >> et_cd character(8) >> etrys character(8) > >> explain analyze >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE TBL_SHA.MS_CD = 'MLD009' >> and TBL_SHA.ETRYS in >>(select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = 'MLD009' >>AND TBL_INF.RY_CD = '0001' >>) >> - Execution Plan - >> Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.769..124168.771 rows=1 loops=1) >> -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.767..124168.769 rows=1 loops=1) >> -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) >> (actual time=97264.166..123920
Re: How to improve the performance of my SQL query?
Thank you for your reply. >I think the whole query can just: >select COUNT(ET_CD) >from TBL_SHA >WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '0001'; This is related to the business logic. >if many duplicates rows returned, then there is no point of evaluate >something like {1 in (1,1,1,1,1,1,)} Because the primary key of the 'tbl_inf' table only consists of 'ms_cd' and 'ry_cd' columns, the subquery will not return duplicate rows. At 2023-07-24 22:42:01, "jian he" wrote: >On Mon, Jul 24, 2023 at 5:54 PM gzh wrote: >> >> >Did you change any parameters that have an impact on query planning? >> >> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS). >> >> I added some parameters and re-executed the Execution Plan. >> >> Except for the index not taking effect, I still don't know the reason why >> the index is not working. >> >> Is it because there is too much data that meets the conditions? >> >> >> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON) >> >> select COUNT(ET_CD) >> >> from TBL_SHA >> >> WHERE TBL_SHA.MS_CD = 'MLD009' >> >> and TBL_SHA.ETRYS in >> >>(select TBL_INF.RY_CD >> >> from TBL_INF >> >> WHERE TBL_INF.MS_CD = 'MLD009' >> >>AND TBL_INF.RY_CD = '0001' >> >>) >> >> >> - Execution Plan - >> >> Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual >> time=128691.521..128717.677 rows=1 loops=1) >> >> Output: (count(tbl_sha.et_cd)) >> >> Buffers: shared hit=58948 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual >> time=128691.519..128717.674 rows=1 loops=1) >> >> Output: count(tbl_sha.et_cd) >> >> Buffers: shared hit=58948 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) >> (actual time=2.364..128350.279 rows=2613500 loops=1) >> >> Output: tbl_sha.et_cd >> >> Buffers: shared hit=58948 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf >> (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1) >> >> Output: tbl_inf.ms_cd, tbl_inf.ry_cd >> >> Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND >> (tbl_inf.ry_cd = '0001'::bpchar)) >> >> Heap Fetches: 1 >> >> Buffers: shared hit=4 >> >> -> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) >> (actual time=2.315..127773.087 rows=2613500 loops=1) >> >> Output: tbl_sha.et_cd, tbl_sha.etrys >> >> Workers Planned: 2 >> >> Workers Launched: 2 >> >> Buffers: shared hit=58944 read=2112758 >> >> I/O Timings: read=357249.120 >> >> -> Parallel Seq Scan on mtpdb.tbl_sha >> (cost=0.00..2454183.88 rows=1012745 width=18) (actual >> time=952.728..127583.089 rows=871167 loops=3) >> >> Output: tbl_sha.et_cd, tbl_sha.etrys >> >> Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND >> (tbl_sha.etrys = '0001'::bpchar)) >> >> Rows Removed by Filter: 14062278 >> >> Buffers: shared hit=58944 read=2112758 >> >> I/O Timings: read=357249.120 >> >> Worker 0: actual time=1432.292..127762.181 >> rows=988036 loops=1 >> >> Buffers: shared hit=17875 read=706862 >> >> I/O Timings: read=119193.744 >> >> Worker 1: actual time=1425.878..127786.777 >> rows=992381 loops=1 >> >> Buffers: shared hit=19813 read=706359 >> >> I/O Timings: read=119386.899 >> >> Planning: >> >> Buffers: shared hit=42 >> >> Planning Time: 1.024 ms >> >> Execution Time: 128717.731 ms >> >> > >I think the whole query can just: >select COUNT(ET_CD) >from TBL_SHA >WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '0001'; > >> and TBL_SHA.ETRYS in >>(select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = 'MLD009' >>AND TBL_INF.RY_CD = '0001' >>) > >if subquery after IN clause part, no rows returned then the whole >query would return zero row. >if many duplicates rows returned, then there is no point of evaluate >something like {1 in (1,1,1,1,1,1,)}
Re: How to improve the performance of my SQL query?
>I asked for EXPLAIN (ANALYZE, BUFFERS, SETTINGS) ... >Or do you have a version that is too old for SETTINGS? Sorry. Please refer to the following execution plan. EXPLAIN (ANALYZE, BUFFERS, SETTINGS) select COUNT(ET_CD) from TBL_SHA WHERE TBL_SHA.MS_CD = 'MLD009' and TBL_SHA.ETRYS in (select TBL_INF.RY_CD from TBL_INF WHERE TBL_INF.MS_CD = 'MLD009' AND TBL_INF.RY_CD = '0001' ) - Execution Plan - Buffers: shared hit=13 read=2171693 I/O Timings: read=365863.877 -> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=131595.624..131599.529 rows=1 loops=1) Buffers: shared hit=13 read=2171693 I/O Timings: read=365863.877 -> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.341..131256.445 rows=2513500 loops=1) Buffers: shared hit=13 read=2171693 I/O Timings: read=365863.877 -> Index Only Scan using tbl_inf_pkc on tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=1.471..1.476 rows=1 loops=1) Index Cond: ((ms_cd = 'MLD009'::bpchar) AND (ry_cd = '0001'::bpchar)) Heap Fetches: 1 Buffers: shared hit=2 read=2 I/O Timings: read=1.412 -> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=0.866..130696.440 rows=2513500 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=11 read=2171691 I/O Timings: read=365862.464 -> Parallel Seq Scan on tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=0.215..130476.981 rows=837833 loops=3) Filter: ((ms_cd = 'MLD009'::bpchar) AND (etrys = '0001'::bpchar)) Rows Removed by Filter: 13728945 Buffers: shared hit=11 read=2171691 I/O Timings: read=365862.464 Settings: effective_cache_size = '1886088kB', jit = 'off', search_path = '"$user", mdb' Planning: Buffers: shared hit=167 read=7 I/O Timings: read=2.735 Planning Time: 3.733 ms Execution Time: 131599.594 ms At 2023-07-24 23:58:50, "Laurenz Albe" wrote: >On Mon, 2023-07-24 at 17:54 +0800, gzh wrote: >> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON) > >I asked for EXPLAIN (ANALYZE, BUFFERS, SETTINGS) ... >Or do you have a version that is too old for SETTINGS? > >One other idea: check if the index is INVALID (this will >be visible if you run "\d tablenane" in "psql"). >Invalid indexes won't be used. > >Yours, >Laurenz Albe
Re: How to improve the performance of my SQL query?
>Did you change any parameters that have an impact on query planning? >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS). I added some parameters and re-executed the Execution Plan. Except for the index not taking effect, I still don't know the reason why the index is not working. Is it because there is too much data that meets the conditions? EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON) select COUNT(ET_CD) from TBL_SHA WHERE TBL_SHA.MS_CD = 'MLD009' and TBL_SHA.ETRYS in (select TBL_INF.RY_CD from TBL_INF WHERE TBL_INF.MS_CD = 'MLD009' AND TBL_INF.RY_CD = '0001' ) - Execution Plan - Limit (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1) Output: (count(tbl_sha.et_cd)) Buffers: shared hit=58948 read=2112758 I/O Timings: read=357249.120 -> Aggregate (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1) Output: count(tbl_sha.et_cd) Buffers: shared hit=58948 read=2112758 I/O Timings: read=357249.120 -> Nested Loop (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1) Output: tbl_sha.et_cd Buffers: shared hit=58948 read=2112758 I/O Timings: read=357249.120 -> Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1) Output: tbl_inf.ms_cd, tbl_inf.ry_cd Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '0001'::bpchar)) Heap Fetches: 1 Buffers: shared hit=4 -> Gather (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1) Output: tbl_sha.et_cd, tbl_sha.etrys Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=58944 read=2112758 I/O Timings: read=357249.120 -> Parallel Seq Scan on mtpdb.tbl_sha (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3) Output: tbl_sha.et_cd, tbl_sha.etrys Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '0001'::bpchar)) Rows Removed by Filter: 14062278 Buffers: shared hit=58944 read=2112758 I/O Timings: read=357249.120 Worker 0: actual time=1432.292..127762.181 rows=988036 loops=1 Buffers: shared hit=17875 read=706862 I/O Timings: read=119193.744 Worker 1: actual time=1425.878..127786.777 rows=992381 loops=1 Buffers: shared hit=19813 read=706359 I/O Timings: read=119386.899 Planning: Buffers: shared hit=42 Planning Time: 1.024 ms Execution Time: 128717.731 ms At 2023-07-24 13:43:46, "Laurenz Albe" wrote: >On Fri, 2023-07-21 at 09:43 +0800, gzh wrote: >> The definitions of the columns used in SQL are as follows. >> >> TBL_SHA >> >> ms_cd character(6) NOT NULL -- PRIMARY KEY >> et_cd character(8) >> etrys character(8) >> >> TBL_INF >> >> ms_cd character(6) NOT NULL -- PRIMARY KEY >> ry_cd character(8) NOT NULL -- PRIMARY KEY >> >> I made some modifications to the data, and I realized that I should not >> change the length of the data. >> The actual data and its corresponding execution plan are shown below. >> >> explain analyze >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE TBL_SHA.MS_CD = 'MLD009' >> and TBL_SHA.ETRYS in >>(select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = 'MLD009' >>AND TBL_INF.RY_CD = '0001' >>) >> - Execution Plan - >> Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.769..124168.771 rows=1 loops=1) >> -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.767..124168.769 rows=1 loops=1) >> -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) >> (actual time=97264.166..123920.769 rows=320 loops=1) >> -> Index Only Scan using TBL_INF_pkc on TBL_INF >> (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1) >>
Re:How to improve the performance of my SQL query?
>select (38700325 - 11833442) /38700325.0; >is 0.69 approx. >So I think it says around 69% of rows satisfy the query condition.Thank you >for your reply. I have learned a lot from it. At 2023-07-20 23:20:16, "jian he" wrote: >On Thu, Jul 20, 2023 at 7:36 PM gzh wrote: >> >> >> Thank you very much for taking the time to reply to my question. >> >> >> Sorry, I provided incorrect information. >> >> The index also does not work in the following query statement. >> >> >> > select COUNT(ET_CD) >> >> > from TBL_SHA >> >> > WHERE MS_CD = '009' >> >> > AND ETRYS = '01' >> >> >> QUERY PLAN >> >> Limit (cost=2419643.47..2419643.48 rows=1 width=8) (actual >> time=128667.439..128668.250 rows=1 loops=1) >> >> -> Finalize Aggregate (cost=2419643.47..2419643.48 rows=1 width=8) >> (actual time=128667.437..128668.246 rows=1 loops=1) >> >> -> Gather (cost=2419643.25..2419643.46 rows=2 width=8) (actual >> time=128664.108..128668.233 rows=3 loops=1) >> >> Workers Planned: 2 >> >> Workers Launched: 2 >> >> -> Partial Aggregate (cost=2418643.25..2418643.26 rows=1 >> width=8) (actual time=128655.256..128655.258 rows=1 loops=3) >> >> -> Parallel Seq Scan on TBL_SHA (cost=0.00..2415548.85 >> rows=1237762 width=9) (actual time=75357.455..128531.615 rows=107 >> loops=3) >> >> Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = >> '01'::bpchar)) >> >> Rows Removed by Filter: 11833442 >> >> Planning Time: 0.118 ms >> >> Execution Time: 128668.290 ms >> >> >> The TBL_SHA table has another index, as shown below. >> >> >> CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS) >> >> CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, >> BK_CD, FR_CD, RM_CD) >> > >> Rows Removed by Filter: 11833442 >select (38700325 - 11833442) /38700325.0; >is 0.69 approx. >So I think it says around 69% of rows satisfy the query condition. > >but I am not sure in the following 2 cases, whether the actual rows >are noisy or not. I can not find the doc explaining it. >> Partial Aggregate (actual time=128655.256..128655.258 rows=1 loops=3) >> Finalize Aggregate (actual time=128667.437..128668.246 rows=1 loops=1)
Re: How to improve the performance of my SQL query?
The definitions of the columns used in SQL are as follows. TBL_SHA ms_cd character(6) NOT NULL -- PRIMARY KEY et_cd character(8) etrys character(8) TBL_INF ms_cd character(6) NOT NULL -- PRIMARY KEY ry_cd character(8) NOT NULL -- PRIMARY KEY I made some modifications to the data, and I realized that I should not change the length of the data. The actual data and its corresponding execution plan are shown below. explain analyze select COUNT(ET_CD) from TBL_SHA WHERE TBL_SHA.MS_CD = 'MLD009' and TBL_SHA.ETRYS in (select TBL_INF.RY_CD from TBL_INF WHERE TBL_INF.MS_CD = 'MLD009' AND TBL_INF.RY_CD = '0001' ) - Execution Plan - Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1) -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1) -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=320 loops=1) -> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1) Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '0001'::bpchar)) Heap Fetches: 1 -> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=320 loops=1) Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '0001'::bpchar)) Rows Removed by Filter: 32000325 Planning Time: 0.162 ms Execution Time: 124168.838 ms At 2023-07-20 22:48:19, "Laurenz Albe" wrote: >On Thu, 2023-07-20 at 22:14 +0800, gzh wrote: >> The information I provided is incorrect, please see my previous reply. > >My question remains: I would like to see the table definitions. > >Also, did you ANALYZE the tables? > >Yours, >Laurenz Albe
Re: How to improve the performance of my SQL query?
Thank you for your reply. The information I provided is incorrect, please see my previous reply. >What I cannot see is if the columns are defined as "character" or whether you >bind >the parameters as "character". Can you show us the table definition of >"TBL_SHA" >and "TBL_INF"? For information security reasons, I can't provide the table definition, these columns are defined as "character". At 2023-07-20 19:58:59, "Laurenz Albe" wrote: >On Thu, 2023-07-20 at 15:09 +0800, gzh wrote: >> I'm running into some performance issues with my SQL query. >> The following SQL query is taking a long time to execute. >> >> explain analyze >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE TBL_SHA.MS_CD = '009' >> and TBL_SHA.ETRYS in >>(select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = '009' >>AND TBL_INF.RY_CD = '01' >>) >> - Execution Plan - >> Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.769..124168.771 rows=1 loops=1) >> -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.767..124168.769 rows=1 loops=1) >> -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) >> (actual time=97264.166..123920.769 rows=320 loops=1) >> -> Index Only Scan using TBL_INF_pkc on TBL_INF >> (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1) >> Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = >> '01'::bpchar)) >> Heap Fetches: 1 >> -> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 >> width=18) (actual time=97264.138..123554.792 rows=320 loops=1) >> Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = >> '01'::bpchar)) >> Rows Removed by Filter: 32000325 >> Planning Time: 0.162 ms >> Execution Time: 124168.838 ms >> >> >> The index is defined as follows. >> >> CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS); > >Actual rows = 320, rows removed by filter is ten times as much. >It should use an index. > >> When I take the following sql statement, the index works fine and the query >> is fast. >> >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE MS_CD = '009' >> AND ETRYS = '01' >> >> The amount of data in the table is as follows. >> TBL_SHA38700325 >> TBL_INF35546 > >This looks very much like it is a problem with the data types. >I see that you are using "character", which you shouldn't do. > >What I cannot see is if the columns are defined as "character" or whether you >bind >the parameters as "character". Can you show us the table definition of >"TBL_SHA" >and "TBL_INF"? > >Yours, >Laurenz Albe
Re:Re: How to improve the performance of my SQL query?
Thank you very much for taking the time to reply to my question. Sorry, I provided incorrect information. The index also does not work in the following query statement. > select COUNT(ET_CD) > from TBL_SHA > WHERE MS_CD = '009' > AND ETRYS = '01' QUERY PLAN Limit (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.439..128668.250 rows=1 loops=1) -> Finalize Aggregate (cost=2419643.47..2419643.48 rows=1 width=8) (actual time=128667.437..128668.246 rows=1 loops=1) -> Gather (cost=2419643.25..2419643.46 rows=2 width=8) (actual time=128664.108..128668.233 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=2418643.25..2418643.26 rows=1 width=8) (actual time=128655.256..128655.258 rows=1 loops=3) -> Parallel Seq Scan on TBL_SHA (cost=0.00..2415548.85 rows=1237762 width=9) (actual time=75357.455..128531.615 rows=107 loops=3) Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '01'::bpchar)) Rows Removed by Filter: 11833442 Planning Time: 0.118 ms Execution Time: 128668.290 ms The TBL_SHA table has another index, as shown below. CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS) CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, BK_CD, FR_CD, RM_CD) When I take the following query statement, the result is returned quickly. Why does index_search_01 always not work? explain analyze select TO_CHAR(MAX(TBL_SHA.ET_DAY),'MMDD') AS ET_DAY from TBL_SHA WHERE MS_CD = '008' AND ET_CD = '03' QUERY PLAN Limit (cost=4.11..4.13 rows=1 width=32) (actual time=0.043..0.044 rows=1 loops=1) InitPlan 1 (returns $0) -> Limit (cost=0.56..4.11 rows=1 width=8) (actual time=0.032..0.033 rows=1 loops=1) -> Index Scan Backward using index_search_01 on TBL_SHA (cost=0.56..10836962.40 rows=3054052 width=8) (actual time=0.032..0.032 rows=1 loops=1) Index Cond: ((MS_CD = '008'::bpchar) AND (ent_day IS NOT NULL)) Filter: (ET_CD = '03'::bpchar) -> Result (cost=4.11..4.13 rows=1 width=32) (actual time=0.042..0.042 rows=1 loops=1) Planning Time: 0.228 ms Execution Time: 0.070 ms At 2023-07-20 16:07:15, "Erik Wienhold" wrote: >> On 20/07/2023 09:09 CEST gzh wrote: >> >> I'm running into some performance issues with my SQL query. >> The following SQL query is taking a long time to execute. >> >> explain analyze >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE TBL_SHA.MS_CD = '009' >> and TBL_SHA.ETRYS in >> (select TBL_INF.RY_CD >> from TBL_INF >> WHERE TBL_INF.MS_CD = '009' >> AND TBL_INF.RY_CD = '01' >> ) >> - Execution Plan - >> Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.769..124168.771 rows=1 loops=1) >> -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual >> time=124168.767..124168.769 rows=1 loops=1) >> -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual >> time=97264.166..123920.769 rows=320 loops=1) >> -> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 >> width=9) (actual time=0.025..0.030 rows=1 loops=1) >> Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '01'::bpchar)) >> Heap Fetches: 1 >> -> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual >> time=97264.138..123554.792 rows=320 loops=1) >> Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '01'::bpchar)) >> Rows Removed by Filter: 32000325 >> Planning Time: 0.162 ms >> Execution Time: 124168.838 ms >> >> >> The index is defined as follows. >> >> CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS); >> >> >> When I take the following sql statement, the index works fine and the query >> is fast. >> >> select COUNT(ET_CD) >> from TBL_SHA >> WHERE MS_CD = '009' >> AND ETRYS = '01' >> >> The amount of data in the table is as follows. >> TBL_SHA 38700325 >> TBL_INF 35546 >> >> Any suggestions for improving the performance of the query would be greatly >> appreciated. > >You can try EXISTS instead of IN to correlate the subquery and the outer query: > > SELECT count(et_cd) > FROM tbl_sha > WHERE tbl_sha.ms_cd = '009' > AND tbl_sha.etrys = '01' > AND EXISTS ( > SELECT > FROM tbl_inf > WHERE tbl_inf.ms_cd = tbl_sha.ms_cd > AND tbl_inf.ry_cd = tbl_sha.etrys > ) > >-- >Erik
How to improve the performance of my SQL query?
Hi everyone, I'm running into some performance issues with my SQL query. The following SQL query is taking a long time to execute. explain analyze select COUNT(ET_CD) from TBL_SHA WHERE TBL_SHA.MS_CD = '009' and TBL_SHA.ETRYS in (select TBL_INF.RY_CD from TBL_INF WHERE TBL_INF.MS_CD = '009' AND TBL_INF.RY_CD = '01' ) - Execution Plan - Limit (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1) -> Aggregate (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1) -> Nested Loop (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=320 loops=1) -> Index Only Scan using TBL_INF_pkc on TBL_INF (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1) Index Cond: ((MS_CD = '009'::bpchar) AND (RY_CD = '01'::bpchar)) Heap Fetches: 1 -> Seq Scan on TBL_SHA (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=320 loops=1) Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = '01'::bpchar)) Rows Removed by Filter: 32000325 Planning Time: 0.162 ms Execution Time: 124168.838 ms The index is defined as follows. CREATE INDEX index_search_01 ON mdb.TBL_SHA USING btree (MS_CD, ETRYS); When I take the following sql statement, the index works fine and the query is fast. select COUNT(ET_CD) from TBL_SHA WHERE MS_CD = '009' AND ETRYS = '01' The amount of data in the table is as follows. TBL_SHA38700325 TBL_INF35546 Any suggestions for improving the performance of the query would be greatly appreciated. Thanks in advance!
Re: function to_char(unknown) is not unique at character 8
Thank you for the solution you provided to identify these useless usage of this function,I learned a lot from it. At 2023-07-06 22:42:38, "Pavel Stehule" wrote: čt 6. 7. 2023 v 16:28 odesílatel gzh napsal: Thank you for the solution, it works fine. > I have a question. Why do you use the to_char(string) function? Instead > to_char('text') you can write only 'text'. I'm guessing it's probably a bug made by the original developer, but I'm not sure how many bugs there are, because it works fine in older version(orafce 3.13). you can use another version CREATE OR REPLACE FUNCTION oracle.to_char(text) RETURNS text AS $$ BEGIN RAISE WARNING 'using useless to_char function'; RETURN $1; END; $$ LANGUAGE plpgsql IMMUTABLE; This can help to identify these useless usage of this function Unfortunately, the Oracle type system is very different, so it is hard to emulate in Postgres. And Oracle PL/SQL design allows people to write terribly ugly code. These bad things are hard to repeat in Pg. At 2023-07-06 19:54:19, "Pavel Stehule" wrote: čt 6. 7. 2023 v 13:38 odesílatel gzh napsal: Thank you very much for taking the time to reply to my question. There are a lot of TO_CHAR in my application, and there is a high cost of modifying the code, is there another solution to solve the problem without modifying the code? There is one dirty workaround, and because it is dirty, I don't want to push it to orafce. You can create own function to_char CREATE OR REPLACE FUNCTION oracle.to_char(text) RETURNS text AS $$ SELECT $1 $$ LANGUAGE sql IMMUTABLE STRICT; This version will be preferred and fix this issue. On second thought, the behavior can be a little bit different than before. I have a question. Why do you use the to_char(string) function? Instead to_char('text') you can write only 'text'. At 2023-07-06 19:21:24, "Pavel Stehule" wrote: Hi čt 6. 7. 2023 v 11:19 odesílatel gzh napsal: Hi, I upgraded the version of PostgreSQL from 12.6 to 12.13, when I execute the sql below , the to_char function caused the following error. ---SQL-- select TO_CHAR('100'); ERROR: function to_char(unknown) is not unique at character 8 HINT: Could not choose a best candidate function. You might need to add explicit type casts. There is no problem before the upgrade and to_char function comes from the Orafce extension. The configuration of the old and new databases is as follows. Database server (old): PostgreSQL 12.6(orafce3.13) Database server (new): PostgreSQL 12.13(orafce3.24) The new database has successfully installed the orafce 3.24 extension. It does not occur in "PostgreSQL 12.6 and orafce 3.13", but occurs in "PostgreSQL 12.13 and orafce 3.24", so either the difference between 12.6 and 12.13 or the difference between orafce 3.13 and 3.24 is suspicious. What is the reason for the problem and how to fix the error? The new orafce contains to_char for numeric types and to_char for timestamp. Old orafce had to_char function (with one argument) only for numeric types. This is the reason why the system doesn't know if a string of unknown type (postgresql string literal) should be processed as numeric or as a timestamp. The best fix is specify used type like (2023-07-06 13:19:33) postgres=# select oracle.TO_CHAR('100'::int); ┌─┐ │ to_char │ ╞═╡ │ 100 │ └─┘ (1 row) Regards Pavel Regards
Re: function to_char(unknown) is not unique at character 8
Thank you for the solution, it works fine. > I have a question. Why do you use the to_char(string) function? Instead > to_char('text') you can write only 'text'. I'm guessing it's probably a bug made by the original developer, but I'm not sure how many bugs there are, because it works fine in older version(orafce 3.13). At 2023-07-06 19:54:19, "Pavel Stehule" wrote: čt 6. 7. 2023 v 13:38 odesílatel gzh napsal: Thank you very much for taking the time to reply to my question. There are a lot of TO_CHAR in my application, and there is a high cost of modifying the code, is there another solution to solve the problem without modifying the code? There is one dirty workaround, and because it is dirty, I don't want to push it to orafce. You can create own function to_char CREATE OR REPLACE FUNCTION oracle.to_char(text) RETURNS text AS $$ SELECT $1 $$ LANGUAGE sql IMMUTABLE STRICT; This version will be preferred and fix this issue. On second thought, the behavior can be a little bit different than before. I have a question. Why do you use the to_char(string) function? Instead to_char('text') you can write only 'text'. At 2023-07-06 19:21:24, "Pavel Stehule" wrote: Hi čt 6. 7. 2023 v 11:19 odesílatel gzh napsal: Hi, I upgraded the version of PostgreSQL from 12.6 to 12.13, when I execute the sql below , the to_char function caused the following error. ---SQL-- select TO_CHAR('100'); ERROR: function to_char(unknown) is not unique at character 8 HINT: Could not choose a best candidate function. You might need to add explicit type casts. There is no problem before the upgrade and to_char function comes from the Orafce extension. The configuration of the old and new databases is as follows. Database server (old): PostgreSQL 12.6(orafce3.13) Database server (new): PostgreSQL 12.13(orafce3.24) The new database has successfully installed the orafce 3.24 extension. It does not occur in "PostgreSQL 12.6 and orafce 3.13", but occurs in "PostgreSQL 12.13 and orafce 3.24", so either the difference between 12.6 and 12.13 or the difference between orafce 3.13 and 3.24 is suspicious. What is the reason for the problem and how to fix the error? The new orafce contains to_char for numeric types and to_char for timestamp. Old orafce had to_char function (with one argument) only for numeric types. This is the reason why the system doesn't know if a string of unknown type (postgresql string literal) should be processed as numeric or as a timestamp. The best fix is specify used type like (2023-07-06 13:19:33) postgres=# select oracle.TO_CHAR('100'::int); ┌─┐ │ to_char │ ╞═╡ │ 100 │ └─┘ (1 row) Regards Pavel Regards
Re: function to_char(unknown) is not unique at character 8
Thank you very much for taking the time to reply to my question. I added oracle to search_path, but it didn't work. postgres=# show search_path; search_path - "$user", public, oracle, pg_catalog At 2023-07-06 19:36:49, "Erik Wienhold" wrote: >> On 06/07/2023 11:19 CEST gzh wrote: >> >> I upgraded the version of PostgreSQL from 12.6 to 12.13, >> when I execute the sql below , the to_char function caused the following >> error. >> >> ---SQL-- >> select TO_CHAR('100'); >> >> ERROR: function to_char(unknown) is not unique at character 8 >> HINT: Could not choose a best candidate function. You might need to add >> explicit type casts. >> >> There is no problem before the upgrade and to_char function comes from the >> Orafce extension. >> The configuration of the old and new databases is as follows. >> >> Database server (old): PostgreSQL 12.6(orafce3.13) >> Database server (new): PostgreSQL 12.13(orafce3.24) >> >> The new database has successfully installed the orafce 3.24 extension. >> It does not occur in "PostgreSQL 12.6 and orafce 3.13", >> but occurs in "PostgreSQL 12.13 and orafce 3.24", >> so either the difference between 12.6 and 12.13 or the difference between >> orafce 3.13 and 3.24 is suspicious. >> >> What is the reason for the problem and how to fix the error? > >This rings a bell: > >https://www.postgresql.org/message-id/1597875806.606392.1681915893771%40office.mailbox.org > >Either qualify functions with namespace oracle, e.g. oracle.to_char('100'), >or place oracle on the search path. > >-- >Erik
Re: function to_char(unknown) is not unique at character 8
Thank you very much for taking the time to reply to my question. There are a lot of TO_CHAR in my application, and there is a high cost of modifying the code, is there another solution to solve the problem without modifying the code? At 2023-07-06 19:21:24, "Pavel Stehule" wrote: Hi čt 6. 7. 2023 v 11:19 odesílatel gzh napsal: Hi, I upgraded the version of PostgreSQL from 12.6 to 12.13, when I execute the sql below , the to_char function caused the following error. ---SQL-- select TO_CHAR('100'); ERROR: function to_char(unknown) is not unique at character 8 HINT: Could not choose a best candidate function. You might need to add explicit type casts. There is no problem before the upgrade and to_char function comes from the Orafce extension. The configuration of the old and new databases is as follows. Database server (old): PostgreSQL 12.6(orafce3.13) Database server (new): PostgreSQL 12.13(orafce3.24) The new database has successfully installed the orafce 3.24 extension. It does not occur in "PostgreSQL 12.6 and orafce 3.13", but occurs in "PostgreSQL 12.13 and orafce 3.24", so either the difference between 12.6 and 12.13 or the difference between orafce 3.13 and 3.24 is suspicious. What is the reason for the problem and how to fix the error? The new orafce contains to_char for numeric types and to_char for timestamp. Old orafce had to_char function (with one argument) only for numeric types. This is the reason why the system doesn't know if a string of unknown type (postgresql string literal) should be processed as numeric or as a timestamp. The best fix is specify used type like (2023-07-06 13:19:33) postgres=# select oracle.TO_CHAR('100'::int); ┌─┐ │ to_char │ ╞═╡ │ 100 │ └─┘ (1 row) Regards Pavel Regards
function to_char(unknown) is not unique at character 8
Hi, I upgraded the version of PostgreSQL from 12.6 to 12.13, when I execute the sql below , the to_char function caused the following error. ---SQL-- select TO_CHAR('100'); ERROR: function to_char(unknown) is not unique at character 8 HINT: Could not choose a best candidate function. You might need to add explicit type casts. There is no problem before the upgrade and to_char function comes from the Orafce extension. The configuration of the old and new databases is as follows. Database server (old): PostgreSQL 12.6(orafce3.13) Database server (new): PostgreSQL 12.13(orafce3.24) The new database has successfully installed the orafce 3.24 extension. It does not occur in "PostgreSQL 12.6 and orafce 3.13", but occurs in "PostgreSQL 12.13 and orafce 3.24", so either the difference between 12.6 and 12.13 or the difference between orafce 3.13 and 3.24 is suspicious. What is the reason for the problem and how to fix the error? Regards
Re: Is there any good optimization solution to improve the query efficiency?
cost=9.44..9.44 rows=472 width=108) (actual time=216.361..216.361 rows=15123 loops=1) Buckets: 16384 (originally 1024) Batches: 1 (originally 1) Memory Usage: 882kB -> CTE Scan on t_res (cost=0.00..9.44 rows=472 width=108) (actual time=55.591..211.698 rows=15123 loops=1) Planning Time: 1.417 ms Execution Time: 411.019 ms At 2023-06-05 22:53:56, "Lorusso Domenico" wrote: try this (there is some comment) with t_res as ( select RSNO, KNO from TBL_RES where TBL_RES.CID >= to_date('2022/07/01', '/MM/DD') and TBL_RES.CID <= to_date('2022/07/31', '/MM/DD') and TBL_RES.COD >= to_date('2022/07/01', '/MM/DD') and TBL_RES.COD <= to_date('2022/07/31', '/MM/DD') ), t_pov2 as ( select T_CUST.RSNO , T_CUST.KNO , MIN(T_CUST.GSTSEQ) GSTSEQ from T_RES -- this is tbl_res already filter by date inner join TBL_CUST T_CUST on T_RES.RSNO = T_CUST.RSNO inner join TBL_POV T_POV on T_POV.CRSNO = T_RES.CRSNO -- why you use this table? it doesn't seem to be used to extract data. Are you trying to extract data from T_RES that have at least a record in T_POV? in this case could work better move this join in the first with (using distinct or group by to ensure there will be just a record for rsno and kno) where T_CUST.STSFLG = 'T' and T_CUST.DISPSEQ <> AND T_CUST.KFIX = '0' group by T_CUST.RSNO , T_CUST.KNO ), t_pov3 as ( select T_CUST.RSNO RSNO2 , T_CUST.KNO , T_CUST.AGE , T_CUST.GST from TBL_CUST T_CUST inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ ) select * from TBL_RES left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2 and TBL_RES.KNO = T_POV3.KNO Il giorno lun 5 giu 2023 alle ore 12:06 gzh ha scritto: Thank you very much for taking the time to reply to my question. I followed your suggestion and rewrote the SQL using Common Table Expression (CTE). Unfortunately, there was no significant improvement in performance. At 2023-06-05 17:47:25, "Lorusso Domenico" wrote: Hello, In many case a formal writing and usage of with statement could solve the issue. If you need join, use always join: where T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ) this is an inner join. I mean something like this with t_pov2 as ( select T_CUST.RSNO , T_CUST.KNO , MIN(T_CUST.GSTSEQ) GSTSEQ from TBL_CUST T_CUST , TBL_POV T_POV , TBL_RES T_RES where T_CUST.STSFLG = 'T' and T_CUST.DISPSEQ <> AND T_CUST.KFIX = '0' and T_POV.CRSNO = T_RES.CRSNO and T_RES.RSNO = T_CUST.RSNO group by T_CUST.RSNO , T_CUST.KNO ), t_pov3 as ( select T_CUST.RSNO RSNO2 , T_CUST.KNO , T_CUST.AGE , T_CUST.GST from TBL_CUST T_CUST inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ ) select * from TBL_RES left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2 and TBL_RES.KNO = T_POV3.KNO where TBL_RES.CID >= to_date('2022/07/01', '/MM/DD') and TBL_RES.CID <= to_date('2022/07/31', '/MM/DD') and TBL_RES.COD >= to_date('2022/07/01', '/MM/DD') and TBL_RES.COD <= to_date('2022/07/31', '/MM/DD') but if tbl_res contain lessere record a good idea is start from this table and use in join with other Il giorno lun 5 giu 2023 alle ore 08:57 gzh ha scritto: Hi everyone, I'm running into some performance issues with my SQL query. The following SQL query is taking a long time to execute. Execution Plan: explain analyse select * from TBL_RES left outer join(select T_CUST.RSNO RSNO2 , T_CUST.KNO , T_CUST.AGE , T_CUST.GST from TBL_CUST T_CUST , (select T_CUST.RSNO , T_CUST.KNO , MIN(T_CUST.GSTSEQ) GSTSEQ from TBL_CUST T_CUST , TBL_POV T_POV , TBL_RES T_RES where T_CUST.STSFLG = 'T' and T_CUST.DISPSEQ <> AND T_CUST.KFIX = '0' and T_POV.CRSNO = T_RES.CRSNO and T_RES.RSNO = T_CUST.RSNO group by T_CUST.RSNO , T_CUST.KNO) T_POV2 where T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2 and TBL_RES.KNO = T_POV3.KNO where TBL_RES.CID >= to_date('2022/07/01', '/MM/DD') and TBL_RES.CID <= to_date('2022/07/31', '/MM/DD') and TBL_RES.COD >= to_date('2022/07/01', '/MM/DD') and TBL_RES.COD <= to_date('2022/07/31', '/MM/DD') - Execution Plan - Nested Loop Left Join (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1) Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))
Re: Is there any good optimization solution to improve the query efficiency?
Thank you very much for taking the time to reply to my question. I followed your suggestion and rewrote the SQL using Common Table Expression (CTE). Unfortunately, there was no significant improvement in performance. At 2023-06-05 17:47:25, "Lorusso Domenico" wrote: Hello, In many case a formal writing and usage of with statement could solve the issue. If you need join, use always join: where T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ) this is an inner join. I mean something like this with t_pov2 as ( select T_CUST.RSNO , T_CUST.KNO , MIN(T_CUST.GSTSEQ) GSTSEQ from TBL_CUST T_CUST , TBL_POV T_POV , TBL_RES T_RES where T_CUST.STSFLG = 'T' and T_CUST.DISPSEQ <> AND T_CUST.KFIX = '0' and T_POV.CRSNO = T_RES.CRSNO and T_RES.RSNO = T_CUST.RSNO group by T_CUST.RSNO , T_CUST.KNO ), t_pov3 as ( select T_CUST.RSNO RSNO2 , T_CUST.KNO , T_CUST.AGE , T_CUST.GST from TBL_CUST T_CUST inner join t_pov2 on T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ ) select * from TBL_RES left outer join t_pov3 on TBL_RES.RSNO = T_POV3.RSNO2 and TBL_RES.KNO = T_POV3.KNO where TBL_RES.CID >= to_date('2022/07/01', '/MM/DD') and TBL_RES.CID <= to_date('2022/07/31', '/MM/DD') and TBL_RES.COD >= to_date('2022/07/01', '/MM/DD') and TBL_RES.COD <= to_date('2022/07/31', '/MM/DD') but if tbl_res contain lessere record a good idea is start from this table and use in join with other Il giorno lun 5 giu 2023 alle ore 08:57 gzh ha scritto: Hi everyone, I'm running into some performance issues with my SQL query. The following SQL query is taking a long time to execute. Execution Plan: explain analyse select * from TBL_RES left outer join(select T_CUST.RSNO RSNO2 , T_CUST.KNO , T_CUST.AGE , T_CUST.GST from TBL_CUST T_CUST , (select T_CUST.RSNO , T_CUST.KNO , MIN(T_CUST.GSTSEQ) GSTSEQ from TBL_CUST T_CUST , TBL_POV T_POV , TBL_RES T_RES where T_CUST.STSFLG = 'T' and T_CUST.DISPSEQ <> AND T_CUST.KFIX = '0' and T_POV.CRSNO = T_RES.CRSNO and T_RES.RSNO = T_CUST.RSNO group by T_CUST.RSNO , T_CUST.KNO) T_POV2 where T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2 and TBL_RES.KNO = T_POV3.KNO where TBL_RES.CID >= to_date('2022/07/01', '/MM/DD') and TBL_RES.CID <= to_date('2022/07/31', '/MM/DD') and TBL_RES.COD >= to_date('2022/07/01', '/MM/DD') and TBL_RES.COD <= to_date('2022/07/31', '/MM/DD') - Execution Plan - Nested Loop Left Join (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1) Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text)) Rows Removed by Join Filter: 4992268642 -> Gather (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on TBL_RES (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3) Filter: ((CID >= to_date('2022/07/01'::text, '/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, '/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, '/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, '/MM/DD'::text))) Rows Removed by Filter: 161714 -> Materialize (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123) -> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1) Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq -> Seq Scan on TBL_CUST T_CUST (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1) -> Hash (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1) Buckets: 65536 Batches: 8 Memory Usage: 2773kB -> Finalize GroupAggregate (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1) Group Key: T_CUST_1.RSNO, T_CUST_1.KNO -> Gather Merge (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1) Workers Planned: 2 Workers Launched: 1
Re:Re: Is there any good optimization solution to improve the query efficiency?
Hi, David >The above join's selectivity estimation seems to be causing an upper >join to resort to performing a Nested Loop join because the planner >thinks the join will only produce 1 row. > >Unfortunately, extended statistics only help for base relation >estimations and do nothing for join estimations, so your best bet >might be to just: > >SET enable_nestloop TO off; > >for this query. After making the adjustments as you suggested, the SQL statement that previously took 16 minutes to query results can now be queried in less than 10 seconds. Thank you very much for taking the time to reply to my question and providing a solution that solved my issue. Your expertise and willingness to help are greatly appreciated, and I learned a lot from your answer. Thanks again! At 2023-06-05 16:21:19, "David Rowley" wrote: >On Mon, 5 Jun 2023 at 18:56, gzh wrote: >> I'm running into some performance issues with my SQL query. >> The following SQL query is taking a long time to execute. > >> -> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual >> time=1197.484..2954.084 rows=330111 loops=1) >> Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND >> ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = >> (min(T_CUST_1.gstseq >> -> Seq Scan on TBL_CUST T_CUST (cost=0.00..79431.15 >> rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1) >> -> Hash (cost=246230.90..246230.90 rows=262488 width=50) >> (actual time=1197.025..1209.957 rows=330111 loops=1) > >The above join's selectivity estimation seems to be causing an upper >join to resort to performing a Nested Loop join because the planner >thinks the join will only produce 1 row. > >Unfortunately, extended statistics only help for base relation >estimations and do nothing for join estimations, so your best bet >might be to just: > >SET enable_nestloop TO off; > >for this query. > >David
Is there any good optimization solution to improve the query efficiency?
Hi everyone, I'm running into some performance issues with my SQL query. The following SQL query is taking a long time to execute. Execution Plan: explain analyse select * from TBL_RES left outer join(select T_CUST.RSNO RSNO2 , T_CUST.KNO , T_CUST.AGE , T_CUST.GST from TBL_CUST T_CUST , (select T_CUST.RSNO , T_CUST.KNO , MIN(T_CUST.GSTSEQ) GSTSEQ from TBL_CUST T_CUST , TBL_POV T_POV , TBL_RES T_RES where T_CUST.STSFLG = 'T' and T_CUST.DISPSEQ <> AND T_CUST.KFIX = '0' and T_POV.CRSNO = T_RES.CRSNO and T_RES.RSNO = T_CUST.RSNO group by T_CUST.RSNO , T_CUST.KNO) T_POV2 where T_POV2.RSNO = T_CUST.RSNO and T_POV2.KNO = T_CUST.KNO and T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2 and TBL_RES.KNO = T_POV3.KNO where TBL_RES.CID >= to_date('2022/07/01', '/MM/DD') and TBL_RES.CID <= to_date('2022/07/31', '/MM/DD') and TBL_RES.COD >= to_date('2022/07/01', '/MM/DD') and TBL_RES.COD <= to_date('2022/07/31', '/MM/DD') - Execution Plan - Nested Loop Left Join (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1) Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text)) Rows Removed by Join Filter: 4992268642 -> Gather (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on TBL_RES (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3) Filter: ((CID >= to_date('2022/07/01'::text, '/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, '/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, '/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, '/MM/DD'::text))) Rows Removed by Filter: 161714 -> Materialize (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123) -> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1) Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq -> Seq Scan on TBL_CUST T_CUST (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1) -> Hash (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1) Buckets: 65536 Batches: 8 Memory Usage: 2773kB -> Finalize GroupAggregate (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1) Group Key: T_CUST_1.RSNO, T_CUST_1.KNO -> Gather Merge (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1) Workers Planned: 2 Workers Launched: 1 -> Partial GroupAggregate (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2) Group Key: T_CUST_1.RSNO, T_CUST_1.KNO -> Sort (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2) Sort Key: T_CUST_1.RSNO, T_CUST_1.KNO Sort Method: external merge Disk: 5480kB Worker 0: Sort Method: external merge Disk: 5520kB -> Parallel Hash Join (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2) Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO) -> Parallel Seq Scan on TBL_CUST T_CUST_1 (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2) Filter: ((dispseq <> ''::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text)) Rows Removed by Filter: 835318 -> Parallel Hash (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2) Buckets: 131072 Batches: 8 Memory Usage: 3008kB -> Parallel Hash Join (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2) Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)
Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
>If you're porting a bunch of code written for Oracle to Postgres, >you'd have to run around and change every occurrence of "date" to >"timestamp" ... unless you install orafce, in which case you can >rely on this alias type that orafce creates. But you do then have >two types named "date" in the system, so you have to be careful >about search_path settings or you'll get more confusion than it's >worth. When I put the oracle schema in the search_path behind the pg_catalog schema, even if I don't make any changes to the to_date(text) functions of the public schema, the application behaves the same as the old version of the database(PostgreSQL 12.10 and orafce 3.15). Can I understand that when the pg_catalog schema is in front of the oracle schema, the date type that does not specify the schema in all functions takes the date type of pg_catalog.date first, so setting the search_path as below is also a solution. postgres=# show search_path; search_path - "$user", public, pg_catalog, oracle At 2023-04-27 22:11:22, "Tom Lane" wrote: >gzh writes: >> I did the following two tests and found that the return value of >> pg_catalog.date and oracle.date are inconsistent. > >Yeah ... that's pretty much the point. Oracle uses the name "date" >for a data type that Postgres (and the SQL standard) calls a "timestamp". >That's very ancient on their side and I doubt they'll ever change it. > >If you're porting a bunch of code written for Oracle to Postgres, >you'd have to run around and change every occurrence of "date" to >"timestamp" ... unless you install orafce, in which case you can >rely on this alias type that orafce creates. But you do then have >two types named "date" in the system, so you have to be careful >about search_path settings or you'll get more confusion than it's >worth. > > regards, tom lane
Re:Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
Thank you very much for your reply. I did the following two tests and found that the return value of pg_catalog.date and oracle.date are inconsistent. ①the function was created with return type pg_catalog.date --- CREATE OR REPLACE FUNCTION to_date(str text) RETURNS pg_catalog.date AS $$ SELECT $1::pg_catalog.date; $$ LANGUAGE sql IMMUTABLE STRICT; COMMENT ON FUNCTION public.to_date(text) IS 'Convert string to date'; --- The execution result is as follows: postgres=# select TO_DATE(TO_CHAR(localtimestamp, '/MM/DD')) as localdate, TO_DATE(TO_CHAR(current_timestamp, '/MM/DD')) as currentdate; localdate | currentdate +- 2023-04-27 | 2023-04-27 ②the function was created with return type oracle.date --- CREATE OR REPLACE FUNCTION to_date(str text) RETURNS oracle.date AS $$ SELECT $1::oracle.date; $$ LANGUAGE sql IMMUTABLE STRICT; COMMENT ON FUNCTION public.to_date(text) IS 'Convert string to date'; --- The execution result is as follows: postgres=# select TO_DATE(TO_CHAR(localtimestamp, '/MM/DD')) as localdate, TO_DATE(TO_CHAR(current_timestamp, '/MM/DD')) as currentdate; localdate | currentdate -+- 2023-04-27 00:00:00 | 2023-04-27 00:00:00 When the return type is set to oracle.date, there are hours, minutes, and seconds of the date value in the SQL execution result. Why is there such a difference and how to solve it? At 2023-04-25 20:53:09, "Erik Wienhold" wrote: >> On 25/04/2023 13:34 CEST gzh wrote: >> >> >The solution is the same whether you upgrade or not: you need >> >to adjust your search_path to include the "oracle" schema, >> >or else explicitly qualify references to orafce functions. >> Thank you very much for your help. >> >> To use the to_date functions of Orafce 3.0.1, we created the following >> to_date function in the public schema of the old database. >> >> - >> CREATE OR REPLACE FUNCTION to_date(str text) RETURNS date AS $$ SELECT >> $1::date; $$ LANGUAGE sql IMMUTABLE STRICT; COMMENT ON FUNCTION >> public.to_date(text) IS 'Convert string to date'; >> - >> >> To avoid using a to_date function with the same name and parameter in the >> pg_catalog schema first, the search_path of the old database is set as >> follows: >> >> "$user", public, pg_catalog >> >> Make sure that public is searched before pg_catalog. >> After the database is upgraded, in order to solve the changes in Oracle >> 3.24, we have added oracle schema to the search_path, as shown below: >> >> "$user", public, oracle, pg_catalog >> >> The following error occurred when I ran my application. >> >> 42P13:ERROR:42P13: return type mismatch in function declared to return >> pg_catalog.date >> >> When I put the oracle schema at the end of the search_path, the problem was >> solved. >> The search_path settings without problems are as follows: >> >> "$user", public, pg_catalog, oracle >> >> Why does it report an error when i put oracle between public and pg_catalog? > >When you created function to_date(text) your search_path was probably > > "$user", public, pg_catalog > >Thereby the function was created with return type pg_catalog.date and without >a search_path setting. > >The cast to date in the function body, however, is unqualified and thus relies >on the session search_path. When adding oracle to the session search_path >before pg_catalog, the cast will be to oracle.date (orafce defines its own >date type) instead of pg_catalog.date. The function return type, however, is >still declared as pg_catalog.date. > >To fix this create the function with an explicit search_path, i.e. > > CREATE FUNCTION to_date(text) > RETURNS oracle.date > SET search_path = oracle > ... > >Or write the cast as $1::oracle.date to not rely on the search_path at all. > >-- >Erik
Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
>The solution is the same whether you upgrade or not: you need >to adjust your search_path to include the "oracle" schema, >or else explicitly qualify references to orafce functions.Thank you very much >for your help. To use the to_date functions of Orafce 3.0.1, we created the following to_date function in the public schema of the old database. - CREATE OR REPLACE FUNCTION to_date(str text) RETURNS date AS $$ SELECT $1::date; $$ LANGUAGE sql IMMUTABLE STRICT; COMMENT ON FUNCTION public.to_date(text) IS 'Convert string to date'; - To avoid using a to_date function with the same name and parameter in the pg_catalog schema first, the search_path of the old database is set as follows: "$user", public, pg_catalog Make sure that public is searched before pg_catalog. After the database is upgraded, in order to solve the changes in Oracle 3.24, we have added oracle schema to the search_path, as shown below: "$user", public, oracle, pg_catalog The following error occurred when I ran my application. 42P13:ERROR:42P13: return type mismatch in function declared to return pg_catalog.date When I put the oracle schema at the end of the search_path, the problem was solved. The search_path settings without problems are as follows: "$user", public, pg_catalog, oracle Why does it report an error when i put oracle between public and pg_catalog? At 2023-04-20 01:18:15, "Tom Lane" wrote: >gzh writes: >> Thank you for your prompt reply. >> Is there another solution if the database is not upgraded to 12.14? > >The solution is the same whether you upgrade or not: you need >to adjust your search_path to include the "oracle" schema, >or else explicitly qualify references to orafce functions. > > regards, tom lane
Re: psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
Thank you for your prompt reply. Is there another solution if the database is not upgraded to 12.14? >Better upgrade to latest release 12.14. At 2023-04-19 22:51:33, "Erik Wienhold" wrote: >> On 19/04/2023 15:24 CEST gzh wrote: >> >> Hi, >> >> I upgraded the version of PostgreSQL from 12.10 to 12.13, > >Better upgrade to latest release 12.14. > >> when I insert data into the t_mstr table, the to_char function in the >> t_mstr's >> trigger caused the following error. >> >> psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist >> >> There is no problem before the upgrade and to_char(numeric) function comes >> from the Orafce extension. >> The configuration of the old and new databases is as follows. >> >> Database server (old): PostgreSQL 12.10(orafce3.15) >> Database server (new): PostgreSQL 12.13(orafce3.24) >> >> The new database has successfully installed the orafce 3.24 extension. >> It does not occur in "PostgreSQL 12.10 and orafce 3.15", >> but occurs in "PostgreSQL 12.13 and orafce 3.24", >> so either the difference between 12.10 and 12.13 or the difference between >> orafce 3.15 and 3.24 is suspicious. >> >> What is the reason for the problem? > >orafce 3.22 moved functions to schema oracle: > >https://github.com/orafce/orafce/blob/b492a0f50d5ee866c9870f886401d7c34ad8ccb3/NEWS#L4,L5 >https://github.com/orafce/orafce/commit/86a1b51440ca33a04ef47fc3bb704dee26d16753 > >-- >Erik
psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist
Hi, I upgraded the version of PostgreSQL from 12.10 to 12.13, when I insert data into the t_mstr table, the to_char function in the t_mstr's trigger caused the following error. psql:t_mstr.sql:994: ERROR: function to_char(numeric) does not exist There is no problem before the upgrade and to_char(numeric) function comes from the Orafce extension. The configuration of the old and new databases is as follows. Database server (old): PostgreSQL 12.10(orafce3.15) Database server (new): PostgreSQL 12.13(orafce3.24) The new database has successfully installed the orafce 3.24 extension. It does not occur in "PostgreSQL 12.10 and orafce 3.15", but occurs in "PostgreSQL 12.13 and orafce 3.24", so either the difference between 12.10 and 12.13 or the difference between orafce 3.15 and 3.24 is suspicious. What is the reason for the problem? Regards
Re: An I/O error occured while sending to the backend
Hi, Rob >Is this true for large numbers of lines in you csv? Originally you said >something like 'adding 1000 lines' broke your app. Are you using >CopyManager? Thank you for your reply. I didn't use CopyManager. ②'s query statement query the data inserted by ①, and a large amount of data inserted will cause performance problem. I misunderstood, it's not ① that affects ②, it should be that ②'s query statement has a performance problem. The Encrypt function is used in ②'s query statement to encrypt the query result, and improper use of this function leads to poor performance. Sometimes 2~4 seconds to return the result, sometimes 7~19 seconds to return the result. We optimized the query statement to reduce the query time to less than 1 second and now the application works well. At 2022-11-16 14:57:41, "Laurenz Albe" wrote: >On Wed, 2022-11-16 at 13:04 +0800, gzh wrote: >> I have developed an application using the jdbc driver and >> connecting to a postgresql database . >> The application mainly does the following two things: >> ① read data from the CSV file and insert it into the database >> ② perform a database query >> In my application, the above ① and ② are executed asynchronously. >> Everything was working fine until I increase the amount of CSV data to more >> than 1000. >> Randomly they receive an error, the stack trace of which is below. >> >> Caused by: java.net.SocketTimeoutException: Read timed out >> at java.base/java.net.SocketInputStream.socketRead0(Native Method) >> at >> java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115) >> at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168) >> at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140) >> at >> java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478) >> at >> java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472) >> at >> java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70) >> at >> java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1454) >> at >> java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1065) >> at >> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161) > >Well, set a longer socket timeout if you need to run long SQL statements, or >tune >those statements to be faster. > >https://jdbc.postgresql.org/documentation/use/#connection-parameters has more >information about "socketTimeout". > >Yours, >Laurenz Albe
Re: An I/O error occured while sending to the backend
Thank you very much for your advice. What I don't understand is that there is no problem when executing ① alone or ② alone, the error occurs when ① and ② are executed together . It works well when i let the application sleep after ① for 10 seconds before executing ② . Is ② affected by ① ? Could you tell me why? At 2022-11-16 14:57:41, "Laurenz Albe" wrote: >On Wed, 2022-11-16 at 13:04 +0800, gzh wrote: >> I have developed an application using the jdbc driver and >> connecting to a postgresql database . >> The application mainly does the following two things: >> ① read data from the CSV file and insert it into the database >> ② perform a database query >> In my application, the above ① and ② are executed asynchronously. >> Everything was working fine until I increase the amount of CSV data to more >> than 1000. >> Randomly they receive an error, the stack trace of which is below. >> >> Caused by: java.net.SocketTimeoutException: Read timed out >> at java.base/java.net.SocketInputStream.socketRead0(Native Method) >> at >> java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115) >> at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168) >> at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140) >> at >> java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478) >> at >> java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472) >> at >> java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70) >> at >> java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1454) >> at >> java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1065) >> at >> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161) > >Well, set a longer socket timeout if you need to run long SQL statements, or >tune >those statements to be faster. > >https://jdbc.postgresql.org/documentation/use/#connection-parameters has more >information about "socketTimeout". > >Yours, >Laurenz Albe
An I/O error occured while sending to the backend
Hi, I have developed an application using the jdbc driver and connecting to a postgresql database . The application mainly does the following two things: ① read data from the CSV file and insert it into the database ② perform a database query In my application, the above ① and ② are executed asynchronously. Everything was working fine until I increase the amount of CSV data to more than 1000. Randomly they receive an error, the stack trace of which is below. com.nec.neosarf.exception.NeoSarfException: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend at com.nec.neosarf.services.core.controller.DutiesDao.DutiesDaoError(DutiesDao.java:2037) at com.nec.neosarf.services.core.controller.DutiesDao.SelectData(DutiesDao.java:459) at com.nec.neosarf.services.core.controller.DutiesDao.SelectData(DutiesDao.java:335) at com.nec.neosarf.services.core.controller.CoreConductor.ConductorLv3_DataAccess(CoreConductor.java:2062) at com.nec.neosarf.services.core.controller.CoreConductor.ConductorCore(CoreConductor.java:794) at com.nec.neosarf.services.core.controller.CoreConductor.ConductorRun(CoreConductor.java:623) at com.nec.neosarf.services.core.controller.CoreConductor.Conductor(CoreConductor.java:356) at com.nec.neosarf.services.core.controller.CoreBody.ConductorCreation(CoreBody.java:4817) at com.nec.neosarf.services.core.controller.CoreBody.ActionDirectorInteface(CoreBody.java:2783) at com.nec.neosarf.services.core.controller.CoreBody.AjaxCheck(CoreBody.java:2465) at com.nec.neosarf.services.core.controller.CoreBody.ExecutionPageRun(CoreBody.java:1693) at com.nec.neosarf.services.core.servlet.NeoSarf.coreServlet(NeoSarf.java:386) at com.nec.neosarf.services.core.servlet.NeoSarf.service(NeoSarf.java:139) at javax.servlet.http.HttpServlet.service(HttpServlet.java:764) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) at org.apache.catalina.core.ApplicationDispatcher.invoke(ApplicationDispatcher.java:711) at org.apache.catalina.core.ApplicationDispatcher.processRequest(ApplicationDispatcher.java:459) at org.apache.catalina.core.ApplicationDispatcher.doForward(ApplicationDispatcher.java:353) at org.apache.catalina.core.ApplicationDispatcher.forward(ApplicationDispatcher.java:313) at com.nec.neosarf.services.core.filter.dynamic.NeoSarfFilterChain.doFilter(NeoSarfFilterChain.java:36) at com.nec.neosarf.services.debugger.filter.DebugSynchronizedFilter.doFilter(DebugSynchronizedFilter.java:72) at com.nec.neosarf.services.core.filter.dynamic.NeoSarfFilterChain.doFilter(NeoSarfFilterChain.java:34) at com.nec.neosarf.services.core.filter.dynamic.DynamicRequestFilter.executeServlet(DynamicRequestFilter.java:93) at com.nec.neosarf.services.core.filter.RequestFilter.doFilter(RequestFilter.java:450) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:197) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:540) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:135) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:687) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:359) at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:399) at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:889) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1735) at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.postgresql.util.PSQLException: An I/O error occured while sending to the backend at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:382) at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408) at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags
Re:Re: Re: Does psqlodbc_11_01_0000-x64 support special characters?
Thank you for the information. After testing, I found that I only need to escape the following 7 characters. % → %25 " → %22 ' → %27 + → %2B ; → %3B = → %3D { → %7B At 2022-10-13 13:27:16, "Jeffrey Walton" wrote: >On Thu, Oct 13, 2022 at 12:13 AM gzh wrote: >> >> My PostgreSQL is deployed on Amazon RDS, so the password of PostgreSQL is >> random and has various reserved characters. >> >> I don't know if the reserved characters below are complete, and there are >> some characters (e.g. * , $) I tried without problems. >> >> Could you tell me which characters require percent-encoding for PostgreSQL >> password? >> >> >> space → %20 >> >> ! → %21 >> >> " → %22 >> >> # → %23 >> >> $ → %24 >> >> % → %25 >> >> & → %26 >> >> ' → %27 >> >> ( → %28 >> >> ) → %29 >> >> * → %2A >> >> + → %2B >> >> , → %2C >> >> - → %2D >> >> . → %2E >> >> / → %2F >> >> : → %3A >> >> ; → %3B >> >> < → %3C >> >> = → %3D >> >> > → %3E >> >> ? → %3F >> >> @ → %40 >> >> [ → %5B >> >> \ → %5C >> >> ] → %5D >> >> ^ → %5E >> >> _ → %5F >> >> ` → %60 >> >> { → %7B >> >> | → %7C >> >> } → %7D >> >> ~ → %7E > >https://www.rfc-editor.org/rfc/rfc3986#section-2.2 > >Jeff
Re:Re: Does psqlodbc_11_01_0000-x64 support special characters?
Dear Jeff I appreciate your reply. My PostgreSQL is deployed on Amazon RDS, so the password of PostgreSQL is random and has various reserved characters. I don't know if the reserved characters below are complete, and there are some characters (e.g. * , $) I tried without problems. Could you tell me which characters require percent-encoding for PostgreSQL password? space → %20 ! → %21 " → %22 # → %23 $ → %24 % → %25 & → %26 ' → %27 ( → %28 ) → %29 * → %2A + → %2B , → %2C - → %2D . → %2E / → %2F : → %3A ; → %3B < → %3C = → %3D > → %3E ? → %3F @ → %40 [ → %5B \ → %5C ] → %5D ^ → %5E _ → %5F ` → %60 { → %7B | → %7C } → %7D ~ → %7E Kind regards, gzh At 2022-10-12 22:01:15, "Jeffrey Walton" wrote: >On Wed, Oct 12, 2022 at 7:16 AM gzh wrote: >> >> I found that the password can't contain the % character, and the other >> special characters (* , $) are no problem. > >You need to percent-encode the password if you wish to use the % >symbol in the password. There are other reserved characters that you >should percent-encode. See >https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING >and https://www.rfc-editor.org/rfc/rfc3986#section-2.1 . > >Jeff > >> At 2022-10-12 16:28:51, "gzh" wrote: >> >> >> PostgreSQL version: 13.5 >> >> Operating system: windows 10 >> >> Description: >> >> >> I wrote a VBA application to connect to PostgreSQL database by psqlodbc. >> >> The application works fine when there are no special characters in the >> password. >> >> When the password contains special characters (e.g. * , $ %), >> >> the application responds with an error below: >> >> >> Number: -2147467259 >> >> Description: password authentication failed for user 'testdb' >> >> >> I made an sample as below: >> >> >> VBA >> >> - START - >> >> >> Sub dbconnTest() >> >> Dim rs As ADODB.Recordset >> >> Dim sql As String >> >> Dim i As Integer >> >> Dim rcnt As Integer >> >> >> >> Set cnn = New ADODB.Connection >> >> cnn.Open "Provider=MSDASQL;Driver=PostgreSQL >> Unicode;UID=postgres;port=5432;Server=localhost;Database=testdb;PWD=Gd*oB,$3Ln%pQ" >> >> >> >> Set rs = New ADODB.Recordset >> >> sql = "SELECT * FROM testtbl" >> >> >> >> rs.ActiveConnection = cnn >> >> rs.Source = sql >> >> rs.Open >> >> >> >> cnt = rs.Fields.Count >> >> rcnt = 2 >> >> >> >> Do Until rs.EOF >> >> For i = 0 To cnt - 1 >> >> Cells(rcnt, i + 1).Value = rs.Fields(i) >> >> Next >> >> >> >> rcnt = rcnt + 1 >> >> rs.MoveNext >> >> Loop >> >> >> >> Set rs = Nothing >> >> Set cnn = Nothing >> >> End Sub >> >> >> - END - >> >> >> >> Thanks for any help! >>
Re:Does psqlodbc_11_01_0000-x64 support special characters?
I found that the password can't contain the % character, and the other special characters (* , $) are no problem. At 2022-10-12 16:28:51, "gzh" wrote: PostgreSQL version: 13.5 Operating system: windows 10 Description: I wrote a VBA application to connect to PostgreSQL database by psqlodbc. The application works fine when there are no special characters in the password. When the password contains special characters (e.g. * , $ %), the application responds with an error below: Number: -2147467259 Description: password authentication failed for user 'testdb' I made an sample as below: VBA - START - Sub dbconnTest() Dim rs As ADODB.Recordset Dim sql As String Dim i As Integer Dim rcnt As Integer Set cnn = New ADODB.Connection cnn.Open "Provider=MSDASQL;Driver=PostgreSQL Unicode;UID=postgres;port=5432;Server=localhost;Database=testdb;PWD=Gd*oB,$3Ln%pQ" Set rs = New ADODB.Recordset sql = "SELECT * FROM testtbl" rs.ActiveConnection = cnn rs.Source = sql rs.Open cnt = rs.Fields.Count rcnt = 2 Do Until rs.EOF For i = 0 To cnt - 1 Cells(rcnt, i + 1).Value = rs.Fields(i) Next rcnt = rcnt + 1 rs.MoveNext Loop Set rs = Nothing Set cnn = Nothing End Sub - END - Thanks for any help!
Re: Different execution plan between PostgreSQL 8.4 and 12.11
Hi everyone, Who can tell me which solution is better below: Solution 1: Change the configuration parameters set enable_seqscan = off Solution 2: Add DISTINCT clause to SQL explain analyze select DISTINCT 2 from analyze_word_reports where (cseid = 94) limit 1; If I don't want to change SQL, is Solution 1 OK? At 2022-10-12 09:47:17, "David Rowley" wrote: >On Wed, 12 Oct 2022 at 13:06, Klint Gore wrote: >> Limit (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.040 rows=1 >> loops=1) >> -> Unique (cost=0.56..28349.28 rows=1 width=4) (actual time=0.039..0.039 >> rows=1 loops=1) >> -> Index Only Scan using idx on tbl (cost=0.56..28349.28 >> rows=995241 width=4) (actual time=0.038..0.038 rows=1 loops=1) >> Index Cond: (fld = 230) >> Heap Fetches: 0 >> Planning Time: 0.066 ms >> Execution Time: 0.047 ms >> >> With the distinct and the limit, the planner somehow knows to push the >> either the distinct or the limit into the index only scan so the unique for >> distinct only had 1 row and the outer limit only had 1 row. Without the >> limit, the distinct still does the index only scan but has to do the unique >> on the million rows and execution time goes to about 100ms. > >I think that would be very simple to fix. I believe I've done that >locally but just detecting if needed_pathkeys == NULL in >create_final_distinct_paths(). > >i.e. > >- if (pathkeys_contained_in(needed_pathkeys, >path->pathkeys)) >+ if (needed_pathkeys == NIL) >+ { >+ Node *limitCount = makeConst(INT8OID, >-1, InvalidOid, >+ > sizeof(int64), >+ > Int64GetDatum(1), false, >+ > FLOAT8PASSBYVAL); >+ add_path(distinct_rel, (Path *) >+ >create_limit_path(root, distinct_rel, path, NULL, >+ >limitCount, LIMIT_OPTION_COUNT, 0, >+ >1)); >+ } >+ else if >(pathkeys_contained_in(needed_pathkeys, path->pathkeys)) > >That just adds a Limit Path instead of the Unique Path. i.e: > >postgres=# explain (analyze, costs off) select distinct a from t1 where a = 0; > QUERY PLAN >-- > Limit (actual time=0.074..0.075 rows=1 loops=1) > -> Index Only Scan using t1_a_idx on t1 (actual time=0.072..0.073 >rows=1 loops=1) > Index Cond: (a = 0) > Heap Fetches: 1 > Planning Time: 0.146 ms > Execution Time: 0.100 ms >(6 rows) > >However, I might be wrong about that. I've not given it too much thought. > >David
Does psqlodbc_11_01_0000-x64 support special characters?
PostgreSQL version: 13.5 Operating system: windows 10 Description: I wrote a VBA application to connect to PostgreSQL database by psqlodbc. The application works fine when there are no special characters in the password. When the password contains special characters (e.g. * , $ %), the application responds with an error below: Number: -2147467259 Description: password authentication failed for user 'testdb' I made an sample as below: VBA - START - Sub dbconnTest() Dim rs As ADODB.Recordset Dim sql As String Dim i As Integer Dim rcnt As Integer Set cnn = New ADODB.Connection cnn.Open "Provider=MSDASQL;Driver=PostgreSQL Unicode;UID=postgres;port=5432;Server=localhost;Database=testdb;PWD=Gd*oB,$3Ln%pQ" Set rs = New ADODB.Recordset sql = "SELECT * FROM testtbl" rs.ActiveConnection = cnn rs.Source = sql rs.Open cnt = rs.Fields.Count rcnt = 2 Do Until rs.EOF For i = 0 To cnt - 1 Cells(rcnt, i + 1).Value = rs.Fields(i) Next rcnt = rcnt + 1 rs.MoveNext Loop Set rs = Nothing Set cnn = Nothing End Sub - END - Thanks for any help!
Re: Different execution plan between PostgreSQL 8.4 and 12.11
Thank you for all your assistance. By communicating with my customer, we have adopted the following solution to fix the problem. set enable_seqscan = off At 2022-10-11 16:21:42, "Pavel Stehule" wrote: út 11. 10. 2022 v 10:01 odesílatel gzh napsal: Thank you for providing the requested information. The WebSite has been used for many years, and this upgrade is only a version upgrade of the PostgreSQL database. My customer does not want to modify the SQL because it will increase the cost of the project(All SQL that contains a LIMIT clause needs to be analyzed and checked). Is there no other way to solve the problem? I don't know about any alternative Regards Pavel At 2022-10-11 13:24:12, "Pavel Stehule" wrote: út 11. 10. 2022 v 7:08 odesílatel gzh napsal: Hi, Pavel > The LIMIT clause changes total cost. This is a very aggressive clause. And > although it is absolutely useless in this case, Postgres does not have any > logic for removing it. Postgres doesn't try to fix developer's mistakes. Sorry,I didn't understand what you mean. Couldn't the LIMIT clause be used like the SQL statement below? >> new=# explain analyze select 2 from analyze_word_reports where (cseid = >> 94) limit 1; there was query SELECT aggregate() FROM xx LIMIT 1 This SQL statement is no problem under PostgreSQL 8.4, the index works well. The optimizer is under nonstop change. And you can expect from any new release 75% queries are without change, 20% queries are faster, and 5% queries are slower The optimization is based on statistics and estimations, and searching for the optimal solution in space of all solutions. In any version there are smaller or bigger changes of estimation methods, and between old 8.4 and 12 there are big changes in possibilities of how the query can be executed. So there is a higher possibility to find some really fast queries, but there is a higher possibility to find some local optimum or slow query too. Usually the optimizer is smarter (what is the benefit), but more sensitive too (what is the cost). You cannot expect the same result, when the data and algorithm is changed in any version. Postgres doesn't garant the immutability of execution plans. The clause LIMIT with low LIMIT value can be problematic in more cases. The model in Postgres expects data are uniformly stored in the table (heap), but the reality can be different. The common trick in these cases is using OFFSET 0 clause like SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10. At 2022-10-11 12:13:47, "Pavel Stehule" wrote: út 11. 10. 2022 v 6:05 odesílatel gzh napsal: Hi, Pavel Thank you for your reply. > the LIMIT clause is in this case totally useless and messy, and maybe can > negative impacts optimizer Yes. After removing the LIMIT clause, the performance is improved. The execution plan shows that the index worked. We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause. The LIMIT clause changes total cost. This is a very aggressive clause. And although it is absolutely useless in this case, Postgres does not have any logic for removing it. Postgres doesn't try to fix developer's mistakes. At 2022-10-11 11:32:48, "Pavel Stehule" wrote: út 11. 10. 2022 v 5:13 odesílatel gzh napsal: Hi, Tom Thank you for your reply. > When you're asking for help, please don't give us vague statements > like "doesn't seem to work". I understand. > Did the plan (including rowcount > estimates) change at all? To what? How far off is that rowcount > estimate, anyway --- that is, how many rows actually have cseid = 94? Please refer to the new execution plan (PostgreSQL 12.11) below. new=# show enable_seqscan; enable_seqscan on (1 行) new=# select count(*) from analyze_word_reports; count -- 21331980 (1 行) new=# select count(*) from analyze_word_reports where (cseid = 94); count - 1287156 (1 行) new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer Regards Pavel -- -- Limit (cost=65184.06..65184.07 rows=1 width
Re:Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11
Thank you for providing the requested information. The WebSite has been used for many years, and this upgrade is only a version upgrade of the PostgreSQL database. My customer does not want to modify the SQL because it will increase the cost of the project(All SQL that contains a LIMIT clause needs to be analyzed and checked). Is there no other way to solve the problem? At 2022-10-11 13:24:12, "Pavel Stehule" wrote: út 11. 10. 2022 v 7:08 odesílatel gzh napsal: Hi, Pavel > The LIMIT clause changes total cost. This is a very aggressive clause. And > although it is absolutely useless in this case, Postgres does not have any > logic for removing it. Postgres doesn't try to fix developer's mistakes. Sorry,I didn't understand what you mean. Couldn't the LIMIT clause be used like the SQL statement below? >> new=# explain analyze select 2 from analyze_word_reports where (cseid = >> 94) limit 1; there was query SELECT aggregate() FROM xx LIMIT 1 This SQL statement is no problem under PostgreSQL 8.4, the index works well. The optimizer is under nonstop change. And you can expect from any new release 75% queries are without change, 20% queries are faster, and 5% queries are slower The optimization is based on statistics and estimations, and searching for the optimal solution in space of all solutions. In any version there are smaller or bigger changes of estimation methods, and between old 8.4 and 12 there are big changes in possibilities of how the query can be executed. So there is a higher possibility to find some really fast queries, but there is a higher possibility to find some local optimum or slow query too. Usually the optimizer is smarter (what is the benefit), but more sensitive too (what is the cost). You cannot expect the same result, when the data and algorithm is changed in any version. Postgres doesn't garant the immutability of execution plans. The clause LIMIT with low LIMIT value can be problematic in more cases. The model in Postgres expects data are uniformly stored in the table (heap), but the reality can be different. The common trick in these cases is using OFFSET 0 clause like SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10. At 2022-10-11 12:13:47, "Pavel Stehule" wrote: út 11. 10. 2022 v 6:05 odesílatel gzh napsal: Hi, Pavel Thank you for your reply. > the LIMIT clause is in this case totally useless and messy, and maybe can > negative impacts optimizer Yes. After removing the LIMIT clause, the performance is improved. The execution plan shows that the index worked. We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause. The LIMIT clause changes total cost. This is a very aggressive clause. And although it is absolutely useless in this case, Postgres does not have any logic for removing it. Postgres doesn't try to fix developer's mistakes. At 2022-10-11 11:32:48, "Pavel Stehule" wrote: út 11. 10. 2022 v 5:13 odesílatel gzh napsal: Hi, Tom Thank you for your reply. > When you're asking for help, please don't give us vague statements > like "doesn't seem to work". I understand. > Did the plan (including rowcount > estimates) change at all? To what? How far off is that rowcount > estimate, anyway --- that is, how many rows actually have cseid = 94? Please refer to the new execution plan (PostgreSQL 12.11) below. new=# show enable_seqscan; enable_seqscan on (1 行) new=# select count(*) from analyze_word_reports; count -- 21331980 (1 行) new=# select count(*) from analyze_word_reports where (cseid = 94); count - 1287156 (1 行) new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer Regards Pavel -- -- Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1) -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1) -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1) Workers Planned: 2
Re:Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11
Hi, Pavel > The LIMIT clause changes total cost. This is a very aggressive clause. And > although it is absolutely useless in this case, Postgres does not have any > logic for removing it. Postgres doesn't try to fix developer's mistakes. Sorry,I didn't understand what you mean. Couldn't the LIMIT clause be used like the SQL statement below? >> new=# explain analyze select 2 from analyze_word_reports where (cseid = >> 94) limit 1; This SQL statement is no problem under PostgreSQL 8.4, the index works well. At 2022-10-11 12:13:47, "Pavel Stehule" wrote: út 11. 10. 2022 v 6:05 odesílatel gzh napsal: Hi, Pavel Thank you for your reply. > the LIMIT clause is in this case totally useless and messy, and maybe can > negative impacts optimizer Yes. After removing the LIMIT clause, the performance is improved. The execution plan shows that the index worked. We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause. The LIMIT clause changes total cost. This is a very aggressive clause. And although it is absolutely useless in this case, Postgres does not have any logic for removing it. Postgres doesn't try to fix developer's mistakes. At 2022-10-11 11:32:48, "Pavel Stehule" wrote: út 11. 10. 2022 v 5:13 odesílatel gzh napsal: Hi, Tom Thank you for your reply. > When you're asking for help, please don't give us vague statements > like "doesn't seem to work". I understand. > Did the plan (including rowcount > estimates) change at all? To what? How far off is that rowcount > estimate, anyway --- that is, how many rows actually have cseid = 94? Please refer to the new execution plan (PostgreSQL 12.11) below. new=# show enable_seqscan; enable_seqscan on (1 行) new=# select count(*) from analyze_word_reports; count -- 21331980 (1 行) new=# select count(*) from analyze_word_reports where (cseid = 94); count - 1287156 (1 行) new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer Regards Pavel -- -- Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1) -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1) -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3) -> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290 9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3) Index Cond: (cseid = 94) Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms (11 行) new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN -- --- Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1) -> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops= 1) Filter: (cseid = 94) Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms (6 行) > If the estimate is far off, then increasing the table's statistics > target might help. Thank you for your advice. Please tell me how to set the table's statistics up to improve performance. new=# select oid from pg_class where relname = 'analyze_word_reports'; oid --- 16429 (1 行) new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid'; attrelid | attname | attstattarget --+-+--- 1642
Re:Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11
Hi, Pavel Thank you for your reply. > the LIMIT clause is in this case totally useless and messy, and maybe can > negative impacts optimizer Yes. After removing the LIMIT clause, the performance is improved. The execution plan shows that the index worked. We've noticed it, but I don't want to fix the problem by modifying the SQL until I find the cause. At 2022-10-11 11:32:48, "Pavel Stehule" wrote: út 11. 10. 2022 v 5:13 odesílatel gzh napsal: Hi, Tom Thank you for your reply. > When you're asking for help, please don't give us vague statements > like "doesn't seem to work". I understand. > Did the plan (including rowcount > estimates) change at all? To what? How far off is that rowcount > estimate, anyway --- that is, how many rows actually have cseid = 94? Please refer to the new execution plan (PostgreSQL 12.11) below. new=# show enable_seqscan; enable_seqscan on (1 行) new=# select count(*) from analyze_word_reports; count -- 21331980 (1 行) new=# select count(*) from analyze_word_reports where (cseid = 94); count - 1287156 (1 行) new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN the LIMIT clause is in this case totally useless and messy, and maybe can negative impacts optimizer Regards Pavel -- -- Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1) -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1) -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3) -> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290 9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3) Index Cond: (cseid = 94) Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms (11 行) new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN -- --- Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1) -> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops= 1) Filter: (cseid = 94) Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms (6 行) > If the estimate is far off, then increasing the table's statistics > target might help. Thank you for your advice. Please tell me how to set the table's statistics up to improve performance. new=# select oid from pg_class where relname = 'analyze_word_reports'; oid --- 16429 (1 行) new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid'; attrelid | attname | attstattarget --+-+--- 16429 | cseid |-1 (1 行) > Another thing that would be worth checking is whether > "set enable_seqscan = off" prods it to choose the plan you want. > If not, then there's something else going on besides poor estimates. "set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4. The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries? At 2022-10-10 10:45:54, "Tom Lane" wrote: >gzh writes: >> I've run analyze(not vacuum analyze), but it doesn't seem to work. > >When you're asking for help, please don't give us vague statements >like "doesn't seem to work". Did the plan (including rowcount >estimates) change at all? To what? How far off is that rowcount >estimate, anyway --- that is, how many rows
Re:Re: Different execution plan between PostgreSQL 8.4 and 12.11
Hi, Tom Thank you for your reply. > When you're asking for help, please don't give us vague statements > like "doesn't seem to work". I understand. > Did the plan (including rowcount > estimates) change at all? To what? How far off is that rowcount > estimate, anyway --- that is, how many rows actually have cseid = 94? Please refer to the new execution plan (PostgreSQL 12.11) below. new=# show enable_seqscan; enable_seqscan on (1 行) new=# select count(*) from analyze_word_reports; count -- 21331980 (1 行) new=# select count(*) from analyze_word_reports where (cseid = 94); count - 1287156 (1 行) new=# explain analyze select count(2) from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN -- -- Limit (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.713..133.035 rows=1 loops=1) -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8) (actual time=123.712..133.033 rows=1 loops=1) -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual time=123.548..133.024 rows=3 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial Aggregate (cost=64183.85..64183.86 rows=1 width=8) (actual time=119.495..119.496 rows=1 loops=3) -> Parallel Index Only Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290 9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052 loops=3) Index Cond: (cseid = 94) Heap Fetches: 1287156 Planning Time: 0.122 ms Execution Time: 133.069 ms (11 行) new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN -- --- Limit (cost=0.00..0.43 rows=1 width=4) (actual time=2156.964..2156.966 rows=1 loops=1) -> Seq Scan on analyze_word_reports (cost=0.00..528550.75 rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops= 1) Filter: (cseid = 94) Rows Removed by Filter: 18320180 Planning Time: 0.086 ms Execution Time: 2156.985 ms (6 行) > If the estimate is far off, then increasing the table's statistics > target might help. Thank you for your advice. Please tell me how to set the table's statistics up to improve performance. new=# select oid from pg_class where relname = 'analyze_word_reports'; oid --- 16429 (1 行) new=# select attrelid,attname,attstattarget from pg_attribute where attrelid=16429 and attname='cseid'; attrelid | attname | attstattarget --+-+--- 16429 | cseid |-1 (1 行) > Another thing that would be worth checking is whether > "set enable_seqscan = off" prods it to choose the plan you want. > If not, then there's something else going on besides poor estimates. "set enable_seqscan = off" works, and the performance is greatly improved, which is almost the same as PostgreSQL 8.4. The enable_seqscan(PostgreSQL 8.4) is on, will this change have an unknown effect on other queries? At 2022-10-10 10:45:54, "Tom Lane" wrote: >gzh writes: >> I've run analyze(not vacuum analyze), but it doesn't seem to work. > >When you're asking for help, please don't give us vague statements >like "doesn't seem to work". Did the plan (including rowcount >estimates) change at all? To what? How far off is that rowcount >estimate, anyway --- that is, how many rows actually have cseid = 94? > >If the estimate is far off, then increasing the table's statistics >target might help. > >Another thing that would be worth checking is whether >"set enable_seqscan = off" prods it to choose the plan you want. >If not, then there's something else going on besides poor estimates. > > regards, tom lane
Re:Re: Different execution plan between PostgreSQL 8.4 and 12.11
Hi Tom, Thank you for your prompt response. I've run analyze(not vacuum analyze), but it doesn't seem to work. Is there any other way to optimize the database? At 2022-10-10 00:02:09, "Tom Lane" wrote: >gzh writes: >> I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns >> different execution plan. > >Have you done an ANALYZE in the new database? The rowcount estimates >seem a lot different, leading me to guess that the newer installation >doesn't have good statistics yet. > > regards, tom lane
Different execution plan between PostgreSQL 8.4 and 12.11
Hi, I have had a Perl Website working for 7 years and have had no problems until a few weeks ago I replaced my database server with a newer one. Database server (old): PostgreSQL 8.4 32bit Database server (new): PostgreSQL 12.11 64bit I run following sql in PostgreSQL 8.4 and PostgreSQL 12.11, it returns different execution plan. --PostgreSQL 8.4 --- old=# select count(1) from analyze_word_reports; count - 9164136 (1 row) old=# select indexdef from pg_indexes where tablename='analyze_word_reports'; indexdef --- CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports USING btree (cseid) CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports USING btree (seq) (2 rows) old=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN - Limit (cost=0.00..0.63 rows=1 width=0) (actual time=0.893..0.893 rows=0 loops=1) -> Index Scan using analyze_word_reports_index_cseid on analyze_word_reports (cost=0.00..18621.98 rows=29707 width=0) (actual time=0.892..0.892 rows=0 loops=1) Index Cond: (cseid = 94) Total runtime: 0.941 ms (4 rows) --PostgreSQL 12.11 --- new=# select count(1) from analyze_word_reports; count - 20131947 (1 row) new=# select indexdef from pg_indexes where tablename='analyze_word_reports'; indexdef --- CREATE INDEX analyze_word_reports_index_cseid ON analyze_word_reports USING btree (cseid) CREATE UNIQUE INDEX analyze_word_reports_pkey ON analyze_word_reports USING btree (seq) (2 rows) new=# explain analyze select 2 from analyze_word_reports where (cseid = 94) limit 1; QUERY PLAN - Limit (cost=0.00..0.41 rows=1 width=4) (actual time=4908.459..4908.462 rows=1 loops=1) -> Seq Scan on analyze_word_reports (cost=0.00..528610.15 rows=1278561 width=4) (actual time=4908.455..4908.456 rows=1 loops=1) Filter: (cseid = 94) Rows Removed by Filter: 15477750 Planning Time: 0.411 ms Execution Time: 4908.498 ms (6 行) Although PostgreSQL 8.4 has more than 10 million rows of data less than PostgreSQL 12.11, PostgreSQL 12.11 is too slow and analyze_word_reports_index_cseid does not seem to work. I'm guessing that the limit keyword of PostgreSQL 12.11 causes the index not to work. But I don't understand why PostgreSQL 8.4 is normal. What is the reason for this and is there any easy way to maintain compatibility? Regards, -- gzh
Re:Re: pg_dump without setting search_path
Dear all, Thank you so much for your response to my request for information. I appreciated you taking the time to provide some answers and for getting back to me so promptly. It made a lot of sense and was exactly what I needed to know. At 2022-08-25 12:19:03, "Adrian Klaver" wrote: >On 8/24/22 20:39, gzh wrote: >> Hi Tom, >> >> Thank you for your prompt response. >> >> When I use pg_dump to export schema from an older version of PostgreSQL >> 8.2.3 , it adds the following line at the beginning: >> >> SET search_path = public, pg_catalog; >> >> Is it possible set an option where pg_dump will add this line in >> PostgreSQL 12.5? >> > >No for this reason: > >https://wiki.postgresql.org/wiki/A_Guide_to_CVE-2018-1058:_Protect_Your_Search_Path > >>> regards, tom lane > > >-- >Adrian Klaver >adrian.kla...@aklaver.com
Re:Re: pg_dump without setting search_path
Hi Tom, Thank you for your prompt response. When I use pg_dump to export schema from an older version of PostgreSQL 8.2.3 , it adds the following line at the beginning: SET search_path = public, pg_catalog; Is it possible set an option where pg_dump will add this line in PostgreSQL 12.5? At 2022-08-25 11:07:46, "Tom Lane" wrote: >gzh writes: >> When I use pg_dump to export schema from a database, it adds the following >> line at the beginning: >> SELECT pg_catalog.set_config('search_path', '', false); >> Is it possible set an option where pg_dump will not add this line? > >No. It's a security precaution. > >> It is causing issues later when I try to execute other SQL commands, without >> the schema qualifier. > >That probably means you have some user-defined functions that are >not sufficiently careful about search_path. You want to either >schema-qualify every reference in the function body, or attach a >"SET search_path" clause to remove the function's context dependency. > >(If you're using SQL-language functions in v14 or later, another >option is to use the SQL-standard syntax instead of writing the >body as a string literal. Then the object references in the function >are parsed at definition time.) > > regards, tom lane
pg_dump without setting search_path
When I use pg_dump to export schema from a database, it adds the following line at the beginning: SELECT pg_catalog.set_config('search_path', '', false); Is it possible set an option where pg_dump will not add this line? It is causing issues later when I try to execute other SQL commands, without the schema qualifier. This is the pg_dump command I am using right now: pg_dump -O -x -h -p -U -d --schema public --schema-only > public-schema.sql
Re:Re:Re: Different execution plan between PostgreSQL 8.2 and 12.5
Dear Adrian, Sorry, there is an error in the email just replied, the version of PostgreSQL is wrong. PostgreSQL 8.4 → PostgreSQL 8.2 At 2022-08-19 12:42:54, "gzh" wrote: Dear Adrian, I appreciate your reply. Your reply gave me a new idea, it should not be the problem that the lower() function causes the unique index to fail. I checked the postgresql.conf file and found that shared_buffers, work_mem and maintenance_work_mem are default value, but in the postgresql.conf file of PostgreSQL 8.4, the value of these parameters are very large. When I changed the value of these parameters to a larger value, the problem solved.
Re:Re: Different execution plan between PostgreSQL 8.2 and 12.5
Dear Adrian, I appreciate your reply. Your reply gave me a new idea, it should not be the problem that the lower() function causes the unique index to fail. I checked the postgresql.conf file and found that shared_buffers, work_mem and maintenance_work_mem are default value, but in the postgresql.conf file of PostgreSQL 8.4, the value of these parameters are very large. When I changed the value of these parameters to a larger value, the problem solved.
Re:Re: Different execution plan between PostgreSQL 8.2 and 12.5
Dear Adrian, Thanks for your reply. >Did you run ANALYZE on the 12.5 server after restoring the data to it? No, I did not run ANALYZE on the 12.5 server after restoring the data. When I change the select clause like this ( crew_base.crewid → count(*) ), I can retrieve the number of data rows. Please refer to my previous reply for more information. At 2022-08-18 12:18:31, "Adrian Klaver" wrote: >On 8/17/22 20:01, gzh wrote: >> Hi, >> >> >> I have had a Perl Website working for 7 years and have had no problems >> >> until a few weeks ago I replaced my database server with a newer one. > >Did you run ANALYZE on the 12.5 server after restoring the data to it? > > >> >> gzh >> >> > > >-- >Adrian Klaver >adrian.kla...@aklaver.com
Re:Re: Different execution plan between PostgreSQL 8.2 and 12.5
Dear David, Thanks for your reply. >In addition to that, I couldn't help notice that the quoted SQL does >not seem to belong to the explain. The EXPLAIN has a Limit node, but >the query does not. I'm assuming this isn't due to the relations being >views since we don't pull up subqueries with a LIMIT. LIMIT node because I connect PostgreSQL with A5M2, the tool will add LIMIT when outputting the execution plan, please ignore it. At 2022-08-18 11:38:58, "David Rowley" wrote: >On Thu, 18 Aug 2022 at 15:32, Tom Lane wrote: >> The 12.5 plan looks like it thinks that the join condition is not >> hashable --- and probably not mergeable as well, else it would have >> done a mergejoin. This is odd if we assume that the lower() >> outputs are just text. But you haven't said anything about the >> data types involved, nor what locale setting you're using, nor >> what nondefault settings or extensions you might be using, so >> speculation about the cause would just be speculation. > >In addition to that, I couldn't help notice that the quoted SQL does >not seem to belong to the explain. The EXPLAIN has a Limit node, but >the query does not. I'm assuming this isn't due to the relations being >views since we don't pull up subqueries with a LIMIT. > >The costs for the 12.5 are cheaper than 8.4's, so I imagine the more >likely cause is the planner favouring an early startup plan. > >It's probably more likely that lower() is providing the planner with >bad estimates and there's likely far less than the expected rows, >resulting in the LIMIT 1 being a much larger proportion of the >total rows than the planner expects. > >David
Re:Re: Different execution plan between PostgreSQL 8.2 and 12.5
Dear Tom, Thanks for your reply. Please refer to the information below: PostgreSQL 8.2 [root@PostgreSQL8 ~]# cat /etc/redhat-release CentOS release 6.6 (Final) [root@PostgreSQL8 ~]# locale | grep LANG LANG=ja_JP.UTF-8 PostgreSQL 12.5 [root@PostgreSQL12 ~]# cat /etc/redhat-release Red Hat Enterprise Linux release 8.5 (Ootpa) [root@PostgreSQL12 ~]# locale | grep LANG LANG=ja_JP.utf8 I run following sql in PostgreSQL 8.2 and PostgreSQL 12.5, it returns different execution plan. --SQL explain analyze select crew_base.crewid from crew_base left join crew_base as crew_base_introduced on crew_base.introduced_by=crew_base_introduced.crewid where crew_base.status = '1'; --PostgreSQL 8.2 --- QUERY PLAN Limit (cost=0.00..229939.53 rows=7921 width=10) (actual time=2.137..4598.114 rows=4489 loops=1) -> Nested Loop Left Join (cost=0.00..229939.53 rows=7921 width=10) (actual time=2.136..4597.484 rows=4489 loops=1) -> Seq Scan on crew_base (cost=0.00..165072.69 rows=7921 width=20) (actual time=2.131..4348.423 rows=4489 loops=1) Filter: (status = 1) -> Index Scan using crew_base_crewid_index on crew_base crew_base_introduced (cost=0.00..8.18 rows=1 width=10) (actual time=0.055..0.055 rows=0 loops=4489) Index Cond: (crew_base.introduced_by = crew_base_introduced.crewid) Total runtime: 4599.985 ms --PostgreSQL 12.5 --- QUERY PLAN Limit (cost=1000.00..119573.46 rows=4759 width=7) (actual time=0.996..633.557 rows=4489 loops=1) -> Gather (cost=1000.00..119573.46 rows=4759 width=7) (actual time=0.994..633.043 rows=4489 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on crew_base (cost=0.00..118097.56 rows=1983 width=7) (actual time=0.573..611.579 rows=1496 loops=3) Filter: (status = 1) Rows Removed by Filter: 493589 Planning Time: 15.966 ms Execution Time: 634.035 ms The execution plan shows that PostgreSQL 12.5 takes less time, but the data can not display, and SQL has been in a suspended state. When I change the select clause to the following( crew_base.crewid → count(*) ), I can retrieve the number of data rows. The amount of data in the crew_base table is 1485255. The data type of the crew_base.crewid field is text. The crew_base.crewid field has a unique index: CREATE UNIQUE INDEX crew_base_crewid_index ON public.crew_base USING btree (crewid) select count(*) from crew_base left join crew_base as crew_base_introduced on crew_base.introduced_by=crew_base_introduced.crewid where crew_base.status = '1'; At 2022-08-18 11:32:22, "Tom Lane" wrote: >gzh writes: >> I run following sql in PostgreSQL 8.2 and PostgreSQL 12.5, it returns >> different execution plan. > >8.2 is ... well, not stone age maybe, but pretty durn ancient. >You really ought to update a bit more often than that. (And >maybe pay more attention to staying up to date with minor releases? >Whatever was your reasoning for choosing 12.5, when the latest 12.x >release is 12.12?) > >The 12.5 plan looks like it thinks that the join condition is not >hashable --- and probably not mergeable as well, else it would have >done a mergejoin. This is odd if we assume that the lower() >outputs are just text. But you haven't said anything about the >data types involved, nor what locale setting you're using, nor >what nondefault settings or extensions you might be using, so >speculation about the cause would just be speculation. > >There is some advice here about how to ask this sort of >question in a way that would obtain useful answers: > >https://wiki.postgresql.org/wiki/Slow_Query_Questions > > regards, tom lane
Different execution plan between PostgreSQL 8.2 and 12.5
Hi, I have had a Perl Website working for 7 years and have had no problems until a few weeks ago I replaced my database server with a newer one. Database server (old): PostgreSQL 8.2 32bit Database server (new): PostgreSQL 12.5 64bit I run following sql in PostgreSQL 8.2 and PostgreSQL 12.5, it returns different execution plan. --SQL explain select crew_base.crewid from crew_base left join crew_base as crew_base_introduced on crew_base.introduced_by=crew_base_introduced.crewid where crew_base.status = '1'; --PostgreSQL 8.2 --- QUERY PLAN Limit (cost=188628.24..189521.23 rows=1 width=10) -> Hash Left Join (cost=188628.24..3800200.71 rows=40443494 width=10) Hash Cond: (lower(crew_base.introduced_by) = lower(crew_base_introduced.crewid)) -> Seq Scan on crew_base (cost=0.00..165072.69 rows=5446 width=20) Filter: (status = 1) -> Hash (cost=161359.55..161359.55 rows=1485255 width=10) -> Seq Scan on crew_base crew_base_introduced (cost=0.00..161359.55 rows=1485255 width=10) --PostgreSQL 12.5 --- QUERY PLAN Limit (cost=0.43..47861.44 rows=1 width=7) -> Nested Loop Left Join (cost=0.43..169386135.30 rows=35391255 width=7) Join Filter: (lower(crew_base.introduced_by) = lower(crew_base_introduced.crewid)) -> Seq Scan on crew_base (cost=0.00..128942.75 rows=4759 width=14) Filter: (status = 1) -> Materialize (cost=0.43..51909.70 rows=1487340 width=7) -> Index Only Scan using crew_base_crewid_index on crew_base crew_base_introduced (cost=0.43..38663.00 rows=1487340 width=7) PostgreSQL 8.2 quickly queried the data, but PostgreSQL 12.5 has not responded. I'm guessing that the lower() function of PostgreSQL 12.5 invalidates the index. But I don't understand why PostgreSQL 8.2 is normal. What is the reason for this and is there any easy way to maintain compatibility? Regards, -- gzh
Re:Re: Different sort result between PostgreSQL 8.4 and 12.5
Dear Magnus: The information you sent through to me was perfect. After I checked the operating system, I found that they are really different. --PostgreSQL 8.4 LANG=ja_JP.UTF-8 --PostgreSQL 12.5 LANG=en_US.UTF-8 After I added the following syntax after the "order by ascid" in PostgreSQL 12.5 database, I got the same result as PostgreSQL 8.4 COLLATE "ja-JP-x-icu" Thank you for being so helpful. 2022-06-27 19:33:01,"Magnus Hagander" : On Mon, Jun 27, 2022 at 1:31 PM gzh wrote: Hi, I have had a Perl Website working for 7 years and have had no problems until a few weeks ago I replaced my database server with a newer one. Database server (old): PostgreSQL 8.4 32bit Database server (new): PostgreSQL 12.5 64bit I run following sql in PostgreSQL 8.4 and PostgreSQL 12.5, it returns different sort results. --PostgreSQL 8.4 --- pg_db=# select ascid from test_order where oo_m.ascid in ('"! ascid"','"001"') order by ascid; ascid --- "! ascid" "001" (2 rows) --PostgreSQL 12.5 --- pg_db=# select ascid from test_order where oo_m.ascid in ('"! ascid"','"001"') order by ascid; ascid --- "001" "! ascid" (2 rows) What is the reason for this and is there any easy way to maintain compatibility? Are these two really running on the same operating system? This looks a lot like the locale changes included in newer versions of glibc, and is in that case dependent on an upgrade of the operating system, not an upgrade of PostgreSQL. See https://wiki.postgresql.org/wiki/Locale_data_changes for details. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Different sort result between PostgreSQL 8.4 and 12.5
Hi, I have had a Perl Website working for 7 years and have had no problems until a few weeks ago I replaced my database server with a newer one. Database server (old): PostgreSQL 8.4 32bit Database server (new): PostgreSQL 12.5 64bit I run following sql in PostgreSQL 8.4 and PostgreSQL 12.5, it returns different sort results. --PostgreSQL 8.4 --- pg_db=# select ascid from test_order where oo_m.ascid in ('"! ascid"','"001"') order by ascid; ascid --- "! ascid" "001" (2 rows) --PostgreSQL 12.5 --- pg_db=# select ascid from test_order where oo_m.ascid in ('"! ascid"','"001"') order by ascid; ascid --- "001" "! ascid" (2 rows) What is the reason for this and is there any easy way to maintain compatibility? Regards, -- gzh
TO_DATE function between PostgreSQL 8.2 and 9.4
Hi, I have had a Perl Website working for 7 years and have had no problems until at the weekend I replace my database server with a newer one. Database server (old): PostgreSQL 8.2 32bit Database server (new): PostgreSQL 9.4 64bit I run following sql in PostgreSQL 8.2, it return my expected result as ‘-MM-DD’ format. --PostgreSQL 8.2 --- SELECT to_date(now() + '-7day', '-MM-DD'); However after i migrated to PostgreSQL 9.4, i hit following error: --PostgreSQL 9.4 --- SELECT to_date(now() + '-7day', '-MM-DD'); ERROR: function to_date(timestamp with time zone, unknown) does not exist LINE 1: SELECT to_date(now() + '-7day', '-MM-DD') ^ ** Error ** Of course, the most reliable way to deal with it is to rewrite the application or SQL to handle types strictly, but the application is large and rewrite is a terrible job. Is there any easy way to maintain compatibility? Regards, -- gzh
after psqlodbc upgrade - Which change solves the error: -2147217887
Hi, I have had a ASP Website working for the 5 years and have had no problems until at the weekend I replace my database server with a newer one. Database server (old): Postgres 9.4 64bit Database server (new): Postgres 12.6 64bit The Website connect to PostgreSQL 9.4 64bit with psqlodbc_10_03_-x86(32bit). Since the upgrade I have been getting the following error message. Number: -2147217887 Description: Multiple-step OLE DB operation generated errors. I found that the problem occurred in the third step: Step 1. Starts a new transaction with the BeginTrans method. Step 2. Creates an OraDynaset object from the specified SQL SELECT statement SQL-A. Step 3. Creates an OraDynaset object from the specified SQL SELECT statement SQL-B. Step 4. Ends the current transaction with the CommitTrans method. I have changed a newer version of odbc driver with psqlodbc_11_00_-x86(32bit) on my site and everything appears to be working fine now. psqlODBC 11.00. Release notes Changes: 1.Remove obsolete maps pointed out. POWER -> pow, CONCAT -> textcat, LEFT -> ltrunc, RIGHT -> rtrunc Patch by Daniel Cory. 2.Remove connSettings option and/or pqopt option from the OutConnectionString parameter of SQLDriverConnect() when each option doesn't exist in InConnectionString parameter. 3.The parameters should be cast because parameters of concat() function are variadic "any". 4.Unbuffered-IO in Windows is incredibly slow. Instead call fflush() after fprintf(). 5.Add an alias DX of *Database* keyword for connection strings to aviod the use of "database" keyword which has a special meaning in some apps or middlewares. 6.numeric items without precision are unlimited and there's no natural map between SQL Data types. Add an option *Numeric(without precision) as*. 7.Fix a bug that SQLSpecialColumns() returns oid/xmin incorrectly when a table does not exist. Patch by Quan Zongliang. I want to know which change of the psqlODBC solves the error? Regards, -- gzh