Postgres not using index on views

2020-04-06 Thread Rick Vincent
Hi,

I am seeing a performance problem with postgresql v 11.7 on views, and I am 
wondering if anyone can tell me why or has any suggestion.

A table is created as:

CREATE TABLE "FBNK_CUSTOMER" (RECID VARCHAR(255) NOT NULL PRIMARY KEY, 
XMLRECORD VARCHAR)

And contains only 180 rows.

Doing an explain plan on the view created over this gives:

EXPLAIN ANALYZE
select RECID from "V_FBNK_CUSTOMER"


Subquery Scan on "V_FBNK_CUSTOMER"  (cost=0.00..19014.60 rows=180 width=7) 
(actual time=459.601..78642.189 rows=180 loops=1)
  ->  Seq Scan on "FBNK_CUSTOMER" a  (cost=0.00..19012.80 rows=180 width=14575) 
(actual time=459.600..78641.950 rows=180 loops=1)
Planning Time: 0.679 ms
Execution Time: 78642.616 ms

Yet an Explain plan on the underlying table( on select RECID from 
"FBNK_CUSTOMER") gives:

Seq Scan on "FBNK_CUSTOMER"  (cost=0.00..22.80 rows=180 width=7) (actual 
time=0.004..0.272 rows=180 loops=1)
Planning Time: 0.031 ms
Execution Time: 0.288 ms

So you can see that postgresql is not using the primary key index for RECID.  
THIS IS NOT THE CASE FOR ORACLE where the primary key index is used in the 
explain plan

The view is created similar to the following where extractValueJS is a stored 
procedure that extracts a value from the VARCHAR XMLRECORD column.

CREATE VIEW "V_FBNK_CUSTOMER" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
,extractValueJS(a.XMLRECORD, 1, 0) "MNEMONIC"
,extractValueJS(a.XMLRECORD, 2, 0) "SHORT_NAME"
,extractValueJS(a.XMLRECORD, 2, -1) "SHORT_NAME_2"
, etc
, extractValueJS(a.XMLRECORD, 179, 9) "TESTER"
FROM
"FBNK_CUSTOMER" a


As well, the problem gets worse as columns are added to the view, irrespective 
of the SELECTION columns and it seems to perform some activity behind.

Creating an empty view,

CREATE VIEW "V_FBNK_CUSTOMER_TEST" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
FROM
"FBNK_CUSTOMER" a- > 3 ms   select RECID from 
"V_FBNK_CUSTOMER_TEST"


CREATE VIEW "V_FBNK_CUSTOMER_TEST" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
,extractValueJS(a.XMLRECORD, 1, 0) "MNEMONIC"
FROM
"FBNK_CUSTOMER" a   --> 54 ms select RECID from 
"V_FBNK_CUSTOMER_TEST"


CREATE VIEW "V_FBNK_CUSTOMER_TEST" as
SELECT a.RECID, a.XMLRECORD "THE_RECORD"
,a.RECID "CUSTOMER_CODE"
,a.RECID "CUSTOMER_NO"
,extractValueJS(a.XMLRECORD, 1, 0) "MNEMONIC"
,extractValueJS(a.XMLRECORD, 2, 0) "SHORT_NAME"
FROM
"FBNK_CUSTOMER" a > 118 ms select RECID 
from "V_FBNK_CUSTOMER_TEST"

The following query takes an extremely long time for only 180 rows, and what 
this means is that we would have to index anything appearing in the where 
clause for every table in order to use views because the views seem not to 
consider the select clause.  Why is that and does anyone know a way around this?

SELECT RECID FROM "V_FBNK_CUSTOMER" WHERE "TESTER" = '5.00' ORDER BY RECID

Sort  (cost=19015.06..19015.06 rows=1 width=7) (actual 
time=102172.500..102172.501 rows=1 loops=1)
  Sort Key: "V_FBNK_CUSTOMER".recid
  Sort Method: quicksort  Memory: 25kB
  ->  Subquery Scan on "V_FBNK_CUSTOMER"  (cost=0.00..19015.05 rows=1 width=7) 
(actual time=91242.866..102172.474 rows=1 loops=1)
Filter: (("V_FBNK_CUSTOMER"."TESTER")::text = '5.00'::text)
Rows Removed by Filter: 179
->  Seq Scan on "FBNK_CUSTOMER" a  (cost=0.00..19012.80 rows=180 
width=14575) (actual time=613.455..102172.175 rows=180 loops=1)
Planning Time: 1.674 ms
Execution Time: 102174.015 ms


The information in this e-mail and any attachments is confidential and may be 
legally privileged. It is intended solely for the addressee or addressees. Any 
use or disclosure of the contents of this e-mail/attachments by a not intended 
recipient is unauthorized and may be unlawful. If you have received this e-mail 
in error please notify the sender. Please note that any views or opinions 
presented in this e-mail are solely those of the author and do not necessarily 
represent those of TEMENOS. We recommend that you check this e-mail and any 
attachments against viruses. TEMENOS accepts no liability for any damage caused 
by any malicious code or virus transmitted by this e-mail.


Re: Postgres not using index on views

2020-04-06 Thread Justin Pryzby
On Mon, Apr 06, 2020 at 02:19:59PM +, Rick Vincent wrote:
> I am seeing a performance problem with postgresql v 11.7 on views, and I am 
> wondering if anyone can tell me why or has any suggestion.
> 
> A table is created as:
> 
> CREATE TABLE "FBNK_CUSTOMER" (RECID VARCHAR(255) NOT NULL PRIMARY KEY, 
> XMLRECORD VARCHAR)
> 
> And contains only 180 rows.
> 
> Doing an explain plan on the view created over this gives:
> 
> EXPLAIN ANALYZE
> select RECID from "V_FBNK_CUSTOMER"
> 
> 
> Subquery Scan on "V_FBNK_CUSTOMER"  (cost=0.00..19014.60 rows=180 width=7) 
> (actual time=459.601..78642.189 rows=180 loops=1)
>   ->  Seq Scan on "FBNK_CUSTOMER" a  (cost=0.00..19012.80 rows=180 
> width=14575) (actual time=459.600..78641.950 rows=180 loops=1)
> 
> Yet an Explain plan on the underlying table( on select RECID from 
> "FBNK_CUSTOMER") gives:
> 
> Seq Scan on "FBNK_CUSTOMER"  (cost=0.00..22.80 rows=180 width=7) (actual 
> time=0.004..0.272 rows=180 loops=1)

It still did a seq scan on the table, so I'm not sure what this has to do with
index scans ?

> The following query takes an extremely long time for only 180 rows, and what 
> this means is that we would have to index anything appearing in the where 
> clause for every table in order to use views because the views seem not to 
> consider the select clause.  Why is that and does anyone know a way around 
> this?

Is there a reason why you don't store the extracted value in its own column ?
And maybe keep it up to date using an insert/update trigger on the xmlrecord
column.

-- 
Justin




Re: Postgres not using index on views

2020-04-06 Thread Tom Lane
Justin Pryzby  writes:
> On Mon, Apr 06, 2020 at 02:19:59PM +, Rick Vincent wrote:
>> The following query takes an extremely long time for only 180 rows, and what 
>> this means is that we would have to index anything appearing in the where 
>> clause for every table in order to use views because the views seem not to 
>> consider the select clause.  Why is that and does anyone know a way around 
>> this?

> Is there a reason why you don't store the extracted value in its own column ?

The planner seems to be quite well aware that the slower query is going to
be slower, since the estimated costs are much higher.  Since it's not
choosing to optimize into a faster form, I wonder whether it's constrained
by semantic requirements.  In particular, I'm suspicious that some of
those functions you have in the view are marked "volatile", preventing
them from being optimized away.

Beyond that guess, though, there's really not enough info here to say.
The info we usually ask for to debug slow-query problems is explained
at

https://wiki.postgresql.org/wiki/Slow_Query_Questions

regards, tom lane




Re: Postgres not using index on views

2020-04-06 Thread Laurenz Albe
On Mon, 2020-04-06 at 14:19 +, Rick Vincent wrote:
> I am seeing a performance problem with postgresql v 11.7 on views, and I am 
> wondering if
> anyone can tell me why or has any suggestion.

Your account is somewhat confused - too many questions rolled into one
rant, I would say.

There are two points that may clear up the case:

- If you have no WHERE clause, a sequential scan of the table is usually
  the best way to do it.  The exception is an index only scan if the index
  contains all that is required, but in PostgreSQL you need a recently
  VACUUMed table for that.

- The expensive part in your view is the "extractValueJS" function.
  Try to tune that for better performance.

If any of your problems are not explained by that, please say so.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com