Re: [HACKERS] Casting issues with domains

2014-12-11 Thread Thomas Reiss
Le 11/12/2014 00:46, Tom Lane a écrit :
 Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 As far as that goes, I think the OP was unhappy about the performance
 of the information_schema views, which in our implementation do exactly
 that so that the exposed types of the view columns conform to the SQL
 standard, even though the underlying catalogs use PG-centric types.

 I don't believe that that's the only reason why the performance of the
 information_schema views tends to be sucky, but it's certainly a reason.
 
 Is that schema too edge case to justify some functional indexes
 on the cast values on the underlying catalogs? (I'm inclined to
 think so, but it seemed like a question worth putting out
 there)
 
 We don't support functional indexes on system catalogs, so whether it'd
 be justified is sorta moot.  On the whole though I'm inclined to agree
 that the information_schema views aren't used enough to justify adding
 overhead to system-catalog updates, even if the pieces for that all
 existed.
 
 Or, since these particular domains are known, is there any sane way
 to special-case these to allow the underlying types to work?
 
 I don't particularly care for a kluge solution here.
 
 I notice that recent versions of the SQL spec contain the notion of a
 distinct type, which is a user-defined type that is representationally
 identical to some base type but has its own name, and comes equipped with
 assignment-grade casts to and from the base type (which in PG terms would
 be binary-compatible casts, though the spec doesn't require that).
 It seems like this might be intended to be the sort of zero cost type
 alias we were talking about, except that the SQL committee seems to have
 got it wrong by not specifying the cast-to-base-type as being implicit.
 Which ISTM you would want so that operators/functions on the base type
 would apply automatically to the distinct type.  But perhaps we could
 extend the spec with some option to CREATE TYPE to allow the cast to come
 out that way.
 
 Or in short, maybe we should try to replace the domains used in the
 current information_schema with distinct types.

That's interesting and could easily solve the problem.

To give some context, for some reason, Drupal queries the
information_schema views before displaying some pages.
As our customer has many tables (approx 6 tables, organised à la
Oracle with one schema per database user). Thus, the seq scan against
pg_class takes ~50ms and the very same one without the cast takes less
than 1ms.

There is an example of query used :
SELECT column_name, data_type, column_default
  FROM information_schema.columns
 WHERE table_schema = 'one_schema'
   AND table_name = 'one_table'
   AND ( data_type = 'bytea'
 OR ( numeric_precision IS NOT NULL
   AND column_default::text LIKE '%nextval%' )
 );

Regards



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-10 Thread Kevin Grittner
Thomas Reiss thomas.re...@dalibo.com wrote:

 postgres=# create table test1 (a text);
 CREATE TABLE
 postgres=# insert into test1 select generate_series(1,10);
 INSERT 0 10
 postgres=# create index idx1 on test1(a);
 CREATE INDEX
 postgres=# analyze test1 ;
 ANALYZE;
 postgres=# explain select * from test1 where a = 'toto';
 QUERY PLAN
 ---
  Index Only Scan using idx1 on test1 (cost=0.29..8.31 rows=1 width=5)
Index Cond: (a = 'toto'::text)
 (2 lignes)

 Now we create a tstdom domain and cast the a column to tstdom in the
 view definition :
 postgres=# create domain tstdom as text;
 CREATE DOMAIN
 postgres=# create view test2 as select a::tstdom from test1 ;
 CREATE VIEW
 postgres=# explain select * from test2 where a='toto';
 QUERY PLAN
 --
  Seq Scan on test1 (cost=0.00..1693.00 rows=500 width=5)
Filter: (((a)::tstdom)::text = 'toto'::text)
 (2 lignes)

 As you can see, a is casted to tstdom then again to text. This casts
 prevents the optimizer to choose an index scan to retrieve the data. The
 casts are however strictly equivalent and should be not prevent the
 optimizer to use indexes.

You can create an index to be used for searching using the domain.
Following the steps in your example, you can run this:

postgres=# create index idx2 on test1 ((a::tstdom));
CREATE INDEX
postgres=# vacuum analyze test1;
VACUUM
postgres=# explain select * from test2 where a='toto';
QUERY PLAN
--
 Index Scan using idx2 on test1 (cost=0.29..8.31 rows=1 width=5)
   Index Cond: (((a)::tstdom)::text = 'toto'::text)
(2 rows)

It's even easier if a is defined to be a member of the domain in
the original table:

postgres=# create domain tstdom as text;
CREATE DOMAIN
postgres=# create table test1 (a tstdom);
CREATE TABLE
postgres=# insert into test1 select generate_series(1,10);
INSERT 0 10
postgres=# create index idx1 on test1(a);
CREATE INDEX
postgres=# analyze test1 ;
ANALYZE
postgres=# explain select * from test1 where a = 'toto';
  QUERY PLAN
---
 Index Only Scan using idx1 on test1 (cost=0.29..8.31 rows=1 width=5)
   Index Cond: (a = 'toto'::text)
(2 rows)

postgres=# create view test2 as select a::tstdom from test1 ;
CREATE VIEW
postgres=# explain select * from test2 where a='toto';
  QUERY PLAN
---
 Index Only Scan using idx1 on test1 (cost=0.29..8.31 rows=1 width=5)
   Index Cond: (a = 'toto'::text)
(2 rows)

It's kinda hard for me to visualize where it makes sense to define
the original table column as the bare type but use a domain when
referencing it in the view.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-10 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 It's kinda hard for me to visualize where it makes sense to define
 the original table column as the bare type but use a domain when
 referencing it in the view.

As far as that goes, I think the OP was unhappy about the performance
of the information_schema views, which in our implementation do exactly
that so that the exposed types of the view columns conform to the SQL
standard, even though the underlying catalogs use PG-centric types.

I don't believe that that's the only reason why the performance of the
information_schema views tends to be sucky, but it's certainly a reason.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-10 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kgri...@ymail.com writes:

 It's kinda hard for me to visualize where it makes sense to define
 the original table column as the bare type but use a domain when
 referencing it in the view.

 As far as that goes, I think the OP was unhappy about the performance
 of the information_schema views, which in our implementation do exactly
 that so that the exposed types of the view columns conform to the SQL
 standard, even though the underlying catalogs use PG-centric types.

 I don't believe that that's the only reason why the performance of the
 information_schema views tends to be sucky, but it's certainly a reason.

Is that schema too edge case to justify some functional indexes
on the cast values on the underlying catalogs? (I'm inclined to
think so, but it seemed like a question worth putting out
there)

Or, since these particular domains are known, is there any sane way
to special-case these to allow the underlying types to work?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-10 Thread Tom Lane
Kevin Grittner kgri...@ymail.com writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 As far as that goes, I think the OP was unhappy about the performance
 of the information_schema views, which in our implementation do exactly
 that so that the exposed types of the view columns conform to the SQL
 standard, even though the underlying catalogs use PG-centric types.
 
 I don't believe that that's the only reason why the performance of the
 information_schema views tends to be sucky, but it's certainly a reason.

 Is that schema too edge case to justify some functional indexes
 on the cast values on the underlying catalogs? (I'm inclined to
 think so, but it seemed like a question worth putting out
 there)

We don't support functional indexes on system catalogs, so whether it'd
be justified is sorta moot.  On the whole though I'm inclined to agree
that the information_schema views aren't used enough to justify adding
overhead to system-catalog updates, even if the pieces for that all
existed.

 Or, since these particular domains are known, is there any sane way
 to special-case these to allow the underlying types to work?

I don't particularly care for a kluge solution here.

I notice that recent versions of the SQL spec contain the notion of a
distinct type, which is a user-defined type that is representationally
identical to some base type but has its own name, and comes equipped with
assignment-grade casts to and from the base type (which in PG terms would
be binary-compatible casts, though the spec doesn't require that).
It seems like this might be intended to be the sort of zero cost type
alias we were talking about, except that the SQL committee seems to have
got it wrong by not specifying the cast-to-base-type as being implicit.
Which ISTM you would want so that operators/functions on the base type
would apply automatically to the distinct type.  But perhaps we could
extend the spec with some option to CREATE TYPE to allow the cast to come
out that way.

Or in short, maybe we should try to replace the domains used in the
current information_schema with distinct types.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-09 Thread Thomas Reiss
Le 08/12/2014 16:18, Tom Lane a écrit :
 Thomas Reiss thomas.re...@dalibo.com writes:
 postgres=# explain select * from test2 where a='toto';
 QUERY PLAN
 --
  Seq Scan on test1  (cost=0.00..1693.00 rows=500 width=5)
Filter: (((a)::tstdom)::text = 'toto'::text)
 (2 lignes)
 
 As you can see, a is casted to tstdom then again to text. This casts
 prevents the optimizer to choose an index scan to retrieve the data. The
 casts are however strictly equivalent and should be not prevent the
 optimizer to use indexes.
 
 No, they are not equivalent.  The optimizer can't simply drop the
 cast-to-domain, because that cast might result in a runtime error due
 to a domain CHECK constraint violation.  (This is true even if no such
 constraint exists at planning time, unfortunately.  If we had a
 mechanism to force replanning at ALTER DOMAIN ADD CONSTRAINT, maybe the
 no-constraints case could be handled better, but we don't; and adding
 one would also imply adding more locks around domain usage, so it's not
 all that attractive to do it.)
 
 The short answer is that SQL domains are not zero-cost type aliases.
 Perhaps there would be value in having a feature that *is* a a zero-cost
 alias, but it wouldn't be a domain.

I agree regarding the feature for zero-cost aliases. It would ease
access on the catalog done via the information_schema for example.

Thanks for your answer. There's some room for improvement for sure, but
it not as easy as it seems.

Regards,
Thomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Casting issues with domains

2014-12-08 Thread Thomas Reiss
Hello all,


We experienced some casting issues with domains. We experienced the
problem while querying the information_schema btw, but here is a simpler
test case :

postgres=# create table test1 (a text);
CREATE TABLE
postgres=# insert into test1 select generate_series(1,10);
INSERT 0 10
postgres=# create index idx1 on test1(a);
CREATE INDEX
postgres=# analyze test1 ;
ANALYZE;
postgres=# explain select * from test1 where a = 'toto';
  QUERY PLAN
---
 Index Only Scan using idx1 on test1  (cost=0.29..8.31 rows=1 width=5)
   Index Cond: (a = 'toto'::text)
(2 lignes)

Now we create a tstdom domain and cast the a column to tstdom in the
view definition :
postgres=# create domain tstdom as text;
CREATE DOMAIN
postgres=# create view test2 as select a::tstdom from test1 ;
CREATE VIEW
postgres=# explain select * from test2 where a='toto';
QUERY PLAN
--
 Seq Scan on test1  (cost=0.00..1693.00 rows=500 width=5)
   Filter: (((a)::tstdom)::text = 'toto'::text)
(2 lignes)

As you can see, a is casted to tstdom then again to text. This casts
prevents the optimizer to choose an index scan to retrieve the data. The
casts are however strictly equivalent and should be not prevent the
optimizer to use indexes.

Also, the same problem appears in the information_schema views, as every
object names are casted to information_schema.sql_identifier. Even if
this domain is declared as name, no index will be used because of this cast.

Shouldn't the planner simplify the casts when it's possible ?


Regards,
Thomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-08 Thread Tom Lane
Thomas Reiss thomas.re...@dalibo.com writes:
 postgres=# explain select * from test2 where a='toto';
 QUERY PLAN
 --
  Seq Scan on test1  (cost=0.00..1693.00 rows=500 width=5)
Filter: (((a)::tstdom)::text = 'toto'::text)
 (2 lignes)

 As you can see, a is casted to tstdom then again to text. This casts
 prevents the optimizer to choose an index scan to retrieve the data. The
 casts are however strictly equivalent and should be not prevent the
 optimizer to use indexes.

No, they are not equivalent.  The optimizer can't simply drop the
cast-to-domain, because that cast might result in a runtime error due
to a domain CHECK constraint violation.  (This is true even if no such
constraint exists at planning time, unfortunately.  If we had a
mechanism to force replanning at ALTER DOMAIN ADD CONSTRAINT, maybe the
no-constraints case could be handled better, but we don't; and adding
one would also imply adding more locks around domain usage, so it's not
all that attractive to do it.)

The short answer is that SQL domains are not zero-cost type aliases.
Perhaps there would be value in having a feature that *is* a a zero-cost
alias, but it wouldn't be a domain.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Casting issues with domains

2014-12-08 Thread Jim Nasby

On 12/8/14, 9:18 AM, Tom Lane wrote:

The short answer is that SQL domains are not zero-cost type aliases.
Perhaps there would be value in having a feature that*is*  a a zero-cost
alias, but it wouldn't be a domain.


Note that you can actually re-use the support functions of one data type to create a new 
one. So if you wanted a special type called document that actually behaved 
the same as text you could do that fairly easily (though not as easily as creating a 
domain).

If we were going to expend energy here, I suspect it would be more useful to look at ways 
of creating new types without requiring C. C isn't an option on many (even most) 
environments in today's cloud world, aside from the intimidation factor. 
There are comments in the code that hypothesize about making cstring a full type; that 
might be all that's needed.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers