Re: Schema variables - new implementation for Postgres 15
> 6. Oracle > > Oracle PL/SQL allows the use of package variables. PL/SQL is +/- ADA > language - and package variables are "global" variables. They are not > directly visible from SQL, but Oracle allows reduced syntax for functions > without arguments, so you need to write a wrapper > > CREATE OR REPLACE PACKAGE my_package > AS > FUNCTION get_a RETURN NUMBER; > END my_package; > / > > CREATE OR REPLACE PACKAGE BODY my_package > AS > a NUMBER(20); > > FUNCTION get_a > RETURN NUMBER > IS > BEGIN > RETURN a; > END get_a; > END my_package; > > SELECT my_package.get_a FROM DUAL; > > Inside SQL the higher priority has SQL, inside non SQL commands like CALL > or some PL/SQL command, the higher priority has packages. > The risk of collision's identifier is in some PL/SQL statements less than in Postgres, because SQL can be used only on dedicated positions (minimally in older Oracle's versions). Against other databases there is not allowed to use SQL everywhere as an expression. PL/SQL is an independent language, environment with its own expression executor (compiler). Other databases allow you to use an SQL subselect (I tested MySQL, PL/pgSQL, and I think (if I remember docs well) it is in standard SQL/PSM (related part of ANSI/SQL)) as expression. The integration of SQL into PL/SQL is not too deep and stored procedures look more like client code executed on the server side. Regards Pavel
Re: Schema variables - new implementation for Postgres 15
Hi > You can see, the RDBMS allows different types of session variables, > different implementations. Usually one system allows more implementation of > session variables. There is a possibility of emulation implementation > between RDBMS, but security setting is possible only in Oracle or DB2. > MySQL concept is very handy for ad hoc work, but it is too primitive for secure or safe use in stored procedures. Oracle concept is safe, but needs packages, needs writing wrappers, needs PL/SQL. I designed a concept that is very similar to DB2 (independently on IBM), and I think it is strong and can be well mapped to PostgreSQL (no packages, more different PL, strongly typed, ...) I think it would be nice to support the MySQL concept as syntactic sugar for GUC. This can be easy and for some use cases really very handy (and less confusing for beginners - using set_confing and current_setting is intuitive for work (emulation) of session variables (although the MSSQL solution is less intuitive). SET @myvar TO 10; --> SELECT set_config('session.myvar', 10) SET @@work_mem TO '10MB'; --> SELECT set_config('work_mem', '10MB'); SELECT @myvar; --> SELECT current_setting('session.myvar'); SELECT @@work_mem; --> SELECT current_setting('work_mem'); The syntax @ and @@ is widely used, and the mapping can be simple. This my proposal is not a replacement of the proposal of "schema" session variables. It is another concept, and I think so both can live together very well, because they are used for different purposes. Oracle, DB2 supports +/- both concepts Regards Pavel > > Regards > > Pavel > > > > > > >
Re: Schema variables - new implementation for Postgres 15
ne 2. 6. 2024 v 23:31 odesílatel Peter Eisentraut napsal: > > On 25.05.24 12:50, Pavel Stehule wrote: > > It looks odd - It is not intuitive, it introduces new inconsistency > > inside Postgres, or with solutions in other databases. No other database > > has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird > > will be confused. Users that use PL/pgSQL will be confused. > > Do you have a description of what those other systems do? Maybe you > posted it already earlier? > I checked today 1. MySQL MySQL knows 3 types of variables global - the access syntax is @@varname - they are used like our GUC and only buildin system variables are supported SET @@autocommit = off; SELECT @@autocommit; user defined variables - the access syntax is @varname - the behaviour is similar to psql variables, but they are server side SET @x = 100; SELECT @x; local variables - only inside PL CREATE PROCEDURE p1() DECLARE x int; BEGIN SET x = 100; SELECT x; END; variables has higher priority than column (like old plpgsql) 2. MSSQL global variables - the access syntax is @@varname, they are used like GUC and little bit more - some state informations are there like @@ERROR, @@ROWCOUNT or @@IDENTITY local variables - the access syntax is @varname, and should be declared before usage by DECLARE command. The scope is limited to batch or procedure or function, where DECLARE command was executed. DECLARE @TestVariable AS VARCHAR(100) SET @TestVariable = 'Think Green' GO PRINT @TestVariable This script fails, because PRINT is executed in another batch. So I think so MSSQL doesn't support session variables There are similar mechanisms like our custom GUC and usage current_setting and set_config functions. Generally, in this area is MSSQL very primitive EXEC sp_set_session_context 'user_id', 4; SELECT SESSION_CONTEXT(N'user_id'); 3. DB2 The "user defined global variables" are similar to my proposal. The differences are different access rights "READ, WRITE" x "SELECT, UPDATE". Because PostgreSQL has SET command for GUC, I introduced LET command (DB2 uses SET) Variables are visible in all sessions, but value is private per session. Variables are not transactional. The usage is wider than my proposal. Then can be changed by commands SET, SELECT INTO or they can be used like OUT parameters of procedures. The search path (or some like that) is used for variables too, but the variables has less priority than tables/columns. CREATE VARIABLE myCounter INT DEFAULT 01; SELECT EMPNO, LASTNAME, CASE WHEN myCounter = 1 THEN SALARY ELSE NULL END FROM EMPLOYEE WHERE WORKDEPT = ’A00’; SET myCounter = 29; There are (I think) different kinds of variables - accessed by the function GETVARIABLE('name', 'default) - it looks very similar ro our GUC and `current_setting` function. These variables can be set by connection string, are of varchar type and 10 values are allowed. Built-in session variables (configuration) can be accessed by the function GETVARIABLE too. SQL stored procedures supports declared local variables like PL/pgSQL 4. Firebird Firebird has something like our custom GUC. But it allow nested routines - so some functionality of session variables can be emulated with local variable and nested routines (but outer variables can be used only in Firebird 5) The variables are accessed by syntax :varname - like psql, but if I understand to diagrams, the char ':' is optional 5. SQL/PSM Standard introduces a concept of modules that can be joined with schemas. The variables are like PLpgSQL, but only local - the only temp tables can be defined on module levels. These tables can be accessed only from routines assigned to modules. Modules are declarative versions of our extensions (if I understand well, I didn't find any implementation). It allows you to overwrite the search patch for routines assigned in the module. Variables are not transactional, the priority - variables/columns is not specified. 6. Oracle Oracle PL/SQL allows the use of package variables. PL/SQL is +/- ADA language - and package variables are "global" variables. They are not directly visible from SQL, but Oracle allows reduced syntax for functions without arguments, so you need to write a wrapper CREATE OR REPLACE PACKAGE my_package AS FUNCTION get_a RETURN NUMBER; END my_package; / CREATE OR REPLACE PACKAGE BODY my_package AS a NUMBER(20); FUNCTION get_a RETURN NUMBER IS BEGIN RETURN a; END get_a; END my_package; SELECT my_package.get_a FROM DUAL; Inside SQL the higher priority has SQL, inside non SQL commands like CALL or some PL/SQL command, the higher priority has packages. The Oracle allows both syntax for calling function with zero arguments so SELECT my_package.get_a FROM DUAL; or SELECT my_package.get_a() FROM DUAL; Then there is less risk reduction of collision. Package variables persist in session Another possibility is using variables in SQL*Plus (looks like our psql variables, w
Re: Schema variables - new implementation for Postgres 15
On 25.05.24 12:50, Pavel Stehule wrote: It looks odd - It is not intuitive, it introduces new inconsistency inside Postgres, or with solutions in other databases. No other database has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be confused. Users that use PL/pgSQL will be confused. Do you have a description of what those other systems do? Maybe you posted it already earlier?
Re: Schema variables - new implementation for Postgres 15
pá 31. 5. 2024 v 15:49 odesílatel Wolfgang Walther napsal: > Pavel Stehule: > > When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to > > RAISE NOTICE '%', SELECT $1 > > > > There is no parser just for expressions. > > That's why my suggestion in [1] already made a difference between: > > SELECT var; > > and > > SELECT col, var FROM table, var; > > So the "only require variable-in-FROM if FROM is used" should extend to > the SQL level. > > That should be possible, right? > 1. you need to implement extra path - the data from FROM clause are processed differently than params - it is much more code (and current code should to stay if you want to support it) 2. current default behave is implicit unpacking of composites when are used in FROM clause. So it is problem when you want to use composite in query without unpacking 3. when I'll support SELECT var and SELECT var FROM var together, then it will raise a collision with self, that should be solved 4. there is not any benefit if variables and tables doen't share catalog, but session variables requires lsn number, and it can be problem to use it is table catalog 5. identification when the variable needs or doesn't need FROM clause isn't easy there can be lot of combinations like SELECT (SELECT var), c FROM tab or SELECT var, (SELECT c) FROM c and if c is variable, then FROM is not necessary. If somebody will write SELECT (SELECT var OFFSET 0) FROM ... then subselect can know nothing about outer query - so it means minimally one check over all nodes It is possible / but it is multiple more complex than current code (and I am not sure if store lns in pg_class is possible ever) 6. I think so plpgsql case statement use multicolumn expression, so you can write CASE WHEN x = 1, (SELECT count(*) FROM tab) THEN ... It is synthetic, but we are talking about what is possible. and although it looks correctly, and will work if x will be plpgsql variable, then it will not work if x will be session variable and then you need to fix it like CASE WHEN (SELECT x=1 FROM x), (SELECT count(*) FROM tab) THEN so it is possible, but it is clean only in trivial cases, and can be pretty messy Personally, I cannot to imagine to explain to any user so following (proposed by you) behaviour is intuitive and friendly CREATE VARIABLE a as int; CREATE TABLE test(id int); SELECT a; --> ok SELECT * FROM test WHERE id = a; -- error message "the column "a" doesn't exists" Best, > > Wolfgang > > [1]: > > https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de >
Re: Schema variables - new implementation for Postgres 15
Pavel Stehule: When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to RAISE NOTICE '%', SELECT $1 There is no parser just for expressions. That's why my suggestion in [1] already made a difference between: SELECT var; and SELECT col, var FROM table, var; So the "only require variable-in-FROM if FROM is used" should extend to the SQL level. That should be possible, right? Best, Wolfgang [1]: https://www.postgresql.org/message-id/e7faf42f-62b8-47f4-af5c-cb8efa3e0e20%40technowledgy.de
Re: Schema variables - new implementation for Postgres 15
> > > > >> In this case you could still write: >> >> RAISE NOTICE '% %', x, (SELECT a,b FROM y); >> >> (assuming only x is a variable here) >> > no - y was a composite variable. When you write RAISE NOTICE '%', x, then PLpgSQL parser rewrite it to RAISE NOTICE '%', SELECT $1 There is no parser just for expressions. > >> Best, >> >> Wolfgang >> >
Re: Schema variables - new implementation for Postgres 15
pá 31. 5. 2024 v 15:29 odesílatel Wolfgang Walther napsal: > Pavel Stehule: > > The session variables can be used in queries, but should be used in > > PL/pgSQL expressions, and then the mandatory usage in FROM clause will > > do lot of problems and unreadable code like > > > > DO $$ > > BEGIN > >RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y); > > > > END > > $$ > > > > This requirement does variables unusable in PL > > I already proposed earlier to only require listing them in FROM when > there is actually a related FROM. > but there is technical problem - plpgsql expression are internally SQL queries. Isn't possible to cleanly to parse queries and expressions differently. > > In this case you could still write: > > RAISE NOTICE '% %', x, (SELECT a,b FROM y); > > (assuming only x is a variable here) > > Best, > > Wolfgang >
Re: Schema variables - new implementation for Postgres 15
Pavel Stehule: The session variables can be used in queries, but should be used in PL/pgSQL expressions, and then the mandatory usage in FROM clause will do lot of problems and unreadable code like DO $$ BEGIN RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y); END $$ This requirement does variables unusable in PL I already proposed earlier to only require listing them in FROM when there is actually a related FROM. In this case you could still write: RAISE NOTICE '% %', x, (SELECT a,b FROM y); (assuming only x is a variable here) Best, Wolfgang
Re: Schema variables - new implementation for Postgres 15
pá 31. 5. 2024 v 15:02 odesílatel Pavel Stehule napsal: > > > pá 31. 5. 2024 v 13:37 odesílatel Wolfgang Walther < > walt...@technowledgy.de> napsal: > >> Pavel Stehule: >> > But in this case you could make variables and tables share the same >> > namespace, i.e. forbid creating a variable with the same name as an >> > already existing table. >> > >> > >> > It helps, but not on 100% - there is a search path >> > >> I think we can ignore the search_path for this discussion. That's not a >> problem of variables vs tables, but just a search path related problem. >> It is exactly the same thing right now, when you create a new table x(x) >> in a schema which happens to be earlier in your search path. >> > > I don't think it is a valid argument - search_path is there, and we cannot > ignore it, because it allows just one case. > > And the need to use a variable in FROM clause introduces implicit > unpacking or inconsistency with current work with composite's types, so I > am more sure this way is not good. > The session variables can be used in queries, but should be used in PL/pgSQL expressions, and then the mandatory usage in FROM clause will do lot of problems and unreadable code like DO $$ BEGIN RAISE NOTICE '% %', (SELECT x FROM x), (SELECT a,b FROM y); END $$ This requirement does variables unusable in PL > > > >> >> The objection to the proposed approach for variables was that it would >> introduce *new* ambiguities, which Alvaro's suggestion avoids. >> >> Best, >> >> Wolfgang >> >
Re: Schema variables - new implementation for Postgres 15
pá 31. 5. 2024 v 13:37 odesílatel Wolfgang Walther napsal: > Pavel Stehule: > > But in this case you could make variables and tables share the same > > namespace, i.e. forbid creating a variable with the same name as an > > already existing table. > > > > > > It helps, but not on 100% - there is a search path > > I think we can ignore the search_path for this discussion. That's not a > problem of variables vs tables, but just a search path related problem. > It is exactly the same thing right now, when you create a new table x(x) > in a schema which happens to be earlier in your search path. > I don't think it is a valid argument - search_path is there, and we cannot ignore it, because it allows just one case. And the need to use a variable in FROM clause introduces implicit unpacking or inconsistency with current work with composite's types, so I am more sure this way is not good. > > The objection to the proposed approach for variables was that it would > introduce *new* ambiguities, which Alvaro's suggestion avoids. > > Best, > > Wolfgang >
Re: Schema variables - new implementation for Postgres 15
Pavel Stehule: But in this case you could make variables and tables share the same namespace, i.e. forbid creating a variable with the same name as an already existing table. It helps, but not on 100% - there is a search path I think we can ignore the search_path for this discussion. That's not a problem of variables vs tables, but just a search path related problem. It is exactly the same thing right now, when you create a new table x(x) in a schema which happens to be earlier in your search path. The objection to the proposed approach for variables was that it would introduce *new* ambiguities, which Alvaro's suggestion avoids. Best, Wolfgang
Re: Schema variables - new implementation for Postgres 15
pá 31. 5. 2024 v 13:10 odesílatel Wolfgang Walther napsal: > Pavel Stehule: > > 2. But my main argument is, it is not really safe - it solves Peter's > > use case, but if I use a reverse example of Peter's case, I still have a > > problem. > > > > I can have a variable x, and then I can write query like `SELECT x FROM > x`; > > > > but if somebody creates table x(x int), then the query `SELECT x FROM x` > > will be correct, but it is surely something else. So the requirement of > > the usage variable inside FROM clause doesn't help. It doesn't work. > > But in this case you could make variables and tables share the same > namespace, i.e. forbid creating a variable with the same name as an > already existing table. > It helps, but not on 100% - there is a search path > > Best, > > Wolfgang >
Re: Schema variables - new implementation for Postgres 15
Pavel Stehule: 2. But my main argument is, it is not really safe - it solves Peter's use case, but if I use a reverse example of Peter's case, I still have a problem. I can have a variable x, and then I can write query like `SELECT x FROM x`; but if somebody creates table x(x int), then the query `SELECT x FROM x` will be correct, but it is surely something else. So the requirement of the usage variable inside FROM clause doesn't help. It doesn't work. But in this case you could make variables and tables share the same namespace, i.e. forbid creating a variable with the same name as an already existing table. Best, Wolfgang
Re: Schema variables - new implementation for Postgres 15
pá 31. 5. 2024 v 11:46 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote: > > > > I propose another variants. First we can introduce pseudo function VAR( > ). > > The argument should be session variables. The name of this function can > be > > pgvar, globvar, ... We can talk about good name, it should not be too > long, > > but it is not important now. The VAR() function will be pseudo function > > like COALESCE, so we can easily to set correct result type. > > So, the purpose of the function would be only to verify that the argument > is a > session variable? That seems to be a very light payload, which looks a bit > awkward. > no, it just reduces catalog searching to variables. So with using this function, then there is no possibility of collision between variables and other objects. The argument can be only variable and nothing else. So then the conflict is not possible. When somebody tries to specify a table or column, then it fails, because this object will not be detected. So inside this function, the tables and columns cannot to shading variables, and variables cannot be replaced by columns. So the proposed function is not just assert, it is designed like a catalog filter. > Out of those options you propose I think the first one is the > most straightforward one, but... > > > Alvaro Herrera: > > > Perhaps the solution to all this is to avoid having the variables be > > > implicitly present in the range table of all queries. Instead, if you > > > need a variable's value, then you need to add the variable to the FROM > > > clause; > > The more I think about this, the more I like this solution. Marking > which variables are available to the query this way, and using established > patterns for resolving ambiguity actually looks intuitive to me. Now I > know, > you've got strong objections: > I still don't like this - mainly from two reasons 1. it doesn't look user friendly - you need to maintain two different places in one query for one object. I can imagine usage there in the case of composite variables with unpacking (and then it can be consistent with others). I can imagine to use optional usage of variables there for the possibility of realiasing - like functions - and if we should support it, then with unpacking of composite values. (2024-05-31 12:33:57) postgres=# create type t as (a int, b int); CREATE TYPE (2024-05-31 12:35:26) postgres=# create function fx() returns t as $$ select 1, 2 $$ language sql; CREATE FUNCTION (2024-05-31 12:35:44) postgres=# select fx(); ┌───┐ │ fx │ ╞═══╡ │ (1,2) │ └───┘ (1 row) (2024-05-31 12:35:47) postgres=# select * from fx(); ┌───┬───┐ │ a │ b │ ╞═══╪═══╡ │ 1 │ 2 │ └───┴───┘ (1 row) 2. But my main argument is, it is not really safe - it solves Peter's use case, but if I use a reverse example of Peter's case, I still have a problem. I can have a variable x, and then I can write query like `SELECT x FROM x`; but if somebody creates table x(x int), then the query `SELECT x FROM x` will be correct, but it is surely something else. So the requirement of the usage variable inside FROM clause doesn't help. It doesn't work. > > I don't like this. Sure, this fixes the problem with collisions, but then > > we cannot talk about variables. When some is used like a table, then it > > should be a table. I can imagine memory tables, but it is a different > type > > of object. Table is relation, variable is just value. Variables should > not > > have columns, so using the same patterns for tables and variables has no > > sense. Using the same catalog for variables and tables. Variables just > hold > > a value, and then you can use it inside a query without necessity to > write > > JOIN. Variables are not tables, and then it is not too confusing so they > > are not transactional and don't support more rows, more columns. > > A FROM clause could contain a function returning a single value, nobody > finds it confusing. And at least to me it's not much different from having > a > session variable as well, what do you think? > but there is a difference when function returns composite, and when not - if I use function in FROM clause, I'll get unpacked columns, when I use function in columns, then I get composite. The usage variable in FROM clause can have sense in similar princip like functions - for possibility to use alias in same level of query and possibility to use one common syntax for composite unpacking. But it doesn't help with safety against collisions. > > > c) using variables with necessity to define it in FROM clause. It is > safe, > > but it can be less readable, when you use more variables, and it is not > too > > readable, and user friendly, because you need to write FROM. And can be > > messy, because you usually will use variables in queries, and it is > > introduce not relations into FROM clause. But I can imagine this mode as > > alternative syntax, but it
Re: Schema variables - new implementation for Postgres 15
> On Tue, May 28, 2024 at 05:18:02PM GMT, Pavel Stehule wrote: > > I propose another variants. First we can introduce pseudo function VAR( ). > The argument should be session variables. The name of this function can be > pgvar, globvar, ... We can talk about good name, it should not be too long, > but it is not important now. The VAR() function will be pseudo function > like COALESCE, so we can easily to set correct result type. So, the purpose of the function would be only to verify that the argument is a session variable? That seems to be a very light payload, which looks a bit awkward. Out of those options you propose I think the first one is the most straightforward one, but... > Alvaro Herrera: > > Perhaps the solution to all this is to avoid having the variables be > > implicitly present in the range table of all queries. Instead, if you > > need a variable's value, then you need to add the variable to the FROM > > clause; The more I think about this, the more I like this solution. Marking which variables are available to the query this way, and using established patterns for resolving ambiguity actually looks intuitive to me. Now I know, you've got strong objections: > I don't like this. Sure, this fixes the problem with collisions, but then > we cannot talk about variables. When some is used like a table, then it > should be a table. I can imagine memory tables, but it is a different type > of object. Table is relation, variable is just value. Variables should not > have columns, so using the same patterns for tables and variables has no > sense. Using the same catalog for variables and tables. Variables just hold > a value, and then you can use it inside a query without necessity to write > JOIN. Variables are not tables, and then it is not too confusing so they > are not transactional and don't support more rows, more columns. A FROM clause could contain a function returning a single value, nobody finds it confusing. And at least to me it's not much different from having a session variable as well, what do you think? > c) using variables with necessity to define it in FROM clause. It is safe, > but it can be less readable, when you use more variables, and it is not too > readable, and user friendly, because you need to write FROM. And can be > messy, because you usually will use variables in queries, and it is > introduce not relations into FROM clause. But I can imagine this mode as > alternative syntax, but it is very unfriendly and not intuitive (I think). The proposal from Wolfgang to have a short-cut and not add FROM in case there is no danger of ambiguity seems to resolve that. > More probably it doesn't fast execution in simple expression execution mode. Could you elaborate more, what do you mean by that? If the performance overhead is not prohibitive (which I would expect is the case), having better UX for a new feature usually beats having better performance. > It looks odd - It is not intuitive, it introduces new inconsistency inside > Postgres, or with solutions in other databases. No other database has a > similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be > confused. Users that use PL/pgSQL will be confused. Session variables are not part of the SQL standard, and maintaining consistency with other databases is a questionable goal. Since it's a new feature, I'm not sure what you mean by inconsistency inside Postgres itself. I see that the main driving case behind this patch is to help with migrating from other databases that do have session variables. Going with variables in FROM clause, will not make a migration much harder -- some of the queries would have to modify the FROM part, and that's it, right? I could imagine it would be even easier than adding VAR() everywhere.
Re: Schema variables - new implementation for Postgres 15
Hi so 25. 5. 2024 v 3:29 odesílatel Tom Lane napsal: > Pavel Stehule writes: > > we can introduce special safe mode started by > > set enable_direct_variable_read to off; > > and allowing access to variables only by usage dedicated function > > (supported by parser) named like variable or pg_variable > > Didn't we learn twenty years ago that GUCs that change query > semantics are an awful idea? Pick a single access method > for these things and stick to it. > I propose another variants. First we can introduce pseudo function VAR( ). The argument should be session variables. The name of this function can be pgvar, globvar, ... We can talk about good name, it should not be too long, but it is not important now. The VAR() function will be pseudo function like COALESCE, so we can easily to set correct result type. I see possible variants 1. for any read of session variable, the VAR function should be used (everywhere), the write is not problem, there is not risk of collisions. When VAR() function will be required everywhere, then the name should be shorter. SELECT * FROM tab WHERE id = VAR(stehule.myvar); SELECT VAR(okbob.myvar); 2. the usage of VAR() function should be required, when query has FROM clause, and then there is in risk of collisions. Without it, then the VAR() function can be optional (it is modification of Wolfgang or Alvaro proposals). I prefer this syntax before mentioning in FROM clause, just I think so it is less confusing, and FROM clause should be used for relations, and not for variables. SELECT * FROM tab WHERE id = VAR(okbob.myvar) SELECT okbob.myvar; 3. Outside PL the VAR() function will be required, inside PL the VAR function can be optional (and we can throw an exception) when we found collision like now What do you think about this proposal? And if you can accept it, what version? I think so implementation of any proposed variant should be easy. I can add extra check to plpgsql_check if the argument of VAR() function is in possible collision with other identifiers in query, but for proposed variants it is just in nice to have category Regards Pavel > > regards, tom lane >
Re: Schema variables - new implementation for Postgres 15
so 25. 5. 2024 v 10:24 odesílatel napsal: > Pavel Stehule: > > Sure there is more possibilities, but I don't want to lost the > > possibility to write code like > > > > CREATE TEMP VARIABLE _x; > > > > LET _x = 'hello'; > > > > DO $$ > > BEGIN > >RAISE NOTICE '%', _x; > > END; > > $$; > > > > So I am searching for a way to do it safely, but still intuitive and > > user friendly. > > Maybe a middle-way between this and Alvaro's proposal could be: > > Whenever you have a FROM clause, a variable must be added to it to be > accessible. When you don't have a FROM clause, you can access it directly. > > This would make the following work: > > RAISE NOTICE '%', _x; > > SELECT _x; > > SELECT tbl.*, _x FROM tbl, _x; > > SELECT tbl.*, (SELECT _x) FROM tbl, _x; > > SELECT tbl.*, (SELECT _x FROM _x) FROM tbl; > > > But the following would be an error: > > SELECT tbl.*, _x FROM tbl; > > SELECT tbl.*, (SELECT _x) FROM tbl; > > It looks odd - It is not intuitive, it introduces new inconsistency inside Postgres, or with solutions in other databases. No other database has a similar rule, so users coming from Oracle, Db2, or MSSQL, Firebird will be confused. Users that use PL/pgSQL will be confused. Regards Pavel > > Best, > > Wolfgang >
Re: Schema variables - new implementation for Postgres 15
Pavel Stehule: Sure there is more possibilities, but I don't want to lost the possibility to write code like CREATE TEMP VARIABLE _x; LET _x = 'hello'; DO $$ BEGIN RAISE NOTICE '%', _x; END; $$; So I am searching for a way to do it safely, but still intuitive and user friendly. Maybe a middle-way between this and Alvaro's proposal could be: Whenever you have a FROM clause, a variable must be added to it to be accessible. When you don't have a FROM clause, you can access it directly. This would make the following work: RAISE NOTICE '%', _x; SELECT _x; SELECT tbl.*, _x FROM tbl, _x; SELECT tbl.*, (SELECT _x) FROM tbl, _x; SELECT tbl.*, (SELECT _x FROM _x) FROM tbl; But the following would be an error: SELECT tbl.*, _x FROM tbl; SELECT tbl.*, (SELECT _x) FROM tbl; Best, Wolfgang
Re: Schema variables - new implementation for Postgres 15
so 25. 5. 2024 v 3:29 odesílatel Tom Lane napsal: > Pavel Stehule writes: > > we can introduce special safe mode started by > > set enable_direct_variable_read to off; > > and allowing access to variables only by usage dedicated function > > (supported by parser) named like variable or pg_variable > > Didn't we learn twenty years ago that GUCs that change query > semantics are an awful idea? Pick a single access method > for these things and stick to it. > I don't think the proposed GUC exactly changes query semantics - it is equivalent of plpgsql options: plpgsql_extra_ or #variable_conflict. It allows us to identify broken queries. And for tools that generates queries is not problem to wrap reading variable by special pseudo function. The code where pseudo function will be used should to work with active or inactive strict mode (related to possibility to use variables). Sure there is more possibilities, but I don't want to lost the possibility to write code like CREATE TEMP VARIABLE _x; LET _x = 'hello'; DO $$ BEGIN RAISE NOTICE '%', _x; END; $$; So I am searching for a way to do it safely, but still intuitive and user friendly. Regards Pavel > > regards, tom lane >
Re: Schema variables - new implementation for Postgres 15
Pavel Stehule writes: > we can introduce special safe mode started by > set enable_direct_variable_read to off; > and allowing access to variables only by usage dedicated function > (supported by parser) named like variable or pg_variable Didn't we learn twenty years ago that GUCs that change query semantics are an awful idea? Pick a single access method for these things and stick to it. regards, tom lane
Re: Schema variables - new implementation for Postgres 15
Hi st 22. 5. 2024 v 19:25 odesílatel Tom Lane napsal: > Peter Eisentraut writes: > > On 18.05.24 13:29, Alvaro Herrera wrote: > >> I want to note that when we discussed this patch series at the dev > >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > >> schema variables at all because of the fact that creating a variable > >> would potentially change the meaning of queries by shadowing table > >> columns. But this turns out to be incorrect: it's_variables_ that are > >> shadowed by table columns, not the other way around. > > > But that's still bad, because seemingly unrelated schema changes can > > make variables appear and disappear. For example, if you have > > SELECT a, b FROM table1 > > and then you drop column b, maybe the above query continues to work > > because there is also a variable b. > > Yeah, that seems pretty dangerous. Could we make it safe enough > by requiring some qualification on variable names? That is, if > you mean b to be a variable, then you must write something like > > SELECT a, pg_variables.b FROM table1 > > This is still ambiguous if you use "pg_variables" as a table alias in > the query, but the alias would win so the query still means what it > meant before. Also, table aliases (as opposed to actual table names) > don't change readily, so I don't think there's much risk of the query > suddenly meaning something different than it did yesterday. > we can introduce special safe mode started by set enable_direct_variable_read to off; and allowing access to variables only by usage dedicated function (supported by parser) named like variable or pg_variable so it can looks like select a, pg_variable(myschema.myvar) from table In this mode, the variables never are readable directly, so there is no risk of collision and issue mentioned by Peter. And the argument of the pg_variable pseudo function can be only variable, so risk of possible collision can be reduced too. The pseudo function pg_variable can be used in less restrictive mode too, when the user can explicitly show usage of the variable. Tom's proposal is already almost supported now. The user can use a dedicated schema without assigning this schema to search_path. Then a qualified name should be required. Can this design be the correct answer for mentioned objections? Regards Pavel > regards, tom lane >
Re: Schema variables - new implementation for Postgres 15
> > > > >> >> As far as I can see now, it's a major design flaw that could keep the >> patch from being accepted. Fortunately there are few good proposals how >> to address this, folks are genuinely trying to help. What do you think >> about trying some of them out, as an alternative approach, to compare >> functionality and user experience? >> > > It is a design flaw of SQL. The issue we talk about is the generic > property of SQL, and then you cannot fix it. > > I thought about possibility to introduce dedicated function > > svalue(regvariable) returns any - with planner support > > and possibility to force usage of this function. Another possibility is > using some simple dedicated operator (syntax) for force using of variables > so theoretically this can looks like: > > set strict_usage_of_session_variables to on; > SELECT * FROM tab WHERE a = svalue('myvar.var'); > or > > SELECT * FROM tab WHERE a = @ myvar.var; > > This can be really safe. Personally It is not my cup of tea, but I can > live it (and this mode can be default). > > Theoretically we can limit usage of variables just for PL/pgSQL. It can > reduce risks too, but it breaks usage variables for parametrization of DO > blocks (what is my primary motivation), but it can be good enough to > support migration from PL/SQL. > another possibility can be disable / enable usage of session variables on session level like set enable_session_variable to on/off so when the application doesn't use session variables, and then session variables can be disabled, but the user can enable it just for self for self session. Then the risk of unwanted usage of session variables can be zero. This is similar to discussion about login triggers. This mechanism can be used for using session variables only in PL too. > > >> >> In the meantime I'm afraid I have to withdraw "Ready for committer" >> status, sorry. I've clearly underestimated the importance of variables >> shadowing, thanks Alvaro and Peter for pointing out some dangerous >> cases. I still believe though that the majority of the patch is in a >> good shape and the question about variables shadowing is the only thing >> that keeps it from moving forward. >> > > I understand. > > I'll try to recapitulate my objections against proposed designs > > a) using syntax like MS - DECLARE command and '@@' prefix - it is dynamic, > so there is not possibility of static check. It is not joined with schema, > so there are possible collisions between variables and and the end the > variables are named like @@mypackage_myvar - so some custom naming > convention is necessary too. There is not possibility to set access rights. > > b) using variables like MySQL - first usage define it, and access by '@' > prefix. It is simple, but without possibility of static check. There is not > possibility to set access rights. > > c) using variables with necessity to define it in FROM clause. It is safe, > but it can be less readable, when you use more variables, and it is not too > readable, and user friendly, because you need to write FROM. And can be > messy, because you usually will use variables in queries, and it is > introduce not relations into FROM clause. But I can imagine this mode as > alternative syntax, but it is very unfriendly and not intuitive (I think). > More probably it doesn't fast execution in simple expression execution mode. > > d) my proposal - there is possibility of collisions, but consistent with > naming of database objects, allows set of access rights, allows static > analyze, consistent with PL/pgSQL and similar to PL/pgSQL. > > There is not any other possibility. Any time this is war between be user > friendly, be readable, be correctly - but there is not perfect solution, > because just SQL is not perfect. Almost all mentioned objections against > proposed variables are valid just for tables and columns. > > Regards > > Pavel > >
Re: Schema variables - new implementation for Postgres 15
Hi pá 24. 5. 2024 v 13:32 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Wed, May 22, 2024 at 08:44:28PM +0200, Pavel Stehule wrote: > > st 22. 5. 2024 v 19:25 odesílatel Tom Lane napsal: > > > > > Peter Eisentraut writes: > > > > On 18.05.24 13:29, Alvaro Herrera wrote: > > > >> I want to note that when we discussed this patch series at the dev > > > >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't > want > > > >> schema variables at all because of the fact that creating a variable > > > >> would potentially change the meaning of queries by shadowing table > > > >> columns. But this turns out to be incorrect: it's_variables_ that > are > > > >> shadowed by table columns, not the other way around. > > > > > > > But that's still bad, because seemingly unrelated schema changes can > > > > make variables appear and disappear. For example, if you have > > > > SELECT a, b FROM table1 > > > > and then you drop column b, maybe the above query continues to work > > > > because there is also a variable b. > > > > > > Yeah, that seems pretty dangerous. Could we make it safe enough > > > by requiring some qualification on variable names? That is, if > > > you mean b to be a variable, then you must write something like > > > > > > SELECT a, pg_variables.b FROM table1 > > > > > > This is still ambiguous if you use "pg_variables" as a table alias in > > > the query, but the alias would win so the query still means what it > > > meant before. Also, table aliases (as opposed to actual table names) > > > don't change readily, so I don't think there's much risk of the query > > > suddenly meaning something different than it did yesterday. > > > > > > > With active shadowing variable warning for described example you will > get a > > warning before dropping. > > I assume you're talking about a warning, which one will get querying the > table with shadowed columns. If no such query has happened yet and the > column was dropped, there will be no warning. > sure - the possible identifier collision cannot be solved in SQL perfectly. It is the same with tables. When I add badly named column to table, I'll get an error "ambiguous columns" just when I'll execute query. The system catalog just cannot protect against collisions - it is true for columns, variables, tables. Little bit protected are views, that are stored in parsed format, but any other object can be broken when somebody choose bad names in catalog or queries. There is not any protection. > > Aside that, I'm afraid dropping a warning in log does not have > sufficient visibility to warn about the issue, since one needs to read > those logs first. I guess what folks are looking for is more constraints > out of the box, preventing any ambiguity. > We can increase (optionality) the level of this message to error. It is not perfect, but it can work well. I think so there is not higher risk with variables than current risk with just tables. a) the possibility to create variables is limited by rights on schema. So nobody can create variables without necessary rights (invisibly) b) if user has own schema with CREATE right, then it can create variables just for self, and with default setting, just visible for self, and just accessible for self. When other users try to use these variables, then the query fails due to missing access rights (usually). Common user cannot to create variables in application schema and cannot to set search_path for applications. c) the changes of schema are usually tested on some testing stages before are applied on production. So when there will be possible collision or some other defect, probably it will be detected there. Untested changes of catalog on production is not too common today. d) any risk that can be related for variables, is related just to renaming column or table. > > > Session variables are joined with schema (in my proposal). Do anybody can > > do just > > > > CREATE SCHEMA svars; -- or what (s)he likes > > CREATE VARIABLE svars.b AS int; > > > > SELECT a, b FROM table1 > > > > and if somebody can be really safe, the can write > > > > SELECT t.a, t.b FROM table1 t > > > > or > > > > SELECT t.a, svars.b FROM table1 t > > > > It can be customized in the way anybody prefers - just creating dedicated > > schemas and setting search_path. Using its own schema for session > variables > > without enhancing search_path for this schema forces the necessity to set > > only qualified names for session variables. > > > > Sure the naming of schemas, aliases can be unhappy wrong, and there can > be > > the problem. But this can be a problem today too. > > If I understand you correctly, you're saying that there are "best > practices" how to deal with session variables to avoid any potential > issues. But I think it's more user-friendly to have something that will > not allow shooting yourself in the foot right out of the box. You're > right, similar things could probably happe
Re: Schema variables - new implementation for Postgres 15
> On Wed, May 22, 2024 at 08:44:28PM +0200, Pavel Stehule wrote: > st 22. 5. 2024 v 19:25 odesílatel Tom Lane napsal: > > > Peter Eisentraut writes: > > > On 18.05.24 13:29, Alvaro Herrera wrote: > > >> I want to note that when we discussed this patch series at the dev > > >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > > >> schema variables at all because of the fact that creating a variable > > >> would potentially change the meaning of queries by shadowing table > > >> columns. But this turns out to be incorrect: it's_variables_ that are > > >> shadowed by table columns, not the other way around. > > > > > But that's still bad, because seemingly unrelated schema changes can > > > make variables appear and disappear. For example, if you have > > > SELECT a, b FROM table1 > > > and then you drop column b, maybe the above query continues to work > > > because there is also a variable b. > > > > Yeah, that seems pretty dangerous. Could we make it safe enough > > by requiring some qualification on variable names? That is, if > > you mean b to be a variable, then you must write something like > > > > SELECT a, pg_variables.b FROM table1 > > > > This is still ambiguous if you use "pg_variables" as a table alias in > > the query, but the alias would win so the query still means what it > > meant before. Also, table aliases (as opposed to actual table names) > > don't change readily, so I don't think there's much risk of the query > > suddenly meaning something different than it did yesterday. > > > > With active shadowing variable warning for described example you will get a > warning before dropping. I assume you're talking about a warning, which one will get querying the table with shadowed columns. If no such query has happened yet and the column was dropped, there will be no warning. Aside that, I'm afraid dropping a warning in log does not have sufficient visibility to warn about the issue, since one needs to read those logs first. I guess what folks are looking for is more constraints out of the box, preventing any ambiguity. > Session variables are joined with schema (in my proposal). Do anybody can > do just > > CREATE SCHEMA svars; -- or what (s)he likes > CREATE VARIABLE svars.b AS int; > > SELECT a, b FROM table1 > > and if somebody can be really safe, the can write > > SELECT t.a, t.b FROM table1 t > > or > > SELECT t.a, svars.b FROM table1 t > > It can be customized in the way anybody prefers - just creating dedicated > schemas and setting search_path. Using its own schema for session variables > without enhancing search_path for this schema forces the necessity to set > only qualified names for session variables. > > Sure the naming of schemas, aliases can be unhappy wrong, and there can be > the problem. But this can be a problem today too. If I understand you correctly, you're saying that there are "best practices" how to deal with session variables to avoid any potential issues. But I think it's more user-friendly to have something that will not allow shooting yourself in the foot right out of the box. You're right, similar things could probably happen with the already existing functionality, but it doesn't give us rights to add more to it. Especially if it's going to be about a brand-new feature. As far as I can see now, it's a major design flaw that could keep the patch from being accepted. Fortunately there are few good proposals how to address this, folks are genuinely trying to help. What do you think about trying some of them out, as an alternative approach, to compare functionality and user experience? In the meantime I'm afraid I have to withdraw "Ready for committer" status, sorry. I've clearly underestimated the importance of variables shadowing, thanks Alvaro and Peter for pointing out some dangerous cases. I still believe though that the majority of the patch is in a good shape and the question about variables shadowing is the only thing that keeps it from moving forward.
Re: Schema variables - new implementation for Postgres 15
Hi st 22. 5. 2024 v 16:14 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Wed, May 22, 2024 at 02:37:49PM +0200, Peter Eisentraut wrote: > > On 18.05.24 13:29, Alvaro Herrera wrote: > > > I want to note that when we discussed this patch series at the dev > > > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > > > schema variables at all because of the fact that creating a variable > > > would potentially change the meaning of queries by shadowing table > > > columns. But this turns out to be incorrect: it's_variables_ that are > > > shadowed by table columns, not the other way around. > > > > But that's still bad, because seemingly unrelated schema changes can make > > variables appear and disappear. For example, if you have > > > > SELECT a, b FROM table1 > > > > and then you drop column b, maybe the above query continues to work > because > > there is also a variable b. Or maybe it now does different things > because b > > is of a different type. This all has the potential to be very confusing. > > Yeah, that's a bummer. Interestingly enough, the db2 implementation of > global session variables mechanism is mentioned as similar to what we > have in the patch. But weirdly, the db2 documentation just states > possibility of a resolution conflict for unqualified names, nothing > else. > I found document https://www.ibm.com/docs/it/i/7.3?topic=variables-global If I understand well, then the same rules are applied for qualified or not qualified identifiers (when there is a conflict), and the variables have low priority. The db2 has the possibility to compile objects, and it can block the usage variables created after compilation - (if I understand well the described behaviour). Regards Pavel
Re: Schema variables - new implementation for Postgres 15
st 22. 5. 2024 v 14:37 odesílatel Peter Eisentraut napsal: > On 18.05.24 13:29, Alvaro Herrera wrote: > > I want to note that when we discussed this patch series at the dev > > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > > schema variables at all because of the fact that creating a variable > > would potentially change the meaning of queries by shadowing table > > columns. But this turns out to be incorrect: it's_variables_ that are > > shadowed by table columns, not the other way around. > > But that's still bad, because seemingly unrelated schema changes can > make variables appear and disappear. For example, if you have > > SELECT a, b FROM table1 > > and then you drop column b, maybe the above query continues to work > because there is also a variable b. Or maybe it now does different > things because b is of a different type. This all has the potential to > be very confusing. > The detection of possible conflicts works well (in or outside PL too) create variable x as int; create table foo(x int); insert into foo values(110); set session_variables_ambiguity_warning to on; (2024-05-23 08:22:34) postgres=# do $$ begin raise notice '%', (select x from foo); end; $$; WARNING: session variable "x" is shadowed LINE 1: (select x from foo) ^ DETAIL: Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name. QUERY: (select x from foo) NOTICE: 110 DO (2024-05-23 08:22:35) postgres=# do $$ declare x int default 100; begin raise notice '%', x; end; $$; WARNING: session variable "x" is shadowed LINE 1: x ^ DETAIL: Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name. QUERY: x NOTICE: 100 DO
Re: Schema variables - new implementation for Postgres 15
st 22. 5. 2024 v 20:21 odesílatel napsal: > Alvaro Herrera: > > Perhaps the solution to all this is to avoid having the variables be > > implicitly present in the range table of all queries. Instead, if you > > need a variable's value, then you need to add the variable to the FROM > > clause; > > +1 > > This should make it easier to work with composite type schema variables > in some cases. It could also enable schema qualifying of schema > variables, or at least make it easier to do, I think. > > In this case variables would share the same namespace as tables and > views, right? So I could not create a variable with the same name as > another table. Which is a good thing, I guess. Not sure how it's > currently implemented in the patch. > I don't like this. Sure, this fixes the problem with collisions, but then we cannot talk about variables. When some is used like a table, then it should be a table. I can imagine memory tables, but it is a different type of object. Table is relation, variable is just value. Variables should not have columns, so using the same patterns for tables and variables has no sense. Using the same catalog for variables and tables. Variables just hold a value, and then you can use it inside a query without necessity to write JOIN. Variables are not tables, and then it is not too confusing so they are not transactional and don't support more rows, more columns. The problem with collision can be solved very easily - just use a dedicated schema (only for variables) and don't use it in the search path. In this case, the unwanted collision is not too probable - although it is possible, if you use a schema name for a variable same like table name or alias name. I can use CREATE SCHEMA __; CREATE VARIABLE __.a AS int; SELECT __.a; although it is maybe wild, probably nobody will use alias or table name __ and then there should not be any problem > > Best, > > Wolfgang >
Re: Schema variables - new implementation for Postgres 15
st 22. 5. 2024 v 19:25 odesílatel Tom Lane napsal: > Peter Eisentraut writes: > > On 18.05.24 13:29, Alvaro Herrera wrote: > >> I want to note that when we discussed this patch series at the dev > >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > >> schema variables at all because of the fact that creating a variable > >> would potentially change the meaning of queries by shadowing table > >> columns. But this turns out to be incorrect: it's_variables_ that are > >> shadowed by table columns, not the other way around. > > > But that's still bad, because seemingly unrelated schema changes can > > make variables appear and disappear. For example, if you have > > SELECT a, b FROM table1 > > and then you drop column b, maybe the above query continues to work > > because there is also a variable b. > > Yeah, that seems pretty dangerous. Could we make it safe enough > by requiring some qualification on variable names? That is, if > you mean b to be a variable, then you must write something like > > SELECT a, pg_variables.b FROM table1 > > This is still ambiguous if you use "pg_variables" as a table alias in > the query, but the alias would win so the query still means what it > meant before. Also, table aliases (as opposed to actual table names) > don't change readily, so I don't think there's much risk of the query > suddenly meaning something different than it did yesterday. > With active shadowing variable warning for described example you will get a warning before dropping. Session variables are joined with schema (in my proposal). Do anybody can do just CREATE SCHEMA svars; -- or what (s)he likes CREATE VARIABLE svars.b AS int; SELECT a, b FROM table1 and if somebody can be really safe, the can write SELECT t.a, t.b FROM table1 t or SELECT t.a, svars.b FROM table1 t It can be customized in the way anybody prefers - just creating dedicated schemas and setting search_path. Using its own schema for session variables without enhancing search_path for this schema forces the necessity to set only qualified names for session variables. Sure the naming of schemas, aliases can be unhappy wrong, and there can be the problem. But this can be a problem today too. Regards Pavel > > regards, tom lane >
Re: Schema variables - new implementation for Postgres 15
st 22. 5. 2024 v 14:37 odesílatel Peter Eisentraut napsal: > On 18.05.24 13:29, Alvaro Herrera wrote: > > I want to note that when we discussed this patch series at the dev > > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > > schema variables at all because of the fact that creating a variable > > would potentially change the meaning of queries by shadowing table > > columns. But this turns out to be incorrect: it's_variables_ that are > > shadowed by table columns, not the other way around. > > But that's still bad, because seemingly unrelated schema changes can > make variables appear and disappear. For example, if you have > > SELECT a, b FROM table1 > > and then you drop column b, maybe the above query continues to work > because there is also a variable b. Or maybe it now does different > things because b is of a different type. This all has the potential to > be very confusing. > In the described case, the variable's shadowing warning will be raised. There are more cases where not well designed changes (just with tables) can break queries or change results. Adding columns can be a potential risk, creating tables or dropping tables (when the search path contains more schemas) too. Good practice is using well designed names and almost all use aliases or labels, and it is one way to minimize real risks. Personally I prefer a very strict mode that disallows shadowing, conflicts, ... but on second hand, for some usual work this strict mode can be boring, so we should find some good compromise. Regards Pavel
Re: Schema variables - new implementation for Postgres 15
Alvaro Herrera: Perhaps the solution to all this is to avoid having the variables be implicitly present in the range table of all queries. Instead, if you need a variable's value, then you need to add the variable to the FROM clause; +1 This should make it easier to work with composite type schema variables in some cases. It could also enable schema qualifying of schema variables, or at least make it easier to do, I think. In this case variables would share the same namespace as tables and views, right? So I could not create a variable with the same name as another table. Which is a good thing, I guess. Not sure how it's currently implemented in the patch. Best, Wolfgang
Re: Schema variables - new implementation for Postgres 15
On 2024-May-22, Dmitry Dolgov wrote: > Yeah, that's a bummer. Interestingly enough, the db2 implementation of > global session variables mechanism is mentioned as similar to what we > have in the patch. But weirdly, the db2 documentation just states > possibility of a resolution conflict for unqualified names, nothing > else. Perhaps the solution to all this is to avoid having the variables be implicitly present in the range table of all queries. Instead, if you need a variable's value, then you need to add the variable to the FROM clause; and if you try to read from the variable and the name conflicts with that of a column in one of the tables in the FROM clause, then you get an error that the name is ambiguous and invites to qualify it. Like, for instance, create table lefttab (a int, b int); create table righttab (c int, d int, b int); =# select b from lefttab, righttab; ERROR: column reference "b" is ambiguous LÍNEA 1: select b from lefttab, righttab; ^ but this works fine because there's no longer an ambiguity: select lefttab.b from lefttab, righttab; b ─── (0 filas) Nothing breaks if you create new variables, because your queries won't see them until you explicitly request them. And if you add add columns to either tables or variables, it's possible that some queries would start having ambiguous references, in which case they'll just stop working until you disambiguate by editing the query. Now, Pavel has been saying that variables are simple and cannot break queries (because they're always shadowed), which is why they're always implicitly visible to all queries[1]; but maybe that's a mistake. [1] https://postgr.es/m/cafj8pra2p7uafgpfjxvhrhftizbcn41j00breotspdd+urg...@mail.gmail.com -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "La experiencia nos dice que el hombre peló millones de veces las patatas, pero era forzoso admitir la posibilidad de que en un caso entre millones, las patatas pelarían al hombre" (Ijon Tichy)
Re: Schema variables - new implementation for Postgres 15
Peter Eisentraut writes: > On 18.05.24 13:29, Alvaro Herrera wrote: >> I want to note that when we discussed this patch series at the dev >> meeting in FOSDEM, a sort-of conclusion was reached that we didn't want >> schema variables at all because of the fact that creating a variable >> would potentially change the meaning of queries by shadowing table >> columns. But this turns out to be incorrect: it's_variables_ that are >> shadowed by table columns, not the other way around. > But that's still bad, because seemingly unrelated schema changes can > make variables appear and disappear. For example, if you have > SELECT a, b FROM table1 > and then you drop column b, maybe the above query continues to work > because there is also a variable b. Yeah, that seems pretty dangerous. Could we make it safe enough by requiring some qualification on variable names? That is, if you mean b to be a variable, then you must write something like SELECT a, pg_variables.b FROM table1 This is still ambiguous if you use "pg_variables" as a table alias in the query, but the alias would win so the query still means what it meant before. Also, table aliases (as opposed to actual table names) don't change readily, so I don't think there's much risk of the query suddenly meaning something different than it did yesterday. regards, tom lane
Re: Schema variables - new implementation for Postgres 15
> On Wed, May 22, 2024 at 02:37:49PM +0200, Peter Eisentraut wrote: > On 18.05.24 13:29, Alvaro Herrera wrote: > > I want to note that when we discussed this patch series at the dev > > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > > schema variables at all because of the fact that creating a variable > > would potentially change the meaning of queries by shadowing table > > columns. But this turns out to be incorrect: it's_variables_ that are > > shadowed by table columns, not the other way around. > > But that's still bad, because seemingly unrelated schema changes can make > variables appear and disappear. For example, if you have > > SELECT a, b FROM table1 > > and then you drop column b, maybe the above query continues to work because > there is also a variable b. Or maybe it now does different things because b > is of a different type. This all has the potential to be very confusing. Yeah, that's a bummer. Interestingly enough, the db2 implementation of global session variables mechanism is mentioned as similar to what we have in the patch. But weirdly, the db2 documentation just states possibility of a resolution conflict for unqualified names, nothing else. There was extensive discussion about this problem early in the thread, and one alternative is to use some sort of special syntax every time when working with a variable to clear any ambiguity [1]. It's more verbose, has to be careful to not block some useful syntax for other stuff, etc. But as Pavel said: > The different syntax disallows any collision well, it is far to what is > more usual standard in this area. And if we introduce special syntax, then > there is no way back. We cannot use :varname - this syntax is used already, > but we can use, theoretically, @var or $var. But, personally, I don't want > to use it, if there is possibility to do without it. It seems to me there is no other possibility to resolve those ambiguity issues. [1]: https://www.postgresql.org/message-id/CAFj8pRD03hwZK%2B541KDt4Eo5YuC81CBBX_P0Sa5A7g5TQFsTww%40mail.gmail.com
Re: Schema variables - new implementation for Postgres 15
On 18.05.24 13:29, Alvaro Herrera wrote: I want to note that when we discussed this patch series at the dev meeting in FOSDEM, a sort-of conclusion was reached that we didn't want schema variables at all because of the fact that creating a variable would potentially change the meaning of queries by shadowing table columns. But this turns out to be incorrect: it's_variables_ that are shadowed by table columns, not the other way around. But that's still bad, because seemingly unrelated schema changes can make variables appear and disappear. For example, if you have SELECT a, b FROM table1 and then you drop column b, maybe the above query continues to work because there is also a variable b. Or maybe it now does different things because b is of a different type. This all has the potential to be very confusing.
Re: Schema variables - new implementation for Postgres 15
so 18. 5. 2024 v 18:31 odesílatel Alvaro Herrera napsal: > On 2024-Jan-30, Dmitry Dolgov wrote: > > > Yep, in this constellation the implementation holds much better (in > > terms of memory) in my create/let/drop testing. > > > > I've marked the CF item as ready for committer, but a note for anyone > > who would like to pick up it from here -- we're talking about first 5 > > patches here, up to the memory cleaning after DROP VARIABLE. It doesn't > > mean the rest is somehow not worth it, but I believe it's a good first > > step. > > Hmm, I think patch 16 is essential, because the point of variable shadowing > is a critical aspect of how the whole thing works. So I would say that > a first step would be those first five patches plus 16. > I'll move patch 16 to 6 position Regards Pavel > > I want to note that when we discussed this patch series at the dev > meeting in FOSDEM, a sort-of conclusion was reached that we didn't want > schema variables at all because of the fact that creating a variable > would potentially change the meaning of queries by shadowing table > columns. But this turns out to be incorrect: it's _variables_ that are > shadowed by table columns, not the other way around. > > -- > Álvaro Herrera PostgreSQL Developer — > https://www.EnterpriseDB.com/ > "No hay ausente sin culpa ni presente sin disculpa" (Prov. francés) >
Re: Schema variables - new implementation for Postgres 15
On 2024-Jan-30, Dmitry Dolgov wrote: > Yep, in this constellation the implementation holds much better (in > terms of memory) in my create/let/drop testing. > > I've marked the CF item as ready for committer, but a note for anyone > who would like to pick up it from here -- we're talking about first 5 > patches here, up to the memory cleaning after DROP VARIABLE. It doesn't > mean the rest is somehow not worth it, but I believe it's a good first > step. Hmm, I think patch 16 is essential, because the point of variable shadowing is a critical aspect of how the whole thing works. So I would say that a first step would be those first five patches plus 16. I want to note that when we discussed this patch series at the dev meeting in FOSDEM, a sort-of conclusion was reached that we didn't want schema variables at all because of the fact that creating a variable would potentially change the meaning of queries by shadowing table columns. But this turns out to be incorrect: it's _variables_ that are shadowed by table columns, not the other way around. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "No hay ausente sin culpa ni presente sin disculpa" (Prov. francés)
Re: Schema variables - new implementation for Postgres 15
út 30. 1. 2024 v 20:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > Yep, in this constellation the implementation holds much better (in > terms of memory) in my create/let/drop testing. > > I've marked the CF item as ready for committer, but a note for anyone > who would like to pick up it from here -- we're talking about first 5 > patches here, up to the memory cleaning after DROP VARIABLE. It doesn't > mean the rest is somehow not worth it, but I believe it's a good first > step. > Thank you very much Pavel
Re: Schema variables - new implementation for Postgres 15
Yep, in this constellation the implementation holds much better (in terms of memory) in my create/let/drop testing. I've marked the CF item as ready for committer, but a note for anyone who would like to pick up it from here -- we're talking about first 5 patches here, up to the memory cleaning after DROP VARIABLE. It doesn't mean the rest is somehow not worth it, but I believe it's a good first step.
Re: Schema variables - new implementation for Postgres 15
po 29. 1. 2024 v 19:36 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Mon, Jan 29, 2024 at 08:57:42AM +0100, Pavel Stehule wrote: > > Hi > > > > ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > > napsal: > > > > > Thanks for the update, smaller patches looks promising. > > > > > > Off the list Pavel has mentioned that the first two patches contain a > > > bare minimum for session variables, so I've reviewed them once more and > > > suggest to concentrate on them first. I'm afraid the memory cleanup > > > patch has to be added to the "bare minimum" set as well -- otherwise in > > > my tests it was too easy to run out of memory via creating, assigning > > > and dropping variables. Unfortunately one can't extract those three > > > patches from the series and apply only them, the memory patch would > have > > > some conflicts. Can you maybe reshuffle the series to have those > patches > > > (1, 2 + 8) as first three? > > > > > > > probably you need too > > > > 0006-function-pg_session_variables-for-cleaning-tests.patch and > > 0007-DISCARD-VARIABLES.patch > > > > 6 is necessary for testing of cleaning > > Ok, let me take a look at those. Unless there are any objections, my > plan would be to give it a final check and mark the CF item as ready for > committer -- meaning the first 5 patches. > sure. Thank you very much. Pavel
Re: Schema variables - new implementation for Postgres 15
> On Mon, Jan 29, 2024 at 08:57:42AM +0100, Pavel Stehule wrote: > Hi > > ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > > > Thanks for the update, smaller patches looks promising. > > > > Off the list Pavel has mentioned that the first two patches contain a > > bare minimum for session variables, so I've reviewed them once more and > > suggest to concentrate on them first. I'm afraid the memory cleanup > > patch has to be added to the "bare minimum" set as well -- otherwise in > > my tests it was too easy to run out of memory via creating, assigning > > and dropping variables. Unfortunately one can't extract those three > > patches from the series and apply only them, the memory patch would have > > some conflicts. Can you maybe reshuffle the series to have those patches > > (1, 2 + 8) as first three? > > > > probably you need too > > 0006-function-pg_session_variables-for-cleaning-tests.patch and > 0007-DISCARD-VARIABLES.patch > > 6 is necessary for testing of cleaning Ok, let me take a look at those. Unless there are any objections, my plan would be to give it a final check and mark the CF item as ready for committer -- meaning the first 5 patches.
Re: Schema variables - new implementation for Postgres 15
> On Sun, Jan 28, 2024 at 08:34:40PM +0100, Pavel Stehule wrote: > > * I've noticed an interesting result when a LET statement is used to > > assign a > > value without a subquery: > > > > create variable test as text; > > -- returns NULL > > select test; > > > > -- use repeat directly without a subquery > > let test = repeat("test", 10); > > > > -- returns NULL > > select test; > > > > I was expecting to see an error here, is this a correct behaviour? > > > > what is strange on this result? Never mind, I've got confused about the quotes here -- it was referring to the variable content, not a string.
Re: Schema variables - new implementation for Postgres 15
ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > Thanks for the update, smaller patches looks promising. > > Off the list Pavel has mentioned that the first two patches contain a > bare minimum for session variables, so I've reviewed them once more and > suggest to concentrate on them first. I'm afraid the memory cleanup > patch has to be added to the "bare minimum" set as well -- otherwise in > my tests it was too easy to run out of memory via creating, assigning > and dropping variables. Unfortunately one can't extract those three > patches from the series and apply only them, the memory patch would have > some conflicts. Can you maybe reshuffle the series to have those patches > (1, 2 + 8) as first three? > > If that's possible, my proposal would be to proceed with them first. To the > best of my knowledge they look good to me, except few minor details: > > * The documentation says in a couple of places (ddl.sgml, > create_variable.sgml) that "Retrieving a session variable's value > returns either a NULL or a default value", but as far as I see the > default value feature is not implemented within first two patches. > > * Similar with mentioning immutable session variables in plpgsql.sgml . > > * Commentary to LookupVariable mentions a rowtype_only argument: > > +/* > + * Returns oid of session variable specified by possibly > qualified identifier. > + * > + * If not found, returns InvalidOid if missing_ok, else throws > error. > + * When rowtype_only argument is true the session variables of not > + * composite types are ignored. This should to reduce possible > collisions. > + */ > +Oid > +LookupVariable(const char *nspname, > + const char *varname, > + bool missing_ok) > > but the function doesn't have it. > > * I've noticed an interesting result when a LET statement is used to > assign a > value without a subquery: > > create variable test as text; > -- returns NULL > select test; > > -- use repeat directly without a subquery > let test = repeat("test", 10); > > -- returns NULL > select test; > > I was expecting to see an error here, is this a correct behaviour? > what is strange on this result? (2024-01-28 20:32:05) postgres=# let test = 'ab'; LET (2024-01-28 20:32:12) postgres=# let test = repeat("test", 10); LET (2024-01-28 20:32:19) postgres=# select test; ┌──┐ │ test │ ╞══╡ │ abababababababababab │ └──┘ (1 row) (2024-01-28 20:32:21) postgres=# let test = null; LET (2024-01-28 20:32:48) postgres=# let test = repeat("test", 10); LET (2024-01-28 20:32:51) postgres=# select test; ┌──┐ │ test │ ╞══╡ │ ∅│ └──┘ (1 row) (2024-01-28 20:32:53) postgres=# select repeat(test, 10); ┌┐ │ repeat │ ╞╡ │ ∅ │ └┘ (1 row) "repeat" is the usual scalar function. Maybe you thought different function
Re: Schema variables - new implementation for Postgres 15
Thanks for the update, smaller patches looks promising. Off the list Pavel has mentioned that the first two patches contain a bare minimum for session variables, so I've reviewed them once more and suggest to concentrate on them first. I'm afraid the memory cleanup patch has to be added to the "bare minimum" set as well -- otherwise in my tests it was too easy to run out of memory via creating, assigning and dropping variables. Unfortunately one can't extract those three patches from the series and apply only them, the memory patch would have some conflicts. Can you maybe reshuffle the series to have those patches (1, 2 + 8) as first three? If that's possible, my proposal would be to proceed with them first. To the best of my knowledge they look good to me, except few minor details: * The documentation says in a couple of places (ddl.sgml, create_variable.sgml) that "Retrieving a session variable's value returns either a NULL or a default value", but as far as I see the default value feature is not implemented within first two patches. * Similar with mentioning immutable session variables in plpgsql.sgml . * Commentary to LookupVariable mentions a rowtype_only argument: +/* + * Returns oid of session variable specified by possibly qualified identifier. + * + * If not found, returns InvalidOid if missing_ok, else throws error. + * When rowtype_only argument is true the session variables of not + * composite types are ignored. This should to reduce possible collisions. + */ +Oid +LookupVariable(const char *nspname, + const char *varname, + bool missing_ok) but the function doesn't have it. * I've noticed an interesting result when a LET statement is used to assign a value without a subquery: create variable test as text; -- returns NULL select test; -- use repeat directly without a subquery let test = repeat("test", 10); -- returns NULL select test; I was expecting to see an error here, is this a correct behaviour?
Re: Schema variables - new implementation for Postgres 15
> On Sun, Dec 03, 2023 at 06:04:12AM +0100, Pavel Stehule wrote: > > Thank you for your review. Next two weeks I'll not too much time to work > on this patch - I have to work on some commercial work, and the week is > Prague PgConf, so my reply will be slow. But after these events I'll > concentrate on this patch. No worries, it's fine. Have fun at PGConf!
Re: Schema variables - new implementation for Postgres 15
Hi ne 26. 11. 2023 v 18:56 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote: > > so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > > napsal: > > > As a side note, I'm intended to go one more time through the first few > > > patches introducing the basic functionality, and then mark it as ready > > > in CF. I can't break the patch in testing since quite long time, and > for > > > most parts the changes make sense to me. > > > > I marked pg_session_variables function as PARALLEL RESTRICTED, and did > > rebase > > So, after one week of uninterrupted evening reviews I've made it through > the first four patches :) > > It's a decent job -- more than once, looking at the code, I thought I > could construct a case when it's going to blow up, but everything was > working just fine. Yet, I think the patch still has to be reshaped a bit > before moving forward. I've got a couple proposals of different nature: > high level changes (you probably won't like some of them, but I'm sure > they're going to be useful), technical code-level improvements/comments, > and few language changes. With those changes in mind I would be > satisfied with the patch, and hopefully they would also make it easier > for a potential committer to pick it up. > > # High level proposals > > * I would suggest reducing the scope of the patch as much as possible, > and not just by trimming on the edges, but rather following Phileas > Fogg's example with the steamboat Henrietta -- get rid of all > non-essential parts. This will make this rather large patch more > approachable for others. > > For that one can concentrate only on the first two patches plus the > fourth one (memory cleanup after dropping variables), leaving DISCARD, > ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the > future. > > Another thing in this context would be to evaluate plpgsql support for > this feature. You know the use case better than me, how important it > is? Is it an intrinsic part of the feature, or session variables could > be still valuable enough even without plpgsql? From what I see > postponing plgpsql will make everything about ~800 lines lighter (most > likely more), and also allow to ignore couple of concerns about the > implementation (about this later). > > * The new GUC session_variables_ambiguity_warning is definitely going to > cause many objections, it's another knob to manage very subtle > behaviour detail very few people will ever notice. I see the point > behind warning about ambiguity, so probably it makes sense to bite the > bullet and decide one way or another. The proposal is to warn always > in potentially ambiguous situations, and if concerns are high about > logging too much, maybe do the warning on lower logging levels. > > # Code-level observations > > * It feels a bit awkward to have varid assignment logic in a separate > function, what about adding an argument with varid to > CreateVariableDestReceiver? SetVariableDestReceiverVarid still could > be used for CreateDestReceiver. > > /* > * Initially create a DestReceiver object. > */ > DestReceiver * > CreateVariableDestReceiver(void) > > /* > * Set parameters for a VariableDestReceiver. > * Should be called right after creating the DestReceiver. > */ > void > SetVariableDestReceiverVarid(DestReceiver *self, Oid varid) > > * It's worth it to add a commentary here explaining why it's fine to use > InvalidOid here: > > if (pstmt->commandType != CMD_UTILITY) > - ExplainOnePlan(pstmt, into, es, query_string, paramLI, > queryEnv, > + ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, > paramLI, queryEnv, >&planduration, (es->buffers ? &bufusage : > NULL)); > > My understanding is that since LetStmt is CMD_UTILITY, this branch > will never be visited for a session variable. > > * IIUC this one is introduced to exclude session variables from the normal > path with EXPR_KIND_UPDATE_TARGET: > > + EXPR_KIND_ASSIGN_VARIABLE, /* PL/pgSQL assignment target - > disallow > +* session > variables */ > > But the name doesn't sound right, maybe longer > EXPR_KIND_UPDATE_TARGET_NO_VARS is better? > > * I'm curious about this one, which exactly part does this change cover? > > @@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node, > - if (param->paramkind != PARAM_EXTERN) > + if (param->paramkind != PARAM_EXTERN && > + param->paramkind != PARAM_VARIABLE) > elog(ERROR, "unexpected paramkind: %d", (int) > param->paramkind); > > I've commented it out, but no tests were affected. > > * Does it mean there could be theoretically two LET statements at the > same time with diffe
Re: Schema variables - new implementation for Postgres 15
Hi st 22. 11. 2023 v 7:20 odesílatel Julien Rouhaud napsal: > Hi, > > On Tue, Oct 17, 2023 at 08:52:13AM +0200, Pavel Stehule wrote: > > > > When I thought about global temporary tables, I got one maybe interesting > > idea. The one significant problem of global temporary tables is place for > > storing info about size or column statistics. > > > > I think so these data can be stored simply in session variables. Any > global > > temporary table can get assigned one session variable, that can hold > these > > data. > > I don't know how realistic this would be. For instance it will require to > properly link the global temporary table life cycle with the session > variable > and I'm afraid it would require to add some hacks to make it work as > needed. > > But this still raises the question of whether this feature could be used > internally for the need of another feature. If we think it's likely, > should we > try to act right now and reserve the "pg_" prefix for internal use rather > than > do that a few years down the line and probably break some user code as it > was > done recently for the role names? > I don't think it is necessary. Session variables (in this design) are joined with schemas. If we use some session variables for system purposes, we can use some dedicated schema. But when I think about it in detail, probably my own dedicated storage (hash table in session memory) can be much better than session variables. What can be shared (maybe) is probably sinval message processing.
Re: Schema variables - new implementation for Postgres 15
> On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote: > so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > > As a side note, I'm intended to go one more time through the first few > > patches introducing the basic functionality, and then mark it as ready > > in CF. I can't break the patch in testing since quite long time, and for > > most parts the changes make sense to me. > > I marked pg_session_variables function as PARALLEL RESTRICTED, and did > rebase So, after one week of uninterrupted evening reviews I've made it through the first four patches :) It's a decent job -- more than once, looking at the code, I thought I could construct a case when it's going to blow up, but everything was working just fine. Yet, I think the patch still has to be reshaped a bit before moving forward. I've got a couple proposals of different nature: high level changes (you probably won't like some of them, but I'm sure they're going to be useful), technical code-level improvements/comments, and few language changes. With those changes in mind I would be satisfied with the patch, and hopefully they would also make it easier for a potential committer to pick it up. # High level proposals * I would suggest reducing the scope of the patch as much as possible, and not just by trimming on the edges, but rather following Phileas Fogg's example with the steamboat Henrietta -- get rid of all non-essential parts. This will make this rather large patch more approachable for others. For that one can concentrate only on the first two patches plus the fourth one (memory cleanup after dropping variables), leaving DISCARD, ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the future. Another thing in this context would be to evaluate plpgsql support for this feature. You know the use case better than me, how important it is? Is it an intrinsic part of the feature, or session variables could be still valuable enough even without plpgsql? From what I see postponing plgpsql will make everything about ~800 lines lighter (most likely more), and also allow to ignore couple of concerns about the implementation (about this later). * The new GUC session_variables_ambiguity_warning is definitely going to cause many objections, it's another knob to manage very subtle behaviour detail very few people will ever notice. I see the point behind warning about ambiguity, so probably it makes sense to bite the bullet and decide one way or another. The proposal is to warn always in potentially ambiguous situations, and if concerns are high about logging too much, maybe do the warning on lower logging levels. # Code-level observations * It feels a bit awkward to have varid assignment logic in a separate function, what about adding an argument with varid to CreateVariableDestReceiver? SetVariableDestReceiverVarid still could be used for CreateDestReceiver. /* * Initially create a DestReceiver object. */ DestReceiver * CreateVariableDestReceiver(void) /* * Set parameters for a VariableDestReceiver. * Should be called right after creating the DestReceiver. */ void SetVariableDestReceiverVarid(DestReceiver *self, Oid varid) * It's worth it to add a commentary here explaining why it's fine to use InvalidOid here: if (pstmt->commandType != CMD_UTILITY) - ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv, + ExplainOnePlan(pstmt, into, InvalidOid, es, query_string, paramLI, queryEnv, &planduration, (es->buffers ? &bufusage : NULL)); My understanding is that since LetStmt is CMD_UTILITY, this branch will never be visited for a session variable. * IIUC this one is introduced to exclude session variables from the normal path with EXPR_KIND_UPDATE_TARGET: + EXPR_KIND_ASSIGN_VARIABLE, /* PL/pgSQL assignment target - disallow +* session variables */ But the name doesn't sound right, maybe longer EXPR_KIND_UPDATE_TARGET_NO_VARS is better? * I'm curious about this one, which exactly part does this change cover? @@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node, - if (param->paramkind != PARAM_EXTERN) + if (param->paramkind != PARAM_EXTERN && + param->paramkind != PARAM_VARIABLE) elog(ERROR, "unexpected paramkind: %d", (int) param->paramkind); I've commented it out, but no tests were affected. * Does it mean there could be theoretically two LET statements at the same time with different command type, one CMD_UTILITY, one CMD_SELECT? Can it cause any issues? + /* +* Inside PL/pgSQL we don't want to execute LET statement as utility +* command, because it disallow to execute expression as simple +* expression. So for PL/pgSQL we have
Re: Schema variables - new implementation for Postgres 15
Hi, On Tue, Oct 17, 2023 at 08:52:13AM +0200, Pavel Stehule wrote: > > When I thought about global temporary tables, I got one maybe interesting > idea. The one significant problem of global temporary tables is place for > storing info about size or column statistics. > > I think so these data can be stored simply in session variables. Any global > temporary table can get assigned one session variable, that can hold these > data. I don't know how realistic this would be. For instance it will require to properly link the global temporary table life cycle with the session variable and I'm afraid it would require to add some hacks to make it work as needed. But this still raises the question of whether this feature could be used internally for the need of another feature. If we think it's likely, should we try to act right now and reserve the "pg_" prefix for internal use rather than do that a few years down the line and probably break some user code as it was done recently for the role names?
Re: Schema variables - new implementation for Postgres 15
> On Sat, Nov 18, 2023 at 02:19:09PM +0100, Pavel Stehule wrote: > > Would it be a problem to make pg_session_variables inspect the catalog > > or something similar if needed? > > > > It can be very easy to build pg_session_variables based on iteration over > the system catalog. But I am not sure if we want it. pg_session_variables() > is designed to show the variables from session memory, and it is used for > testing. Originally it was named pg_debug_session_variables. If we iterate > over catalog, it means using locks, and it can have an impact on isolation > tests. I see, thanks for clarification. In the end one can check the catalog directly of course, is there any other value in this function except for debugging purposes? As a side note, I'm intended to go one more time through the first few patches introducing the basic functionality, and then mark it as ready in CF. I can't break the patch in testing since quite long time, and for most parts the changes make sense to me.
Re: Schema variables - new implementation for Postgres 15
>> >> The difference between debug_parallel_query = 1 and debug_parallel_query >> = 0 is strange - and I'll check it. >> > > looks so pg_session_variables() doesn't work in debug_paralel_query mode. > It is marked as parallel safe, what is probably nonsense.
Re: Schema variables - new implementation for Postgres 15
so 18. 11. 2023 v 14:19 odesílatel Pavel Stehule napsal: > Hi > > pá 17. 11. 2023 v 20:17 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > >> > On Wed, Aug 23, 2023 at 04:02:44PM +0200, Pavel Stehule wrote: >> > NameListToString is already buildin function. Do you think >> NamesFromList? >> > >> > This is my oversight - there is just `+extern List *NamesFromList(List >> > *names); ` line, but sure - it should be in 0002 patch >> > >> > fixed now >> >> Right, thanks for fixing. >> >> I think there is a wrinkle with pg_session_variables function. It >> returns nothing if sessionvars hash table is empty, which has two >> consequences: >> >> * One might get confused about whether a variable is created, >> based on the information from the function. An expected behaviour, but >> could be considered a bad UX. >> >> =# CREATE VARIABLE var1 AS varchar; >> >> -- empty, is expected >> =# SELECT name, typname, can_select, can_update FROM >> pg_session_variables(); >> name | typname | can_select | can_update >> --+-++ >> (0 rows) >> >> -- but one can't create a variable >> =# CREATE VARIABLE var1 AS varchar; >> ERROR: 42710: session variable "var1" already exists >> LOCATION: create_variable, pg_variable.c:102 >> >> -- yet, suddenly after a select... >> =# SELECT var2; >> var2 >> -- >> NULL >> (1 row) >> >> -- ... it's not empty >> =# SELECT name, typname, can_select, can_update FROM pg_sessio >> n_variables(); >> name | typname | can_select | can_update >> --+---++ >> var2 | character varying | t | t >> (1 row) >> >> * Running a parallel query will end up returning an empty result even >> after accessing the variable. >> >> -- debug_parallel_query = 1 all the time >> =# CREATE VARIABLE var2 AS varchar; >> >> -- empty, is expected >> =# SELECT name, typname, can_select, can_update FROM >> pg_session_variables(); >> name | typname | can_select | can_update >> --+-++ >> (0 rows) >> >> -- but this time an access... >> SELECT var2; >> var2 >> -- >> NULL >> (1 row) >> >> -- or set... >> =# LET var2 = 'test'; >> >> -- doesn't change the result, it's still empty >> =# SELECT name, typname, can_select, can_update FROM >> pg_session_variables(); >> name | typname | can_select | can_update >> --+-++ >> (0 rows) >> >> Would it be a problem to make pg_session_variables inspect the catalog >> or something similar if needed? >> > > It can be very easy to build pg_session_variables based on iteration over > the system catalog. But I am not sure if we want it. pg_session_variables() > is designed to show the variables from session memory, and it is used for > testing. Originally it was named pg_debug_session_variables. If we iterate > over catalog, it means using locks, and it can have an impact on isolation > tests. > > So maybe we can introduce a parameter for this function to show all > session variables (based on catalog) or only used based on iteration over > memory. Default can be "all". What do you think about it? > > The difference between debug_parallel_query = 1 and debug_parallel_query = > 0 is strange - and I'll check it. > looks so pg_session_variables() doesn't work in debug_paralel_query mode.
Re: Schema variables - new implementation for Postgres 15
Hi pá 17. 11. 2023 v 20:17 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Wed, Aug 23, 2023 at 04:02:44PM +0200, Pavel Stehule wrote: > > NameListToString is already buildin function. Do you think NamesFromList? > > > > This is my oversight - there is just `+extern List *NamesFromList(List > > *names); ` line, but sure - it should be in 0002 patch > > > > fixed now > > Right, thanks for fixing. > > I think there is a wrinkle with pg_session_variables function. It > returns nothing if sessionvars hash table is empty, which has two > consequences: > > * One might get confused about whether a variable is created, > based on the information from the function. An expected behaviour, but > could be considered a bad UX. > > =# CREATE VARIABLE var1 AS varchar; > > -- empty, is expected > =# SELECT name, typname, can_select, can_update FROM > pg_session_variables(); > name | typname | can_select | can_update > --+-++ > (0 rows) > > -- but one can't create a variable > =# CREATE VARIABLE var1 AS varchar; > ERROR: 42710: session variable "var1" already exists > LOCATION: create_variable, pg_variable.c:102 > > -- yet, suddenly after a select... > =# SELECT var2; > var2 > -- > NULL > (1 row) > > -- ... it's not empty > =# SELECT name, typname, can_select, can_update FROM pg_sessio > n_variables(); > name | typname | can_select | can_update > --+---++ > var2 | character varying | t | t > (1 row) > > * Running a parallel query will end up returning an empty result even > after accessing the variable. > > -- debug_parallel_query = 1 all the time > =# CREATE VARIABLE var2 AS varchar; > > -- empty, is expected > =# SELECT name, typname, can_select, can_update FROM > pg_session_variables(); > name | typname | can_select | can_update > --+-++ > (0 rows) > > -- but this time an access... > SELECT var2; > var2 > -- > NULL > (1 row) > > -- or set... > =# LET var2 = 'test'; > > -- doesn't change the result, it's still empty > =# SELECT name, typname, can_select, can_update FROM > pg_session_variables(); > name | typname | can_select | can_update > --+-++ > (0 rows) > > Would it be a problem to make pg_session_variables inspect the catalog > or something similar if needed? > It can be very easy to build pg_session_variables based on iteration over the system catalog. But I am not sure if we want it. pg_session_variables() is designed to show the variables from session memory, and it is used for testing. Originally it was named pg_debug_session_variables. If we iterate over catalog, it means using locks, and it can have an impact on isolation tests. So maybe we can introduce a parameter for this function to show all session variables (based on catalog) or only used based on iteration over memory. Default can be "all". What do you think about it? The difference between debug_parallel_query = 1 and debug_parallel_query = 0 is strange - and I'll check it.
Re: Schema variables - new implementation for Postgres 15
> On Wed, Aug 23, 2023 at 04:02:44PM +0200, Pavel Stehule wrote: > NameListToString is already buildin function. Do you think NamesFromList? > > This is my oversight - there is just `+extern List *NamesFromList(List > *names); ` line, but sure - it should be in 0002 patch > > fixed now Right, thanks for fixing. I think there is a wrinkle with pg_session_variables function. It returns nothing if sessionvars hash table is empty, which has two consequences: * One might get confused about whether a variable is created, based on the information from the function. An expected behaviour, but could be considered a bad UX. =# CREATE VARIABLE var1 AS varchar; -- empty, is expected =# SELECT name, typname, can_select, can_update FROM pg_session_variables(); name | typname | can_select | can_update --+-++ (0 rows) -- but one can't create a variable =# CREATE VARIABLE var1 AS varchar; ERROR: 42710: session variable "var1" already exists LOCATION: create_variable, pg_variable.c:102 -- yet, suddenly after a select... =# SELECT var2; var2 -- NULL (1 row) -- ... it's not empty =# SELECT name, typname, can_select, can_update FROM pg_sessio n_variables(); name | typname | can_select | can_update --+---++ var2 | character varying | t | t (1 row) * Running a parallel query will end up returning an empty result even after accessing the variable. -- debug_parallel_query = 1 all the time =# CREATE VARIABLE var2 AS varchar; -- empty, is expected =# SELECT name, typname, can_select, can_update FROM pg_session_variables(); name | typname | can_select | can_update --+-++ (0 rows) -- but this time an access... SELECT var2; var2 -- NULL (1 row) -- or set... =# LET var2 = 'test'; -- doesn't change the result, it's still empty =# SELECT name, typname, can_select, can_update FROM pg_session_variables(); name | typname | can_select | can_update --+-++ (0 rows) Would it be a problem to make pg_session_variables inspect the catalog or something similar if needed?
Re: Schema variables - new implementation for Postgres 15
Hi When I thought about global temporary tables, I got one maybe interesting idea. The one significant problem of global temporary tables is place for storing info about size or column statistics. I think so these data can be stored simply in session variables. Any global temporary table can get assigned one session variable, that can hold these data. Regards Pavel
Re: Schema variables - new implementation for Postgres 15
On Sat, Aug 12, 2023 at 01:20:03PM +0200, Dmitry Dolgov wrote: > > On Sat, Aug 12, 2023 at 09:28:19AM +0800, Julien Rouhaud wrote: > > On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote: > > > > > > Another confusing example was this one at the end of set_session_variable: > > > > > > + /* > > > + * XXX While unlikely, an error here is possible. It wouldn't > > > leak memory > > > + * as the allocated chunk has already been correctly assigned > > > to the > > > + * session variable, but would contradict this function > > > contract, which is > > > + * that this function should either succeed or leave the > > > current value > > > + * untouched. > > > + */ > > > + elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new > > > value", > > > + > > > get_namespace_name(get_session_variable_namespace(svar->varid)), > > > + get_session_variable_name(svar->varid), > > > + svar->varid); > > > > > > It's not clear, which exactly error you're talking about, it's the last > > > instruction in the function. > > > > FTR I think I'm the one that changed that. The error I was talking about is > > elog() itself (in case of OOM for instance), or even one of the get_* call, > > if > > running with log_level <= DEBUG1. It's clearly really unlikely but still > > possible, thus this comment which also tries to explain why this elog() is > > not > > done earlier. > > I see, thanks for clarification. Absolutely nitpicking, but the crucial > "that's why this elog is not done earlier" is only assumed in the > comment between the lines, not stated out loud :) Well, yes although to be fair the original version of this had a prior comment that was making it much more obvious: + /* +* No error should happen after this poiht, otherwise we could leak the +* newly allocated value if any. +*/ (which would maybe have been better said "Nothing that can error out should be called after that point"). After quite a lot of patch revisions it now simply says: + /* We can overwrite old variable now. No error expected */ I agree that a bit more explanation is needed, and maybe also reminding that this is because all of that is done in a persistent memory context.
Re: Schema variables - new implementation for Postgres 15
> On Sat, Aug 12, 2023 at 09:28:19AM +0800, Julien Rouhaud wrote: > On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote: > > > > Another confusing example was this one at the end of set_session_variable: > > > > + /* > > +* XXX While unlikely, an error here is possible. It wouldn't > > leak memory > > +* as the allocated chunk has already been correctly assigned > > to the > > +* session variable, but would contradict this function > > contract, which is > > +* that this function should either succeed or leave the > > current value > > +* untouched. > > +*/ > > + elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new > > value", > > + > > get_namespace_name(get_session_variable_namespace(svar->varid)), > > +get_session_variable_name(svar->varid), > > +svar->varid); > > > > It's not clear, which exactly error you're talking about, it's the last > > instruction in the function. > > FTR I think I'm the one that changed that. The error I was talking about is > elog() itself (in case of OOM for instance), or even one of the get_* call, if > running with log_level <= DEBUG1. It's clearly really unlikely but still > possible, thus this comment which also tries to explain why this elog() is not > done earlier. I see, thanks for clarification. Absolutely nitpicking, but the crucial "that's why this elog is not done earlier" is only assumed in the comment between the lines, not stated out loud :)
Re: Schema variables - new implementation for Postgres 15
On Fri, Aug 11, 2023 at 05:55:26PM +0200, Dmitry Dolgov wrote: > > Another confusing example was this one at the end of set_session_variable: > > + /* > + * XXX While unlikely, an error here is possible. It wouldn't leak > memory > + * as the allocated chunk has already been correctly assigned to the > + * session variable, but would contradict this function contract, which > is > + * that this function should either succeed or leave the current value > + * untouched. > + */ > + elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value", > + > get_namespace_name(get_session_variable_namespace(svar->varid)), > + get_session_variable_name(svar->varid), > + svar->varid); > > It's not clear, which exactly error you're talking about, it's the last > instruction in the function. FTR I think I'm the one that changed that. The error I was talking about is elog() itself (in case of OOM for instance), or even one of the get_* call, if running with log_level <= DEBUG1. It's clearly really unlikely but still possible, thus this comment which also tries to explain why this elog() is not done earlier.
Re: Schema variables - new implementation for Postgres 15
> On Thu, Aug 03, 2023 at 08:15:13AM +0200, Pavel Stehule wrote: > Hi > > fresh rebase Thanks for continuing efforts. The new patch structure looks better to me (although the boundary between patches 0001 and 0002 is somewhat fuzzy, e.g. the function NameListToString is used already in the first one, but defined in the second). Couple of commentaries along the way: * Looks like it's common to use BKI_DEFAULT when defining catalog entities, something like BKI_DEFAULT(-1) for typmod, BKI_DEFAULT(0) for collation, etc. Does it make sense to put few default values into pg_variable as well? * The first patch contains: diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c @@ -2800,6 +2800,8 @@ AbortTransaction(void) AtAbort_Portals(); smgrDoPendingSyncs(false, is_parallel_worker); AtEOXact_LargeObject(false); + + /* 'false' means it's abort */ AtAbort_Notify(); AtEOXact_RelationMap(false, is_parallel_worker); AtAbort_Twophase(); What does the commentary refer to, is it needed? * I see ExplainOneQuery got a new argument: static void ExplainOneQuery(Query *query, int cursorOptions, - IntoClause *into, ExplainState *es, + IntoClause *into, Oid targetvar, ExplainState *es, const char *queryString, ParamListInfo params, QueryEnvironment *queryEnv); >From what I understand it represents a potential session variable to be explained. Isn't it too specific for this interface, could it be put somewhere else? To be honest, I don't have any suggestions myself, but it feels a bit out of place here. * Session variable validity logic is not always clear, at least to me, producing following awkward pieces of code: + if (!svar->is_valid) + { + if (is_session_variable_valid(svar)) + svar->is_valid = true; I get it as there are two ways how a variable could be invalid? * It's not always easy to follow which failure modes are taken care of. E.g. +* Don't try to use possibly invalid data from svar. And we don't want to +* overwrite invalid svar immediately. The datumCopy can fail, and in this +* case, the stored value will be invalid still. I couldn't find any similar precautions, how exactly datumCopy can fail, are you referring to palloc/memcpy failures? Another confusing example was this one at the end of set_session_variable: + /* +* XXX While unlikely, an error here is possible. It wouldn't leak memory +* as the allocated chunk has already been correctly assigned to the +* session variable, but would contradict this function contract, which is +* that this function should either succeed or leave the current value +* untouched. +*/ + elog(DEBUG1, "session variable \"%s.%s\" (oid:%u) has new value", + get_namespace_name(get_session_variable_namespace(svar->varid)), +get_session_variable_name(svar->varid), +svar->varid); It's not clear, which exactly error you're talking about, it's the last instruction in the function. Maybe it would be beneficial to have some overarching description, all in one place, about how session variables implementation handles various failures?
Re: Schema variables - new implementation for Postgres 15
On Thu, Mar 30, 2023 at 4:06 AM Pavel Stehule wrote: > Hi > > ne 26. 3. 2023 v 19:44 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > >> > On Fri, Mar 24, 2023 at 08:04:08AM +0100, Pavel Stehule wrote: >> > čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule < >> pavel.steh...@gmail.com> >> > napsal: >> > >> > > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut < >> > > peter.eisentr...@enterprisedb.com> napsal: >> > > >> > >> The other issue is that by its nature this patch adds a lot of code >> in a >> > >> lot of places. Large patches are more likely to be successful if >> they >> ... >> I agree, the patch scale is a bit overwhelming. It's worth noting that >> due to the nature of this change certain heavy lifting has to be done in >> any case, plus I've got an impression that some part of the patch are >> quite solid (although I haven't reviewed everything, did anyone achieve >> that milestone?). But still, it would be of great help to simplify the >> current implementation, and I'm afraid the only way of doing this is to >> make trades-off about functionality vs change size & complexity. >> > > There is not too much space for reduction - more - sometimes there is code > reuse between features. > > I can reduce temporary session variables, but the same AtSubXact routines > are used by memory purging routines, and if only if you drop all dependent > features, then you can get some interesting number of reduced lines. I can > imagine very reduced feature set like > > 1) no temporary variables, no reset at transaction end > 2) without default expressions - default is null > 3) direct memory cleaning on drop (without possibility of saved value > after reverted drop) or cleaning at session end always > > Note - @1 and @3 shares code > > Please don't remove #2. With Default Values, I was eyeballing these as pseudo constants. I find I have a DRY (Don't Repeat Yourself) issue in our current code base (PLPGSQL) because of the lack of shared constants throughout the application layer. We literally created a CONST schema with SQL functions that return a set value. It's kludgy, but clear enough. (We have approximately 50 of these). Regards, Kirk
Re: Schema variables - new implementation for Postgres 15
> > >> example >> >> create variable a as int; >> create table foo(a int); >> >> select a from foo; -- the "a" is ambiguous, variable "a" is shadowed >> >> This is a basic case, and the unique names don't help. The variables are >> more aggressive in namespace than tables, because they don't require be in >> FROM clause. This is the reason why we specify so variables are always >> shadowed. Only this behaviour is safe and robust. I cannot break any query >> (that doesn't use variables) by creating any variable. On second hand, an >> experience from Oracle's PL/SQL or from old PLpgSQL is, so unwanted >> shadowing can be hard to investigate (without some tools). >> >> PL/pgSQL doesn't allow conflict between PL/pgSQL variables, and SQL >> (now), and I think so it is best. But the scope of PLpgSQL variables is >> relatively small, so very strict behaviour is acceptable. >> >> The session variables are some between tables and attributes. The catalog >> pg_class can be enhanced about columns for variables, but it does a lot >> now, so I think it is not practical. >> >>> >>> I agree about shadowing schema variables. But is there no way to fix > that so that you can dereference the variable? > [Does an Alias work inside a procedure against a schema var?] > Does adding a schema prefix resolve it properly, so your example, I could > do: > SELECT schema_var.a AS var_a, a as COL_A from t; > Yes, using schema can fix collisions in almost all cases. There are some possible cases, when the schema name is the same as some variable name, and in these cases there can still be collisions (and still there is a possibility to use catalog.schema.object and it can fix a collision). You can use a qualified identifier and again in most cases it fixes collisions. These cases are tested in regression tests. Regards Pavel > Again, I like the default that it is hidden, but I can envision needing > both? > > Regards, Kirk >
Re: Schema variables - new implementation for Postgres 15
On Wed, Apr 5, 2023 at 1:58 PM Pavel Stehule wrote: > > > st 5. 4. 2023 v 19:20 odesílatel Greg Stark napsal: > >> On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud wrote: >> > >> > This feature can significantly increase log size, so it's disabled by >> default. >> > For testing or development environments it's recommended to enable it >> if you >> > use session variables. >> >> I think it's generally not practical to have warnings for valid DML. >> Effectively warnings in DML are errors since they make the syntax just >> unusable. I suppose it's feasible to have it as a debugging option >> that defaults to off but I'm not sure it's really useful. >> > > It is a tool that should help with collision detection. Without it, it > can be pretty hard to detect it. It is similar to plpgsql's extra warnings. > > >> I suppose it raises the question of whether session variables should >> be in pg_class and be in the same namespace as tables so that >> collisions are impossible. I haven't looked at the code to see if >> that's feasible or reasonable. But this feels a bit like what happened >> with sequences where they used to be a wholly special thing and later >> we realized everything was simpler if they were just a kind of >> relation. >> > > The first patch did it. But at the end, it doesn't reduce conflicts, > because usually the conflicts are between variables and table's attributes > (columns). > > example > > create variable a as int; > create table foo(a int); > > select a from foo; -- the "a" is ambiguous, variable "a" is shadowed > > This is a basic case, and the unique names don't help. The variables are > more aggressive in namespace than tables, because they don't require be in > FROM clause. This is the reason why we specify so variables are always > shadowed. Only this behaviour is safe and robust. I cannot break any query > (that doesn't use variables) by creating any variable. On second hand, an > experience from Oracle's PL/SQL or from old PLpgSQL is, so unwanted > shadowing can be hard to investigate (without some tools). > > PL/pgSQL doesn't allow conflict between PL/pgSQL variables, and SQL (now), > and I think so it is best. But the scope of PLpgSQL variables is relatively > small, so very strict behaviour is acceptable. > > The session variables are some between tables and attributes. The catalog > pg_class can be enhanced about columns for variables, but it does a lot > now, so I think it is not practical. > >> >> I agree about shadowing schema variables. But is there no way to fix that so that you can dereference the variable? [Does an Alias work inside a procedure against a schema var?] Does adding a schema prefix resolve it properly, so your example, I could do: SELECT schema_var.a AS var_a, a as COL_A from t; Again, I like the default that it is hidden, but I can envision needing both? Regards, Kirk
Re: Schema variables - new implementation for Postgres 15
On Thu, Apr 6, 2023 at 1:58 AM Pavel Stehule wrote: > > st 5. 4. 2023 v 19:20 odesílatel Greg Stark napsal: >> >> On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud wrote: >> > >> > This feature can significantly increase log size, so it's disabled by >> > default. >> > For testing or development environments it's recommended to enable it if >> > you >> > use session variables. >> >> I think it's generally not practical to have warnings for valid DML. >> Effectively warnings in DML are errors since they make the syntax just >> unusable. I suppose it's feasible to have it as a debugging option >> that defaults to off but I'm not sure it's really useful. > > > It is a tool that should help with collision detection. Without it, it can > be pretty hard to detect it. It is similar to plpgsql's extra warnings. Another example is escape_string_warning, which can also emit warning for valid DML. I once had to fix some random framework that a previous employer was using, in order to move to a more recent pg version and have standard_conforming_strings on, and having escape_string_warning was quite helpful.
Re: Schema variables - new implementation for Postgres 15
st 5. 4. 2023 v 19:20 odesílatel Greg Stark napsal: > On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud wrote: > > > > This feature can significantly increase log size, so it's disabled by > default. > > For testing or development environments it's recommended to enable it if > you > > use session variables. > > I think it's generally not practical to have warnings for valid DML. > Effectively warnings in DML are errors since they make the syntax just > unusable. I suppose it's feasible to have it as a debugging option > that defaults to off but I'm not sure it's really useful. > It is a tool that should help with collision detection. Without it, it can be pretty hard to detect it. It is similar to plpgsql's extra warnings. > I suppose it raises the question of whether session variables should > be in pg_class and be in the same namespace as tables so that > collisions are impossible. I haven't looked at the code to see if > that's feasible or reasonable. But this feels a bit like what happened > with sequences where they used to be a wholly special thing and later > we realized everything was simpler if they were just a kind of > relation. > The first patch did it. But at the end, it doesn't reduce conflicts, because usually the conflicts are between variables and table's attributes (columns). example create variable a as int; create table foo(a int); select a from foo; -- the "a" is ambiguous, variable "a" is shadowed This is a basic case, and the unique names don't help. The variables are more aggressive in namespace than tables, because they don't require be in FROM clause. This is the reason why we specify so variables are always shadowed. Only this behaviour is safe and robust. I cannot break any query (that doesn't use variables) by creating any variable. On second hand, an experience from Oracle's PL/SQL or from old PLpgSQL is, so unwanted shadowing can be hard to investigate (without some tools). PL/pgSQL doesn't allow conflict between PL/pgSQL variables, and SQL (now), and I think so it is best. But the scope of PLpgSQL variables is relatively small, so very strict behaviour is acceptable. The session variables are some between tables and attributes. The catalog pg_class can be enhanced about columns for variables, but it does a lot now, so I think it is not practical. Regards Pavel > > -- > greg >
Re: Schema variables - new implementation for Postgres 15
On Sun, 26 Mar 2023 at 07:34, Julien Rouhaud wrote: > > This feature can significantly increase log size, so it's disabled by default. > For testing or development environments it's recommended to enable it if you > use session variables. I think it's generally not practical to have warnings for valid DML. Effectively warnings in DML are errors since they make the syntax just unusable. I suppose it's feasible to have it as a debugging option that defaults to off but I'm not sure it's really useful. I suppose it raises the question of whether session variables should be in pg_class and be in the same namespace as tables so that collisions are impossible. I haven't looked at the code to see if that's feasible or reasonable. But this feels a bit like what happened with sequences where they used to be a wholly special thing and later we realized everything was simpler if they were just a kind of relation. -- greg
Re: Schema variables - new implementation for Postgres 15
> On Tue, Mar 28, 2023 at 09:34:20PM +0200, Pavel Stehule wrote: > Hi > > > Talking about documentation I've noticed that the implementation > > contains few limitations, that are not mentioned in the docs. Examples > > are WITH queries: > > > > WITH x AS (LET public.svar = 100) SELECT * FROM x; > > ERROR: LET not supported in WITH query > > > > The LET statement doesn't support the RETURNING clause, so using inside > CTE does not make any sense. > > Do you have some tips, where this behaviour should be mentioned? Yeah, you're right, it's probably not worth adding. I usually find it a good idea to explicitly mention any limitations, but WITH docs are actually have one line about statements without the RETURNING clause, plus indeed for LET it makes even less sense. > > and using with set-returning functions (haven't found any related tests). > > > > There it is: > > +CREATE VARIABLE public.svar AS int; > +-- should be ok > +LET public.svar = generate_series(1, 1); > +-- should fail > +LET public.svar = generate_series(1, 2); > +ERROR: expression returned more than one row > +LET public.svar = generate_series(1, 0); > +ERROR: expression returned no rows > +DROP VARIABLE public.svar; Oh, interesting. I was looking for another error message from parse_func.c: set-returning functions are not allowed in LET assignment expression Is this one you've posted somehow different? > > Another small note is about this change in the rowsecurity: > > > > /* > > -* For SELECT, UPDATE and DELETE, add security quals to enforce > > the USING > > -* policies. These security quals control access to existing > > table rows. > > -* Restrictive policies are combined together using AND, and > > permissive > > -* policies are combined together using OR. > > +* For SELECT, LET, UPDATE and DELETE, add security quals to > > enforce the > > +* USING policies. These security quals control access to > > existing table > > +* rows. Restrictive policies are combined together using AND, and > > +* permissive policies are combined together using OR. > > */ > > > > From this commentary one may think that LET command supports row level > > security, but I don't see it being implemented. A wrong commentary? > > > > I don't think so. The row level security should be supported. I tested it > on example from doc: > > [...] > > (2023-03-28 21:32:33) postgres=# set role to t1role; > SET > (2023-03-28 21:32:40) postgres=# select * from accounts ; > ┌─┬─┬┐ > │ manager │ company │ contact_email │ > ╞═╪═╪╡ > │ t1role │ xxx │ t1r...@xxx.org │ > └─┴─┴┘ > (1 row) > > (2023-03-28 21:32:45) postgres=# let v = (select company from accounts); > LET > (2023-03-28 21:32:58) postgres=# select v; > ┌─┐ > │ v │ > ╞═╡ > │ xxx │ > └─┘ > (1 row) > > (2023-03-28 21:33:03) postgres=# set role to default; > SET > (2023-03-28 21:33:12) postgres=# set role to t2role; > SET > (2023-03-28 21:33:19) postgres=# select * from accounts ; > ┌─┬─┬┐ > │ manager │ company │ contact_email │ > ╞═╪═╪╡ > │ t2role │ yyy │ t2r...@yyy.org │ > └─┴─┴┘ > (1 row) > > (2023-03-28 21:33:22) postgres=# let v = (select company from accounts); > LET > (2023-03-28 21:33:26) postgres=# select v; > ┌─┐ > │ v │ > ╞═╡ > │ yyy │ > └─┘ > (1 row) Hm, but isn't the row level security enforced here on the select level, not when assigning some value via LET? Plus, it seems the comment originally refer to the command types (CMD_SELECT, etc), and there is no CMD_LET and no need for it, right? I'm just trying to understand if there was anything special done for session variables in this regard, and if not, the commentary change seems to be not needed (I know, I know, it's totally nitpicking).
Re: Schema variables - new implementation for Postgres 15
On 30.03.23 10:49, Pavel Stehule wrote: If I reorganize the patch to the following structure, can be it useful for you? 1. really basic functionality (no temporary variables, no def expressions, no memory cleaning) SELECT variable LET should be supported + doc, + related tests. 2. support for temporary variables (session, transaction scope), memory cleaning at the end of transaction 3. PL/pgSQL support 4. pg_dump 5. shadowing warning 6. ... others ... That seems like an ok approach. The pg_dump support should probably go into the first patch, so it's self-contained.
Re: Schema variables - new implementation for Postgres 15
Hi st 29. 3. 2023 v 12:17 odesílatel Peter Eisentraut < peter.eisentr...@enterprisedb.com> napsal: > On 24.03.23 08:04, Pavel Stehule wrote: > > Maybe I can divide the patch 0002-session-variables to three sections - > > related to memory management, planning and execution? > > Personally, I find the existing split not helpful. There is no value > (to me) in putting code, documentation, and tests in three separate > patches. This is in fact counter-helpful (to me). Things like the > DISCARD command (0005) and the error messages changes (0009) can be > separate patches, but most of the rest should probably be a single patch. > > I know you have been asked earlier in the thread to provide smaller > patches, so don't change it just for me, but this is my opinion. > If I reorganize the patch to the following structure, can be it useful for you? 1. really basic functionality (no temporary variables, no def expressions, no memory cleaning) SELECT variable LET should be supported + doc, + related tests. 2. support for temporary variables (session, transaction scope), memory cleaning at the end of transaction 3. PL/pgSQL support 4. pg_dump 5. shadowing warning 6. ... others ... Can it be better for you? Regards Pavel
Re: Schema variables - new implementation for Postgres 15
Hi ne 26. 3. 2023 v 19:44 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Fri, Mar 24, 2023 at 08:04:08AM +0100, Pavel Stehule wrote: > > čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule > > > napsal: > > > > > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut < > > > peter.eisentr...@enterprisedb.com> napsal: > > > > > >> The other issue is that by its nature this patch adds a lot of code > in a > > >> lot of places. Large patches are more likely to be successful if they > > >> add a lot of code in one place or smaller amounts of code in a lot of > > >> places. But this patch does both and it's just overwhelming. There > is > > >> so much new internal functionality and terminology. Variables can be > > >> created, registered, initialized, stored, copied, prepared, set, > freed, > > >> removed, released, synced, dropped, and more. I don't know if anyone > > >> has actually reviewed all that in detail. > > >> > > >> Has any effort been made to make this simpler, smaller, reduce scope, > > >> refactoring, find commonalities with other features, try to manage the > > >> complexity somehow? > > >> > > > I agree that this patch is large, but almost all code is simple. > Complex > > > code is "only" in 0002-session-variables.patch (113KB/438KB). > > > > > > Now, I have no idea how the functionality can be sensibly reduced or > > > divided (no without significant performance loss). I see two difficult > > > points in this code: > > > > > > 1. when to clean memory. The code implements cleaning very accurately - > > > and this is unique in Postgres. Partially I implement some > functionality of > > > storage manager. Probably no code from Postgres can be reused, because > > > there is not any support for global temporary objects. Cleaning based > on > > > sinval messages processing is difficult, but there is nothing else. > The > > > code is a little bit more complex, because there are three types of > session > > > variables: a) session variables, b) temp session variables, c) session > > > variables with transaction scope. Maybe @c can be removed, and maybe we > > > don't need to support not null default (this can simplify > initialization). > > > What do you think about it? > > > > > > 2. how to pass a variable's value to the executor. The implementation > is > > > based on extending the Param node, but it cannot reuse query params > buffers > > > and implements own. > > > But it is hard to simplify code, because we want to support usage > > > variables in queries, and usage in PL/pgSQL expressions too. And both > are > > > processed differently. > > > > > > > Maybe I can divide the patch 0002-session-variables to three sections - > > related to memory management, planning and execution? > > I agree, the patch scale is a bit overwhelming. It's worth noting that > due to the nature of this change certain heavy lifting has to be done in > any case, plus I've got an impression that some part of the patch are > quite solid (although I haven't reviewed everything, did anyone achieve > that milestone?). But still, it would be of great help to simplify the > current implementation, and I'm afraid the only way of doing this is to > make trades-off about functionality vs change size & complexity. > There is not too much space for reduction - more - sometimes there is code reuse between features. I can reduce temporary session variables, but the same AtSubXact routines are used by memory purging routines, and if only if you drop all dependent features, then you can get some interesting number of reduced lines. I can imagine very reduced feature set like 1) no temporary variables, no reset at transaction end 2) without default expressions - default is null 3) direct memory cleaning on drop (without possibility of saved value after reverted drop) or cleaning at session end always Note - @1 and @3 shares code This reduced implementation can still be useful. Probably it doesn't reduce too much code, but it can reduce non trivial code. I believe so almost all not reduced code will be almost trivial > > Maybe instead splitting the patch into implementation components, it's > possible to split it feature-by-feature, where every single patch would > represent an independent (to a certain degree) functionality? I have in > mind something like: catalog changes; base implementation; ACL support; > xact actions implementation (on commit drop, etc); variables with > default value; shadowing; etc. If such approach is possible, it will > give us: flexibility to apply only a subset of the whole patch series; > some understanding how much complexity is coming from each feature. What > do you think about this idea? > I think cleaning, dropping can be moved to a separate patch. ACL support uses generic support (it is only a few lines). The patch 02 can be splitted - I am not sure how these parts can be independent. I'll try to split this patch, and we will see if it will be better. > I also recall some
Re: Schema variables - new implementation for Postgres 15
On 24.03.23 08:04, Pavel Stehule wrote: Maybe I can divide the patch 0002-session-variables to three sections - related to memory management, planning and execution? Personally, I find the existing split not helpful. There is no value (to me) in putting code, documentation, and tests in three separate patches. This is in fact counter-helpful (to me). Things like the DISCARD command (0005) and the error messages changes (0009) can be separate patches, but most of the rest should probably be a single patch. I know you have been asked earlier in the thread to provide smaller patches, so don't change it just for me, but this is my opinion.
Re: Schema variables - new implementation for Postgres 15
Hi ne 26. 3. 2023 v 19:53 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Sun, Mar 26, 2023 at 07:32:05PM +0800, Julien Rouhaud wrote: > > Hi, > > > > I just have a few minor wording improvements for the various comments / > > documentation you quoted. > > Talking about documentation I've noticed that the implementation > contains few limitations, that are not mentioned in the docs. Examples > are WITH queries: > > WITH x AS (LET public.svar = 100) SELECT * FROM x; > ERROR: LET not supported in WITH query > The LET statement doesn't support the RETURNING clause, so using inside CTE does not make any sense. Do you have some tips, where this behaviour should be mentioned? > and using with set-returning functions (haven't found any related tests). > There it is: +CREATE VARIABLE public.svar AS int; +-- should be ok +LET public.svar = generate_series(1, 1); +-- should fail +LET public.svar = generate_series(1, 2); +ERROR: expression returned more than one row +LET public.svar = generate_series(1, 0); +ERROR: expression returned no rows +DROP VARIABLE public.svar; > > Another small note is about this change in the rowsecurity: > > /* > -* For SELECT, UPDATE and DELETE, add security quals to enforce > the USING > -* policies. These security quals control access to existing > table rows. > -* Restrictive policies are combined together using AND, and > permissive > -* policies are combined together using OR. > +* For SELECT, LET, UPDATE and DELETE, add security quals to > enforce the > +* USING policies. These security quals control access to > existing table > +* rows. Restrictive policies are combined together using AND, and > +* permissive policies are combined together using OR. > */ > > From this commentary one may think that LET command supports row level > security, but I don't see it being implemented. A wrong commentary? > I don't think so. The row level security should be supported. I tested it on example from doc: CREATE TABLE public.accounts ( manager text, company text, contact_email text ); CREATE VARIABLE public.v AS text; COPY public.accounts (manager, company, contact_email) FROM stdin; t1role xxx t1r...@xxx.org t2role yyy t2r...@yyy.org \. CREATE POLICY account_managers ON public.accounts USING ((manager = CURRENT_USER)); ALTER TABLE public.accounts ENABLE ROW LEVEL SECURITY; GRANT SELECT,INSERT ON TABLE public.accounts TO t1role; GRANT SELECT,INSERT ON TABLE public.accounts TO t2role; GRANT ALL ON VARIABLE public.v TO t1role; GRANT ALL ON VARIABLE public.v TO t2role; [pavel@localhost postgresql.master]$ psql Assertions: on psql (16devel) Type "help" for help. (2023-03-28 21:32:33) postgres=# set role to t1role; SET (2023-03-28 21:32:40) postgres=# select * from accounts ; ┌─┬─┬┐ │ manager │ company │ contact_email │ ╞═╪═╪╡ │ t1role │ xxx │ t1r...@xxx.org │ └─┴─┴┘ (1 row) (2023-03-28 21:32:45) postgres=# let v = (select company from accounts); LET (2023-03-28 21:32:58) postgres=# select v; ┌─┐ │ v │ ╞═╡ │ xxx │ └─┘ (1 row) (2023-03-28 21:33:03) postgres=# set role to default; SET (2023-03-28 21:33:12) postgres=# set role to t2role; SET (2023-03-28 21:33:19) postgres=# select * from accounts ; ┌─┬─┬┐ │ manager │ company │ contact_email │ ╞═╪═╪╡ │ t2role │ yyy │ t2r...@yyy.org │ └─┴─┴┘ (1 row) (2023-03-28 21:33:22) postgres=# let v = (select company from accounts); LET (2023-03-28 21:33:26) postgres=# select v; ┌─┐ │ v │ ╞═╡ │ yyy │ └─┘ (1 row) Regards Pavel
Re: Schema variables - new implementation for Postgres 15
ne 26. 3. 2023 v 13:32 odesílatel Julien Rouhaud napsal: > Hi, > > I just have a few minor wording improvements for the various comments / > documentation you quoted. > > On Sun, Mar 26, 2023 at 08:53:49AM +0200, Pavel Stehule wrote: > > út 21. 3. 2023 v 17:18 odesílatel Peter Eisentraut < > > peter.eisentr...@enterprisedb.com> napsal: > > > > > - What is the purpose of struct Variable? It seems very similar to > > >FormData_pg_variable. At least a comment would be useful. > > > > > > > I wrote comment there: > > > > > > /* > > * The Variable struct is based on FormData_pg_variable struct. Against > > * FormData_pg_variable it can hold node of deserialized expression used > > * for calculation of default value. > > */ > > Did you mean "Unlike" rather than "Against"? > fixed > > > > 0002 > > > > > > expr_kind_allows_session_variables() should have some explanation > > > about criteria for determining which expression kinds should allow > > > variables. > > > > > > > I wrote comment there: > > > > /* > > * Returns true, when expression of kind allows using of > > * session variables. > > + * The session's variables can be used everywhere where > > + * can be used external parameters. Session variables > > + * are not allowed in DDL. Session's variables cannot be > > + * used in constraints. > > + * > > + * The identifier can be parsed as an session variable > > + * only in expression's kinds where session's variables > > + * are allowed. This is the primary usage of this function. > > + * > > + * Second usage of this function is for decision if > > + * an error message "column does not exist" or "column > > + * or variable does not exist" should be printed. When > > + * we are in expression, where session variables cannot > > + * be used, we raise the first form or error message. > > */ > > Maybe > > /* > * Returns true if the given expression kind is valid for session variables > * Session variables can be used everywhere where external parameters can > be > * used. Session variables are not allowed in DDL commands or in > constraints. > * > * An identifier can be parsed as a session variable only for expression > kinds > * where session variables are allowed. This is the primary usage of this > * function. > * > * Second usage of this function is to decide whether "column does not > exist" or > * "column or variable does not exist" error message should be printed. > * When we are in an expression where session variables cannot be used, we > raise > * the first form or error message. > */ > changed > > > > session_variables_ambiguity_warning: There needs to be more > > > information about this. The current explanation is basically just, > > > "warn if your query is confusing". Why do I want that? Why would I > > > not want that? What is the alternative? What are some examples? > > > Shouldn't there be a standard behavior without a need to configure > > > anything? > > > > > > > I enhanced this entry: > > > > + > > +The session variables can be shadowed by column references in a > > query. This > > +is an expected feature. The existing queries should not be > broken > > by creating > > +any session variable, because session variables are shadowed > > always if the > > +identifier is ambiguous. The variables should be named without > > possibility > > +to collision with identifiers of other database objects (column > > names or > > +record field names). The warnings enabled by setting > > session_variables_ambiguity_warning > > +should help with finding identifier's collisions. > > Maybe > > Session variables can be shadowed by column references in a query, this is > an > expected behavior. Previously working queries shouldn't error out by > creating > any session variable, so session variables are always shadowed if an > identifier > is ambiguous. Variables should be referenced using an unambiguous > identifier > without any possibility for a collision with identifier of other database > objects (column names or record fields names). The warning messages > emitted > when enabling session_variables_ambiguity_warning can > help > finding such identifier collision. > > > + > > + > > +This feature can significantly increase size of logs, and then > it > > is > > +disabled by default, but for testing or development > environments it > > +should be enabled. > > Maybe > > This feature can significantly increase log size, so it's disabled by > default. > For testing or development environments it's recommended to enable it if > you > use session variables. > replaced Thank you very much for these language correctures Regards Pavel p.s. I'll send updated patch after today or tomorrow - I have to fix broken dependency check after rebase
Re: Schema variables - new implementation for Postgres 15
> On Sun, Mar 26, 2023 at 07:32:05PM +0800, Julien Rouhaud wrote: > Hi, > > I just have a few minor wording improvements for the various comments / > documentation you quoted. Talking about documentation I've noticed that the implementation contains few limitations, that are not mentioned in the docs. Examples are WITH queries: WITH x AS (LET public.svar = 100) SELECT * FROM x; ERROR: LET not supported in WITH query and using with set-returning functions (haven't found any related tests). Another small note is about this change in the rowsecurity: /* -* For SELECT, UPDATE and DELETE, add security quals to enforce the USING -* policies. These security quals control access to existing table rows. -* Restrictive policies are combined together using AND, and permissive -* policies are combined together using OR. +* For SELECT, LET, UPDATE and DELETE, add security quals to enforce the +* USING policies. These security quals control access to existing table +* rows. Restrictive policies are combined together using AND, and +* permissive policies are combined together using OR. */ >From this commentary one may think that LET command supports row level security, but I don't see it being implemented. A wrong commentary?
Re: Schema variables - new implementation for Postgres 15
> On Fri, Mar 24, 2023 at 08:04:08AM +0100, Pavel Stehule wrote: > čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule > napsal: > > > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut < > > peter.eisentr...@enterprisedb.com> napsal: > > > >> The other issue is that by its nature this patch adds a lot of code in a > >> lot of places. Large patches are more likely to be successful if they > >> add a lot of code in one place or smaller amounts of code in a lot of > >> places. But this patch does both and it's just overwhelming. There is > >> so much new internal functionality and terminology. Variables can be > >> created, registered, initialized, stored, copied, prepared, set, freed, > >> removed, released, synced, dropped, and more. I don't know if anyone > >> has actually reviewed all that in detail. > >> > >> Has any effort been made to make this simpler, smaller, reduce scope, > >> refactoring, find commonalities with other features, try to manage the > >> complexity somehow? > >> > > I agree that this patch is large, but almost all code is simple. Complex > > code is "only" in 0002-session-variables.patch (113KB/438KB). > > > > Now, I have no idea how the functionality can be sensibly reduced or > > divided (no without significant performance loss). I see two difficult > > points in this code: > > > > 1. when to clean memory. The code implements cleaning very accurately - > > and this is unique in Postgres. Partially I implement some functionality of > > storage manager. Probably no code from Postgres can be reused, because > > there is not any support for global temporary objects. Cleaning based on > > sinval messages processing is difficult, but there is nothing else. The > > code is a little bit more complex, because there are three types of session > > variables: a) session variables, b) temp session variables, c) session > > variables with transaction scope. Maybe @c can be removed, and maybe we > > don't need to support not null default (this can simplify initialization). > > What do you think about it? > > > > 2. how to pass a variable's value to the executor. The implementation is > > based on extending the Param node, but it cannot reuse query params buffers > > and implements own. > > But it is hard to simplify code, because we want to support usage > > variables in queries, and usage in PL/pgSQL expressions too. And both are > > processed differently. > > > > Maybe I can divide the patch 0002-session-variables to three sections - > related to memory management, planning and execution? I agree, the patch scale is a bit overwhelming. It's worth noting that due to the nature of this change certain heavy lifting has to be done in any case, plus I've got an impression that some part of the patch are quite solid (although I haven't reviewed everything, did anyone achieve that milestone?). But still, it would be of great help to simplify the current implementation, and I'm afraid the only way of doing this is to make trades-off about functionality vs change size & complexity. Maybe instead splitting the patch into implementation components, it's possible to split it feature-by-feature, where every single patch would represent an independent (to a certain degree) functionality? I have in mind something like: catalog changes; base implementation; ACL support; xact actions implementation (on commit drop, etc); variables with default value; shadowing; etc. If such approach is possible, it will give us: flexibility to apply only a subset of the whole patch series; some understanding how much complexity is coming from each feature. What do you think about this idea? I also recall somewhere earlier in the thread Pavel has mentioned that a transactional version of session variables patch would be actually simpler, and he has plans to implement it later on. Is there another trade-off on the table we could think of, transactional vs non-transactional session variables?
Re: Schema variables - new implementation for Postgres 15
Hi, I just have a few minor wording improvements for the various comments / documentation you quoted. On Sun, Mar 26, 2023 at 08:53:49AM +0200, Pavel Stehule wrote: > út 21. 3. 2023 v 17:18 odesílatel Peter Eisentraut < > peter.eisentr...@enterprisedb.com> napsal: > > > - What is the purpose of struct Variable? It seems very similar to > >FormData_pg_variable. At least a comment would be useful. > > > > I wrote comment there: > > > /* > * The Variable struct is based on FormData_pg_variable struct. Against > * FormData_pg_variable it can hold node of deserialized expression used > * for calculation of default value. > */ Did you mean "Unlike" rather than "Against"? > > 0002 > > > > expr_kind_allows_session_variables() should have some explanation > > about criteria for determining which expression kinds should allow > > variables. > > > > I wrote comment there: > > /* > * Returns true, when expression of kind allows using of > * session variables. > + * The session's variables can be used everywhere where > + * can be used external parameters. Session variables > + * are not allowed in DDL. Session's variables cannot be > + * used in constraints. > + * > + * The identifier can be parsed as an session variable > + * only in expression's kinds where session's variables > + * are allowed. This is the primary usage of this function. > + * > + * Second usage of this function is for decision if > + * an error message "column does not exist" or "column > + * or variable does not exist" should be printed. When > + * we are in expression, where session variables cannot > + * be used, we raise the first form or error message. > */ Maybe /* * Returns true if the given expression kind is valid for session variables * Session variables can be used everywhere where external parameters can be * used. Session variables are not allowed in DDL commands or in constraints. * * An identifier can be parsed as a session variable only for expression kinds * where session variables are allowed. This is the primary usage of this * function. * * Second usage of this function is to decide whether "column does not exist" or * "column or variable does not exist" error message should be printed. * When we are in an expression where session variables cannot be used, we raise * the first form or error message. */ > > session_variables_ambiguity_warning: There needs to be more > > information about this. The current explanation is basically just, > > "warn if your query is confusing". Why do I want that? Why would I > > not want that? What is the alternative? What are some examples? > > Shouldn't there be a standard behavior without a need to configure > > anything? > > > > I enhanced this entry: > > + > +The session variables can be shadowed by column references in a > query. This > +is an expected feature. The existing queries should not be broken > by creating > +any session variable, because session variables are shadowed > always if the > +identifier is ambiguous. The variables should be named without > possibility > +to collision with identifiers of other database objects (column > names or > +record field names). The warnings enabled by setting > session_variables_ambiguity_warning > +should help with finding identifier's collisions. Maybe Session variables can be shadowed by column references in a query, this is an expected behavior. Previously working queries shouldn't error out by creating any session variable, so session variables are always shadowed if an identifier is ambiguous. Variables should be referenced using an unambiguous identifier without any possibility for a collision with identifier of other database objects (column names or record fields names). The warning messages emitted when enabling session_variables_ambiguity_warning can help finding such identifier collision. > + > + > +This feature can significantly increase size of logs, and then it > is > +disabled by default, but for testing or development environments it > +should be enabled. Maybe This feature can significantly increase log size, so it's disabled by default. For testing or development environments it's recommended to enable it if you use session variables.
Re: Schema variables - new implementation for Postgres 15
čt 23. 3. 2023 v 19:54 odesílatel Pavel Stehule napsal: > Hi > > > čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut < > peter.eisentr...@enterprisedb.com> napsal: > >> On 17.03.23 21:50, Pavel Stehule wrote: >> > Hi >> > >> > rebase + fix-update pg_dump tests >> > >> > Regards >> > >> > Pavel >> > >> >> I have spent several hours studying the code and the past discussions. >> >> The problem I see in general is that everyone who reviews and tests the >> patches finds more problems, behavioral, weird internal errors, crashes. >> These are then immediately fixed, and the cycle starts again. I don't >> have the sense that this process has arrived at a steady state yet. >> >> The other issue is that by its nature this patch adds a lot of code in a >> lot of places. Large patches are more likely to be successful if they >> add a lot of code in one place or smaller amounts of code in a lot of >> places. But this patch does both and it's just overwhelming. There is >> so much new internal functionality and terminology. Variables can be >> created, registered, initialized, stored, copied, prepared, set, freed, >> removed, released, synced, dropped, and more. I don't know if anyone >> has actually reviewed all that in detail. >> >> Has any effort been made to make this simpler, smaller, reduce scope, >> refactoring, find commonalities with other features, try to manage the >> complexity somehow? >> >> I'm not making a comment on the details of the functionality itself. I >> just think on the coding level it's not gotten to a satisfying situation >> yet. >> >> > I agree that this patch is large, but almost all code is simple. Complex > code is "only" in 0002-session-variables.patch (113KB/438KB). > > Now, I have no idea how the functionality can be sensibly reduced or > divided (no without significant performance loss). I see two difficult > points in this code: > > 1. when to clean memory. The code implements cleaning very accurately - > and this is unique in Postgres. Partially I implement some functionality of > storage manager. Probably no code from Postgres can be reused, because > there is not any support for global temporary objects. Cleaning based on > sinval messages processing is difficult, but there is nothing else. The > code is a little bit more complex, because there are three types of session > variables: a) session variables, b) temp session variables, c) session > variables with transaction scope. Maybe @c can be removed, and maybe we > don't need to support not null default (this can simplify initialization). > What do you think about it? > > 2. how to pass a variable's value to the executor. The implementation is > based on extending the Param node, but it cannot reuse query params buffers > and implements own. > But it is hard to simplify code, because we want to support usage > variables in queries, and usage in PL/pgSQL expressions too. And both are > processed differently. > Maybe I can divide the patch 0002-session-variables to three sections - related to memory management, planning and execution? Regards Pavel > Regards > > Pavel > >
Re: Schema variables - new implementation for Postgres 15
Hi čt 23. 3. 2023 v 16:33 odesílatel Peter Eisentraut < peter.eisentr...@enterprisedb.com> napsal: > On 17.03.23 21:50, Pavel Stehule wrote: > > Hi > > > > rebase + fix-update pg_dump tests > > > > Regards > > > > Pavel > > > > I have spent several hours studying the code and the past discussions. > > The problem I see in general is that everyone who reviews and tests the > patches finds more problems, behavioral, weird internal errors, crashes. > These are then immediately fixed, and the cycle starts again. I don't > have the sense that this process has arrived at a steady state yet. > > The other issue is that by its nature this patch adds a lot of code in a > lot of places. Large patches are more likely to be successful if they > add a lot of code in one place or smaller amounts of code in a lot of > places. But this patch does both and it's just overwhelming. There is > so much new internal functionality and terminology. Variables can be > created, registered, initialized, stored, copied, prepared, set, freed, > removed, released, synced, dropped, and more. I don't know if anyone > has actually reviewed all that in detail. > > Has any effort been made to make this simpler, smaller, reduce scope, > refactoring, find commonalities with other features, try to manage the > complexity somehow? > > I'm not making a comment on the details of the functionality itself. I > just think on the coding level it's not gotten to a satisfying situation > yet. > > I agree that this patch is large, but almost all code is simple. Complex code is "only" in 0002-session-variables.patch (113KB/438KB). Now, I have no idea how the functionality can be sensibly reduced or divided (no without significant performance loss). I see two difficult points in this code: 1. when to clean memory. The code implements cleaning very accurately - and this is unique in Postgres. Partially I implement some functionality of storage manager. Probably no code from Postgres can be reused, because there is not any support for global temporary objects. Cleaning based on sinval messages processing is difficult, but there is nothing else. The code is a little bit more complex, because there are three types of session variables: a) session variables, b) temp session variables, c) session variables with transaction scope. Maybe @c can be removed, and maybe we don't need to support not null default (this can simplify initialization). What do you think about it? 2. how to pass a variable's value to the executor. The implementation is based on extending the Param node, but it cannot reuse query params buffers and implements own. But it is hard to simplify code, because we want to support usage variables in queries, and usage in PL/pgSQL expressions too. And both are processed differently. Regards Pavel
Re: Schema variables - new implementation for Postgres 15
On 17.03.23 21:50, Pavel Stehule wrote: Hi rebase + fix-update pg_dump tests Regards Pavel I have spent several hours studying the code and the past discussions. The problem I see in general is that everyone who reviews and tests the patches finds more problems, behavioral, weird internal errors, crashes. These are then immediately fixed, and the cycle starts again. I don't have the sense that this process has arrived at a steady state yet. The other issue is that by its nature this patch adds a lot of code in a lot of places. Large patches are more likely to be successful if they add a lot of code in one place or smaller amounts of code in a lot of places. But this patch does both and it's just overwhelming. There is so much new internal functionality and terminology. Variables can be created, registered, initialized, stored, copied, prepared, set, freed, removed, released, synced, dropped, and more. I don't know if anyone has actually reviewed all that in detail. Has any effort been made to make this simpler, smaller, reduce scope, refactoring, find commonalities with other features, try to manage the complexity somehow? I'm not making a comment on the details of the functionality itself. I just think on the coding level it's not gotten to a satisfying situation yet.
Re: Schema variables - new implementation for Postgres 15
On 17.03.23 21:50, Pavel Stehule wrote: rebase + fix-update pg_dump tests I have a few comments on the code: 0001 ExecGrant_Variable() could probably use ExecGrant_common(). The additions to syscache.c should be formatted to the new style. in pg_variable.h: - create_lsn ought to have a "var" prefix. - typo: "typmode for variable's type" - What is the purpose of struct Variable? It seems very similar to FormData_pg_variable. At least a comment would be useful. Preserve the trailing comma in ParseExprKind. 0002 expr_kind_allows_session_variables() should have some explanation about criteria for determining which expression kinds should allow variables. Usually, we handle EXPR_KIND_* switches without default case, so we get notified what needs to be changed if a new enum symbol is added. 0010 The material from the tutorial (advanced.sgml) might be better in ddl.sgml. In catalogs.sgml, the columns don't match the ones actually defined in pg_variable.h in patch 0001 (e.g., create_lsn is missing and the order doesn't match). (The order of columns in pg_variable.h didn't immediately make sense to me either, so maybe there is a middle ground to be found.) session_variables_ambiguity_warning: There needs to be more information about this. The current explanation is basically just, "warn if your query is confusing". Why do I want that? Why would I not want that? What is the alternative? What are some examples? Shouldn't there be a standard behavior without a need to configure anything? In allfiles.sgml, dropVariable should be before dropView.
Re: Schema variables - new implementation for Postgres 15
st 8. 3. 2023 v 16:35 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Wed, Mar 08, 2023 at 08:31:07AM +0100, Pavel Stehule wrote: > > pá 3. 3. 2023 v 21:19 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > > napsal: > > > > > > On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote: > > > > > > > > fresh rebase > > > > > > I'm continuing to review, this time going through shadowing stuff in > > > transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg > work > > > for rather little outcome :) I guess all attempts to simplify this part > > > weren't successful? > > > > > > > Originally I wrote it in the strategy "reduce false alarms". But when I > > think about it, it may not be good in this case. Usually the changes are > > done first on some developer environment, and good practice is to > disallow > > same (possibly confusing) identifiers. So I am not against making this > > warning more aggressive with some possibility of false alarms. With > > blocking reduction of alarms the differences in regress was zero. So I > > reduced this part. > > Great, thank you. > > > > Couple of questions to it. In IdentifyVariable in the branch handling > > > two values the commentary says: > > > > > > /* > > > * a.b can mean "schema"."variable" or "variable"."field", > > > * Check both variants, and returns InvalidOid with > > > * not_unique flag, when both interpretations are > > > * possible. Second node can be star. In this case, the > > > * only allowed possibility is "variable"."*". > > > */ > > > > > > I read this as "variable"."*" is a valid combination, but the very next > > > part of this condition says differently: > > > > > > > > > > > > > > > /* > > > * Session variables doesn't support unboxing by star > > > * syntax. But this syntax have to be calculated here, > > > * because can come from non session variables related > > > * expressions. > > > */ > > > Assert(IsA(field2, A_Star)); > > > > > > Is the first commentary not quite correct? > > > > > > > I think it is correct, but maybe I was not good at describing this issue. > > The sentence "Second node can be a star. In this case, the > > the only allowed possibility is "variable"."*"." should be in the next > > comment. > > > > In this part we process a list of identifiers, and we try to map these > > identifiers to some semantics. The parser should ensure that > > all fields of lists are strings or the last field is a star. In this case > > the semantic "schema".* is nonsense, and the only possible semantic > > is "variable".*. It is valid semantics, but unsupported now. Unboxing is > > available by syntax (var).* > > > > I changed the comment > > Thanks. Just to clarify, by "unsupported" you mean unsupported in the > current patch implementation right? From what I understand value > unboxing could be done without parentheses in a non-top level select > query. > Yes, it can be implemented in the next steps. I don't think there can be some issues, but it means more lines and a little bit more complex interface. In this step, I try to implement minimalistic required functionality that can be enhanced in next steps. For this area is an important fact, so session variables will be shadowed always by relations. It means new functionality in session variables cannot break existing applications ever, and then there is space for future enhancement. > > As a side note, I'm not sure if this branch is exercised in any tests. > I've replaced returning InvalidOid with actually doing > LookupVariable(NULL, a, true) > in this case, and all the tests are still passing. > Usually we don't test not yet implemented functionality. > > > > Another question about how shadowing warning should work between > > > namespaces. > > > Let's say I've got two namespaces, public and test, both have a session > > > variable with the same name, but only one has a table with the same > name: > > > > > > -- in public > > > create table test_agg(a int); > > > create type for_test_agg as (a int); > > > create variable test_agg for_test_agg; > > > > > > -- in test > > > create type for_test_agg as (a int); > > > create variable test_agg for_test_agg; > > > > > > Now if we will try to trigger the shadowing warning from public > > > namespace, it would work differently: > > > > > > -- in public > > > =# let test.test_agg.a = 10; > > > =# let test_agg.a = 20; > > > =# set session_variables_ambiguity_warning to on; > > > > > > -- note the value returned from the table > > > =# select jsonb_agg(test_agg.a) from test_agg; > > > WARNING: 42702: session variable "test_agg.a" is shadowed > > > LINE 1: select jsonb_agg(test_agg.a) from test_agg; > > > ^ > > > DETAIL: Session variables can be shadowed by columns, > routine's > > > variables and routine's
Re: Schema variables - new implementation for Postgres 15
> On Wed, Mar 08, 2023 at 08:31:07AM +0100, Pavel Stehule wrote: > pá 3. 3. 2023 v 21:19 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > > > > On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote: > > > > > > fresh rebase > > > > I'm continuing to review, this time going through shadowing stuff in > > transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg work > > for rather little outcome :) I guess all attempts to simplify this part > > weren't successful? > > > > Originally I wrote it in the strategy "reduce false alarms". But when I > think about it, it may not be good in this case. Usually the changes are > done first on some developer environment, and good practice is to disallow > same (possibly confusing) identifiers. So I am not against making this > warning more aggressive with some possibility of false alarms. With > blocking reduction of alarms the differences in regress was zero. So I > reduced this part. Great, thank you. > > Couple of questions to it. In IdentifyVariable in the branch handling > > two values the commentary says: > > > > /* > > * a.b can mean "schema"."variable" or "variable"."field", > > * Check both variants, and returns InvalidOid with > > * not_unique flag, when both interpretations are > > * possible. Second node can be star. In this case, the > > * only allowed possibility is "variable"."*". > > */ > > > > I read this as "variable"."*" is a valid combination, but the very next > > part of this condition says differently: > > > > > > > > > /* > > * Session variables doesn't support unboxing by star > > * syntax. But this syntax have to be calculated here, > > * because can come from non session variables related > > * expressions. > > */ > > Assert(IsA(field2, A_Star)); > > > > Is the first commentary not quite correct? > > > > I think it is correct, but maybe I was not good at describing this issue. > The sentence "Second node can be a star. In this case, the > the only allowed possibility is "variable"."*"." should be in the next > comment. > > In this part we process a list of identifiers, and we try to map these > identifiers to some semantics. The parser should ensure that > all fields of lists are strings or the last field is a star. In this case > the semantic "schema".* is nonsense, and the only possible semantic > is "variable".*. It is valid semantics, but unsupported now. Unboxing is > available by syntax (var).* > > I changed the comment Thanks. Just to clarify, by "unsupported" you mean unsupported in the current patch implementation right? From what I understand value unboxing could be done without parentheses in a non-top level select query. As a side note, I'm not sure if this branch is exercised in any tests. I've replaced returning InvalidOid with actually doing LookupVariable(NULL, a, true) in this case, and all the tests are still passing. > > Another question about how shadowing warning should work between > > namespaces. > > Let's say I've got two namespaces, public and test, both have a session > > variable with the same name, but only one has a table with the same name: > > > > -- in public > > create table test_agg(a int); > > create type for_test_agg as (a int); > > create variable test_agg for_test_agg; > > > > -- in test > > create type for_test_agg as (a int); > > create variable test_agg for_test_agg; > > > > Now if we will try to trigger the shadowing warning from public > > namespace, it would work differently: > > > > -- in public > > =# let test.test_agg.a = 10; > > =# let test_agg.a = 20; > > =# set session_variables_ambiguity_warning to on; > > > > -- note the value returned from the table > > =# select jsonb_agg(test_agg.a) from test_agg; > > WARNING: 42702: session variable "test_agg.a" is shadowed > > LINE 1: select jsonb_agg(test_agg.a) from test_agg; > > ^ > > DETAIL: Session variables can be shadowed by columns, routine's > > variables and routine's arguments with the same name. > > LOCATION: transformColumnRef, parse_expr.c:940 > > jsonb_agg > > --- > > [1] > > > > -- no warning, note the session variable value > > =# select jsonb_agg(test.test_agg.a) from test_agg; > > jsonb_agg > > --- > > [10] > > > > It happens because in the second scenario the logic inside > > transformColumnRef > > will not set up the node variable (there is no corresponding table in the > > "test" schema), and the following conditions covering session variables > > shadowing are depending on it. Is it supposed to be like this? > > > > I am sorry, I don't understand what you want to describe. Session variables > are shadowed by relations, ever. It is design. In the first case, the > variable is shadowed and a warning
Re: Schema variables - new implementation for Postgres 15
> On Tue, Feb 28, 2023 at 06:12:50AM +0100, Pavel Stehule wrote: > > fresh rebase I'm continuing to review, this time going through shadowing stuff in transformColumnRef, IdentifyVariable etc. Well, that's a lot of leg work for rather little outcome :) I guess all attempts to simplify this part weren't successful? Couple of questions to it. In IdentifyVariable in the branch handling two values the commentary says: /* * a.b can mean "schema"."variable" or "variable"."field", * Check both variants, and returns InvalidOid with * not_unique flag, when both interpretations are * possible. Second node can be star. In this case, the * only allowed possibility is "variable"."*". */ I read this as "variable"."*" is a valid combination, but the very next part of this condition says differently: /* * Session variables doesn't support unboxing by star * syntax. But this syntax have to be calculated here, * because can come from non session variables related * expressions. */ Assert(IsA(field2, A_Star)); Is the first commentary not quite correct? Another question about how shadowing warning should work between namespaces. Let's say I've got two namespaces, public and test, both have a session variable with the same name, but only one has a table with the same name: -- in public create table test_agg(a int); create type for_test_agg as (a int); create variable test_agg for_test_agg; -- in test create type for_test_agg as (a int); create variable test_agg for_test_agg; Now if we will try to trigger the shadowing warning from public namespace, it would work differently: -- in public =# let test.test_agg.a = 10; =# let test_agg.a = 20; =# set session_variables_ambiguity_warning to on; -- note the value returned from the table =# select jsonb_agg(test_agg.a) from test_agg; WARNING: 42702: session variable "test_agg.a" is shadowed LINE 1: select jsonb_agg(test_agg.a) from test_agg; ^ DETAIL: Session variables can be shadowed by columns, routine's variables and routine's arguments with the same name. LOCATION: transformColumnRef, parse_expr.c:940 jsonb_agg --- [1] -- no warning, note the session variable value =# select jsonb_agg(test.test_agg.a) from test_agg; jsonb_agg --- [10] It happens because in the second scenario the logic inside transformColumnRef will not set up the node variable (there is no corresponding table in the "test" schema), and the following conditions covering session variables shadowing are depending on it. Is it supposed to be like this?
Re: Schema variables - new implementation for Postgres 15 (typo)
Hi I read notes from the FOSDEM developer meeting, and I would like to repeat notice about motivation for introduction of session variables, and one reason why session_variables are not transactional, and why they should not be replaced by temp tables is performance. There are more use cases where session variables can be used. One scenario for session variables is to use them like static variables. They can be used from some rows triggers, .. where local variable is not enough (like https://www.cybertec-postgresql.com/en/why-are-my-postgresql-updates-getting-slower/ ) create variable xx as int; do $$ begin let xx = 1; for i in 1..1 loop let xx = xx + 1; end loop; raise notice '%', xx; end; $$; NOTICE: 10001 DO Time: 4,079 ms create temp table xx01(a int); delete from xx01; vacuum full xx01; vacuum; do $$ begin insert into xx01 values(1); for i in 1..1 loop update xx01 set a = a + 1; end loop; raise notice '%', (select a from xx01); end; $$; NOTICE: 10001 DO Time: 1678,949 ms (00:01,679) postgres=# \dt+ xx01 List of relations ┌───┬──┬───┬───┬─┬───┬┬─┐ │ Schema │ Name │ Type │ Owner │ Persistence │ Access method │ Size │ Description │ ╞═══╪══╪═══╪═══╪═╪═══╪╪═╡ │ pg_temp_3 │ xx01 │ table │ pavel │ temporary │ heap │ 384 kB │ │ └───┴──┴───┴───┴─┴───┴┴─┘ (1 row) Originally, I tested 100K iterations, but it was too slow, and I cancelled it after 5 minutes. Vacuum can be done after the end of transaction. And there can be another negative impact related to bloating of pg_attribute, pg_class, pg_depend tables. Workaround based on custom GUC is not too bad, but there is not any possibility of security protection (and there is not any possibility of static check in plpgsql_check) - and still it is 20x slower than session variables do $$ begin perform set_config('cust.xx', '1', false); for i in 1..1 loop perform set_config('cust.xx', (current_setting('cust.xx')::int + 1)::text, true); end loop; raise notice '%', current_setting('cust.xx'); end; $$; NOTICE: 10001 DO Time: 80,201 ms Session variables don't try to replace temp tables, and temp tables can be a very bad replacement of session's variables. Regards Pavel
Re: Schema variables - new implementation for Postgres 15 (typo)
> > > I can be wrong, but from these numbers I don't think so these sync cycles > > should to contain CHECK_FOR_INTERRUPTS > > > > What do you think? > > Well, there is always possibility someone will create more variables > than any arbitrary limit we have tested for. But I see your point and > don't have a strong opinion about this, so let's keep it as it is :) > > In this case, I afraid more about possible impacts of canceling than long operation. It should be possible to cancel query - but you cannot to cancel followup operation like memory cleaning or other resource releasing. The possibility to be cancelled in this cycle means rewriting processing to be much more defensive (and slower). And although you can hypothetically cancel sync cycles, then you should to some time finish these cycles because you need to clean memory from garbage. Regards Pavel ok :) If it is an issue, then it can be easily fixed at future, but I don't think I
Re: Schema variables - new implementation for Postgres 15 (typo)
> On Mon, Jan 23, 2023 at 07:09:27PM +0100, Pavel Stehule wrote: > po 23. 1. 2023 v 15:25 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > > > > On Sun, Jan 22, 2023 at 07:47:07PM +0100, Pavel Stehule wrote: > > > pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > > > napsal: > > > > > > > * I think it was already mentioned in the thread, there seems to be > > not a > > > > single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some > > > > number > > > > of loops over the sessionvars are implemented, are they always going > > to > > > > be > > > > small enough to not make any troubles? > > > > > > > > > > The longest cycle is a cycle that rechecks actively used variables > > against > > > system catalog. No cycle depends on the content of variables. > > > > Right, but what about the cases with huge number of variables? Not > > saying it could be in any sense common, but possible to do. > > > > Now I tested 10K variables (with enabled assertions, without it is should > be faster) > > [...] > > I can be wrong, but from these numbers I don't think so these sync cycles > should to contain CHECK_FOR_INTERRUPTS > > What do you think? Well, there is always possibility someone will create more variables than any arbitrary limit we have tested for. But I see your point and don't have a strong opinion about this, so let's keep it as it is :) > > > > * sync_sessionvars_all explains why is it necessary to copy > > > > xact_recheck_varids: > > > > > > > > When we check the variables, the system cache can be > > > > invalidated, > > > > and xact_recheck_varids can be enhanced. > > > > > > > > I'm not quite following what the "enhancement" part is about? Is > > > > xact_recheck_varids could be somehow updated concurrently with the > > loop? > > > > > > > > > > Yes. pg_variable_cache_callback can be called when > > > is_session_variable_valid is executed. > > > > > > Maybe "extended" can be a better word instead of "enhanced"? I > > reformulated > > > this comment > > > > Yeah, "extended" sounds better. But I was mostly confused about the > > mechanism, if the cache callback can interrupt the execution at any > > moment when called, that would explain it. > > > > patch from yesterday has extended comment in this area :-) Thanks!
Re: Schema variables - new implementation for Postgres 15 (typo)
po 23. 1. 2023 v 15:25 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Sun, Jan 22, 2023 at 07:47:07PM +0100, Pavel Stehule wrote: > > pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > > napsal: > > > > > * I think it was already mentioned in the thread, there seems to be > not a > > > single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some > > > number > > > of loops over the sessionvars are implemented, are they always going > to > > > be > > > small enough to not make any troubles? > > > > > > > The longest cycle is a cycle that rechecks actively used variables > against > > system catalog. No cycle depends on the content of variables. > > Right, but what about the cases with huge number of variables? Not > saying it could be in any sense common, but possible to do. > Now I tested 10K variables (with enabled assertions, without it is should be faster) creating 763ms do $$ begin for i in 1..1 loop execute format('create variable %I as int', 'xx' || i); end loop; end $$; assigning 491ms do $$ begin for i in 1..1 loop execute format('let %I = 10', 'xx' || i); end loop; end $$; dropping without necessity of memory cleaning 1155ms do $$ begin for i in 1..1 loop execute format('drop variable %I', 'xx' || i); end loop; end $$; dropping with memory cleaning 2708 just memory cleaning 72ms (time of commit - at commit cleaning) do $$ begin for i in 1..1 loop execute format('let %I = 10', 'xx' || i); end loop; end $$; begin; do $$ begin for i in 1..1 loop execute format('drop variable %I', 'xx' || i); end loop; end $$; commit; So just syncing (cleaning 10K) variables needs less than 72 ms I can be wrong, but from these numbers I don't think so these sync cycles should to contain CHECK_FOR_INTERRUPTS What do you think? > > > * sync_sessionvars_all explains why is it necessary to copy > > > xact_recheck_varids: > > > > > > When we check the variables, the system cache can be > > > invalidated, > > > and xact_recheck_varids can be enhanced. > > > > > > I'm not quite following what the "enhancement" part is about? Is > > > xact_recheck_varids could be somehow updated concurrently with the > loop? > > > > > > > Yes. pg_variable_cache_callback can be called when > > is_session_variable_valid is executed. > > > > Maybe "extended" can be a better word instead of "enhanced"? I > reformulated > > this comment > > Yeah, "extended" sounds better. But I was mostly confused about the > mechanism, if the cache callback can interrupt the execution at any > moment when called, that would explain it. > patch from yesterday has extended comment in this area :-) Regards Pavel
Re: Schema variables - new implementation for Postgres 15 (typo)
> On Sun, Jan 22, 2023 at 07:47:07PM +0100, Pavel Stehule wrote: > pá 20. 1. 2023 v 21:35 odesílatel Dmitry Dolgov <9erthali...@gmail.com> > napsal: > > > * I think it was already mentioned in the thread, there seems to be not a > > single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some > > number > > of loops over the sessionvars are implemented, are they always going to > > be > > small enough to not make any troubles? > > > > The longest cycle is a cycle that rechecks actively used variables against > system catalog. No cycle depends on the content of variables. Right, but what about the cases with huge number of variables? Not saying it could be in any sense common, but possible to do. > > * sync_sessionvars_all explains why is it necessary to copy > > xact_recheck_varids: > > > > When we check the variables, the system cache can be > > invalidated, > > and xact_recheck_varids can be enhanced. > > > > I'm not quite following what the "enhancement" part is about? Is > > xact_recheck_varids could be somehow updated concurrently with the loop? > > > > Yes. pg_variable_cache_callback can be called when > is_session_variable_valid is executed. > > Maybe "extended" can be a better word instead of "enhanced"? I reformulated > this comment Yeah, "extended" sounds better. But I was mostly confused about the mechanism, if the cache callback can interrupt the execution at any moment when called, that would explain it.
Re: Schema variables - new implementation for Postgres 15 (typo)
I've accumulated another collection of various questions and comments. As a side note I'm getting a good feeling about this patch, those part I've read so far looks good to me. * I've suddenly realized one could use pseudo types for variables, and it not always works. E.g.: =# create variable pseudo_array anyarray; =# select pseudo_array; pseudo_array -- NULL =# let pseudo_array = ARRAY[1, 2, 3]; ERROR: 42804: target session variable is of type anyarray but expression is of type integer[] LOCATION: svariableStartupReceiver, svariableReceiver.c:79 =# create variable pseudo_unknown unknown; =# select pseudo_unknown; ERROR: XX000: failed to find conversion function from unknown to text LOCATION: coerce_type, parse_coerce.c:542 Is it supposed to be like this, or something is missing? * I think it was already mentioned in the thread, there seems to be not a single usage of CHECK_FOR_INTERRUPTS in session_variable.c . But some number of loops over the sessionvars are implemented, are they always going to be small enough to not make any troubles? * sync_sessionvars_all explains why is it necessary to copy xact_recheck_varids: When we check the variables, the system cache can be invalidated, and xact_recheck_varids can be enhanced. I'm not quite following what the "enhancement" part is about? Is xact_recheck_varids could be somehow updated concurrently with the loop? * A small typo diff --git a/src/backend/commands/session_variable.c b/src/backend/commands/session_variable.c --- a/src/backend/commands/session_variable.c +++ b/src/backend/commands/session_variable.c @@ -485,7 +485,7 @@ sync_sessionvars_all(bool filter_lxid) /* * When we check the variables, the system cache can be invalidated, -* and xac_recheck_varids can be enhanced. We want to iterate +* and xact_recheck_varids can be enhanced. We want to iterate NOTE: The commentaries above were made based on the previous patch version, but it looks like those aspects were not changed.
Re: Schema variables - new implementation for Postgres 15 (typo)
On Tue, Jan 10, 2023 at 08:35:16PM +0100, Pavel Stehule wrote: > út 10. 1. 2023 v 3:20 odesílatel Julien Rouhaud napsal: > > > > Another new behavior I see is the new rowtype_only parameter for > > LookupVariable. Has this been discussed? > > > > I think so it was discussed about table shadowing > > without this filter, I lost the message "missing FROM-clause entry for ..." > > -- should fail > SELECT varx.xxx; > -ERROR: missing FROM-clause entry for table "varx" > -LINE 1: SELECT varx.xxx; > - ^ > +ERROR: type text is not composite > -- don't allow multi column query > CREATE TYPE vartesttp AS (a1 int, b1 int, c1 int); > CREATE VARIABLE v1 AS vartesttp; > @@ -1421,9 +1419,7 @@ > DROP TYPE ab; > CREATE VARIABLE myvar AS int; > SELECT myvar.blabla; > -ERROR: missing FROM-clause entry for table "myvar" > -LINE 1: SELECT myvar.blabla; > - ^ > +ERROR: type integer is not composite > DROP VARIABLE myvar; > -- the result of view should be same in parallel mode too > CREATE VARIABLE v1 AS int; > > My original idea was to try to reduce possible conflicts (in old versions > of this path, a conflict was disallowed). But it is true, so these "new" > error messages are sensible too, and with eliminating rowtype_only I can > reduce code. Ok! Another problem is that the error message as-is is highly unhelpful as it's not clear at all that the problem is coming from an unsuitable variable. Maybe change makeParamSessionVariable to use lookup_rowtype_tupdesc_noerror() and emit a friendlier error message? Something like variable "X.Y" is of type Z, which is not composite > I modified the IdentifyVariable function a little bit. With new argument > noerror I am able to ensure so no error will be raised when this function > is called just for shadowing detection. I locally modified IdentifyVariable to emit WARNING reports when noerror is set to quickly see how it was used and didn't get any regression test error. This definitely needs to be covered by regression tests. Looking as session_variables.sql, the session_variables_ambiguity_warning GUC doesn't have a lot of tests in general.
Re: Schema variables - new implementation for Postgres 15 (typo)
Hi, On Sat, Jan 07, 2023 at 08:09:27AM +0100, Pavel Stehule wrote: > > > > Hmm, how safe is it for third-party code to access the stored data directly > > rather than a copy? If it makes extension fragile if they're not careful > > enough with cache invalidation, or even give them a way to mess up with the > > data directly, it's probably not a good idea to provide such an API. > > > > ok, I removed it Another new behavior I see is the new rowtype_only parameter for LookupVariable. Has this been discussed? I can see how it can be annoying to get a "variable isn't composite" type of error when you already know that only a composite object can be used (and other might work), but it looks really scary to entirely ignore some objects that should be found in your search_path just because of their datatype. And if we ignore something like "a.b" if "a" isn't a variable of composite type, why wouldn't we apply the same "just ignore it" rule if it's indeed a composite type but doesn't have any "b" field? Your application could also start to use different object if your drop a say json variable and create a composite variable instead. It seems to be in contradiction with how the rest of the system works and looks wrong to me. Note also that LookupVariable can be quite expensive since you may have to do a lookup for every schema found in the search_path, so the sooner it stops the better. > > > updated patches attached I forgot to mention it last time but you should bump the copyright year for all new files added when you'll send a new version of the patchset.
Re: Schema variables - new implementation for Postgres 15 (typo)
Hi, On Fri, Jan 06, 2023 at 08:02:41PM +0100, Pavel Stehule wrote: > pá 6. 1. 2023 v 5:11 odesílatel Julien Rouhaud napsal: > > > > > +Datum > > +GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid > > expected_typid) > > +{ > > + SVariable svar; > > + > > + svar = prepare_variable_for_reading(varid); > > + Assert(svar != NULL && svar->is_valid); > > + > > + return CopySessionVariableWithTypeCheck(varid, isNull, > > expected_typid); > > + > > + if (expected_typid != svar->typid) > > + elog(ERROR, "type of variable \"%s.%s\" is different than > > expected", > > + > > get_namespace_name(get_session_variable_namespace(varid)), > > +get_session_variable_name(varid)); > > + > > + *isNull = svar->isnull; > > + > > + return svar->value; > > +} > > > > there's a unconditional return in the middle of the function, which also > > looks > > like a thinko during a rebase since CopySessionVariableWithTypeCheck mostly > > contains the same following code. > > > > This looks like my mistake - originally I fixed an issue related to the > usage session variable from plpgsql, and I forced a returned copy of the > session variable's value. Now, the function > GetSessionVariableWithTypeCheck is not used anyywhere. Oh I didn't check if it was used in the patchset. > It can be used only > from extensions, where is ensured so a) the value is not changed, b) and in > a lifetime of returned value is not called any query or any expression that > can change the value of this variable. I fixed this code and I enhanced > comments. I am not sure if this function should not be removed. It is not > used by backend, but it can be handy for extensions - it reduces possible > useless copy. Hmm, how safe is it for third-party code to access the stored data directly rather than a copy? If it makes extension fragile if they're not careful enough with cache invalidation, or even give them a way to mess up with the data directly, it's probably not a good idea to provide such an API. > > I'm also wondering if there should be additional tests based on the last > > scenario reported by Dmitry? (I don't see any new or similar test, but I > > may > > have missed it). > > > > The scenario reported by Dmitry is in tests already. Oh, so I missed it sorry about that. I did some testing using debug_discard_cache in the past and didn't run into this issue, so it's probably due to a more recent changes or before such a test was added. > I am not sure if I > have to repeat it with active debug_discard_cache. I expect this mode will > be activated in some testing environments. Yes, some buildfarm animal are configured to run with various debug_discard_caches setting so it's not needed to override it in some tests (it makes testing time really slow, which will be annoying for everyone including old/slow buildfarm animals). > I have no idea how to simply emulate this issue without > debug_discard_caches on 1. It is necessary to raise the sinval message > exactly when the variable is checked against system catalogue. Manually testing while setting locally debug_discard_cache should be enough. > updated patches attached Thanks!
Re: Schema variables - new implementation for Postgres 15 (typo)
Hi, On Fri, Dec 23, 2022 at 08:38:43AM +0100, Pavel Stehule wrote: > > I am sending an updated patch, fixing the mentioned issue. Big thanks for > testing, and checking. There were multiple reviews in the last weeks which reported some issues, but unless I'm missing something I don't see any follow up from the reviewers on the changes? I will still wait a bit to see if they chime in while I keep looking at the diff since the last version of the code I reviewed. But in the meantime I already saw a couple of things that don't look right: --- a/src/backend/commands/dropcmds.c +++ b/src/backend/commands/dropcmds.c @@ -481,6 +481,11 @@ does_not_exist_skipping(ObjectType objtype, Node *object) msg = gettext_noop("publication \"%s\" does not exist, skipping"); name = strVal(object); break; + case OBJECT_VARIABLE: + msg = gettext_noop("session variable \"%s\" does not exist, skipping"); + name = NameListToString(castNode(List, object)); + break; + default: case OBJECT_COLUMN: the "default:" seems like a thinko during a rebase? +Datum +GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid expected_typid) +{ + SVariable svar; + + svar = prepare_variable_for_reading(varid); + Assert(svar != NULL && svar->is_valid); + + return CopySessionVariableWithTypeCheck(varid, isNull, expected_typid); + + if (expected_typid != svar->typid) + elog(ERROR, "type of variable \"%s.%s\" is different than expected", + get_namespace_name(get_session_variable_namespace(varid)), +get_session_variable_name(varid)); + + *isNull = svar->isnull; + + return svar->value; +} there's a unconditional return in the middle of the function, which also looks like a thinko during a rebase since CopySessionVariableWithTypeCheck mostly contains the same following code. I'm also wondering if there should be additional tests based on the last scenario reported by Dmitry? (I don't see any new or similar test, but I may have missed it). > > > Why do you think so? The variable has no mvcc support - it is just stored > > > value with local visibility without mvcc support. There can be little bit > > > similar issues like with global temporary tables. > > > > Yeah, sorry for not being precise, I mean global temporary tables. This > > is not my analysis, I've simply picked up it was mentioned a couple of > > times here. The points above are not meant to serve as an objection > > against the patch, but rather to figure out if there are any gaps left > > to address and come up with some sort of plan with "committed" as a > > final destination. > > > > There are some similarities, but there are a lot of differences too. > Handling of metadata is partially similar, but session variable is almost > the value cached in session memory. It has no statistics, it is not stored > in a file. Because there is different storage, I don't think there is some > intersection on implementation level. +1
Re: Schema variables - new implementation for Postgres 15 (typo)
> On Thu, Dec 22, 2022 at 08:45:57PM +0100, Pavel Stehule wrote: > > From the first look it seems some major topics the discussion is evolving > > are about: > > > > * Validity of the use case. Seems to be quite convincingly addressed in > > [1] and > > [2]. > > > > * Complicated logic around invalidation, concurrent create/drop etc. (I > > guess > > the issue above is falling into the same category). > > > > * Concerns that session variables could repeat some problems of temporary > > tables. > > > > Why do you think so? The variable has no mvcc support - it is just stored > value with local visibility without mvcc support. There can be little bit > similar issues like with global temporary tables. Yeah, sorry for not being precise, I mean global temporary tables. This is not my analysis, I've simply picked up it was mentioned a couple of times here. The points above are not meant to serve as an objection against the patch, but rather to figure out if there are any gaps left to address and come up with some sort of plan with "committed" as a final destination.
Re: Schema variables - new implementation for Postgres 15 (typo)
čt 22. 12. 2022 v 17:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > Hi, > > I'm continuing review the patch slowly, and have one more issue plus one > philosophical question. > > The issue have something to do with variables invalidation. Enabling > debug_discard_caches = 1 (about which I've learned from this thread) and > running this subset of the test suite: > > CREATE SCHEMA svartest; > SET search_path = svartest; > CREATE VARIABLE var3 AS int; > > CREATE OR REPLACE FUNCTION inc(int) > RETURNS int AS $$ > BEGIN > LET svartest.var3 = COALESCE(svartest.var3 + $1, $1); > RETURN var3; > END; > $$ LANGUAGE plpgsql; > > SELECT inc(1); > SELECT inc(1); > SELECT inc(1); > > crashes in my setup like this: > > #2 0x00b432d4 in ExceptionalCondition > (conditionName=0xce9b99 "n >= 0 && n < list->length", fileName=0xce9c18 > "list.c", lineNumber=770) at assert.c:66 > #3 0x007d3acd in list_delete_nth_cell (list=0x18ab248, > n=-3388) at list.c:770 > #4 0x007d3b88 in list_delete_cell (list=0x18ab248, > cell=0x18dc028) at list.c:842 > #5 0x006bcb52 in sync_sessionvars_all (filter_lxid=true) > at session_variable.c:524 > #6 0x006bd4cb in SetSessionVariable (varid=16386, > value=2, isNull=false) at session_variable.c:844 > #7 0x006bd617 in SetSessionVariableWithSecurityCheck > (varid=16386, value=2, isNull=false) at session_variable.c:885 > #8 0x7f763b890698 in exec_stmt_let (estate=0x7ffcc6fd5190, > stmt=0x18aa920) at pl_exec.c:5030 > #9 0x7f763b88a746 in exec_stmts (estate=0x7ffcc6fd5190, > stmts=0x180) at pl_exec.c:2116 > #10 0x7f763b88a247 in exec_stmt_block (estate=0x7ffcc6fd5190, > block=0x18aabf8) at pl_exec.c:1935 > #11 0x7f763b889a49 in exec_toplevel_block > (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1626 > #12 0x7f763b8879df in plpgsql_exec_function (func=0x18781b0, > fcinfo=0x18be110, simple_eval_estate=0x0, simple_eval_resowner=0x0, > procedure_resowner=0x0, atomic=true) at pl_exec.c:615 > #13 0x7f763b8a2320 in plpgsql_call_handler (fcinfo=0x18be110) > at pl_handler.c:277 > #14 0x00721716 in ExecInterpExpr (state=0x18bde28, > econtext=0x18bd3d0, isnull=0x7ffcc6fd56d7) at execExprInterp.c:730 > #15 0x00723642 in ExecInterpExprStillValid > (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at > execExprInterp.c:1855 > #16 0x0077a78b in ExecEvalExprSwitchContext > (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at > ../../../src/include/executor/executor.h:344 > #17 0x0077a7f4 in ExecProject (projInfo=0x18bde20) at > ../../../src/include/executor/executor.h:378 > #18 0x0077a9dc in ExecResult (pstate=0x18bd2c0) at > nodeResult.c:136 > #19 0x00738ea0 in ExecProcNodeFirst (node=0x18bd2c0) at > execProcnode.c:464 > #20 0x0072c6e3 in ExecProcNode (node=0x18bd2c0) at > ../../../src/include/executor/executor.h:262 > #21 0x0072f426 in ExecutePlan (estate=0x18bd098, > planstate=0x18bd2c0, use_parallel_mode=false, operation=CMD_SELECT, > sendTuples=true, numberTuples=0, direction=ForwardScanDirection, > dest=0x18b3eb8, execute_once=true) at execMain.c:1691 > #22 0x0072cf76 in standard_ExecutorRun > (queryDesc=0x189c688, direction=ForwardScanDirection, count=0, > execute_once=true) at execMain.c:423 > #23 0x0072cdb3 in ExecutorRun (queryDesc=0x189c688, > direction=ForwardScanDirection, count=0, execute_once=true) at > execMain.c:367 > #24 0x0099afdc in PortalRunSelect (portal=0x1866648, > forward=true, count=0, dest=0x18b3eb8) at pquery.c:927 > #25 0x0099ac99 in PortalRun (portal=0x1866648, > count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x18b3eb8, > altdest=0x18b3eb8, qc=0x7ffcc6fd5a70) at pquery.c:771 > #26 0x0099487d in exec_simple_query > (query_string=0x17edcc8 "SELECT inc(1);") at postgres.c:1238 > > It seems that sync_sessionvars_all tries to remove a cell that either > doesn't > belong to the xact_recheck_varids or weird in some other way: > > +>>> p l - xact_recheck_varids->elements > $81 = -3388 > I am able to repeat this issue. I'll look at it. > > The second thing I wanted to ask about is a more strategical question. Does > anyone have clear understanding where this patch is going? The thread is > quite > large, and it's hard to catch up with all the details -- it would be great > if > someone could summarize the current state of things, are there any > outstanding > design issues or not addressed concerns? > I hope I fixed the issues founded by Julian and Tomas. Now there is not implemented transaction support related to values, and I pla
Re: Schema variables - new implementation for Postgres 15 (typo)
Hi, I'm continuing review the patch slowly, and have one more issue plus one philosophical question. The issue have something to do with variables invalidation. Enabling debug_discard_caches = 1 (about which I've learned from this thread) and running this subset of the test suite: CREATE SCHEMA svartest; SET search_path = svartest; CREATE VARIABLE var3 AS int; CREATE OR REPLACE FUNCTION inc(int) RETURNS int AS $$ BEGIN LET svartest.var3 = COALESCE(svartest.var3 + $1, $1); RETURN var3; END; $$ LANGUAGE plpgsql; SELECT inc(1); SELECT inc(1); SELECT inc(1); crashes in my setup like this: #2 0x00b432d4 in ExceptionalCondition (conditionName=0xce9b99 "n >= 0 && n < list->length", fileName=0xce9c18 "list.c", lineNumber=770) at assert.c:66 #3 0x007d3acd in list_delete_nth_cell (list=0x18ab248, n=-3388) at list.c:770 #4 0x007d3b88 in list_delete_cell (list=0x18ab248, cell=0x18dc028) at list.c:842 #5 0x006bcb52 in sync_sessionvars_all (filter_lxid=true) at session_variable.c:524 #6 0x006bd4cb in SetSessionVariable (varid=16386, value=2, isNull=false) at session_variable.c:844 #7 0x006bd617 in SetSessionVariableWithSecurityCheck (varid=16386, value=2, isNull=false) at session_variable.c:885 #8 0x7f763b890698 in exec_stmt_let (estate=0x7ffcc6fd5190, stmt=0x18aa920) at pl_exec.c:5030 #9 0x7f763b88a746 in exec_stmts (estate=0x7ffcc6fd5190, stmts=0x180) at pl_exec.c:2116 #10 0x7f763b88a247 in exec_stmt_block (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1935 #11 0x7f763b889a49 in exec_toplevel_block (estate=0x7ffcc6fd5190, block=0x18aabf8) at pl_exec.c:1626 #12 0x7f763b8879df in plpgsql_exec_function (func=0x18781b0, fcinfo=0x18be110, simple_eval_estate=0x0, simple_eval_resowner=0x0, procedure_resowner=0x0, atomic=true) at pl_exec.c:615 #13 0x7f763b8a2320 in plpgsql_call_handler (fcinfo=0x18be110) at pl_handler.c:277 #14 0x00721716 in ExecInterpExpr (state=0x18bde28, econtext=0x18bd3d0, isnull=0x7ffcc6fd56d7) at execExprInterp.c:730 #15 0x00723642 in ExecInterpExprStillValid (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at execExprInterp.c:1855 #16 0x0077a78b in ExecEvalExprSwitchContext (state=0x18bde28, econtext=0x18bd3d0, isNull=0x7ffcc6fd56d7) at ../../../src/include/executor/executor.h:344 #17 0x0077a7f4 in ExecProject (projInfo=0x18bde20) at ../../../src/include/executor/executor.h:378 #18 0x0077a9dc in ExecResult (pstate=0x18bd2c0) at nodeResult.c:136 #19 0x00738ea0 in ExecProcNodeFirst (node=0x18bd2c0) at execProcnode.c:464 #20 0x0072c6e3 in ExecProcNode (node=0x18bd2c0) at ../../../src/include/executor/executor.h:262 #21 0x0072f426 in ExecutePlan (estate=0x18bd098, planstate=0x18bd2c0, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0, direction=ForwardScanDirection, dest=0x18b3eb8, execute_once=true) at execMain.c:1691 #22 0x0072cf76 in standard_ExecutorRun (queryDesc=0x189c688, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:423 #23 0x0072cdb3 in ExecutorRun (queryDesc=0x189c688, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:367 #24 0x0099afdc in PortalRunSelect (portal=0x1866648, forward=true, count=0, dest=0x18b3eb8) at pquery.c:927 #25 0x0099ac99 in PortalRun (portal=0x1866648, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x18b3eb8, altdest=0x18b3eb8, qc=0x7ffcc6fd5a70) at pquery.c:771 #26 0x0099487d in exec_simple_query (query_string=0x17edcc8 "SELECT inc(1);") at postgres.c:1238 It seems that sync_sessionvars_all tries to remove a cell that either doesn't belong to the xact_recheck_varids or weird in some other way: +>>> p l - xact_recheck_varids->elements $81 = -3388 The second thing I wanted to ask about is a more strategical question. Does anyone have clear understanding where this patch is going? The thread is quite large, and it's hard to catch up with all the details -- it would be great if someone could summarize the current state of things, are there any outstanding design issues or not addressed concerns? >From the first look it seems some major topics the discussion is evolving are >about: * Validity of the use case. Seems to be quite convincingly addressed in [1] and [2]. * Complicated logic around invalidation, concurrent create/drop etc. (I guess the issue above is falling into the same category). * Concerns that session variables could repeat some problems of temporary tables. Is there anything else? [1]: https://www
Re: Schema variables - new implementation for Postgres 15 (typo)
Op 14-12-2022 om 05:54 schreef Pavel Stehule: Hi fresh rebase typo alert: v20221214-0003-LET-command.patch contains errmsg("target session varible is of type %s" ('varible' should be 'variable') Erik
Re: Schema variables - new implementation for Postgres 15
po 14. 11. 2022 v 8:00 odesílatel Sergey Shinderuk < s.shinde...@postgrespro.ru> napsal: > On 13.11.2022 20:59, Pavel Stehule wrote: > > fresh rebase > > Hello, > > Sorry, I haven't been following this thread, but I'd like to report a > memory management bug. I couldn't apply the latest patches, so I tested > with v20221104-1-* patches applied atop of commit b0284bfb1db. > > > postgres=# create variable s text default 'abc'; > > create function f() returns text as $$ > begin > return g(s); > end; > $$ language plpgsql; > > create function g(t text) returns text as $$ > begin > let s = 'BOOM!'; > return t; > end; > $$ language plpgsql; > > select f(); > CREATE VARIABLE > CREATE FUNCTION > CREATE FUNCTION > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > > LOG: server process (PID 55307) was terminated by signal 11: > Segmentation fault > DETAIL: Failed process was running: select f(); > > > I believe it's a use-after-free error, triggered by assigning a new > value to s in g(), thus making t a dangling pointer. > > After reconnecting I get a scary error: > > postgres=# select f(); > ERROR: compressed pglz data is corrupt > I am able to reproduce it, and I have a quick fix, but I need to investigate i this fix will be correct It's a good example so I have to always return a copy of value. Regards Pavel > > Best regards, > > -- > Sergey Shinderukhttps://postgrespro.com/ > >
Re: Schema variables - new implementation for Postgres 15
On 13.11.2022 20:59, Pavel Stehule wrote: fresh rebase Hello, Sorry, I haven't been following this thread, but I'd like to report a memory management bug. I couldn't apply the latest patches, so I tested with v20221104-1-* patches applied atop of commit b0284bfb1db. postgres=# create variable s text default 'abc'; create function f() returns text as $$ begin return g(s); end; $$ language plpgsql; create function g(t text) returns text as $$ begin let s = 'BOOM!'; return t; end; $$ language plpgsql; select f(); CREATE VARIABLE CREATE FUNCTION CREATE FUNCTION server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. LOG: server process (PID 55307) was terminated by signal 11: Segmentation fault DETAIL: Failed process was running: select f(); I believe it's a use-after-free error, triggered by assigning a new value to s in g(), thus making t a dangling pointer. After reconnecting I get a scary error: postgres=# select f(); ERROR: compressed pglz data is corrupt Best regards, -- Sergey Shinderukhttps://postgrespro.com/
Re: Schema variables - new implementation for Postgres 15
pá 4. 11. 2022 v 15:28 odesílatel Dmitry Dolgov <9erthali...@gmail.com> napsal: > > On Fri, Nov 04, 2022 at 03:17:18PM +0100, Pavel Stehule wrote: > > > I've stumbled upon something that looks weird to me (inspired by the > > > example from tests): > > > > > > =# create variable v2 as int; > > > =# let v2 = 3; > > > =# create view vv2 as select coalesce(v2, 0) + 1000 as result > > > > > > =# select * from vv2; > > > result > > > > > > 1003 > > > > > > =# set force_parallel_mode to on; > > > =# select * from vv2; > > > result > > > > > > 1000 > > > > > > In the second select the actual work is done from a worker backend. > > > Since values of session variables are stored in the backend local > > > memory, it's not being shared with the worker and the value is not > found > > > in the hash map. Does this suppose to be like that? > > > > > > > It looks like a bug, but parallel queries should be supported. > > > > The value of the variable is passed as parameter to the worker backend. > But > > probably somewhere the original reference was not replaced by parameter > > > > On Fri, Nov 04, 2022 at 10:17:13PM +0800, Julien Rouhaud wrote: > > Hi, > > > > There's code to serialize and restore all used variables for parallel > workers > > (see code about PARAM_VARIABLE and queryDesc->num_session_variables / > > queryDesc->plannedstmt->sessionVariables). I haven't reviewed that part > yet, > > but it's supposed to be working. Blind guess would be that it's missing > > something in expression walker. > > I see, thanks. I'll take a deeper look, my initial assumption was due to > the fact that in the worker case create_sessionvars_hashtables is > getting called for every query. > It should be fixed in today's patch The problem was in missing pushing the hasSessionVariables flag to the upper subquery in pull_up_simple_subquery. --- a/src/backend/optimizer/prep/prepjointree.c +++ b/src/backend/optimizer/prep/prepjointree.c @@ -1275,6 +1275,9 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte, /* If subquery had any RLS conditions, now main query does too */ parse->hasRowSecurity |= subquery->hasRowSecurity; + /* If subquery had session variables, now main query does too */ + parse->hasSessionVariables |= subquery->hasSessionVariables; + Thank you for the check and bug report. Your example was added to regress tests Regards Pavel
Re: Schema variables - new implementation for Postgres 15
Hi, On Sat, Nov 05, 2022 at 05:04:31PM +0100, Tomas Vondra wrote: > > I did a quick initial review of this patch series - attached is a > version with "review" commits for some of the parts. The current patch > seems in pretty good shape, most of what I noticed are minor issues. I > plan to do a more thorough review later. Thanks! I agree with all of your comments, just a few answers below > - NamesFromList and IdentifyVariable seem introduced unnecessarily > early, as they are only used in 0002 and 0003 parts (in the original > patch series). Not sure if the plan is to squash everything into a > single patch, or commit individual patches. The split was mostly done to make the patch easier to review, as it adds quite a bit of infrastructure. There have been some previous comments to have a more logical separation and fix similar issues, but there are still probably other oddities like that laying around. I personally didn't focus much on it as I don't know if the future committer will choose to squash everything or not. > - AFAIK patches don't need to modify typedefs.list. I think this was discussed a year or so ago, and my understanding is that the general rule is that it's now welcome, if not recommended, to maintain typedefs.list in each patchset. > Which I think means this: > > if (filter_lxid && svar->drop_lxid == MyProc->lxid) > continue; > > accesses drop_lxid, which was not initialized in init_session_variable. Agreed.