Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-09-21 Thread Amit Kapila
On Sat, Sep 22, 2018 at 2:28 AM Andres Freund  wrote:
>
> On 2018-09-22 08:54:57 +1200, Thomas Munro wrote:
> > On Fri, Sep 21, 2018 at 4:43 PM Tom Lane  wrote:
> > > Thomas Munro  writes:
> > > > On Fri, Sep 21, 2018 at 4:06 PM Tom Lane  wrote:
> > > >> Why would we fix it rather than just removing it?
> > >
> > > > I assumed we wouldn't remove an extern C function extension code
> > > > somewhere might use.  Though admittedly I'd be surprised if anyone
> > > > used this one.
> > >
> > > Unless it looks practical to support this behavior in the Windows
> > > and SysV cases, I think we should get rid of it rather than expend
> > > effort on supporting it for just some platforms.
> >
> > We can remove it in back-branches without breaking API compatibility:
> >
> > 1.  Change dsm_impl_can_resize() to return false unconditionally (I
> > suppose client code is supposed to check this before using
> > dsm_resize(), though I'm not sure why it has an "impl" in its name if
> > it's part of the public interface of this module).
> > 2.  Change dsm_resize() and dsm_remap() to raise an error conditionally.
> > 3.  Rip out the DSM_OP_RESIZE cases from various places.
> >
> > Then in master, remove all of those functions completely.  However,
> > I'd feel like a bit of a vandal.  Robert and Amit probably had plans
> > for that code...?
>
> Robert, Amit: ^
>

I went through and check the original proposal [1] to see if any use
case is mentioned there, but nothing related has been discussed.   I
couldn't think of much use of this facility except maybe for something
like parallelizing correalated sub-queries where the size of outer var
can change across executions and we might need to resize the initially
allocated memory.  This is just a wild thought, I don't have any
concrete idea about this.   Having said that, I don't object to
removing this especially because the implementation doesn't seem to be
complete.  In future, if someone needs such a facility, they can first
develop a complete version of this API.

[1] - 
https://www.postgresql.org/message-id/CA%2BTgmoaDqDUgt%3D4Zs_QPOnBt%3DEstEaVNP%2B5t%2Bm%3DFPNWshiPR3A%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] proposal: schema variables

2018-09-21 Thread Pavel Stehule
pá 21. 9. 2018 v 21:46 odesílatel Arthur Zakirov 
napsal:

> On Wed, Sep 19, 2018 at 04:36:40PM +0200, Pavel Stehule wrote:
> > ON COMMIT DROP is used only for temp variables (transaction or not
> > transaction). The purpose is same like for tables. Sometimes you can to
> > have object with shorter life than is session.
> >
> > ON TRANSACTION END RESET has sense mainly for not transaction variables.
> I
> > see two use cases.
> >
> > 1. protect some sensitive data - on transaction end guaranteed reset and
> > cleaning on end transaction. So you can be sure, so variable is not
> > initialized (has default value), or you are inside transaction.
> >
> > 2. automatic initialization - ON TRANSACTION END RESET ensure so variable
> > is in init state for any transaction.
> >
> > Both cases has sense for transaction or not transaction variables.
> >
> > I am thinking so transaction life time for content has sense. Is cheaper
> to
> > reset variable than drop it (what ON COMMIT DROP does)
> >
> > What do you think?
>
> Thanks, I understood the cases.
>
> But I think there is more sense to use these options only with
> transactional
> variables. It is more consistent and simple for me.
>

I agree so it can be hard to imagine - and if I return back to start
discussion about schema variables - it can be hard because it joins some
concepts - variables has persistent transactional metadata, but the content
is not transactional.

I don't think so the variability is a issue in this case. There is a lot of
examples, so lot of combinations are possible - global temp tables and
package variables (Oracle), local temp tables and local variables
(Postgres), session variables and memory tables (MSSQL). Any combination of
feature has cases where can be very practical and useful.

ON TRANSACTION END RESET can be useful, because we have not a session event
triggers (and in this moment I am not sure if it is necessary and practical
- their usage can be very fragile). But some work can do ON xxx clauses,
that should not to have negative impact on performance or fragility.

ON TRANSACTION END RESET ensure cleaned and initialized to default value
for any transaction. Other possibility is ON COMMAND END RESET (but I would
not to implement it now), ...


> As a summary, it is 1 voice vs 1 voice :) So it is better to leave the
> syntax as is without changes for now.
>

:) now is enough time to think about syntax. Some features can be removed
and returned back later, where this concept will be more absorbed.

Regards

Pavel


>
> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
>


Re: Proposal for disk quota feature

2018-09-21 Thread Pavel Stehule
pá 21. 9. 2018 v 16:21 odesílatel Hubert Zhang  napsal:

> just fast reaction - why QUOTA object?
>> Isn't ALTER SET enough?
>> Some like
>> ALTER TABLE a1 SET quote = 1MB;
>> ALTER USER ...
>> ALTER SCHEMA ..
>> New DDL commans looks like too hard hammer .
>
>
> It's an option. Prefer to consider quota setting store together:
> CREATE DISK QUOTA way is more nature to store quota setting in a separate
> pg_diskquota catalog
> While ALTER SET way is more close to store quota setting in pg_class,
> pg_role, pg_namespace. etc in an integrated way.
> (Note that here I mean nature/close is not must, ALTER SET could also
> store in pg_diskquota and vice versa.)
>

I have not a problem with new special table for storing this information.
But it looks like redundant to current GUC configuration and limits. Can be
messy do some work with ALTER ROLE, and some work via CREATE QUOTE.

Regards

Pavel


> Here are some differences I can think of:
> 1 pg_role is a global catalog, not per database level. It's harder to
> tracker the user's disk usage in the whole clusters(considering 1000+
> databases).  So the semantic of  CREATE DISK QUOTA ON USER is limited: it
> only tracks the user's disk usage inside the current database.
> 2 using separate pg_diskquota could add more field except for quota limit
> without adding too many fields in pg_class, e.g. red zone to give the user
> a warning or the current disk usage of the db objects.
>
> On Fri, Sep 21, 2018 at 8:01 PM Pavel Stehule 
> wrote:
>
>>
>>
>> pá 21. 9. 2018 v 13:32 odesílatel Hubert Zhang 
>> napsal:
>>
>>>
>>>
>>>
>>>
>>> *Hi all,We redesign disk quota feature based on the comments from Pavel
>>> Stehule and Chapman Flack. Here are the new design.OverviewBasically,  disk
>>> quota feature is used to support multi-tenancy environment, different level
>>> of database objects could be set a quota limit to avoid over use of disk
>>> space. A common case could be as follows: DBA could enable disk quota on a
>>> specified database list. DBA could set disk quota limit for
>>> tables/schemas/roles in these databases. Separate disk quota worker process
>>> will monitor the disk usage for these objects and detect the objects which
>>> exceed their quota limit. Queries loading data into these “out of disk
>>> quota” tables/schemas/roles will be cancelled.We are currently working at
>>> init implementation stage. We would like to propose our idea firstly and
>>> get feedbacks from community to do quick iteration.SQL Syntax (How to use
>>> disk quota)1 Specify the databases with disk quota enabled in GUC
>>> “diskquota_databases” in postgresql.conf and restart the database.2 DBA
>>> could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1
>>> ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with
>>> (quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota =
>>> ‘3MB’);*
>>>
>>
>> just fast reaction - why QUOTA object?
>>
>> Isn't ALTER SET enough?
>>
>> Some like
>>
>> ALTER TABLE a1 SET quote = 1MB;
>> ALTER USER ...
>> ALTER SCHEMA ..
>>
>> New DDL commans looks like too hard hammer .
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *3 Simulate a schema out of quota limit case: suppose table a1 and table
>>> a2 are both under schema s1.INSERT INTO a1 SELECT
>>> generate_series(1,1000);INSERT INTO a2 SELECT
>>> generate_series(1,300);SELECT pg_sleep(5)INSERT INTO a1 SELECT
>>> generate_series(1,1000);ERROR:  schema's disk space quota exceededDROP
>>> TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT
>>> generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the
>>> following components.1. Quota Setting Store is where the disk quota setting
>>> to be stored and accessed. We plan to use catalog table pg_diskquota to
>>> store these information. pg_diskquota is
>>> like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; /*
>>> diskquota name */int16 quotatype; /* diskquota type name */ Oid
>>> quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /*
>>> diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in
>>> MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of size
>>> change of database objects. We plan to use stat collector to detect the
>>> ‘active’ table list at initial stage. But stat collector has some
>>> limitation on finding the active table which is in a running transaction.
>>> Details see TODO section.3. Quota Size Checker is where to calculate the
>>> size and compare with quota limit for database objects. According to
>>> Pavel’s comment, autovacuum launcher and worker process could be a good
>>> reference to disk quota. So we plan to use a disk quota launcher daemon
>>> process and several disk quota worker process to finish this work. Launcher
>>> process is responsible for starting worker process based on a user defined
>>> database list from GUC. Worker process will connect to its 

Re: vary read_only in SPI calls? or poke at the on-entry snapshot?

2018-09-21 Thread Chapman Flack
On 09/20/18 00:44, Tom Lane wrote:
>> 1. fiddle the loader to always pass read_only => false to SPI calls,
>>regardless of the volatility of the function it is loading for.
>> 2. leave the loader alone, and adjust install_jar (an infrequent
>>operation) to do something heretical with its on-entry snapshot.
> 
> I suspect #1 is less likely to have bad side-effects.  But I've not
> done any careful analysis.

Just noticed too: a thing prepared with the SPI_prepare... functions
(which take no read_only or snapshot parameters) later gets executed by
one of the SPI_execute... functions, which do take such parameters.

I assume the analysis and planning take place with reference to some
snapshot---necessarily? right?---(though on a quick skim of the code
I'm not yet sure what snapshot that is or how chosen), but the
parameters accepted by SPI_execute... can lead to a different snapshot
being in effect then.

Is that by design? Does it imply certain caveats I should be careful
about? Does it trigger any amount of replanning?

-Chap



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Andres Freund
Hi,

On 2018-09-22 09:56:00 +1200, Thomas Munro wrote:
> I vote for doing it this way then.  It may turn out to be useful for
> efficient SearchSysCache(...), DirectFunctionCall(...) and other
> things like that.

Yea, especially the *FunctionCall* stuff is awfully verbose.

I also wonder if it could make
http://archives.postgresql.org/message-id/20180605172952.x34m5uz6ju6enaem%40alap3.anarazel.de
a bit less painful.

Greetings,

Andres Freund



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Andres Freund
Hi,

On 2018-09-21 19:27:48 -0400, Tom Lane wrote:
> > You earlier were talking about tackling this - do you still want to? I
> > can otherwise, but it'll not be today, but likely tomorrow.
> 
> On it now.

Thanks, looks good.  msvc and icc are, as expected, ok too.

Greetings,

Andres Freund



Re: [patch] Support LLVM 7

2018-09-21 Thread Andres Freund
On 2018-09-20 23:08:04 +0200, Christoph Berg wrote:
> Re: To Andres Freund 2018-09-20 <20180920210315.gb21...@msg.df7cb.de>
> > Server beendete die Verbindung unerwartet
> 
> Something ate the attachments. Sorry.
> 
> FATAL:  fatal llvm error: Cannot select: 0x57e61d40: ch,glue = X86ISD::CALL 
> 0x57e61cb0, 0x57e61e18, Register:i32 $edi, RegisterMask:Untyped, 0x57e61cb0:1
>   0x57e61e18: i32 = X86ISD::Wrapper TargetGlobalAddress:i32 (%struct.TupleTableSlot*)* @deform_0_1> 0
> 0x57e61dd0: i32 = TargetGlobalAddress @deform_0_1> 0
>   0x57e61c68: i32 = Register $edi
>   0x57e61cf8: Untyped = RegisterMask
>   0x57e61cb0: ch,glue = CopyToReg 0x57e61c20, Register:i32 $edi, 0x57e61b90
> 0x57e61c68: i32 = Register $edi
> 0x57e61b90: i32,ch = CopyFromReg 0x57e1fd3c, Register:i32 %27
>   0x57e61b48: i32 = Register %27
> In function: evalexpr_0_0
> Server beendete die Verbindung unerwartet

This looks like a reported LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=34268

I tried to ping a few people involved in x32 on the LLVM list - not sure
if that has much of a chance. FWIW, it doesn't just appear to be
relevant for JIT, but also outside of it:
https://bugs.llvm.org/show_bug.cgi?id=36743

Greetings,

Andres Freund



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Tom Lane
Andres Freund  writes:
> On 2018-09-21 18:00:35 -0400, Tom Lane wrote:
>> If you want to rename it, then to what?  VA_ARGS_NARGS, perhaps?

> I like your suggestion. I mainly didn't like the PP_ prefix.

Sold.  The original author overcomplicated it anyway; I now have

/*
 * VA_ARGS_NARGS
 *Returns the number of macro arguments it is passed.
 *
 * An empty argument still counts as an argument, so effectively, this is
 * "one more than the number of commas in the argument list".
 *
 * This works for up to 63 arguments.  Internally, VA_ARGS_NARGS_() is passed
 * 64+N arguments, and the C99 standard only requires macros to allow up to
 * 127 arguments, so we can't portably go higher.  The implementation is
 * pretty trivial: VA_ARGS_NARGS_() returns its 64th argument, and we set up
 * the call so that that is the appropriate one of the list of constants.
 */
#define VA_ARGS_NARGS(...) \
VA_ARGS_NARGS_(__VA_ARGS__, \
   63,62,61,60,   \
   59,58,57,56,55,54,53,52,51,50, \
   49,48,47,46,45,44,43,42,41,40, \
   39,38,37,36,35,34,33,32,31,30, \
   29,28,27,26,25,24,23,22,21,20, \
   19,18,17,16,15,14,13,12,11,10, \
   9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#define VA_ARGS_NARGS_( \
_01,_02,_03,_04,_05,_06,_07,_08,_09,_10, \
_11,_12,_13,_14,_15,_16,_17,_18,_19,_20, \
_21,_22,_23,_24,_25,_26,_27,_28,_29,_30, \
_31,_32,_33,_34,_35,_36,_37,_38,_39,_40, \
_41,_42,_43,_44,_45,_46,_47,_48,_49,_50, \
_51,_52,_53,_54,_55,_56,_57,_58,_59,_60, \
_61,_62,_63,  N, ...) \
(N)

from which it's pretty obvious that this really is a simple macro
call.  I'd first thought it involved macro recursion, but it doesn't.

> You earlier were talking about tackling this - do you still want to? I
> can otherwise, but it'll not be today, but likely tomorrow.

On it now.

regards, tom lane



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Andres Freund
On 2018-09-21 18:00:35 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2018-09-22 09:15:27 +1200, Thomas Munro wrote:
> >> On Sat, Sep 22, 2018 at 8:51 AM Andres Freund  wrote:
> >>> I think there's some argument to be made about the "mental" complexity
> >>> of the macros - if we went for them, we'd certainly need to add some
> >>> docs about how they work.  One argument for having PP_NARGS (renamed) is
> >>> that it doesn't seem useful just here, but in a few other cases as well.
> 
> If you want to rename it, then to what?  VA_ARGS_NARGS, perhaps?

I like your suggestion. I mainly didn't like the PP_ prefix.


> >> It's a nice general facility to have in the tree.
> 
> Yeah, that's a fair point.
> 
> >> It seems to compile
> >> OK on clang, gcc, MSVC (I added this thread as CF entry 20/1798 as a
> >> lazy way to see if AppVeyor would build it OK, and it worked fine
> >> until conflicting commits landed).  I wonder if xlc, icc, aCC and Sun
> >> Studio can grok it.
> 
> > I think unless $compiler doesn't correctly implement vararg macros, it
> > really should just work.
> 
> Well, we'd find out pretty quickly if we try to use it here.

You earlier were talking about tackling this - do you still want to? I
can otherwise, but it'll not be today, but likely tomorrow.

Greetings,

Andres Freund



Re: transction_timestamp() inside of procedures

2018-09-21 Thread Bruce Momjian
On Fri, Sep 21, 2018 at 06:28:22AM -0400, Bruce Momjian wrote:
> On Fri, Sep 21, 2018 at 02:34:25PM +0900, Michael Paquier wrote:
> > On Thu, Sep 20, 2018 at 10:12:06PM -0700, Andres Freund wrote:
> > > Isn't the point that transaction_timestamp() does *not* currently change
> > > its value, even though the transaction (although not the outermost
> > > statement) has finished?
> > 
> > Ouch, yes.  I see the point now.  Indeed that's strange to not have a
> > new transaction timestamp after commit within the DO block..
> 
> So, this puts us in an odd position.  Right now everyone knows that
> statement_timestamp() is only changed by the outer statement, i.e., a
> SELECT in a function doesn't change statement_timestamp().   So, there
> is an argument that transaction_timestamp() should do the same and not
> change in a function --- in fact, if it does change, it would mean that
> transaction_timestamp() changes in a function, but statement_timestamp()
> doesn't --- that seems very odd.  It would mean that new statements in a
> function don't change statement_timestamp(), but new transctions in a
> function do --- again, very odd.

Sorry I was unclear about this.  It is only the third loop that proves
it is not advancing:

NOTICE:  clock   2018-09-21 18:01:00.63704-04
NOTICE:  statement   2018-09-21 18:01:00.636509-04
NOTICE:  transaction 2018-09-21 18:01:00.636509-04

NOTICE:  clock   2018-09-21 18:01:02.640033-04
NOTICE:  statement   2018-09-21 18:01:00.636509-04
NOTICE:  transaction 2018-09-21 18:01:00.636509-04

NOTICE:  clock   2018-09-21 18:01:04.642266-04
NOTICE:  statement   2018-09-21 18:01:00.636509-04
--> NOTICE:  transaction 2018-09-21 18:01:00.636509-04

Keep in mind that transaction_timestamp() is CURRENT_TIMESTAMP.  

I have always thought of clock/statement/transation as decreasing levels
of time precision, and it might be odd to change that.  I don't think we
want to change the behavior of statement_timestamp() in procedures, so
that kind of requires us not to change transaction_timestamp() inside of
procedures.

However, no change in behavior causes the problem that if you have a
transaction block using transaction_timestamp() or CURRENT_TIMESTAMP,
and you move it into a procedure, the behavior of those functions will
change, but this was always true of moving statement_timestamp() into a
function, and I have never heard a complaint about that.

Does the SQL standard have anything to say about CURRENT_TIMESTAMP in
procedures?  Do we need another function that does advance on procedure
commit?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: [PATCH] Include application_name in "connection authorized" log message

2018-09-21 Thread Don Seiler
On Tue, Aug 7, 2018 at 12:32 PM Tom Lane  wrote:

> Don Seiler  writes:
>
> > 1. We want to make a generic, central ascii-lobotomizing function similar
> > to check_application_name that we can re-use there and for other checks
> (eg
> > user name).
> > 2. Change check_application_name to call this function (or just call this
> > function instead of check_application_name()?)
>
> check_application_name's API is dictated by the GUC check-hook interface,
> and doesn't really make sense for this other use.  So the first part of
> that, not the second.
>
> > 3. Call this function when storing the value in the port struct.
>
> I'm not sure where exactly is the most sensible place to call it,
> but trying to minimize the number of places that know about this
> kluge seems like a good principle.
>

OK I created a new function called clean_ascii() in common/string.c. I call
this from my new logic in postmaster.c as well as replacing the logic in
guc.c's check_application_name() and check_cluster_name().

I've been fighting my own confusion with git and rebasing and fighting the
same conflicts over and over and over, but this patch should be what I
want. If anyone has time to review my git process, I would appreciate it. I
must be doing something wrong to have these same conflicts every time I
rebase (or I completely misunderstand what it actually does).

Thanks,
Don.

-- 
Don Seiler
www.seiler.us


0001-Changes-to-add-application_name-to-Port-struct-so-we.patch
Description: Binary data


Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Tom Lane
Andres Freund  writes:
> On 2018-09-22 09:15:27 +1200, Thomas Munro wrote:
>> On Sat, Sep 22, 2018 at 8:51 AM Andres Freund  wrote:
>>> I think there's some argument to be made about the "mental" complexity
>>> of the macros - if we went for them, we'd certainly need to add some
>>> docs about how they work.  One argument for having PP_NARGS (renamed) is
>>> that it doesn't seem useful just here, but in a few other cases as well.

If you want to rename it, then to what?  VA_ARGS_NARGS, perhaps?

>> It's a nice general facility to have in the tree.

Yeah, that's a fair point.

>> It seems to compile
>> OK on clang, gcc, MSVC (I added this thread as CF entry 20/1798 as a
>> lazy way to see if AppVeyor would build it OK, and it worked fine
>> until conflicting commits landed).  I wonder if xlc, icc, aCC and Sun
>> Studio can grok it.

> I think unless $compiler doesn't correctly implement vararg macros, it
> really should just work.

Well, we'd find out pretty quickly if we try to use it here.

regards, tom lane



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Thomas Munro
On Sat, Sep 22, 2018 at 9:46 AM Andres Freund  wrote:
> On 2018-09-22 09:15:27 +1200, Thomas Munro wrote:
> > On Sat, Sep 22, 2018 at 8:51 AM Andres Freund  wrote:
> > > I think there's some argument to be made about the "mental" complexity
> > > of the macros - if we went for them, we'd certainly need to add some
> > > docs about how they work.  One argument for having PP_NARGS (renamed) is
> > > that it doesn't seem useful just here, but in a few other cases as well.
> >
> > It's a nice general facility to have in the tree.  It seems to compile
> > OK on clang, gcc, MSVC (I added this thread as CF entry 20/1798 as a
> > lazy way to see if AppVeyor would build it OK, and it worked fine
> > until conflicting commits landed).  I wonder if xlc, icc, aCC and Sun
> > Studio can grok it.
>
> I think unless $compiler doesn't correctly implement vararg macros, it
> really should just work.  There's nothing but pretty smart use of
> actually pretty plain vararg macros.  If any of the other compilers have
> troubles with that, they'd really not implement vararg macros...

I vote for doing it this way then.  It may turn out to be useful for
efficient SearchSysCache(...), DirectFunctionCall(...) and other
things like that.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Andres Freund
Hi,

On 2018-09-22 09:15:27 +1200, Thomas Munro wrote:
> On Sat, Sep 22, 2018 at 8:51 AM Andres Freund  wrote:
> > I think there's some argument to be made about the "mental" complexity
> > of the macros - if we went for them, we'd certainly need to add some
> > docs about how they work.  One argument for having PP_NARGS (renamed) is
> > that it doesn't seem useful just here, but in a few other cases as well.
> 
> It's a nice general facility to have in the tree.  It seems to compile
> OK on clang, gcc, MSVC (I added this thread as CF entry 20/1798 as a
> lazy way to see if AppVeyor would build it OK, and it worked fine
> until conflicting commits landed).  I wonder if xlc, icc, aCC and Sun
> Studio can grok it.

I think unless $compiler doesn't correctly implement vararg macros, it
really should just work.  There's nothing but pretty smart use of
actually pretty plain vararg macros.  If any of the other compilers have
troubles with that, they'd really not implement vararg macros...

Greetings,

Andres Freund



Re: Proposal for disk quota feature

2018-09-21 Thread Jeremy Finzel
On Fri, Sep 21, 2018 at 9:21 AM Hubert Zhang  wrote:

> just fast reaction - why QUOTA object?
>> Isn't ALTER SET enough?
>> Some like
>> ALTER TABLE a1 SET quote = 1MB;
>> ALTER USER ...
>> ALTER SCHEMA ..
>> New DDL commans looks like too hard hammer .
>
>
> It's an option. Prefer to consider quota setting store together:
> CREATE DISK QUOTA way is more nature to store quota setting in a separate
> pg_diskquota catalog
> While ALTER SET way is more close to store quota setting in pg_class,
> pg_role, pg_namespace. etc in an integrated way.
> (Note that here I mean nature/close is not must, ALTER SET could also
> store in pg_diskquota and vice versa.)
>
> Here are some differences I can think of:
> 1 pg_role is a global catalog, not per database level. It's harder to
> tracker the user's disk usage in the whole clusters(considering 1000+
> databases).  So the semantic of  CREATE DISK QUOTA ON USER is limited: it
> only tracks the user's disk usage inside the current database.
> 2 using separate pg_diskquota could add more field except for quota limit
> without adding too many fields in pg_class, e.g. red zone to give the user
> a warning or the current disk usage of the db objects.
>
> On Fri, Sep 21, 2018 at 8:01 PM Pavel Stehule 
> wrote:
>
>>
>>
>> pá 21. 9. 2018 v 13:32 odesílatel Hubert Zhang 
>> napsal:
>>
>>>
>>>
>>>
>>>
>>> *Hi all,We redesign disk quota feature based on the comments from Pavel
>>> Stehule and Chapman Flack. Here are the new design.OverviewBasically,  disk
>>> quota feature is used to support multi-tenancy environment, different level
>>> of database objects could be set a quota limit to avoid over use of disk
>>> space. A common case could be as follows: DBA could enable disk quota on a
>>> specified database list. DBA could set disk quota limit for
>>> tables/schemas/roles in these databases. Separate disk quota worker process
>>> will monitor the disk usage for these objects and detect the objects which
>>> exceed their quota limit. Queries loading data into these “out of disk
>>> quota” tables/schemas/roles will be cancelled.We are currently working at
>>> init implementation stage. We would like to propose our idea firstly and
>>> get feedbacks from community to do quick iteration.SQL Syntax (How to use
>>> disk quota)1 Specify the databases with disk quota enabled in GUC
>>> “diskquota_databases” in postgresql.conf and restart the database.2 DBA
>>> could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1
>>> ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with
>>> (quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota =
>>> ‘3MB’);*
>>>
>>
>> just fast reaction - why QUOTA object?
>>
>> Isn't ALTER SET enough?
>>
>> Some like
>>
>> ALTER TABLE a1 SET quote = 1MB;
>> ALTER USER ...
>> ALTER SCHEMA ..
>>
>> New DDL commans looks like too hard hammer .
>>
>>
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *3 Simulate a schema out of quota limit case: suppose table a1 and table
>>> a2 are both under schema s1.INSERT INTO a1 SELECT
>>> generate_series(1,1000);INSERT INTO a2 SELECT
>>> generate_series(1,300);SELECT pg_sleep(5)INSERT INTO a1 SELECT
>>> generate_series(1,1000);ERROR:  schema's disk space quota exceededDROP
>>> TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT
>>> generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the
>>> following components.1. Quota Setting Store is where the disk quota setting
>>> to be stored and accessed. We plan to use catalog table pg_diskquota to
>>> store these information. pg_diskquota is
>>> like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; /*
>>> diskquota name */int16 quotatype; /* diskquota type name */ Oid
>>> quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /*
>>> diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in
>>> MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of size
>>> change of database objects. We plan to use stat collector to detect the
>>> ‘active’ table list at initial stage. But stat collector has some
>>> limitation on finding the active table which is in a running transaction.
>>> Details see TODO section.3. Quota Size Checker is where to calculate the
>>> size and compare with quota limit for database objects. According to
>>> Pavel’s comment, autovacuum launcher and worker process could be a good
>>> reference to disk quota. So we plan to use a disk quota launcher daemon
>>> process and several disk quota worker process to finish this work. Launcher
>>> process is responsible for starting worker process based on a user defined
>>> database list from GUC. Worker process will connect to its target database
>>> and monitor the disk usage for objects in this database. In init stage of
>>> worker process, it will call calculate_total_relation_size() to calculate
>>> the size for each user table. After init stage, worker process will 

Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Thomas Munro
On Sat, Sep 22, 2018 at 8:51 AM Andres Freund  wrote:
> I think there's some argument to be made about the "mental" complexity
> of the macros - if we went for them, we'd certainly need to add some
> docs about how they work.  One argument for having PP_NARGS (renamed) is
> that it doesn't seem useful just here, but in a few other cases as well.

It's a nice general facility to have in the tree.  It seems to compile
OK on clang, gcc, MSVC (I added this thread as CF entry 20/1798 as a
lazy way to see if AppVeyor would build it OK, and it worked fine
until conflicting commits landed).  I wonder if xlc, icc, aCC and Sun
Studio can grok it.

--
Thomas Munro
http://www.enterprisedb.com



Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-09-21 Thread Andres Freund
On 2018-09-22 08:54:57 +1200, Thomas Munro wrote:
> On Fri, Sep 21, 2018 at 4:43 PM Tom Lane  wrote:
> > Thomas Munro  writes:
> > > On Fri, Sep 21, 2018 at 4:06 PM Tom Lane  wrote:
> > >> Why would we fix it rather than just removing it?
> >
> > > I assumed we wouldn't remove an extern C function extension code
> > > somewhere might use.  Though admittedly I'd be surprised if anyone
> > > used this one.
> >
> > Unless it looks practical to support this behavior in the Windows
> > and SysV cases, I think we should get rid of it rather than expend
> > effort on supporting it for just some platforms.
> 
> We can remove it in back-branches without breaking API compatibility:
> 
> 1.  Change dsm_impl_can_resize() to return false unconditionally (I
> suppose client code is supposed to check this before using
> dsm_resize(), though I'm not sure why it has an "impl" in its name if
> it's part of the public interface of this module).
> 2.  Change dsm_resize() and dsm_remap() to raise an error conditionally.
> 3.  Rip out the DSM_OP_RESIZE cases from various places.
> 
> Then in master, remove all of those functions completely.  However,
> I'd feel like a bit of a vandal.  Robert and Amit probably had plans
> for that code...?

Robert, Amit: ^

Greetings,

Andres Freund



Re: Strange failure in LWLock on skink in REL9_5_STABLE

2018-09-21 Thread Thomas Munro
On Fri, Sep 21, 2018 at 4:43 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > On Fri, Sep 21, 2018 at 4:06 PM Tom Lane  wrote:
> >> Why would we fix it rather than just removing it?
>
> > I assumed we wouldn't remove an extern C function extension code
> > somewhere might use.  Though admittedly I'd be surprised if anyone
> > used this one.
>
> Unless it looks practical to support this behavior in the Windows
> and SysV cases, I think we should get rid of it rather than expend
> effort on supporting it for just some platforms.

We can remove it in back-branches without breaking API compatibility:

1.  Change dsm_impl_can_resize() to return false unconditionally (I
suppose client code is supposed to check this before using
dsm_resize(), though I'm not sure why it has an "impl" in its name if
it's part of the public interface of this module).
2.  Change dsm_resize() and dsm_remap() to raise an error conditionally.
3.  Rip out the DSM_OP_RESIZE cases from various places.

Then in master, remove all of those functions completely.  However,
I'd feel like a bit of a vandal.  Robert and Amit probably had plans
for that code...?

--
Thomas Munro
http://www.enterprisedb.com



Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Andres Freund
Hi,

On 2018-09-21 16:20:42 -0400, Tom Lane wrote:
> Thomas Munro  writes:
> > On Fri, Sep 21, 2018 at 5:52 PM Andres Freund  wrote:
> >> Here's a very quick-and-dirty implementation of this approach. Some very
> >> very brief testing seems to indicate it works, although I'm sure not
> >> perfectly.
> 
> > And here is a quick-and-dirty variadic COMPLETE_WITH(...).  Together:
> 
> I'm a bit inclined to get rid of the need for PP_NARG() by instead making
> the macros add a trailing NULL argument, viz
> 
> #define TailMatches(...) \
>   CheckTailMatchesFor(previous_words_count, previous_words, __VA_ARGS__, NULL)

I don't think that'd *quite* work right now - MatchAny is also NULL. We
probably could make it work by redefining either MatchAny or the last
argument to CheckTailMatchesFor() to some other value however.


> This'd require (some of) the implementation functions to iterate over
> their variadic arguments twice, the first time merely counting how many
> there are.

Yea, leaving the above problem aside, I've a hard time to get excited
about that overhead.


> But we aren't exactly concerned about max runtime performance
> here, and the PP_NARG() thing seems like a crock that could easily blow
> out compilation time on some compilers.

It's not actually that complicated an expansion, and we've not
encountered many problems with expansions that are similarly complex,
e.g. ereport et al.  It's pretty likely at least as cheap as the current
approach, where we sometimes have 9 deep recursion. So I'm not too
concerned about the compile-time performance. FWIW, I tested it with gcc
-E yesterday, and I couldn't measure any difference.

I think there's some argument to be made about the "mental" complexity
of the macros - if we went for them, we'd certainly need to add some
docs about how they work.  One argument for having PP_NARGS (renamed) is
that it doesn't seem useful just here, but in a few other cases as well.

Greetings,

Andres Freund



Re: Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE

2018-09-21 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Sep 21, 2018 at 5:52 PM Andres Freund  wrote:
>> Here's a very quick-and-dirty implementation of this approach. Some very
>> very brief testing seems to indicate it works, although I'm sure not
>> perfectly.

> And here is a quick-and-dirty variadic COMPLETE_WITH(...).  Together:

I'm a bit inclined to get rid of the need for PP_NARG() by instead making
the macros add a trailing NULL argument, viz

#define TailMatches(...) \
  CheckTailMatchesFor(previous_words_count, previous_words, __VA_ARGS__, NULL)

This'd require (some of) the implementation functions to iterate over
their variadic arguments twice, the first time merely counting how many
there are.  But we aren't exactly concerned about max runtime performance
here, and the PP_NARG() thing seems like a crock that could easily blow
out compilation time on some compilers.

If people are OK with that design decision, I'm happy to assemble these
pieces and push it through.  I just had to add two more versions of
HeadMatchesN :-( so I'm feeling motivated to make something happen.

regards, tom lane



Re: Redundant psql tab-completion for CREATE PUBLICATION

2018-09-21 Thread Tom Lane
[ This seems to have slipped through the cracks, sorry about that ]

Edmund Horner  writes:
> While looking at Justin's patch for VACUUM completions, I found an
> existing bit of code that tries to match on a word with a space:
>/* Complete "CREATE PUBLICATION  FOR TABLE " */
> else if (Matches4("CREATE", "PUBLICATION", MatchAny, "FOR TABLE"))
> COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);

> I think the clause will never match the "FOR TABLE" word;

Yeah, that's certainly right.

> and can, in
> any case, be removed, as the the final check for completable "things"
> (at the end of psql_completion) will see "TABLE" and act
> appropriately.

But that only works for the table name immediately after TABLE, not
for a list of table names.  I think we need to fix it like this,
instead:

-   /* Complete "CREATE PUBLICATION  FOR TABLE " */
-   else if (Matches4("CREATE", "PUBLICATION", MatchAny, "FOR TABLE"))
+   /* Complete "CREATE PUBLICATION  FOR TABLE , ..." */
+   else if (HeadMatches5("CREATE", "PUBLICATION", MatchAny, "FOR", 
"TABLE"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);

Pushed like that.

regards, tom lane



Re: [HACKERS] proposal: schema variables

2018-09-21 Thread Arthur Zakirov
On Wed, Sep 19, 2018 at 04:36:40PM +0200, Pavel Stehule wrote:
> ON COMMIT DROP is used only for temp variables (transaction or not
> transaction). The purpose is same like for tables. Sometimes you can to
> have object with shorter life than is session.
> 
> ON TRANSACTION END RESET has sense mainly for not transaction variables. I
> see two use cases.
> 
> 1. protect some sensitive data - on transaction end guaranteed reset and
> cleaning on end transaction. So you can be sure, so variable is not
> initialized (has default value), or you are inside transaction.
> 
> 2. automatic initialization - ON TRANSACTION END RESET ensure so variable
> is in init state for any transaction.
> 
> Both cases has sense for transaction or not transaction variables.
> 
> I am thinking so transaction life time for content has sense. Is cheaper to
> reset variable than drop it (what ON COMMIT DROP does)
> 
> What do you think?

Thanks, I understood the cases.

But I think there is more sense to use these options only with transactional
variables. It is more consistent and simple for me.

As a summary, it is 1 voice vs 1 voice :) So it is better to leave the
syntax as is without changes for now.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: adding tab completions

2018-09-21 Thread Tom Lane
I wrote:
> The main thing that is bothering me about the remainder is its desire
> to offer single-punctuation-character completions such as "(".  I do
> not see the point of that.  You can't select a completion without
> typing at least one character, so what does it accomplish to offer
> those options, except clutter?

Actually, after poking at it for awhile, I noticed there was a much bigger
problem: tests like

else if (HeadMatches2("VACUUM", "("))

would continue to fire even if the option list was complete, so that
after typing say

vacuum ( verbose ) 

you would not get offered any table names, only option names.

I experimented with various fixes for that, but the only one that
worked required extending word_matches() to allow a wildcard in the
middle of a pattern.  (Before it only allowed one at the end; but
it takes just a couple more lines of code to improve that.)  With
that, we can do

else if (HeadMatches2("VACUUM", "(*") &&
 !HeadMatches2("VACUUM", "(*)"))

and this test will not trigger if the option list is complete.

I've gone ahead and pushed it with those changes.

regards, tom lane



Re: doc - add missing documentation for "acldefault"

2018-09-21 Thread Joe Conway
On 09/19/2018 11:18 AM, Joe Conway wrote:
> On 09/19/2018 10:54 AM, Tom Lane wrote:
>> Joe Conway  writes:
>>> * I do believe aclitemeq() has utility outside internal purposes.
>> 
>> Our normal policy is that we do not document functions that are meant to
>> be invoked through operators.  The \df output saying that is sufficient:
> 
> 
> 
>> I would strongly object to ignoring that policy in just one place.
> 
> Ok, fair enough.
> 
>> Actually, it appears that most of these functions have associated
>> operators:
>> 
>> # select oid::regoperator, oprcode from pg_operator where oprright = 
>> 'aclitem'::regtype;
>>   oid  |   oprcode   
>> ---+-
>>  +(aclitem[],aclitem)  | aclinsert
>>  -(aclitem[],aclitem)  | aclremove
>>  @>(aclitem[],aclitem) | aclcontains
>>  =(aclitem,aclitem)| aclitemeq
>>  ~(aclitem[],aclitem)  | aclcontains
>> (5 rows)
>> 
>> So maybe what we really need is a table of operators not functions.
> 
> Good idea -- I will take a look at that.
> 
>> However, I don't object to documenting any function that has its
>> own pg_description string.

Ok, so the attached version refactors/splits the group into two tables
-- operators and functions.

It drops aclinsert and aclremove entirely due to the fact that they no
longer do anything useful, to wit:
-
Datum
aclinsert(PG_FUNCTION_ARGS)
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("aclinsert is no longer supported")));

PG_RETURN_NULL();   /* keep compiler quiet */
}

Datum
aclremove(PG_FUNCTION_ARGS)
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("aclremove is no longer supported")));

PG_RETURN_NULL();   /* keep compiler quiet */
}
-

I also included John Naylor's patch with some minor editorialization.

Any further comments or complaints?

Thanks,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4331beb..8ebb1bf 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** SELECT * FROM pg_ls_dir('.') WITH ORDINA
*** 15962,15968 
   
  
   
!   System Information Functions
  

  shows several
--- 15962,15968 
   
  
   
!   System Information Functions and Operators
  

  shows several
*** SELECT has_function_privilege('joeuser',
*** 16894,16899 
--- 16894,17034 
 
  

+ shows the operators
+available for the aclitem type, which is the internal
+representation of access privileges. An aclitem entry
+describes the permissions of a grantee, whether they are grantable
+or not, and which grantor granted them. For instance,
+calvin=r*w/hobbes specifies that the role
+calvin has the grantable privilege
+SELECT (r*) and the non-grantable
+privilege UPDATE (w), granted by
+the role hobbes. An empty grantee stands for
+PUBLIC.
+   
+ 
+
+ aclitem
+
+
+ acldefault
+
+
+ aclitemeq
+
+
+ aclcontains
+
+
+ aclexplode
+
+
+ makeaclitem
+
+ 
+ 
+  aclitem Operators
+  
+   
+
+ Operator
+ Description
+ Example
+ Result
+
+   
+   
+ 
+
+  = 
+ equal
+ 'calvin=r*w/hobbes'::aclitem = 'calvin=r*w*/hobbes'::aclitem
+ f
+
+ 
+
+  @ 
+ contains element
+ '{calvin=r*w/hobbes,hobbes=r*w*/postgres}'::aclitem[] @> 'calvin=r*w/hobbes'::aclitem
+ t
+
+ 
+
+  ~ 
+ contains element
+ '{calvin=r*w/hobbes,hobbes=r*w*/postgres}'::aclitem[] ~ 'calvin=r*w/hobbes'::aclitem
+ t
+
+ 
+   
+  
+ 
+ 
+
+  shows some additional
+ functions to manage the aclitem type.
+
+ 
+
+ aclitem Functions
+ 
+  
+   Name Return Type Description
+  
+  
+   
+acldefault(type,
+   ownerId)
+aclitem[]
+get the hardcoded default access privileges for an object belonging to ownerId
+   
+   
+aclexplode(aclitem[])
+setof record
+get aclitem array as tuples
+   
+   
+makeaclitem(grantee, grantor, privilege, grantable)
+aclitem
+build an aclitem from input
+   
+  
+ 
+
+ 
+
+ acldefault returns the hardcoded default access privileges
+ for an object of type belonging to role ownerId.
+ Notice that these are used in the absence of any pg_default_acl
+ () entry. Default access privileges are described in
+  and can be overwritten with
+ . In other words, this function will return
+ results which may be misleading when the defaults have been 

Re: pgsql: Improve autovacuum logging for aggressive and anti-wraparound ru

2018-09-21 Thread Andres Freund
Hi,

On 2018-09-21 20:38:16 +0300, Sergei Kornilov wrote:
> > My first question was whether TWO of them were dead code ... isn't an
> > aggressive vacuum to prevent wraparound, and a vacuum to prevent
> > wraparound aggressive?
> Maybe i am wrong, aggressive autovacuum was your commit.
> Message split was in b55509332f50f998b6e8b3830a51c5b9d8f666aa
> Aggressive autovacuum was in fd31cd265138019d9b5fe53043670898bc9f
> 
> If aggressive really is wraparound without difference - i think we need 
> refactor this code, it is difficult have two different flags for same purpose.
> 
> But as far i can see it is possible have aggressive non-wraparound vacuum. 
> One important difference - regular and aggressive regular can be canceled by 
> backend,.wraparound autovacuum can not. (by checking 
> PROC_VACUUM_FOR_WRAPAROUND in src/backend/storage/lmgr/proc.c )

Yes, without checking the code, they should be different. Aggressive is
controlled by vacuum_freeze_table_age whereas anti-wrap is controlled by
autovacuum_freeze_max_age (but also implies aggressive).

Greetings,

Andres Freund



Re: pgsql: Improve autovacuum logging for aggressive and anti-wraparound ru

2018-09-21 Thread Sergei Kornilov
Hello, Robert

> My first question was whether TWO of them were dead code ... isn't an
> aggressive vacuum to prevent wraparound, and a vacuum to prevent
> wraparound aggressive?
Maybe i am wrong, aggressive autovacuum was your commit.
Message split was in b55509332f50f998b6e8b3830a51c5b9d8f666aa
Aggressive autovacuum was in fd31cd265138019d9b5fe53043670898bc9f

If aggressive really is wraparound without difference - i think we need 
refactor this code, it is difficult have two different flags for same purpose.

But as far i can see it is possible have aggressive non-wraparound vacuum. One 
important difference - regular and aggressive regular can be canceled by 
backend,.wraparound autovacuum can not. (by checking PROC_VACUUM_FOR_WRAPAROUND 
in src/backend/storage/lmgr/proc.c )

regards, Sergei



pg_atomic_exchange_u32() in ProcArrayGroupClearXid()

2018-09-21 Thread Alexander Korotkov
Hi!

While investigating ProcArrayGroupClearXid() code I wonder why do we have
this loop instead of plain pg_atomic_exchange_u32() call?  Is it
intentional?

/*
* Now that we've got the lock, clear the list of processes waiting for
* group XID clearing, saving a pointer to the head of the list.  Trying
* to pop elements one at a time could lead to an ABA problem.
*/
while (true)
{
nextidx = pg_atomic_read_u32(>procArrayGroupFirst);
if (pg_atomic_compare_exchange_u32(>procArrayGroupFirst,
   ,
   INVALID_PGPROCNO))
break;
}

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: proposal: prefix function

2018-09-21 Thread Pavel Stehule
Hi

pá 21. 9. 2018 v 12:37 odesílatel Chris Travers 
napsal:

>
>
> On Fri, Sep 21, 2018 at 10:09 AM Pavel Stehule 
> wrote:
>
>> Hi
>>
>> can we implement prefix function for fast test if substr is prefix of
>> some string?
>>
>> create or replace function prefix(str text, substr text)
>> returns boolean as $$
>>   select substr(str, 1, length(substr)) = substr
>> $$ language sql;
>>
>> This function can be very effective in C language. Now it should be
>> implemented with like or regexp, what is significantly more expensive.
>>
>>
> These would just be wrappers around already existing internal functions,
> right?
>

It is already - I forgot

starts_with

It is PostgreSQL 11 feature 710d90da1fd8c1d028215ecaf7402062079e99e9

I like when I can do nothing :)

Pavel



>
>> Regards
>>
>> Pavel
>>
>
>
> --
> Best Regards,
> Chris Travers
> Head of Database
>
> Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
> Saarbrücker Straße 37a, 10405 Berlin
>
>


Re: pgsql: Improve autovacuum logging for aggressive and anti-wraparound ru

2018-09-21 Thread Robert Haas
On Fri, Sep 14, 2018 at 11:35 AM, Alvaro Herrera
 wrote:
> On 2018-Sep-13, Michael Paquier wrote:
>> Improve autovacuum logging for aggressive and anti-wraparound runs
>>
>> A log message was being generated when log_min_duration is reached for
>> autovacuum on a given relation to indicate if it was an aggressive run,
>> and missed the point of mentioning if it is doing an anti-wrapround
>> run.  The log message generated is improved so as one, both or no extra
>> details are added depending on the option set.
>
> Hmm, can a for-wraparound vacuum really not be aggressive?  I think one
> of those four cases is really dead code.

My first question was whether TWO of them were dead code ... isn't an
aggressive vacuum to prevent wraparound, and a vacuum to prevent
wraparound aggressive?

I can't figure out what this is giving us that we didn't have before.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PATCH: pgbench - option to build using ppoll() for larger connection counts

2018-09-21 Thread Tom Lane
I wrote:
> So ... why exactly is this patch insisting on ppoll() rather than just
> plain poll()?  AFAICS, all you're doing with that is being able to
> specify the timeout in microsec not millisec, which does not really
> justify taking much of a hit in portability, to my mind.

To check into my assumptions here, I did a bit of testing to find out
what ppoll can really do in terms of timeout accuracy.  As best I can
tell, its resolution is on the order of 100 usec on both fairly new
Linux kernels (4.17.19 on Fedora 28) and fairly old ones (2.6.32 on
RHEL6).  So there's not really that much daylight between what you can
do with ppoll() and with poll()'s millisecond resolution.  select(2)
seems to have about the same effective timeout resolution.

Interestingly, select(2)'s timeout resolution is noticeably better than
that, around 10 usec, on recent macOS.  Didn't try other platforms.

Also, I'd misunderstood the portability angle.  Unlike pselect(),
ppoll() is *not* in POSIX.  It started as a Linux-ism, although
I see it's appeared recently in FreeBSD.  That somewhat assuages
my fear of broken implementations on specific platforms, but it
also means that it's going to be far from universally available.

So the question here really boils down to whether you think that a
large set of pgbench connections is interesting on non-Linux platforms.
There's a case to be made that it isn't, perhaps, but I'm not exactly
sold.

On the other hand, while we could certainly make a poll-based code path
within this patch, I'm not quite sure what we'd do with it.  My results
for macOS indicate that using poll rather than select would create a
tradeoff: in return for possibly allowing more clients, there would be
a definite loss in timeout precision.  I don't think that limiting
\sleep commands to ms precision would be so awful, but it's easier
to believe that loss of precision in the transaction dispatch rate for
"--rate" tests could be a problem for some people.  So we might have
to expose the choice of which call to use to users, which seems like
a mess.

So maybe what we've got here --- make it better on Linux, with no
change elsewhere --- is about as good as we can hope for.

Also, I notice that the kevent syscall available on macOS and some
BSDen uses a struct-timespec timeout parameter, ie, theoretical nsec
resolution same as ppoll.  So there's a case to be made that where
we should go for those platforms is to build a kevent-based code
path not a poll-based one.  It'd be unreasonable to insist on this
patch providing that, though.

Anyway, bottom line: my objection on Wednesday was mainly based on
the assumption that we might have to contend with broken ppoll on
some platforms, which now appears to be fallacious.  So, objection
withdrawn.

regards, tom lane



[patch] Bug in pg_dump/pg_restore using --no-publication

2018-09-21 Thread Gilles Darold
Hi,


Attached is a patch that fixes a bug in pg_dump since 10.0 and
reproducible in master. When using option --no-publication : ALTER
PUBLICATION orders are still present in the dump.


Steps to reproduce:

postgres=# CREATE DATABASE test;
CREATE DATABASE
postgres=# \c test
You are now connected to database "test" as user "postgres".
test=# create table t1 (id integer);
CREATE TABLE
test=# CREATE PUBLICATION p1 FOR TABLE t1;
CREATE PUBLICATION
test=# CREATE PUBLICATION p_all FOR ALL TABLES;
CREATE PUBLICATION
test=# CREATE PUBLICATION p_insert_only FOR TABLE t1 WITH (publish =
'insert');
CREATE PUBLICATION
test=# \q

$ pg_dump -p 5400 --no-publication test | grep PUBLICATION
-- Name: p1 t1; Type: PUBLICATION TABLE; Schema: public; Owner:
ALTER PUBLICATION p1 ADD TABLE ONLY public.t1;
-- Name: p_insert_only t1; Type: PUBLICATION TABLE; Schema: public; Owner:
ALTER PUBLICATION p_insert_only ADD TABLE ONLY public.t1;


pg_restore suffers the same problem:

pg_restore --no-publication test.dump | grep PUBLICATION
-- Name: p1 t1; Type: PUBLICATION TABLE; Schema: public; Owner:
ALTER PUBLICATION p1 ADD TABLE ONLY public.t1;
-- Name: p_insert_only t1; Type: PUBLICATION TABLE; Schema: public; Owner:
ALTER PUBLICATION p_insert_only ADD TABLE ONLY public.t1;

pg_restore --no-publication test.dump -l test.dump| grep PUBLICATION
2230; 6106 16389 PUBLICATION TABLE public p1 t1
2231; 6106 16392 PUBLICATION TABLE public p_insert_only t1


Should I add it to current commitfest ?


-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 7d1d439ba2..2e86664c17 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -2920,7 +2920,8 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		return 0;
 
 	/* If it's a publication, maybe ignore it */
-	if (ropt->no_publications && strcmp(te->desc, "PUBLICATION") == 0)
+	if (ropt->no_publications && (strcmp(te->desc, "PUBLICATION") == 0
+		|| strcmp(te->desc, "PUBLICATION TABLE") == 0))
 		return 0;
 
 	/* If it's a security label, maybe ignore it */
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 0687a81914..c8d01ed4a4 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -3907,6 +3907,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 	PQExpBuffer query;
 	PGresult   *res;
 	PublicationRelInfo *pubrinfo;
+	DumpOptions *dopt = fout->dopt;
 	int			i_tableoid;
 	int			i_oid;
 	int			i_pubname;
@@ -3914,7 +3915,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables)
 j,
 ntups;
 
-	if (fout->remoteVersion < 10)
+	if (dopt->no_publications || fout->remoteVersion < 10)
 		return;
 
 	query = createPQExpBuffer();


Re: FETCH FIRST clause PERCENT option

2018-09-21 Thread Mark Dilger



> On Sep 20, 2018, at 5:29 PM, Andres Freund  wrote:
> 
> Hi,
> 
> On 2018-09-20 17:06:36 -0700, Mark Dilger wrote:
>> I should think that spilling anything to a tuplestore would only be needed
>> if the query contains an ORDER BY expression.  If you query
>> 
>>  FETCH FIRST 50 PERCENT * FROM foo;
>> 
>> you should just return every other row, discarding the rest, right?  It's
>> only when an explicit ordering is given that the need to store the results
>> arises.  Even with
>> 
>>  FETCH FIRST 50 PERCENT name FROM foo ORDER BY name;
>> 
>> you can return one row for every two rows that you get back from the
>> sort node, reducing the maximum number you need to store at any time to
>> no more than 25% of all rows.
> 
> I'm doubtful about the validity of these optimizations, particularly
> around being surprising. But I think more importantly, we should focus
> on the basic implementation that's needed anyway.

You may be right that getting the basic implementation finished first
is better than optimizing at this stage.  So the rest of what I'm going
to say is just in defense of the optimization, and not an argument for
needing to optimize right away.

As for reducing the surprise factor, I think that it would be surprising
if I ask for a smallish percentage of rows and it takes significantly longer
and significantly more memory or disk than asking for all the rows takes.
If I'm including an explicit ORDER BY, then that explains it, but otherwise,
I'd be surprised.  Note that I'm not saying I'd be surprised by it taking
roughly the same length of time / memory / disk.  I'd only be surprised if
it took a lot more.

There are plenty of SQL generation engines that people put in their software.
I'd expect something like

sprintf("FETCH FIRST %d PERCENT %s FROM %s", percentage, columns, 
tablename)

to show up in such engines, and percentage to sometimes be 100.  At least
in that case you should just return all rows rather than dumping them into
a tuplestore.  Likewise, if the percentage is 0, you'll want to finish quickly.
Actually, I don't know if the SQL spec would require side effects to still
happen, in which case you'd still have to generate all rows for their side
effects to happen, and then just not return them.  But still, no tuplestore.
So the implementation of FETCH FIRST would at least need to think about what
percentage is being requested, rather than just mindlessly adding a node to
the tree for storing everything, then computing the LIMIT based on the number
of rows stored, and then returning that number of rows.

mark


Re: Proposal for disk quota feature

2018-09-21 Thread Hubert Zhang
>
> just fast reaction - why QUOTA object?
> Isn't ALTER SET enough?
> Some like
> ALTER TABLE a1 SET quote = 1MB;
> ALTER USER ...
> ALTER SCHEMA ..
> New DDL commans looks like too hard hammer .


It's an option. Prefer to consider quota setting store together:
CREATE DISK QUOTA way is more nature to store quota setting in a separate
pg_diskquota catalog
While ALTER SET way is more close to store quota setting in pg_class,
pg_role, pg_namespace. etc in an integrated way.
(Note that here I mean nature/close is not must, ALTER SET could also store
in pg_diskquota and vice versa.)

Here are some differences I can think of:
1 pg_role is a global catalog, not per database level. It's harder to
tracker the user's disk usage in the whole clusters(considering 1000+
databases).  So the semantic of  CREATE DISK QUOTA ON USER is limited: it
only tracks the user's disk usage inside the current database.
2 using separate pg_diskquota could add more field except for quota limit
without adding too many fields in pg_class, e.g. red zone to give the user
a warning or the current disk usage of the db objects.

On Fri, Sep 21, 2018 at 8:01 PM Pavel Stehule 
wrote:

>
>
> pá 21. 9. 2018 v 13:32 odesílatel Hubert Zhang  napsal:
>
>>
>>
>>
>>
>> *Hi all,We redesign disk quota feature based on the comments from Pavel
>> Stehule and Chapman Flack. Here are the new design.OverviewBasically,  disk
>> quota feature is used to support multi-tenancy environment, different level
>> of database objects could be set a quota limit to avoid over use of disk
>> space. A common case could be as follows: DBA could enable disk quota on a
>> specified database list. DBA could set disk quota limit for
>> tables/schemas/roles in these databases. Separate disk quota worker process
>> will monitor the disk usage for these objects and detect the objects which
>> exceed their quota limit. Queries loading data into these “out of disk
>> quota” tables/schemas/roles will be cancelled.We are currently working at
>> init implementation stage. We would like to propose our idea firstly and
>> get feedbacks from community to do quick iteration.SQL Syntax (How to use
>> disk quota)1 Specify the databases with disk quota enabled in GUC
>> “diskquota_databases” in postgresql.conf and restart the database.2 DBA
>> could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1
>> ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with
>> (quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota =
>> ‘3MB’);*
>>
>
> just fast reaction - why QUOTA object?
>
> Isn't ALTER SET enough?
>
> Some like
>
> ALTER TABLE a1 SET quote = 1MB;
> ALTER USER ...
> ALTER SCHEMA ..
>
> New DDL commans looks like too hard hammer .
>
>
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *3 Simulate a schema out of quota limit case: suppose table a1 and table
>> a2 are both under schema s1.INSERT INTO a1 SELECT
>> generate_series(1,1000);INSERT INTO a2 SELECT
>> generate_series(1,300);SELECT pg_sleep(5)INSERT INTO a1 SELECT
>> generate_series(1,1000);ERROR:  schema's disk space quota exceededDROP
>> TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT
>> generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the
>> following components.1. Quota Setting Store is where the disk quota setting
>> to be stored and accessed. We plan to use catalog table pg_diskquota to
>> store these information. pg_diskquota is
>> like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; /*
>> diskquota name */int16 quotatype; /* diskquota type name */ Oid
>> quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /*
>> diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in
>> MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of size
>> change of database objects. We plan to use stat collector to detect the
>> ‘active’ table list at initial stage. But stat collector has some
>> limitation on finding the active table which is in a running transaction.
>> Details see TODO section.3. Quota Size Checker is where to calculate the
>> size and compare with quota limit for database objects. According to
>> Pavel’s comment, autovacuum launcher and worker process could be a good
>> reference to disk quota. So we plan to use a disk quota launcher daemon
>> process and several disk quota worker process to finish this work. Launcher
>> process is responsible for starting worker process based on a user defined
>> database list from GUC. Worker process will connect to its target database
>> and monitor the disk usage for objects in this database. In init stage of
>> worker process, it will call calculate_total_relation_size() to calculate
>> the size for each user table. After init stage, worker process will refresh
>> the disk model every N seconds. Refreshing will only recalculate the size
>> of tables in ‘active’ table list, which is generated by Quata Change
>> Detector to minimize the 

Re: [HACKERS] SERIALIZABLE on standby servers

2018-09-21 Thread Thomas Munro
On Sat, Sep 22, 2018 at 12:28 AM Thomas Munro
 wrote:
> I'll add it to the next
> Commitfest so I know when to rebase it.

And cfbot immediately showed that this assertion in
OldSerXidSetActiveSerXmin() could fail in the isolation tests:

Assert(!TransactionIdIsValid(oldSerXidControl->tailXid)
   || TransactionIdFollows(xid, oldSerXidControl->tailXid));

Not sure how that ever worked or if I screwed something up while
rebasing, but the quick (and possibly wrong?) solution I found was to
exclude hypothetical SERIALIABLEXACTs when scanning for the new oldest
xid:

@@ -3181,6 +3322,7 @@ SetNewSxactGlobalXmin(void)
for (sxact = FirstPredXact(); sxact != NULL; sxact =
NextPredXact(sxact))
{
if (!SxactIsRolledBack(sxact)
+   && !SxactIsHypothetical(sxact)
&& !SxactIsCommitted(sxact)
&& sxact != OldCommittedSxact)

Here's a version like that in the meantime so that the CI passes.  The
real solution might be to give them their own xid (right now they
"borrow" one, see PreCommit_CheckForSerializationFailure()... now that
I think about it, that must be wrong), when I have more time for this
project.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-SERIALIZABLE-READ-ONLY-DEFERRABLE-for-hot-standby-v4.patch
Description: Binary data


Re: Size and size_t in dsa API

2018-09-21 Thread Thomas Munro
On Thu, Sep 20, 2018 at 10:13 PM Ideriha, Takeshi
 wrote:
> >> As a non-expert developer's opinion, I think mixing of Size and size_t 
> >> makes difficult
> >to understand source code.
> >
> >Agreed.  Let's change them all to size_t and back-patch that to keep future
> >back-patching easy.  Patch attached.
>
> Thank you for the quick action. I'm happy now.
> I confirmed English word Size in comment is kept as it is.

Pushed.  Thanks for the reminder about that.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE on standby servers

2018-09-21 Thread Thomas Munro
Hi Kevin, all,

/me pokes ancient thread

I haven't done any more work on the problems mentioned above, but I
ran into Kevin at PostgresOpen in San Francisco and he said he might
have some time to look at this problem.  So, here is a long overdue
rebase of the WIP patch.  It shows a first order approximation of
DEFERRABLE working on a standby (for example, see the sequence from
the first message in the thread[1]).  I'll add it to the next
Commitfest so I know when to rebase it.

[1] 
https://www.postgresql.org/message-id/flat/CAEepm%3D2b9TV%2BvJ4UeSBixDrW7VUiTjxPwWq8K3QwFSWx0pTXHQ%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com


0001-SERIALIZABLE-READ-ONLY-DEFERRABLE-for-hot-standby-v3.patch
Description: Binary data


Re: Proposal for disk quota feature

2018-09-21 Thread Pavel Stehule
pá 21. 9. 2018 v 13:32 odesílatel Hubert Zhang  napsal:

>
>
>
>
> *Hi all,We redesign disk quota feature based on the comments from Pavel
> Stehule and Chapman Flack. Here are the new design.OverviewBasically,  disk
> quota feature is used to support multi-tenancy environment, different level
> of database objects could be set a quota limit to avoid over use of disk
> space. A common case could be as follows: DBA could enable disk quota on a
> specified database list. DBA could set disk quota limit for
> tables/schemas/roles in these databases. Separate disk quota worker process
> will monitor the disk usage for these objects and detect the objects which
> exceed their quota limit. Queries loading data into these “out of disk
> quota” tables/schemas/roles will be cancelled.We are currently working at
> init implementation stage. We would like to propose our idea firstly and
> get feedbacks from community to do quick iteration.SQL Syntax (How to use
> disk quota)1 Specify the databases with disk quota enabled in GUC
> “diskquota_databases” in postgresql.conf and restart the database.2 DBA
> could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1
> ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with
> (quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota =
> ‘3MB’);*
>

just fast reaction - why QUOTA object?

Isn't ALTER SET enough?

Some like

ALTER TABLE a1 SET quote = 1MB;
ALTER USER ...
ALTER SCHEMA ..

New DDL commans looks like too hard hammer .



>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *3 Simulate a schema out of quota limit case: suppose table a1 and table
> a2 are both under schema s1.INSERT INTO a1 SELECT
> generate_series(1,1000);INSERT INTO a2 SELECT
> generate_series(1,300);SELECT pg_sleep(5)INSERT INTO a1 SELECT
> generate_series(1,1000);ERROR:  schema's disk space quota exceededDROP
> TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT
> generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the
> following components.1. Quota Setting Store is where the disk quota setting
> to be stored and accessed. We plan to use catalog table pg_diskquota to
> store these information. pg_diskquota is
> like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; /*
> diskquota name */int16 quotatype; /* diskquota type name */ Oid
> quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /*
> diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in
> MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of size
> change of database objects. We plan to use stat collector to detect the
> ‘active’ table list at initial stage. But stat collector has some
> limitation on finding the active table which is in a running transaction.
> Details see TODO section.3. Quota Size Checker is where to calculate the
> size and compare with quota limit for database objects. According to
> Pavel’s comment, autovacuum launcher and worker process could be a good
> reference to disk quota. So we plan to use a disk quota launcher daemon
> process and several disk quota worker process to finish this work. Launcher
> process is responsible for starting worker process based on a user defined
> database list from GUC. Worker process will connect to its target database
> and monitor the disk usage for objects in this database. In init stage of
> worker process, it will call calculate_total_relation_size() to calculate
> the size for each user table. After init stage, worker process will refresh
> the disk model every N seconds. Refreshing will only recalculate the size
> of tables in ‘active’ table list, which is generated by Quata Change
> Detector to minimize the cost.4. Quota Enforcement Operator is where to
> check for the quota limitation at postgres backend side. We will firstly
> implement it in ExecCheckRTPerms() as pre-running enforcement. It will
> check the disk quota of tables being inserted or updated, and report error
> if table’s or table’s schema’s or table’s owner’s quota limit is exceeded.
> As a native feature, we plan to add more checkpoint to do running query
> enforcement. For example, if a disk quota lefts 10MB quota, a query could
> insert 1GB data. This query could be allowed in pre-running enforcement
> check, but will be cancelled in running query enforcement check. Therefore,
> it can improve the accurate of disk quota usage. To achieve this, we plan
> to add a checkpoint in lower API such as smgr_extened. Hence, the Quota
> Enforcement Operator will check the disk quota usage when smgr_extened is
> called. If the quota is over limited, current query will be cancelled.
> Highlight1. Native feature.Support native Create/Drop Disk Quota SQL
> statement.New catalog table pg_diskquota to store disk quota setting.2.
> Auto DML/DDL detection. Table
> create/update/insert/delete/vacuum/truncate/drop/schema_change/owner_change,
>  Schema create/drop and Role create/drop will be detected by 

Re: Problem while setting the fpw with SIGHUP

2018-09-21 Thread Amit Kapila
On Tue, Sep 18, 2018 at 12:46 PM Kyotaro HORIGUCHI
 wrote:
>
> At Fri, 14 Sep 2018 16:30:37 +0530, Amit Kapila  
> wrote in 
> > On Fri, Sep 14, 2018 at 12:57 PM Michael Paquier  
> > wrote:
> > >
> > > On Thu, Sep 06, 2018 at 04:37:28PM -0700, Michael Paquier wrote:
> > > > /*
> > > >  * Properly accept or ignore signals the postmaster might send us.
> > > >  */
> > > > -   pqsignal(SIGHUP, StartupProcSigHupHandler); /* reload config 
> > > > file */
> > > > +   pqsignal(SIGHUP, SIG_IGN);  /* ignore reload config */
> > > >
> > > > I am finally coming back to this patch set, and that's one of the first
> > > > things I am going to help moving on for this CF.  And this bit from the
> > > > last patch series is not acceptable as there are some parameters which
> > > > are used by the startup process which can be reloaded.  One of them is
> > > > wal_retrieve_retry_interval for tuning when fetching WAL at recovery.
> > >
> > > So, I have been working on this problem again and I have reviewed the
> > > thread, and there have been many things discussed in the last couple of
> > > months:
> > > 1) We do not want to initialize XLogInsert stuff unconditionally for all
> > > processes at the moment recovery begins, but we just want to initialize
> > > it once WAL write is open for business.
> > > 2) Both the checkpointer and the startup process can call
> > > UpdateFullPageWrites() which can cause Insert->fullPageWrites to get
> > > incorrect values.
> >
> > Can you share the steps to reproduce this problem?
>
> The window for the issue is small.
>
> It happens if checkpointer first looks SharedRecoveryInProgress
> is turned off at the beginning of the CheckPointerMain loop.
> The window is needed be widen to make sure the issue happens.
>
> Applying the first patch attched, the following steps will cause
> the issue with high reliability.
>
> 1. initdb  (full_page_writes = on by default)
>
> 2. put recovery.conf so that server can accept promote signal
>
> 3. touch /tmp/hoge
>change full_page_write to off in postgresql.conf
>
> 4. pg_ctl_promote
>
> The attached second file is a script do the above steps.
> Server writes the following log message and die.
>
> | 2018-09-18 13:55:49.928 JST [16405] LOG:  database system is ready to 
> accept connections
> | TRAP: FailedAssertion("!(CritSectionCount == 0)", File: "mcxt.c", Line: 731)
> | 2018-09-18 13:55:50.546 JST [16405] LOG:  checkpointer process (PID 16407) 
> was terminated by signal 6: Aborted
>
> We can preallocating the XLogInsert buffer just to prevent the
> crash. This is done by calling RecoveryInProgress() before
> UpdateFullPageWrites() finds it turned to false. This is an
> isolated problem.
>

Yes, till this point the problem is quite clear and can be reproduced,
however, my question was for the next part for which there doesn't
seem to be a concrete test case.

> But it has another issue that FPW_CHANGE record
> can be missing or wrong FPW state after recovery end.
>
> It comes from the fact that responsibility to update the flag is
> not atomically passed from startup to checkpointer. (The window
> is after UpdateFullPageWrites() call and until setting
> SharedRecoveryInProgress to false.)
>

I understand your concern about missing FPW_CHANGE WAL record during
the above window but is that really a problem because till the
promotion is complete, it is not expected that we write any WAL
record.  Now, the other part of the problem you mentioned is "wrong
FPW state" which can happen because we don't read
Insert->fullPageWrites under lock.  If you are concerned about that
then I think we can solve it with a much less invasive change.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Connection slots reserved for replication

2018-09-21 Thread Alexander Kukushkin
Hi,

On 20 September 2018 at 08:18, Kyotaro HORIGUCHI
 wrote:

>
> Instaed, we can iterally "reserve" connection slots for the
> specific use by providing ProcGlobal->walsenderFreeProcs. The
> slots are never stolen for other usage. Allowing additional
> walsenders comes after all the slots are filled to grab an
> available "normal" slot, it works as the same as the current
> behavior when walsender_reserved_connectsions = 0.
>
> What do you think about this?

Sounds reasonable, please see the updated patch.

Regards,
--
Alexander Kukushkin
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e1073ac6d3..ccdb217735 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3059,6 +3059,27 @@ include_dir 'conf.d'

   
 
+ 
+  replication_reserved_connections
+  (integer)
+  
+   replication_reserved_connections configuration parameter
+  
+  
+  
+   
+Determines the number of connection slots that
+are reserved for replication connections.
+   
+
+   
+The default value is zero. The value should not exceed max_wal_senders.
+This parameter can only be set at server start.
+   
+  
+ 
+
   
max_replication_slots (integer)

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 370429d746..40b9deaa0c 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -122,6 +122,8 @@ int			max_wal_senders = 0;	/* the maximum number of concurrent
 int			wal_sender_timeout = 60 * 1000; /* maximum time to send one WAL
 			 * data message */
 bool		log_replication_commands = false;
+int			replication_reserved_connections = 0; /* the number of connection slots
+   * reserved for replication connections */
 
 /*
  * State for WalSndWakeupRequest
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 6f9aaa52fa..2d04a8204a 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -43,6 +43,7 @@
 #include "postmaster/autovacuum.h"
 #include "replication/slot.h"
 #include "replication/syncrep.h"
+#include "replication/walsender.h"
 #include "storage/condition_variable.h"
 #include "storage/standby.h"
 #include "storage/ipc.h"
@@ -180,6 +181,7 @@ InitProcGlobal(void)
 	ProcGlobal->freeProcs = NULL;
 	ProcGlobal->autovacFreeProcs = NULL;
 	ProcGlobal->bgworkerFreeProcs = NULL;
+	ProcGlobal->walsenderFreeProcs = NULL;
 	ProcGlobal->startupProc = NULL;
 	ProcGlobal->startupProcPid = 0;
 	ProcGlobal->startupBufferPinWaitBufId = -1;
@@ -253,13 +255,20 @@ InitProcGlobal(void)
 			ProcGlobal->autovacFreeProcs = [i];
 			procs[i].procgloballist = >autovacFreeProcs;
 		}
-		else if (i < MaxBackends)
+		else if (i < MaxBackends - replication_reserved_connections)
 		{
 			/* PGPROC for bgworker, add to bgworkerFreeProcs list */
 			procs[i].links.next = (SHM_QUEUE *) ProcGlobal->bgworkerFreeProcs;
 			ProcGlobal->bgworkerFreeProcs = [i];
 			procs[i].procgloballist = >bgworkerFreeProcs;
 		}
+		else if (i < MaxBackends)
+		{
+			/* PGPROC for walsender, add to walsenderFreeProcs list */
+			procs[i].links.next = (SHM_QUEUE *) ProcGlobal->walsenderFreeProcs;
+			ProcGlobal->walsenderFreeProcs = [i];
+			procs[i].procgloballist = >walsenderFreeProcs;
+		}
 
 		/* Initialize myProcLocks[] shared memory queues. */
 		for (j = 0; j < NUM_LOCK_PARTITIONS; j++)
@@ -304,6 +313,8 @@ InitProcess(void)
 		procgloballist = >autovacFreeProcs;
 	else if (IsBackgroundWorker)
 		procgloballist = >bgworkerFreeProcs;
+	else if (am_walsender)
+		procgloballist = >walsenderFreeProcs;
 	else
 		procgloballist = >freeProcs;
 
@@ -318,6 +329,14 @@ InitProcess(void)
 
 	set_spins_per_delay(ProcGlobal->spins_per_delay);
 
+	/*
+	* Try to use ProcGlobal->freeProcs as a fallback when
+	* all reserved walsender slots are already busy.
+	*/
+	if (am_walsender && replication_reserved_connections < max_wal_senders
+			&& *procgloballist == NULL)
+		procgloballist = >freeProcs;
+
 	MyProc = *procgloballist;
 
 	if (MyProc != NULL)
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 5ef6315d20..7de2609ef2 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -505,7 +505,7 @@ InitializeMaxBackends(void)
 
 	/* the extra unit accounts for the autovacuum launcher */
 	MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
-		max_worker_processes;
+		max_worker_processes + replication_reserved_connections;
 
 	/* internal error because the values were all checked previously */
 	if (MaxBackends > MAX_BACKENDS)
@@ -790,8 +790,8 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 
 	/*
 	 * The last few connection slots are reserved for superusers.  Although
-	 * replication connections currently require superuser privileges, we
-	 * don't allow them to consume the reserved slots, 

Re: proposal: prefix function

2018-09-21 Thread Pavel Stehule
pá 21. 9. 2018 v 12:37 odesílatel Chris Travers 
napsal:

>
>
> On Fri, Sep 21, 2018 at 10:09 AM Pavel Stehule 
> wrote:
>
>> Hi
>>
>> can we implement prefix function for fast test if substr is prefix of
>> some string?
>>
>> create or replace function prefix(str text, substr text)
>> returns boolean as $$
>>   select substr(str, 1, length(substr)) = substr
>> $$ language sql;
>>
>> This function can be very effective in C language. Now it should be
>> implemented with like or regexp, what is significantly more expensive.
>>
>>
> These would just be wrappers around already existing internal functions,
> right?
>

maybe, I didn't do any research. If not, the implementation is on few lines.

Regards

Pavel

>
>
>> Regards
>>
>> Pavel
>>
>
>
> --
> Best Regards,
> Chris Travers
> Head of Database
>
> Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
> Saarbrücker Straße 37a, 10405 Berlin
>
>


Re: Proposal for disk quota feature

2018-09-21 Thread Hubert Zhang
*Hi all,We redesign disk quota feature based on the comments from Pavel
Stehule and Chapman Flack. Here are the new design.OverviewBasically,  disk
quota feature is used to support multi-tenancy environment, different level
of database objects could be set a quota limit to avoid over use of disk
space. A common case could be as follows: DBA could enable disk quota on a
specified database list. DBA could set disk quota limit for
tables/schemas/roles in these databases. Separate disk quota worker process
will monitor the disk usage for these objects and detect the objects which
exceed their quota limit. Queries loading data into these “out of disk
quota” tables/schemas/roles will be cancelled.We are currently working at
init implementation stage. We would like to propose our idea firstly and
get feedbacks from community to do quick iteration.SQL Syntax (How to use
disk quota)1 Specify the databases with disk quota enabled in GUC
“diskquota_databases” in postgresql.conf and restart the database.2 DBA
could set disk quota limit for table/schema/role.CREATE DISK QUOTA tablea1
ON TABLE a1 with (quota = ‘1MB’);CREATE DISK QUOTA roleu1 ON USER u1 with
(quota = ‘1GB’);CREATE DISK QUOTA schemas1 ON SCHEMA s1 with (quota =
‘3MB’);3 Simulate a schema out of quota limit case: suppose table a1 and
table a2 are both under schema s1.INSERT INTO a1 SELECT
generate_series(1,1000);INSERT INTO a2 SELECT
generate_series(1,300);SELECT pg_sleep(5)INSERT INTO a1 SELECT
generate_series(1,1000);ERROR:  schema's disk space quota exceededDROP
TABLE a2;SELECT pg_sleep(5)INSERT INTO a1 SELECT
generate_series(1,1000);INSERT 0 1000ArchitectureDisk quota has the
following components.1. Quota Setting Store is where the disk quota setting
to be stored and accessed. We plan to use catalog table pg_diskquota to
store these information. pg_diskquota is
like:CATALOG(pg_diskquota,6122,DiskQuotaRelationId){ NameData quotaname; /*
diskquota name */int16 quotatype; /* diskquota type name */ Oid
quotatargetoid; /* diskquota target db object oid*/ int32 quotalimit; /*
diskquota size limit in MB*/ int32 quotaredzone; /* diskquota redzone in
MB*/} FormData_pg_diskquota;2. Quota Change Detector is the monitor of size
change of database objects. We plan to use stat collector to detect the
‘active’ table list at initial stage. But stat collector has some
limitation on finding the active table which is in a running transaction.
Details see TODO section.3. Quota Size Checker is where to calculate the
size and compare with quota limit for database objects. According to
Pavel’s comment, autovacuum launcher and worker process could be a good
reference to disk quota. So we plan to use a disk quota launcher daemon
process and several disk quota worker process to finish this work. Launcher
process is responsible for starting worker process based on a user defined
database list from GUC. Worker process will connect to its target database
and monitor the disk usage for objects in this database. In init stage of
worker process, it will call calculate_total_relation_size() to calculate
the size for each user table. After init stage, worker process will refresh
the disk model every N seconds. Refreshing will only recalculate the size
of tables in ‘active’ table list, which is generated by Quata Change
Detector to minimize the cost.4. Quota Enforcement Operator is where to
check for the quota limitation at postgres backend side. We will firstly
implement it in ExecCheckRTPerms() as pre-running enforcement. It will
check the disk quota of tables being inserted or updated, and report error
if table’s or table’s schema’s or table’s owner’s quota limit is exceeded.
As a native feature, we plan to add more checkpoint to do running query
enforcement. For example, if a disk quota lefts 10MB quota, a query could
insert 1GB data. This query could be allowed in pre-running enforcement
check, but will be cancelled in running query enforcement check. Therefore,
it can improve the accurate of disk quota usage. To achieve this, we plan
to add a checkpoint in lower API such as smgr_extened. Hence, the Quota
Enforcement Operator will check the disk quota usage when smgr_extened is
called. If the quota is over limited, current query will be cancelled.
Highlight1. Native feature.Support native Create/Drop Disk Quota SQL
statement.New catalog table pg_diskquota to store disk quota setting.2.
Auto DML/DDL detection. Table
create/update/insert/delete/vacuum/truncate/drop/schema_change/owner_change,
 Schema create/drop and Role create/drop will be detected by disk quota
automatically. 3. Low cost disk quota checker.Worker process of disk quota
need to refresh the disk usage model every N seconds. Since recalculate the
file size using stat() system call is expensive for a large number of
files, we use an ‘active’ table list to reduce the real work at each
iteration. A basic experiment on our init stage implementation on database
with 20K tables shows that the refresh cost is 1% 

Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2018-09-21 Thread Pavel Stehule
Hi

út 18. 9. 2018 v 17:33 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > [ xml-xpath-default-ns-7.patch ]
>
> At Andrew's prompting, I took a look over this patch.  I don't know much
> of anything about XML, so I have no idea as to standards compliance here,
> but I do have some comments:
>
> * I'm fairly uncomfortable with the idea that we're going to maintain
> our own XPath parser.  That seems like a recipe for lots of future
> work ... and the code is far too underdocumented for anyone to actually
> maintain it.  (Case in point: _transformXPath's arguments are not
> documented at all, and in fact there's hardly a word about what it's
> even supposed to *do*.)
>

I understand, and I would be much more happy if xmllib2 supports this
feature. But the development of new feature of this lib was frozen - and
there are not any possibility.

On second hand the parser is very simple - tokenizer detects identifiers,
strings, numbers, and parser try to find unqualified names and prints
default namespace identifier before. It doesn't do more.

I renamed function   _transformXPath to transform_xpath_recurse and I
descibed params


> * I think the business with "pgdefnamespace.pgsqlxml.internal" is just
> plain awful.  It's a wart, and I don't think it's even saving you any
> code once you account for all the places where you have to throw errors
> for somebody trying to use that as a regular name.  This should be done
> with out-of-band signaling if possible.  The existing convention above
> this code is that a NULL pointer means a default namespace; can't that
> be continued throughout?  If that's not practical, can you pick a string
> that simply can't syntactically be a namespace name?  (In the SQL world,
> for example, an empty string is an illegal identifier so that that could
> be used for the purpose.  But I don't know if that applies to XML.)
> Or perhaps you can build a random name that is chosen just to make it
> different from any of the listed namespaces?  If none of those work,
> I think passing around an explicit "isdefault" boolean alongside the name
> would be better than having a reserved word.
>

The string used like default namespace name should be valid XML namespace
name, because it is injected to XPath expression and it is passed to
libxml2 as one namespace name. libxml2 requires not null valid value.

I changed it and now the namespace name is generated.


>
> * _transformXPath recurses, so doesn't it need check_stack_depth()?
>

fixed

>
> * I'm not especially in love with using function names that start
> with an underscore.  I do not think that adds anything, and it's
> unlike the style in most places in PG.
>

renamed


> * This is a completely unhelpful error message:
> +   if (!parser->buffer_is_empty)
> +   elog(ERROR, "internal error");
> If you can't be bothered to write a message that says something
> useful, either drop the test or turn it into an Assert.  I see some
> other internal errors that aren't up to project norms either.
>

use assert


>
> * Either get rid of the "elog(DEBUG1)"s, or greatly lower their
> message priority.  They might've been appropriate for developing
> this patch, but they are not okay to commit that way.
>

removed


> * Try to reduce the amount of random whitespace changes in the patch.
>

The formatting was really strange, fixed


>
> * .h files should never #include "postgres.h", per project policy.
>

I moved the code to xml.c like you propose


>
> * I'm not sure I'd bother with putting the new code into a separate
> file rather than cramming it into xml.c.  The main reason why not
> is that you're going to get "empty translation unit" warnings from
> some compilers in builds without USE_LIBXML.
>
> * Documentation, comments, and error messages could all use some
> copy-editing by a native English speaker (you knew that of course).
>

I hope so some native speakers looks there.

Thank you for comments

Attached updated patch

Regards

Pavel


> regards, tom lane
>


default_namespace-20180921.patch.gz
Description: application/gzip


Re: Problem while updating a foreign table pointing to a partitioned table on foreign server

2018-09-21 Thread Etsuro Fujita

(2018/09/18 21:14), Kyotaro HORIGUCHI wrote:

At Fri, 14 Sep 2018 22:01:39 +0900, Etsuro Fujita  wrote 
in<5b9bb133.1060...@lab.ntt.co.jp>

@@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid
relationObjectId,\
  bool inhparent,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  errmsg("cannot access temporary or unlogged relations during
  r\
ecovery")));

+   max_attrnum = RelationGetNumberOfAttributes(relation);
+
+ /* Foreign table may have exanded this relation with junk columns */
+ if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
+   {
+ AttrNumber maxattno = max_varattno(root->parse->targetList, varno);
+   if (max_attrnum<  maxattno)
+   max_attrnum = maxattno;
+   }
+
 rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
-   rel->max_attr = RelationGetNumberOfAttributes(relation);
+   rel->max_attr = max_attrnum;
 rel->reltablespace = RelationGetForm(relation)->reltablespace;

This breaks the fundamental assumption that rel->max_attr is equal to
RelationGetNumberOfAttributes of that table.  My concern is: this
change would probably be a wart, so it would be bug-prone in future
versions.


Hmm. I believe that once RelOptInfo is created all attributes
defined in it is safely accessed. Is it a wrong assumption?


The patch you proposed seems to fix the issue well for the current 
version of PG, but I'm a bit scared to have such an assumption (ie, to 
include columns in a rel's tlist that are not defined anywhere in the 
system catalogs).  In future we might add eg, a lsyscache.c routine for 
some planning use that are given the attr number of a column as an 
argument, like get_attavgwidth, and if so, it would be easily 
conceivable that that routine would error out for such an undefined 
column.  (get_attavgwidth would return 0, not erroring out, though.)



Actually RelationGetNumberOfAttributes is used in few distinct
places while planning.



build_physical_tlist is not used for
foreign relations.


For UPDATE/DELETE, that function would not be called for a foreign 
target in the posetgres_fdw case, as CTID is requested (see 
use_physical_tlist), but otherwise that function may be called if 
possible.  No?



If we don't accept the expanded tupdesc for base relations, the
another way I can find is transforming the foreign relation into
something another like a subquery, or allowing expansion of
attribute list of a base relation...


Sorry, I don't understand this fully, but there seems to be the same 
concern as mentioned above.



Another thing on the new version:

@@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root,
RelOptInfo *rel)
 relation = heap_open(rte->relid, NoLock);

 numattrs = RelationGetNumberOfAttributes(relation);
+
+   /*
+ * Foreign tables may have expanded with some junk columns. Punt
+* in the case.

...

I think this would disable the optimization on projection in foreign
scans, causing performance regression.


Well, in update/delete cases, create_plan_recurse on foreign scan
is called with CP_EXACT_TLIST in create_modifytable_plan


That's not necessarily true; consider UPDATE/DELETE on a local join; in 
that case the topmost plan node for a subplan of a ModifyTable would be 
a join, and if that's a NestLoop, create_plan_recurse would call 
create_nestloop_plan, which would recursively call create_plan_recurse 
for its inner/outer subplans with flag=0, not CP_EXACT_TLIST.



so the
code path is not actually used.


I think this is true for the postgres_fdw case; because 
use_physical_tlist would decide not to do build_physical_tlist for the 
reason mentioned above.  BUT my question here is: why do we need the 
change to build_physical_tlist?



Since this uses fdw_scan_tlist so it is theoretically
back-patchable back to 9.6.


IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join
pushdown infrastructure, so I think your patch can be back-patched to
PG9.5, but I don't think that's enough; IIRC, this issue was
introduced in PG9.3, so a solution for this should be back-patch-able
to PG9.3, I think.


In the previous version, fdw_scan_tlist is used to hold only
additional (junk) columns. I think that we can get rid of the
variable by scanning the full tlist for junk columns. Apparently
it's differnt patch for such versions. I'm not sure how much it
is invasive for now but will consider.


Sorry, I don't fully understand this.  Could you elaborate a bit more on 
this?



0001-Add-test-for-postgres_fdw-foreign-parition-update.patch

   This should fail for unpatched postgres_fdw. (Just for demonstration)


+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid::int - (SELECT min(tableoid) FROM 

Re: proposal: prefix function

2018-09-21 Thread Chris Travers
On Fri, Sep 21, 2018 at 10:09 AM Pavel Stehule 
wrote:

> Hi
>
> can we implement prefix function for fast test if substr is prefix of some
> string?
>
> create or replace function prefix(str text, substr text)
> returns boolean as $$
>   select substr(str, 1, length(substr)) = substr
> $$ language sql;
>
> This function can be very effective in C language. Now it should be
> implemented with like or regexp, what is significantly more expensive.
>
>
These would just be wrappers around already existing internal functions,
right?


> Regards
>
> Pavel
>


-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: Proposal for Signal Detection Refactoring

2018-09-21 Thread Chris Travers
On Fri, Sep 21, 2018 at 6:46 AM Michael Paquier  wrote:

> On Thu, Sep 20, 2018 at 03:08:34PM +0200, Chris Travers wrote:
> > So here's a small patch.  I will add it for the next commit fest unless
> > anyone has any reason I shouldn't.
>
> -   return InterruptPending && (QueryCancelPending || ProcDiePending);
> +   return PENDING_INTERRUPT_LEVEL() >= QUERY_CANCEL;
>
> This is pretty similar to lock levels, where it is pretty hard to put a
> strict monotone hierarchy when it comes to such interruptions.


I understand how lock levels don't fit a simple hierarchy but at least when
it comes
to what is going to be aborted on a signal, I am having trouble
understanding the problem here.

Does anyone have any search terms I should look into the archives regarding
this issue?

I will assume this patch will be rejected then.



> The new
> code does not seem like an improvment either, as for example in the code
> mentioned above, you know directly what are the actions involved, which
> is not the case with the new code style.
>

The initial motivation for this patch was that it felt a little bit odd to
be using global
boolean flags for this sort of approach and I was concerned that if this
approach
proliferated it might cause problems later.  As far as I know my previous
patch was
the second use of the current pattern.

So one thing I wonder is if there are ways where abstracting this would
make more sense.


> --
> Michael
>


-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: transction_timestamp() inside of procedures

2018-09-21 Thread Bruce Momjian
On Fri, Sep 21, 2018 at 02:34:25PM +0900, Michael Paquier wrote:
> On Thu, Sep 20, 2018 at 10:12:06PM -0700, Andres Freund wrote:
> > Isn't the point that transaction_timestamp() does *not* currently change
> > its value, even though the transaction (although not the outermost
> > statement) has finished?
> 
> Ouch, yes.  I see the point now.  Indeed that's strange to not have a
> new transaction timestamp after commit within the DO block..

So, this puts us in an odd position.  Right now everyone knows that
statement_timestamp() is only changed by the outer statement, i.e., a
SELECT in a function doesn't change statement_timestamp().   So, there
is an argument that transaction_timestamp() should do the same and not
change in a function --- in fact, if it does change, it would mean that
transaction_timestamp() changes in a function, but statement_timestamp()
doesn't --- that seems very odd.  It would mean that new statements in a
function don't change statement_timestamp(), but new transctions in a
function do --- again, very odd.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



RE: impact of SPECTRE and MELTDOWN patches

2018-09-21 Thread REIX, Tony
No. I work in Power world only. I know nothing about Intel.

I was asking since these days we are executing a performance benchmark with 
PostgreSQL 10.4 on AIX/Power9 and Linux/Power9. I'll suggest my colleague to 
run a try with the Firmware check of SPECTRUM set on (now off), so that it will 
provide complementary information to the other tests they had already done.


Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

De : ROS Didier 
Envoyé : vendredi 21 septembre 2018 11:52:42
À : REIX, Tony; pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches


Hi



No, 80% of our IT infrastructure is INTEL HW.

Have you any recommendations to correct the impact on the performance ?



Best Regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr
support-postgres-nive...@edf.fr
support-oracle-nive...@edf.fr
Tél. : 01 78 66 61 14
Tél. mobile : 06 49 51 11 88
Lync : ros.did...@edf.fr






De : tony.r...@atos.net [mailto:tony.r...@atos.net]
Envoyé : vendredi 21 septembre 2018 11:48
À : ROS Didier ; pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches



Thx.



So, it isIntel HW.

Have you experimented too with Power HW?



Regards



Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader

Office : +33 (0) 4 76 29 72 67

1 rue de Provence - 38432 Échirolles - France

www.atos.net



De : ROS Didier mailto:didier@edf.fr>>
Envoyé : vendredi 21 septembre 2018 11:40:15
À : REIX, Tony; 
pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches



Hi

Here is the HW information :



[root@pcyyymm9 ~]# cat /proc/cpuinfo

processor   : 0

vendor_id   : GenuineIntel

cpu family  : 6

model   : 62

model name  : Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz

stepping: 4

microcode   : 0x427

cpu MHz : 2300.000

cache size  : 16384 KB

physical id : 0

siblings: 2

core id : 0

cpu cores   : 2

apicid  : 0

initial apicid  : 0

fpu : yes

fpu_exception   : yes

cpuid level : 13

wp  : yes

flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc 
arch_perfmon nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni 
pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand 
hypervisor lahf_lm epb fsgsbase smep dtherm ida arat pln pts

bogomips: 4600.00

clflush size: 64

cache_alignment : 64

address sizes   : 40 bits physical, 48 bits virtual



[root@pcyyymm9 ~]# free -m

  totalusedfree  shared  buff/cache   available

Mem:  158774879 2263029   107727545

Swap:  2047 1071940



[root@pcyyymm9 ~]# uname -a

Linux pcyyymm9.pcy.edfgdf.fr 3.10.0-862.6.3.el7.x86_64 #1 SMP Fri Jun 15 
17:57:37 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux



Best regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr






De : tony.r...@atos.net [mailto:tony.r...@atos.net]
Envoyé : vendredi 21 septembre 2018 11:24
À : ROS Didier mailto:didier@edf.fr>>; 
pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches



Hi,



Which HW have you experimented with?



Thx/Regards



Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader

Office : +33 (0) 4 76 29 72 67

1 rue de Provence - 38432 Échirolles - France

www.atos.net





De : ROS Didier mailto:didier@edf.fr>>
Envoyé : 

RE: impact of SPECTRE and MELTDOWN patches

2018-09-21 Thread ROS Didier
Hi

No, 80% of our IT infrastructure is INTEL HW.
Have you any recommendations to correct the impact on the performance ?

Best Regards
[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre
didier@edf.fr
support-postgres-nive...@edf.fr
support-oracle-nive...@edf.fr
Tél. : 01 78 66 61 14
Tél. mobile : 06 49 51 11 88
Lync : ros.did...@edf.fr



De : tony.r...@atos.net [mailto:tony.r...@atos.net]
Envoyé : vendredi 21 septembre 2018 11:48
À : ROS Didier ; pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches


Thx.



So, it isIntel HW.

Have you experimented too with Power HW?



Regards


Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

De : ROS Didier mailto:didier@edf.fr>>
Envoyé : vendredi 21 septembre 2018 11:40:15
À : REIX, Tony; 
pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches


Hi

Here is the HW information :



[root@pcyyymm9 ~]# cat /proc/cpuinfo

processor   : 0

vendor_id   : GenuineIntel

cpu family  : 6

model   : 62

model name  : Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz

stepping: 4

microcode   : 0x427

cpu MHz : 2300.000

cache size  : 16384 KB

physical id : 0

siblings: 2

core id : 0

cpu cores   : 2

apicid  : 0

initial apicid  : 0

fpu : yes

fpu_exception   : yes

cpuid level : 13

wp  : yes

flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc 
arch_perfmon nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni 
pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand 
hypervisor lahf_lm epb fsgsbase smep dtherm ida arat pln pts

bogomips: 4600.00

clflush size: 64

cache_alignment : 64

address sizes   : 40 bits physical, 48 bits virtual



[root@pcyyymm9 ~]# free -m

  totalusedfree  shared  buff/cache   available

Mem:  158774879 2263029   107727545

Swap:  2047 1071940



[root@pcyyymm9 ~]# uname -a

Linux pcyyymm9.pcy.edfgdf.fr 3.10.0-862.6.3.el7.x86_64 #1 SMP Fri Jun 15 
17:57:37 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux



Best regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr






De : tony.r...@atos.net [mailto:tony.r...@atos.net]
Envoyé : vendredi 21 septembre 2018 11:24
À : ROS Didier mailto:didier@edf.fr>>; 
pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches



Hi,



Which HW have you experimented with?



Thx/Regards



Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader

Office : +33 (0) 4 76 29 72 67

1 rue de Provence - 38432 Échirolles - France

www.atos.net





De : ROS Didier mailto:didier@edf.fr>>
Envoyé : jeudi 20 septembre 2018 09:23
À : pgsql-hack...@postgresql.org
Objet : impact of SPECTRE and MELTDOWN patches



Hi

   I would like to know what are the recommendation to resolve the 
problem of performance impact after applying the SPECTRE and MELTDOWN patches ?

   Do we have to add more CPUs ?



   NB : I have tested on one of our production database and I get 
an impact of ~25%... Do you have such a result ?



   Thanks in advance.



Best Regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr






Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les 

RE: impact of SPECTRE and MELTDOWN patches

2018-09-21 Thread REIX, Tony
Thx.


So, it isIntel HW.

Have you experimented too with Power HW?


Regards


Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net

De : ROS Didier 
Envoyé : vendredi 21 septembre 2018 11:40:15
À : REIX, Tony; pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches


Hi

Here is the HW information :



[root@pcyyymm9 ~]# cat /proc/cpuinfo

processor   : 0

vendor_id   : GenuineIntel

cpu family  : 6

model   : 62

model name  : Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz

stepping: 4

microcode   : 0x427

cpu MHz : 2300.000

cache size  : 16384 KB

physical id : 0

siblings: 2

core id : 0

cpu cores   : 2

apicid  : 0

initial apicid  : 0

fpu : yes

fpu_exception   : yes

cpuid level : 13

wp  : yes

flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc 
arch_perfmon nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni 
pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand 
hypervisor lahf_lm epb fsgsbase smep dtherm ida arat pln pts

bogomips: 4600.00

clflush size: 64

cache_alignment : 64

address sizes   : 40 bits physical, 48 bits virtual



[root@pcyyymm9 ~]# free -m

  totalusedfree  shared  buff/cache   available

Mem:  158774879 2263029   107727545

Swap:  2047 1071940



[root@pcyyymm9 ~]# uname -a

Linux pcyyymm9.pcy.edfgdf.fr 3.10.0-862.6.3.el7.x86_64 #1 SMP Fri Jun 15 
17:57:37 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux



Best regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr







De : tony.r...@atos.net [mailto:tony.r...@atos.net]
Envoyé : vendredi 21 septembre 2018 11:24
À : ROS Didier ; pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches



Hi,



Which HW have you experimented with?



Thx/Regards



Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader

Office : +33 (0) 4 76 29 72 67

1 rue de Provence - 38432 Échirolles - France

www.atos.net





De : ROS Didier mailto:didier@edf.fr>>
Envoyé : jeudi 20 septembre 2018 09:23
À : pgsql-hack...@postgresql.org
Objet : impact of SPECTRE and MELTDOWN patches



Hi

   I would like to know what are the recommendation to resolve the 
problem of performance impact after applying the SPECTRE and MELTDOWN patches ?

   Do we have to add more CPUs ?



   NB : I have tested on one of our production database and I get 
an impact of ~25%... Do you have such a result ?



   Thanks in advance.



Best Regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr






Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in 

RE: impact of SPECTRE and MELTDOWN patches

2018-09-21 Thread ROS Didier
Hi
Here is the HW information :

[root@pcyyymm9 ~]# cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 62
model name  : Intel(R) Xeon(R) CPU E5-4610 v2 @ 2.30GHz
stepping: 4
microcode   : 0x427
cpu MHz : 2300.000
cache size  : 16384 KB
physical id : 0
siblings: 2
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx rdtscp lm constant_tsc 
arch_perfmon nopl xtopology tsc_reliable nonstop_tsc aperfmperf eagerfpu pni 
pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c rdrand 
hypervisor lahf_lm epb fsgsbase smep dtherm ida arat pln pts
bogomips: 4600.00
clflush size: 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual

[root@pcyyymm9 ~]# free -m
  totalusedfree  shared  buff/cache   available
Mem:  158774879 2263029   107727545
Swap:  2047 1071940

[root@pcyyymm9 ~]# uname -a
Linux pcyyymm9.pcy.edfgdf.fr 3.10.0-862.6.3.el7.x86_64 #1 SMP Fri Jun 15 
17:57:37 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux

Best regards
[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre
didier@edf.fr




De : tony.r...@atos.net [mailto:tony.r...@atos.net]
Envoyé : vendredi 21 septembre 2018 11:24
À : ROS Didier ; pgsql-hack...@postgresql.org
Objet : RE: impact of SPECTRE and MELTDOWN patches


Hi,



Which HW have you experimented with?



Thx/Regards


Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net


De : ROS Didier mailto:didier@edf.fr>>
Envoyé : jeudi 20 septembre 2018 09:23
À : pgsql-hack...@postgresql.org
Objet : impact of SPECTRE and MELTDOWN patches


Hi

   I would like to know what are the recommendation to resolve the 
problem of performance impact after applying the SPECTRE and MELTDOWN patches ?

   Do we have to add more CPUs ?



   NB : I have tested on one of our production database and I get 
an impact of ~25%... Do you have such a result ?



   Thanks in advance.



Best Regards

[cid:image002.png@01D14E0E.8515EB90]


Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr








Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à 
l'intention exclusive des destinataires et les informations qui y figurent sont 
strictement confidentielles. Toute utilisation de ce Message non conforme à sa 
destination, toute diffusion ou toute publication totale ou partielle, est 
interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le 
copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si 
vous avez reçu ce Message par erreur, merci de le supprimer de votre système, 
ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support 
que ce soit. Nous vous remercions également d'en avertir immédiatement 
l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie 
électronique arrivent en temps utile, sont sécurisées ou dénuées de toute 
erreur ou virus.


This message and any attachments (the 'Message') are intended solely for the 
addressees. The information contained in this Message is confidential. Any use 
of information contained in this Message not in accord with its purpose, any 
dissemination or disclosure, either whole or partial, is prohibited except 
formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any 
part of it. If you have received this message in error, please delete it and 
all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or 
virus-free.


RE: impact of SPECTRE and MELTDOWN patches

2018-09-21 Thread REIX, Tony
Hi,


Which HW have you experimented with?


Thx/Regards


Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM Coop Architect & Technical Leader
Office : +33 (0) 4 76 29 72 67
1 rue de Provence - 38432 Échirolles - France
www.atos.net



De : ROS Didier 
Envoyé : jeudi 20 septembre 2018 09:23
À : pgsql-hack...@postgresql.org
Objet : impact of SPECTRE and MELTDOWN patches


Hi

   I would like to know what are the recommendation to resolve the 
problem of performance impact after applying the SPECTRE and MELTDOWN patches ?

   Do we have to add more CPUs ?



   NB : I have tested on one of our production database and I get 
an impact of ~25%... Do you have such a result ?



   Thanks in advance.



Best Regards

[cid:image002.png@01D14E0E.8515EB90]



Didier ROS
Expertise SGBD
DS IT/IT DMA/Solutions Groupe EDF/Expertise Applicative - SGBD
Nanterre Picasso - E2 565D (aile nord-est)
32 Avenue Pablo Picasso
92000 Nanterre

didier@edf.fr








Re: Changing the setting of wal_sender_timeout per standby

2018-09-21 Thread Michael Paquier
On Fri, Sep 21, 2018 at 06:26:19AM +, Tsunakawa, Takayuki wrote:
> Agreed.  Sorry to cause you to take this long time for such a tiny
> patch...

Well, that is arguing about how to shape things and agree on those,
which is not wasted time, far from that.
--
Michael


signature.asc
Description: PGP signature


proposal: prefix function

2018-09-21 Thread Pavel Stehule
Hi

can we implement prefix function for fast test if substr is prefix of some
string?

create or replace function prefix(str text, substr text)
returns boolean as $$
  select substr(str, 1, length(substr)) = substr
$$ language sql;

This function can be very effective in C language. Now it should be
implemented with like or regexp, what is significantly more expensive.

Regards

Pavel


Re: Pluggable Storage - Andres's take

2018-09-21 Thread Haribabu Kommi
On Fri, Sep 21, 2018 at 5:05 PM Andres Freund  wrote:

> Hi,
>
> On 2018-09-21 16:57:43 +1000, Haribabu Kommi wrote:
>
> > For example, in the sequential scan, the heap returns the slot with
> > the tuple or with value array of all the columns and then the data gets
> > filtered and later removed the unnecessary columns with projection.
> > This works fine for the row based storage. For columnar storage, if
> > the storage knows that upper layers needs only particular columns,
> > then they can directly return the specified columns and there is no
> > need of projection step. This will help the columnar storage also
> > to return proper columns in a faster way.
>
> I think this is an important feature, but I feel fairly strongly that we
> should only tackle it in a second version. This patchset is already
> pretty darn large.  It's imo not just helpful for columnar, but even for
> heap - we e.g. spend a lot of time deforming columns that are never
> accessed. That's particularly harmful when the leading columns are all
> NOT NULL and fixed width, but even if not, it's painful.
>

OK. Thanks for your opinion.
Then I will first try to cleanup the open items of the existing patch.


> Is it good to pass the plan to the storage, so that they can find out
> > the columns that needs to be returned?
>
> I don't think that's the right approach - this should be a level *below*
> plan nodes, not reference them. I suspect we're going to have to have a
> new table_scan_set_columnlist() option or such.
>

The table_scan_set_columnlist() API can be a good solution to share
the columns that are expected.



> > And also if the projection can handle in the storage itself for some
> > scenarios, need to be informed the callers that there is no need to
> > perform the projection extra.
>
> I don't think that should be done in the storage layer - that's probably
> better done introducing custom scan nodes and such.  This has costing
> implications etc, so this needs to happen *before* planning is finished.
>

Sorry, my explanation was wrong, Assuming a scenario where the target list
contains only the plain columns of a table and these columns are already
passed
to storage using the above proposed new API and their of one to one mapping.
Based on the above info, deciding whether the projection is required or not
is good.

Regards,
Haribabu Kommi
Fujitsu Australia


Re: View to get all the extension control file details

2018-09-21 Thread Haribabu Kommi
On Thu, Sep 20, 2018 at 3:18 PM Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> Hello.
>
> At Mon, 17 Sep 2018 16:49:41 +1000, Haribabu Kommi <
> kommi.harib...@gmail.com> wrote in
> 
> > Hi Hackers,
> >
> > Currently PostgreSQL provides following views to get the extension
> specific
> > details
> >
> > pg_available_extensions - Name, default_version, installed_version,
> comment
> >
> > pg_available_extension_versions - Name, version, installed, superuser,
> > relocatable, schema, requires, comment
> >
> > But these misses the "directory", "module_pathname" and "encoding"
> > extension specific informations and these are not available even with
> > extension specific functions also. There are some extension that differs
> in
> > extension name to library name. The pgpool_recovery extension library
> name
> > is pgpool-recovery.so, '_' to '-'. While we are developing some tool on
> top
> > of PostgreSQL, we found out this problem and it can be solved easily if
> the
> > server expose the details that i have and got it from the extension
> control
> > file.
>
> Nowadays we are going to provide views for such files. Howerer
> I'm not a fan of checking extension packaging using such views,


Thanks for your opinion.
As we are in the process of developing a tool to find out the details
of the extensions automatically, in that case, it will be helpful if any
view is available.


> I
> agree that it's good to have at least a function/view that shows
> all available attributes of extensions. Is there no other items
> not in controlfiles?
>

I listed all the members of the ExtensionControlFile structure. I don't
find anything else is required.


> > Any opinion in adding a new view like "pg_available_extension_details" to
> > display all extension control file columns? or Adding them to the
> existing
> > view is good?
>
> I felt it's a bit too noisy at first but pg_settings is doing
> something like. So +1 to extend the existing
> pg_available_extensions view from me from the viewpoint of
> consistency with other views of the similar objective.
>

OK, thanks for your view. Will do accordingly.

Regards,
Haribabu Kommi
Fujitsu Australia


Re: Pluggable Storage - Andres's take

2018-09-21 Thread Andres Freund
Hi,

On 2018-09-21 16:57:43 +1000, Haribabu Kommi wrote:
> During the porting of Fujitsu in-memory columnar store on top of pluggable
> storage, I found that the callers of the "heap_beginscan" are expecting
> the returned data is always contains all the records.

Right.


> For example, in the sequential scan, the heap returns the slot with
> the tuple or with value array of all the columns and then the data gets
> filtered and later removed the unnecessary columns with projection.
> This works fine for the row based storage. For columnar storage, if
> the storage knows that upper layers needs only particular columns,
> then they can directly return the specified columns and there is no
> need of projection step. This will help the columnar storage also
> to return proper columns in a faster way.

I think this is an important feature, but I feel fairly strongly that we
should only tackle it in a second version. This patchset is already
pretty darn large.  It's imo not just helpful for columnar, but even for
heap - we e.g. spend a lot of time deforming columns that are never
accessed. That's particularly harmful when the leading columns are all
NOT NULL and fixed width, but even if not, it's painful.


> Is it good to pass the plan to the storage, so that they can find out
> the columns that needs to be returned?

I don't think that's the right approach - this should be a level *below*
plan nodes, not reference them. I suspect we're going to have to have a
new table_scan_set_columnlist() option or such.


> And also if the projection can handle in the storage itself for some
> scenarios, need to be informed the callers that there is no need to
> perform the projection extra.

I don't think that should be done in the storage layer - that's probably
better done introducing custom scan nodes and such.  This has costing
implications etc, so this needs to happen *before* planning is finished.

Greetings,

Andres Freund



Re: Pluggable Storage - Andres's take

2018-09-21 Thread Haribabu Kommi
On Mon, Sep 10, 2018 at 5:42 PM Haribabu Kommi 
wrote:

> On Wed, Sep 5, 2018 at 2:04 PM Haribabu Kommi 
> wrote:
>
>>
>> On Tue, Sep 4, 2018 at 10:33 AM Andres Freund  wrote:
>>
>>> Hi,
>>>
>>> Thanks for the patches!
>>>
>>> On 2018-09-03 19:06:27 +1000, Haribabu Kommi wrote:
>>> > I found couple of places where the zheap is using some extra logic in
>>> > verifying
>>> > whether it is zheap AM or not, based on that it used to took some extra
>>> > decisions.
>>> > I am analyzing all the extra code that is done, whether any callbacks
>>> can
>>> > handle it
>>> > or not? and how? I can come back with more details later.
>>>
>>> Yea, I think some of them will need to stay (particularly around
>>> integrating undo) and som other ones we'll need to abstract.
>>>
>>
>> OK. I will list all the areas that I found with my observation of how to
>> abstract or leaving it and then implement around it.
>>
>
> The following are the change where the code is specific to checking whether
> it is a zheap relation or not?
>
> Overall I found that It needs 3 new API's at the following locations.
> 1. RelationSetNewRelfilenode
> 2. heap_create_init_fork
> 3. estimate_rel_size
> 4. Facility to provide handler options like (skip WAL and etc).
>

During the porting of Fujitsu in-memory columnar store on top of pluggable
storage, I found that the callers of the "heap_beginscan" are expecting
the returned data is always contains all the records.

For example, in the sequential scan, the heap returns the slot with
the tuple or with value array of all the columns and then the data gets
filtered and later removed the unnecessary columns with projection.
This works fine for the row based storage. For columnar storage, if
the storage knows that upper layers needs only particular columns,
then they can directly return the specified columns and there is no
need of projection step. This will help the columnar storage also
to return proper columns in a faster way.

Is it good to pass the plan to the storage, so that they can find out
the columns that needs to be returned? And also if the projection
can handle in the storage itself for some scenarios, need to be
informed the callers that there is no need to perform the projection
extra.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia


RE: Changing the setting of wal_sender_timeout per standby

2018-09-21 Thread Tsunakawa, Takayuki
From: Michael Paquier [mailto:mich...@paquier.xyz]
> I think that the description of wal_sender_timeout and its properties should
> remain where the parameter is defined, so (3) is not a good option in my
> opinion.  (2) has a point with the use of quotes actually, so why not just
> mention options=''-c wal_sender_timeout=5000'' in the example of
> recovery.conf as you suggest and call it a day, but keep the paragraph I
> suggested in (1)?

Agreed.  Sorry to cause you to take this long time for such a tiny patch...


Regards
Takayuki Tsunakawa





Re: Changing the setting of wal_sender_timeout per standby

2018-09-21 Thread Michael Paquier
On Fri, Sep 21, 2018 at 05:37:42AM +, Tsunakawa, Takayuki wrote:
> I think all of these are almost equally good.  I chose (1) at first,
> and you chose (3).  But (2) may be the best, because it's the natural
> place the user will see when configuring the standby, and it already
> contains an example of recovery.conf.  We can add "options=''-c
> wal_sender_timeout=5000''" or something in that example.  I'm OK with
> anyplace, but I recommend adding how to specify wal_sender_timeout in
> primary_conninfo, because libpq's options parameter may not be
> well-konown, and it's a bit difficult to figure out the need to
> enclose the value with double single-quotes. 

I think that the description of wal_sender_timeout and its properties
should remain where the parameter is defined, so (3) is not a good
option in my opinion.  (2) has a point with the use of quotes actually,
so why not just mention options=''-c wal_sender_timeout=5000'' in the
example of recovery.conf as you suggest and call it a day, but keep the
paragraph I suggested in (1)?
--
Michael


signature.asc
Description: PGP signature