Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-02-11 Thread Artur Zakirov

On 11.02.2016 01:19, Tom Lane wrote:

I wrote:

Artur Zakirov  writes:

I think this is not a bug. It is a normal behavior. In Mac OS sscanf()
with the %s format reads the string one character at a time. The size of
letter 'Ñ…' is 2. And sscanf() separate it into two wrong characters.



That argument might be convincing if OSX behaved that way for all
multibyte characters, but it doesn't seem to be doing that.  Why is
only 'Ñ…' affected?


I looked into the OS X sources, and found that indeed you are right:
*scanf processes the input a byte at a time, and applies isspace() to
each byte separately, even when the locale is such that that's a clearly
insane thing to do.  Since this code was derived from FreeBSD, FreeBSD
has or once had the same issue.  (A look at the freebsd project on github
says it still does, assuming that's the authoritative repo.)  Not sure
about other BSDen.

I also verified that in UTF8-based locales, isspace() thinks that 0x85 and
0xA0, and no other high-bit-set values, are spaces.  Not sure exactly why
it thinks that, but that explains why 'Ñ…' fails when adjacent code points
don't.

So apparently the coding rule we have to adopt is "don't use *scanf()
on data that might contain multibyte characters".  (There might be corner
cases where it'd work all right for conversion specifiers other than %s,
but probably you might as well just use strtol and friends in such cases.)
Ugh.

regards, tom lane



Yes, I meant this. The second byte divides the word into two wrong pieces.

Sorry for my unclear explanation. I should to explain more clearly.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2016-02-11 Thread Michael Paquier
On Thu, Feb 11, 2016 at 5:20 PM, Andres Freund  wrote:
> On 2016-02-11 09:25:30 +0530, Amit Kapila wrote:
>> On Wed, Feb 10, 2016 at 1:42 PM, Michael Paquier 
>> wrote:
>> >
>> > On Wed, Feb 10, 2016 at 5:00 PM, Amit Kapila 
>> wrote:
>> > > Okay, but isn't it better that we remove the snapshot taken
>> > > at checkpoint time in the main branch or till where this code is
>> > > getting back-patched.   Do you see any need of same after
>> > > having the logging of snapshot in bgwriter?
>> >
>> > But this one is necessary as well to allow hot standby faster to
>> > initialize, no? Particularly in the case where a bgwriter snapshot
>> > would have been taken just before the checkpoint, there may be up to
>> > 15s until the next one.
>> >
>>
>> It could be helpful if checkpoint is done at shorter intervals, otherwise
>> we anyway log it at 15s interval and if need faster initialisation
>> of hot-standby, then it is better to reduce the log snapshot interval
>> in bgwriter.
>
> No. By emitting the first snapshot directly after the determination of
> the redo pointer, we can get into STANDBY_SNAPSHOT_READY more
> quickly. Especially if some of the snapshots are overflowed. During
> startup 15s can be a *long* time; but on the other hand there's not much
> benefit at logging snapshots more frequently when the system is up.
> I don't think we should tinker with the frequency/logging points, while
> fixing the issue here.

Agreement from here on this point. Andres, what's still remaining on
my side is to know how to we detect in XLoginsert() how we update the
LSN progress. I was willing to add XLogInsertExtended() with a
complementary int/bool flag and let XLogInsert() alone as you don't
like much doing this decision making using just the record type as I
did in one of the patch upthread, however you did not like this idea
either. What exactly do you have in mind?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
Hi

assigned https://commitfest.postgresql.org/9/514/

Regards

Pavel


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Andres Freund
On 2016-02-09 09:27:04 +, Dean Rasheed wrote:
> Looking at this patch, I have mixed feelings about it. On the one hand
> I really like the look of the output, and I can see that the non-fixed
> nature of the output columns makes this hard to achieve server-side.

> But on the other hand, this seems to be going way beyond the normal
> level of result formatting that something like \x does, and I find the
> syntax for sorting particularly ugly.

I've pretty similar doubts. Addinging features to psql which are complex
enough that it's likely that people will be forced to parse psql
output...  On the other hand, a proper server side solution won't be
easy; so maybe this is a okay enough stopgap.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

2016-02-11 Thread Andres Freund
On 2016-02-11 09:25:30 +0530, Amit Kapila wrote:
> On Wed, Feb 10, 2016 at 1:42 PM, Michael Paquier 
> wrote:
> >
> > On Wed, Feb 10, 2016 at 5:00 PM, Amit Kapila 
> wrote:
> > > Okay, but isn't it better that we remove the snapshot taken
> > > at checkpoint time in the main branch or till where this code is
> > > getting back-patched.   Do you see any need of same after
> > > having the logging of snapshot in bgwriter?
> >
> > But this one is necessary as well to allow hot standby faster to
> > initialize, no? Particularly in the case where a bgwriter snapshot
> > would have been taken just before the checkpoint, there may be up to
> > 15s until the next one.
> >
> 
> It could be helpful if checkpoint is done at shorter intervals, otherwise
> we anyway log it at 15s interval and if need faster initialisation
> of hot-standby, then it is better to reduce the log snapshot interval
> in bgwriter.

No. By emitting the first snapshot directly after the determination of
the redo pointer, we can get into STANDBY_SNAPSHOT_READY more
quickly. Especially if some of the snapshots are overflowed. During
startup 15s can be a *long* time; but on the other hand there's not much
benefit at logging snapshots more frequently when the system is up.

I don't think we should tinker with the frequency/logging points, while
fixing the issue here.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
2016-02-11 7:44 GMT+01:00 Vitaly Burovoy :

> On 2/10/16, Pavel Stehule  wrote:
> > Hi Vitaly,
> >
> > please, can you send your version of this patch, how we talked about it
> in
> > Moscow?
> >
> > Thank you
> >
> > Pavel
>
> Hello, Pavel!
>
> Please find attached my version of the patch (it applies cleanly on
> top of 64d89a9 which is current master).
>
> It is time to change oid. 3331 is used by "bytea_sortsupport", I set
> 3334 to "pg_size_bytes".
>
> I got a code design of numbers checking from json_lex_number in
> src/backend/utils/adt/json.c
> For me it seems more structured. I adapted it a little and it allows
> to add parsing an exponent (like '10е3 Mb') easily for full support of
> numeric (if sometimes it is necessary).
>

yes, it is better structured


>
> When I added "trimming" for size units (playing with avoiding an extra
> buffer), I found there is easy to support "bytes" unit (but "byte" is
> still unsupported).
>

I am little bit unsure about support the unit unsupported by GUC parser.
But for usage in custom space and for this current usage, it is acceptable.


>
>
> Also this version includes all changes I mentioned in my last review[1]:
> 1. parse_memory_unit returns value instead of using a pointer (return
> zero if noting is found) for it;
> 2. all messages are in a single style (nuances are in errdetails);
> 3. "select"s are in uppercase, rephrased and moved a comment block in test;
> 4. several tests are added (also with supporting of "bytes" unit);
> 5. a sentence in a documentation is rephrased (numeric->fixed-point
> number); "bytes" unit is added to both functions;
> 6. fixed indentation a little;
> 7. pfree is removed (it is easier than removing all other allocated
> resources).
>

ok, thank you


>
>
> I still think my changes are little and they are based on your work
> (and research).
>

thank you very much - but you refactoring is significant and helpful. I'll
reassign your version to opened commitfest.

Regards

Pavel


>
> [1]
> http://www.postgresql.org/message-id/cakoswnk13wvdem06lro-hucr0pr6et29+dvqy6j5skxzaru...@mail.gmail.com
>
> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
This is last incarnation of my patch (cleaned and refactored by Vitaly Burovoy)
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-02-11 Thread Amit Kapila
On Thu, Feb 11, 2016 at 8:02 PM, Robert Haas  wrote:
>
> On the substantive part of the patch, this doesn't look safe:
>
> +/*
> + * Add ourselves to the list of processes needing a group XID status
> + * update.
> + */
> +proc->clogGroupMember = true;
> +proc->clogGroupMemberXid = xid;
> +proc->clogGroupMemberXidStatus = status;
> +proc->clogGroupMemberPage = pageno;
> +proc->clogGroupMemberLsn = lsn;
> +while (true)
> +{
> +nextidx = pg_atomic_read_u32(>clogGroupFirst);
> +
> +/*
> + * Add the proc to list if the clog page where we need to update
the
> + * current transaction status is same as group leader's clog
page.
> + */
> +if (nextidx != INVALID_PGPROCNO &&
> +ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
> proc->clogGroupMemberPage)
> +return false;
>
> DANGER HERE!
>
> +pg_atomic_write_u32(>clogGroupNext, nextidx);
> +
> +if (pg_atomic_compare_exchange_u32(>clogGroupFirst,
> +   ,
> +   (uint32) proc->pgprocno))
> +break;
> +}
>
> There is a potential ABA problem here.  Suppose that this code
> executes in one process as far as the line that says DANGER HERE.
> Then, the group leader wakes up, performs all of the CLOG
> modifications, performs another write transaction, and again becomes
> the group leader, but for a different member page.  Then, the original
> process that went to sleep at DANGER HERE wakes up.  At this point,
> the pg_atomic_compare_exchange_u32 will succeed and we'll have
> processes with different pages in the list, contrary to the intention
> of the code.
>

Very Good Catch.  I think if we want to address this we can detect
the non-group leader transactions that tries to update the different
CLOG page (different from group-leader) after acquiring
CLogControlLock and then mark these transactions such that
after waking they need to perform CLOG update via normal path.
Now this can decrease the latency of such transactions, but I
think there will be only very few transactions if at-all there which
can face this condition, because most of the concurrent transactions
should be on same page, otherwise the idea of multiple-slots we
have tried upthread would have shown benefits.
Another idea could be that we update the comments indicating the
possibility of multiple Clog-page updates in same group on the basis
that such cases will be less and even if it happens, it won't effect the
transaction status update.

Do you have anything else in mind?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
Hi

2016-02-12 2:28 GMT+01:00 Vitaly Burovoy :

> Hello!
>
> On 2/11/16, Pavel Stehule  wrote:
> > Hi
> >
> > assigned https://commitfest.postgresql.org/9/514/
> >
> > Regards
> > Pavel
>
>
> This patch was almost done by the end of the previous CF(2016-01):
> there was few little flaws which are solved by now.
>
> I have reviewed this patch.
> It applies cleanly at the top of current master
> (f144f73242acef574bc27a4c70e809a64806e4a4), compiles silently and
> implements the behavior reached by consensus.
>
> All size units (the same as used in the GUC) are supported.
> In addition "bytes" is also supported that makes it be a inverse
> function for the pg_size_pretty[1].
>
> The documentation describes behavior of the function and possible size
> units. I don't see what else needs to be said.
>
> The code is clean and commented.
>
> Regression tests cover possible use cases and corner cases.
>
>
> Notes for a committer:
> 1. pg_proc.h has changed, so the CATALOG_VERSION_NO must also be changed.
> 2. Code comments, error message strings, DESCR in pg_proc.h and
> changes in the documentation need proof-reading by a native English
> speaker, which the Author and I are not.
>

Thank you for review and other work

Nice day

Pavel


>
>
> [CF] https://commitfest.postgresql.org/9/514/
> [1]
> http://www.postgresql.org/message-id/ca+tgmozfomg4eyorzzgf7pzotg9pxpuhtqvxlfskim4izh8...@mail.gmail.com
>
> --
> Best regards,
> Vitaly Burovoy
>


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-11 Thread Amit Kapila
On Wed, Feb 10, 2016 at 8:51 PM, Robert Haas  wrote:
>
> On Wed, Feb 10, 2016 at 1:32 AM, Amit Kapila 
wrote:
> > The reason for using centralized way is that we need to request
> > named tranches before initialization of shared memory and as far as
> > I can see, currently there is no way in the subsystems where they can
> > issue such a request, so one possibility is that we introduce new API's
> > like InitBufferLWLocks(), InitLmgrLWLocks(), InitPredicateLWLocks()
> > in respective subsystem and call them in
> > CreateSharedMemoryAndSemaphores() before shared memory
> > initialization. Does by doing that way addresses your concern?
>
> Well, if we're going to have new functions like that, I think the
> place to call them from is PostmasterMain() just before
> process_shared_preload_libraries(). After all, if extensions were
> requesting tranches, they'd do it from
> process_shared_preload_libraries(), so it seems like the right place.
>

Initially I also thought like that, but on further analysis I found that
we also need to request the tranches from InitCommunication() as
that gets called from initdb path.

> However, since the number of locks we need for each of these
> subsystems is fixed at compile time, it seems a bit of a shame to have
> to do something about them at runtime.  I wonder if we should just
> hard-code this in CreateLWLocks() instead of trying to use the
> named-tranche facility.  That is, where that function does this:
>
> MainLWLockTranche.name = "main";
> MainLWLockTranche.array_base = MainLWLockArray;
> MainLWLockTranche.array_stride = sizeof(LWLockPadded);
> LWLockRegisterTranche(LWTRANCHE_MAIN, );
>
> ...register four tranches instead.  And where it does this:
>
> /* Initialize all fixed LWLocks in main array */
> for (id = 0, lock = MainLWLockArray; id < numLocks; id++, lock++)
> LWLockInitialize(>lock, LWTRANCHE_MAIN);
>
> ...have four loops instead, each initializing with a different tranche
> ID.  Then the current method of computing the location of those locks
> would still work just fine; the code changes would be a lot more
> isolated, and we wouldn't have to do runtime save-and-restore of more
> variables on Windows.
>

Sounds sensible, however after changes, CreateLWLocks() started
looking unreadable, so moved initialization and registration of tranches
to separate functions.

Increased number of tranches allocated in LWLockTrancheArray, as
now the LWTRANCHE_FIRST_USER_DEFINED is already greater
than 16.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fixed_locks_tranche_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-02-11 Thread Artur Zakirov

On 11.02.2016 03:33, Tom Lane wrote:

Artur Zakirov  writes:

  [ tsearch_aff_parse_v1.patch ]


I've pushed this with some corrections --- notably, I did not like the
lack of buffer-overrun prevention, and it did the wrong thing if a line
had more than one trailing space character.

We still need to look at other uses of *scanf(), but I think that any
other changes that might be needed should be separate patches anyhow.

regards, tom lane



Thank you!

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid user-level setting = confusing error message

2016-02-11 Thread Tom Lane
Thom Brown  writes:
> At the moment, if we try to set up a configuration parameter for a
> user which doesn't make sense in that context, we get an error message
> that doesn't really tell us that we're not allowed to set it for
> users:
> # ALTER ROLE moo SET log_line_prefix = '%s';
> ERROR:  parameter "log_line_prefix" cannot be changed now
> Might I propose an alternative error message for user-level
> configuration changes that target parameters of level sighup and
> above?
> Perhaps something like:
> ERROR: parameter "log_line_prefix" cannot be set at the user level.

I don't object in principle to having multiple phrasings of that
error message, but I do not see that "at the user level" is any
clearer than what we've got now.  What's a "user level"?

Something along the line of "cannot be changed within individual
sessions" might work better, or maybe not.  Let the bikeshedding
begin ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:36 AM, Robert Haas  wrote:
> On Thu, Feb 11, 2016 at 9:29 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Add some isolation tests for deadlock detection and resolution.
>>
>> Buildfarm says this needs work ...
>>
>> dromedary is one of mine, do you need me to look into what is
>> happening?
>
> That would be great.  Taking a look at what happened, I have a feeling
> this may be a race condition of some kind in the isolation tester.  It
> seems to have failed to recognize that a1 started waiting, and that
> caused the "deadlock detected" message to reported differently.  I'm
> not immediately sure what to do about that.

Yeah, so: try_complete_step() waits 10ms, and if it still hasn't
gotten any data back from the server, then it uses a separate query to
see whether the step in question is waiting on a lock.  So what
must've happened here is that it took more than 10ms for the process
to show up as waiting in pg_stat_activity.

It might be possible to fix this by not passing STEP_NONBLOCK if
there's only one connection that isn't waiting.  I think I had it like
that at one point, and then took it out because it caused some other
problem.  Another option is to lengthen the timeout.  It doesn't seem
great to be dependent on a fixed timeout, but the server doesn't send
any protocol traffic to indicate a lock wait.  If we declared which
steps are supposed to wait, then there'd be no ambiguity, but that
seems like a drag.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improved error reporting in format()

2016-02-11 Thread Teodor Sigaev

Thank you, committed

Jim Nasby wrote:

On 1/2/16 5:57 PM, Jim Nasby wrote:

Attached patch clarifies that %-related error messages with hints as
well as (IMHO) improving the clarity of the message:


Sorry, forgot to update regression tests. New patch attached.





--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Teodor Sigaev

1 - sml_limit to similarity_limit. sml_threshold is difficult to write I think,
similarity_limit is more simple.

It seems to me that threshold is right word by meaning. sml_threshold is my 
choice.


2 - subword_similarity() to word_similarity().

Agree, according to Mike Rylander opinion in this thread


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:29 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Add some isolation tests for deadlock detection and resolution.
>
> Buildfarm says this needs work ...
>
> dromedary is one of mine, do you need me to look into what is
> happening?

That would be great.  Taking a look at what happened, I have a feeling
this may be a race condition of some kind in the isolation tester.  It
seems to have failed to recognize that a1 started waiting, and that
caused the "deadlock detected" message to reported differently.  I'm
not immediately sure what to do about that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-11 Thread Masahiko Sawada
Thank you for reviewing this patch!

On Wed, Feb 10, 2016 at 4:39 PM, Kyotaro HORIGUCHI
 wrote:
> Hello,
>
> At Thu, 4 Feb 2016 02:32:29 +0900, Masahiko Sawada  
> wrote in 
>> On Tue, Feb 2, 2016 at 7:22 PM, Alvaro Herrera  
>> wrote:
>> > Masahiko Sawada wrote:
>> >> I think we have two options.
>> >>
>> >> 1. Change page layout(PG_PAGE_LAYOUT_VERSION) to 5. pg_upgrade detects
>> >> it and then converts only VM files.
>> >> 2. Change pg_upgrade plugin mechanism so that it can handle other name
>> >> conversion plugins (e.g., convertLayout_vm_to_vfm)
>> >>
>> >> I think #2 is better. Thought?
>> >
>> > My vote is for #2 as well.  Maybe we just didn't have forks when this
>> > functionality was invented; maybe the author just didn't think hard
>> > enough about what would be the right interface to do it.
>>
>> I've almost wrote up the very rough patch. (it can pass regression test)
>> Windows supporting is not yet, and Makefile is not correct.
>>
>> I've divided the main patch into two patches; add frozen bit patch and
>> pg_upgrade support patch.
>> 000 patch is almost  same as previous code. (includes small fix)
>> 001 patch provides rewriting visibility map as a pageConverter routine.
>> 002 patch is for enhancement debug message in visibilitymap.c
>
> Thanks, it becomes easy to read.
>
>> In order to support pageConvert plugin, I made the following changes.
>> * Main changes
>> - Remove PAGE_CONVERSION
>> - pg_upgrade plugin is located to 'src/bin/pg_upgrade/plugins' directory.
>> - Move directory having plugins from '$(bin)/plugins' to '$(lib)/plugins'.
>
> These seem fair.
>
>> - Add new page-converter plugin function for visibility map.
>> - Current code doesn't allow us to use link mode (-k) in the case
>> where page-converter is required.
>>
>>   But I changed it so that if page-converter for fork file is
>> specified, we convert it actually even when link mode.
>>
>> * Interface designe
>> convertFile() and convertPage() are plugin function for main relation
>> file, and these functions are dynamically loaded by
>> loadConvertPlugin().
>
> Though I haven't looked this so closer, loadConverterPlugin looks
> to continue deciding what plugin to load using old and new page
> layout versions. Currently the acutually possible versions are 4
> and if we increment it now, 5.
>
> On the other hand, _vm came at the *catalog version* 201107031
> (9.1 release) and _fsm came at 8.4 release. Both of them are of
> page layout version 4. Are we allowed to increment page layout
> version for this reason? And is this framework under
> reconstruction is flexible enough for this kind of changes in
> future? I don't think so.

Yeah, I also think that page layout version should not be increased by
this layout change of vm.
This patch checks catalog version at first, and then decides what
plugin to load.
In this case, only the format of VM has been changed, so pg_upgrade
loads a plugin for VM and convert them.
pg_upgrade doesn't load for other plugin file, and other files are just copied.

>
>
> We have added _vm and _fsm so far so we must use a version number
> that can determine when _vm, _fsm and _vfm are introduced. I'm
> afraid that it is out of purpose, catalog version seems to be
> most usable, it is already used to know when the crash safe VM
> has been introduced.
>
> Using the catalog version, the plugin we provide first would be
> converteLayout_201105231_201602071.so which has only a converter
> from _vm to _vfm. This plugin is loaded for the combination of
> the source cluster with the catalog version of 201105231(when VM
> has been introduced) or later and the destination cluster with
> that *before* 201602071 (this version).
>
> If we change the format of fsm (vm no longer exists), we would
> have a new plugin maybe named
> converteLayout_200904091_2017x.so which has an, maybe,
> inplace file converter for fsm. It will be loaded when a source
> database is of the catalog version of 200904091(when FSM has been
> introduced) or later and a destination before 2017x(the
> version). Catalog version seems to work fine.

I think that it's not good idea to use catalog version for plugin name.
Because, even if catalog version is used for plugin file name as you
suggested, pg_upgrade still needs to decide what plugin name to load
by itself.
Also, the plugin file having catalog version would not easy to
understand what plugin does actually. It's not developer friendly.
The advanteage of using page layout version as plugin name is that
pg_upgrade can decide automatically which plugin should be loaded.

>
> So far, I assumed that we can safely assume that the name of
> files to be converted is [fork_name] so the possible types
> of conversions would be the following.
>
>  - per-page conversion
>  - per-file conversion between the files with the same fork name.

Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:15 AM, Andres Freund  wrote:
> On 2016-02-11 08:50:41 -0500, Robert Haas wrote:
>> Are we thinking to back-patch this?  I would be disinclined to
>> back-patch widespread changes like this.  If there's a specific
>> instance related to Gin where this is causing a tangible problem, we
>> could back-patch just that part, with a clear description of that
>> problem.  Otherwise, I think this should be master-only.
>
> I'm not sure. It's pretty darn nasty that right now we fail in some
> places in the code if stdbool.h is included previously. That's probably
> going to become more and more common. On the other hand it's invasive,
> as you say.  Partially patching things doesn't seem like a really viable
> approach to me, bugs caused by this are hard to find/trigger.

I have never been quite clear on what you think is going to cause
stdbool.h inclusions to become more common.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:11 AM, Aleksander Alekseev
 wrote:
>> Thanks, this looks way better.  Some more comments:
>>
>> - I don't think there's any good reason for this patch to change the
>> calling convention for ShmemInitHash().  Maybe that's a good idea and
>> maybe it isn't, but it's a separate issue from what this patch is
>> doing, and if we're going to do it at all, it should be discussed
>> separately.  Let's leave it out of this patch.
>>
>> - I would not provide an option to change the number of freelist
>> mutexes.  Let's dump DEFAULT_MUTEXES_NUM and MAX_MUTEXES_NUM and have
>> FREELIST_MUTEXES_NUM.  The value of 32 which you selected is fine with
>> me.
>>
>> - The change to LOG2_NUM_LOCK_PARTITIONS in lwlock.h looks like an
>> independent change.  Again, leave it out of this patch and submit it
>> separately with its own benchmarking data if you want to argue for it.
>>
>> I think if you make these changes this patch will be quite a bit
>> smaller and in pretty good shape.
>>
>> Thanks for working on this.
>
> Here is a corrected patch. I decided to keep a small change in
> InitLocks procedure (lock.c) since ShmemInitHash description clearly
> states "For a table whose maximum size is certain, [init_size] should
> be equal to max_size". Also I added two items to my TODO list to send
> more patches as soon (and if) this patch will be applied.
>
> Thanks for your contribution to this patch.

The fact that InitLocks() doesn't do this has been discussed before
and there's no consensus on changing it.  It is, at any rate, a
separate issue.  I'll go through the rest of this patch again now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Andres Freund
On 2016-02-11 09:51:26 -0500, Robert Haas wrote:
> I have never been quite clear on what you think is going to cause
> stdbool.h inclusions to become more common.

Primarily because it's finally starting to be supported across the
board, thanks to MS finally catching up.

Then there's uglyness like:
http://archives.postgresql.org/message-id/508bd3a0-b0a1-4f10-ad9c-2f86656f6e79%40postgrespro.ru


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_parallel_degree context level

2016-02-11 Thread Dean Rasheed
On 11 February 2016 at 13:18, Robert Haas  wrote:
> On Thu, Feb 11, 2016 at 7:40 AM, Thom Brown  wrote:
>> As it currently stands, max_parallel_degree is set to a superuser
>> context
>
> I don't have a clue why it's like that.  It seems like it should be
> PGC_USERSSET just like, say, work_mem.  I think that's just brain fade
> on my part, and I think the current setting will be really
> inconvenient for unprivileged users: as it is, they have no way to
> turn parallel query off.  Unless somebody objects, I'll go change
> that.
>

+1. I would want it to be user settable.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 1:44 PM, Andres Freund  wrote:
> On 2016-02-04 16:54:58 +0100, Andres Freund wrote:
>> Fabien asked me to post a new version of the checkpoint flushing patch
>> series. While this isn't entirely ready for commit, I think we're
>> getting closer.
>>
>> I don't want to post a full series right now, but my working state is
>> available on
>> http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/checkpoint-flush
>> git://git.postgresql.org/git/users/andresfreund/postgres.git checkpoint-flush
>
> The first two commits of the series are pretty close to being ready. I'd
> welcome review of those, and I plan to commit them independently of the
> rest as they're beneficial independently.  The most important bits are
> the comments and docs of 0002 - they weren't particularly good
> beforehand, so I had to rewrite a fair bit.
>
> 0001: Make SetHintBit() a bit more aggressive, afaics that fixes all the
>   potential regressions of 0002
> 0002: Fix the overaggressive flushing by the wal writer, by only
>   flushing every wal_writer_delay ms or wal_writer_flush_after
>   bytes.

I previously reviewed 0001 and I think it's fine.  I haven't reviewed
0002 in detail, but I like the concept.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:53 AM, Robert Haas  wrote:
> The fact that InitLocks() doesn't do this has been discussed before
> and there's no consensus on changing it.  It is, at any rate, a
> separate issue.  I'll go through the rest of this patch again now.

I did a little bit of cleanup on this patch and the results are
attached.  I also did some benchmarking of this version.  I tested
this on two systems, in each case using five-minute, read-only pgbench
runs at scale factor 3000, varying the client count and taking the
median of three runs.  First, I tested it on hydra, a 2-socket,
16-processor, 64-thread POWER box.  This is a community resource
hosted at OSUOSL.  Second, I tested it on cthulhu, an 8-socket,
64-processor, 128-thread x86_64 box.  This is an EnterpriseDB
resource.  Here are the results:

hydra, master vs. patched
1 client: 8042.872109 vs. 7839.587491 (-2.5%)
64 clients: 213311.852043 vs. 214002.314071 (+0.3%)
96 clients: 219551.356907 vs. 221908.397489 (+1.1%)
128 clients: 210279.022760 vs. 217974.079171 (+3.7%)

cthulhu, master vs. patched
1 client: 3407.705820 vs. 3645.129360 (+7.0%)
64 clients: 88667.681890 vs. 82636.914343 (-6.8%)
96 clients: 79303.750250 vs. 105442.993869 (+33.0%)
128 clients: 74684.510668 vs. 120984.938371 (+62.0%)

Obviously, the high-client count results on cthulhu are stellar.  I'm
*assuming* that the regressions are just random variation.  I am
however wondering if it to set the freelist affinity based on
something other than the hash value, like say the PID, so that the
same process doesn't keep switching to a different freelist for every
buffer eviction.  It also strikes me that it's probably quite likely
that slock_t mutex[NUM_FREELISTS] is a poor way to lay out this data
in memory.  For example, on a system where slock_t is just one byte,
most likely all of those mutexes are going to be in the same cache
line, which means you're going to get a LOT of false sharing.  It
seems like it would be sensible to define a new struct that contains
an slock_t, a long, and a HASHELEMENT *, and then make an array of
those structs.  That wouldn't completely eliminate false sharing, but
it would reduce it quite a bit.  My guess is that if you did that, you
could reduce the number of freelists to 8 or less and get pretty much
the same performance benefit that you're getting right now with 32.
And that, too, seems likely to be good for single-client performance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 24a53da..06c413c 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -15,7 +15,7 @@
  * to hash_create.  This prevents any attempt to split buckets on-the-fly.
  * Therefore, each hash bucket chain operates independently, and no fields
  * of the hash header change after init except nentries and freeList.
- * A partitioned table uses a spinlock to guard changes of those two fields.
+ * A partitioned table uses spinlocks to guard changes of those fields.
  * This lets any subset of the hash buckets be treated as a separately
  * lockable partition.  We expect callers to use the low-order bits of a
  * lookup key's hash value as a partition number --- this will work because
@@ -111,6 +111,8 @@
 #define DEF_DIRSIZE			   256
 #define DEF_FFACTOR			   1	/* default fill factor */
 
+/* Number of freelists to be used for a partitioned hash table. */
+#define NUM_FREELISTS			32
 
 /* A hash bucket is a linked list of HASHELEMENTs */
 typedef HASHELEMENT *HASHBUCKET;
@@ -128,12 +130,17 @@ typedef HASHBUCKET *HASHSEGMENT;
  */
 struct HASHHDR
 {
-	/* In a partitioned table, take this lock to touch nentries or freeList */
-	slock_t		mutex;			/* unused if not partitioned table */
-
-	/* These fields change during entry addition/deletion */
-	long		nentries;		/* number of entries in hash table */
-	HASHELEMENT *freeList;		/* linked list of free elements */
+	/*
+	 * The freelist can become a point of contention on high-concurrency hash
+	 * tables, so we use an array of freelist, each with its own mutex and
+	 * nentries count, instead of just a single one.
+	 *
+	 * If hash table is not partitioned only nentries[0] and freeList[0] are
+	 * used and spinlocks are not used at all.
+	 */
+	slock_t		mutex[NUM_FREELISTS];		/* array of spinlocks */
+	long		nentries[NUM_FREELISTS];	/* number of entries */
+	HASHELEMENT *freeList[NUM_FREELISTS]; /* lists of free elements */
 
 	/* These fields can change, but not in a partitioned table */
 	/* Also, dsize can't change in a shared table, even if unpartitioned */
@@ -166,6 +173,9 @@ struct HASHHDR
 
 #define IS_PARTITIONED(hctl)  ((hctl)->num_partitions != 0)
 
+#define FREELIST_IDX(hctl, hashcode) \
+	(IS_PARTITIONED(hctl) ? hashcode % NUM_FREELISTS : 0)
+
 /*
  * Top control structure for a hashtable --- in a shared table, each backend
  * has its 

Re: [HACKERS] [PATCH v4] GSSAPI encryption support

2016-02-11 Thread Robbie Harwood
Michael Paquier  writes:

> On Thu, Feb 11, 2016 at 6:06 AM, Robbie Harwood  wrote:
>>
>> - The GSSAPI authentication code has been moved without modification.
>>   In doing so, the temptation to modify it (flags, error checking, that
>>   big comment at the top about things from Athena, etc.)  is very large.
>>   I do not know whether these changes are best suited to another patch
>>   in this series or should be reviewed separately.  I am also hesitant
>>   to add things beyond the core before I am told this is the right
>>   approach.
>
> I would recommend a different patch if code needs to be moved around.
> The move may make sense taken as an independent piece of the
> integration.

Sorry, are you suggesting separate patch for moving the GSS auth code,
or separate patch for changes to said code?  I am happy to move it if
so, just want to be sure.

> + * Portions Copyright (c) 2015-2016, Red Hat, Inc.
> + * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
> I think that this part may be a problem... Not sure the feeling of the
> others regarding additional copyright notices.

Good catch.  That's an accident (force of habit).  Since I'm pretty sure
this version won't be merged anyway, I'll drop it from the next one.

> It would be good to add that to the next CF, I will be happy to get a
> look at it.

Sounds good.  Thanks for looking at it!


signature.asc
Description: PGP signature


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 12:10 PM, Amit Kapila  wrote:
> Okay, changed as per suggestion.

Looks good to me; committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Freeze avoidance of very large table.

2016-02-11 Thread Robert Haas
On Wed, Feb 3, 2016 at 12:32 PM, Masahiko Sawada  wrote:
> I've divided the main patch into two patches; add frozen bit patch and
> pg_upgrade support patch.
> 000 patch is almost  same as previous code. (includes small fix)
> 001 patch provides rewriting visibility map as a pageConverter routine.
> 002 patch is for enhancement debug message in visibilitymap.c

I'd like to suggest splitting 000 into two patches.  The first one
would change the format of the visibility map, and the second one
would change VACUUM to optimize scans based on the new format.  I
think that would make it easier to get this reviewed and committed.

I think this patch churns a bunch of things that don't really need to
be churned.  For example, consider this hunk:

 /*
  * If we didn't pin the visibility map page and the page has become all
- * visible while we were busy locking the buffer, we'll have to unlock and
- * re-lock, to avoid holding the buffer lock across an I/O.  That's a bit
- * unfortunate, but hopefully shouldn't happen often.
+ * visible or all frozen while we were busy locking the buffer, we'll
+ * have to unlock and re-lock, to avoid holding the buffer lock across an
+ * I/O.  That's a bit unfortunate, but hopefully shouldn't happen often.
  */

Since the page can't become all-frozen without also becoming
all-visible, the original text is still 100% accurate, and the change
doesn't seem to add any useful clarity.  Let's think about which
things really need to be changed and not just mechanically change
everything.

-Assert(PageIsAllVisible(heapPage));
+/*
+ * Caller is expected to set PD_ALL_VISIBLE or
+ * PD_ALL_FROZEN first.
+ */
+Assert(((flags | VISIBILITYMAP_ALL_VISIBLE) &&
PageIsAllVisible(heapPage)) ||
+   ((flags | VISIBILITYMAP_ALL_FROZEN) &&
PageIsAllFrozen(heapPage)));

I think this would be more clear as two separate assertions.

Your 000 patch has a little bit of whitespace damage:

[rhaas pgsql]$ git diff --check
src/backend/commands/vacuumlazy.c:1951: indent with spaces.
+bool *all_visible, bool
*all_frozen)
src/include/access/heapam_xlog.h:393: indent with spaces.
+Buffer vm_buffer, TransactionId
cutoff_xid, uint8 flags);

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Pavel Stehule
Hi

2016-02-11 7:44 GMT+01:00 Vitaly Burovoy :

> On 2/10/16, Pavel Stehule  wrote:
> > Hi Vitaly,
> >
> > please, can you send your version of this patch, how we talked about it
> in
> > Moscow?
> >
> > Thank you
> >
> > Pavel
>
> Hello, Pavel!
>
> Please find attached my version of the patch (it applies cleanly on
> top of 64d89a9 which is current master).
>
>
I tested this patch and all tests passed

Regards

Pavel


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Dean Rasheed
On 11 February 2016 at 08:43, Andres Freund  wrote:
> On 2016-02-09 09:27:04 +, Dean Rasheed wrote:
>> Looking at this patch, I have mixed feelings about it. On the one hand
>> I really like the look of the output, and I can see that the non-fixed
>> nature of the output columns makes this hard to achieve server-side.
>
>> But on the other hand, this seems to be going way beyond the normal
>> level of result formatting that something like \x does, and I find the
>> syntax for sorting particularly ugly.
>
> I've pretty similar doubts. Addinging features to psql which are complex
> enough that it's likely that people will be forced to parse psql
> output...  On the other hand, a proper server side solution won't be
> easy; so maybe this is a okay enough stopgap.
>

Well to be clear, I like the idea of this feature, and I'm not trying
to stand in the way of progressing it. However, I can't see myself
committing it in its current form.

My biggest problem is with the sorting, for all the reasons discussed
above. There is absolutely no reason for \crosstabview to be
re-sorting rows -- they should just be left in the original query
result order. Sorting columns is a little more understandable, since
there is no way for the original query to control the order in which
the colV values come out, but Tom raises a good point -- there are far
too many bells and whistles when it comes to sorting, and we don't
want to be adding all of them to the psql syntax.

Thinking about this some more though, perhaps *sorting* the columns is
the wrong way to be thinking about it. Perhaps a better approach would
be to allow the columns to be *listed* (optionally, using a separate
query). Something like the following (don't get too hung up on the
syntax):

SELECT name,
   to_char(date, 'Mon') AS month,
   sum(amount) AS amount
 FROM invoices
 GROUP BY 1,2
 ORDER BY name
\crosstabview cols = (select to_char(d, 'Mon') from
generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Teodor Sigaev

The behavior of this function is surprising to me.

select substring_similarity('dog' ,  'hotdogpound') ;

  substring_similarity
--
  0.25


Substring search was desined to search similar word in string:
contrib_regression=# select substring_similarity('dog' ,  'hot dogpound') ;
  substring_similarity
--
  0.75

contrib_regression=# select substring_similarity('dog' ,  'hot dog pound') ;
  substring_similarity
--
 1


Hmm, this behavior looks too much like magic to me.  I mean, a substring
is a substring -- why are we treating the space as a special character
here?


Because  it isn't a regex for substring search. Since implementing, pg_trgm 
works over words in string.

contrib_regression=# select similarity('block hole', 'hole black');
 similarity

   0.571429
contrib_regression=# select similarity('block hole', 'black hole');
 similarity

   0.571429

It ignores spaces between words and word's order.

I agree, that substring_similarity is confusing name, but actually it search 
most similar word in second arg to first arg and returns their similarity.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Improvements of Hunspell dictionaries support

2016-02-11 Thread Artur Zakirov

Thank you for the review.

On 10.02.2016 19:46, Teodor Sigaev wrote:



I duplicate the patch here.


it's very good thing to update disctionaries to support modern versions.
And thank you for improving documentation. Also I've impressed by long
description in spell.c header.

Som notices about code:

1
  struct SPELL. Why do you remove union p? You leave comment
  about using d struct instead of flag field and as can see
  it's right comment. It increases size of SPELL structure.


I will fix it. I had misunderstood the Alvaro's comment about it.



2 struct AFFIX. I'm agree with Alvaro taht sum of sizes of bit fields
should be less or equal to size of integer. In opposite case, suppose,
we can get undefined behavior. Please, split  bitfields  to two integers.


I will fix it. Here I had misunderstood too.



3 unsigned char flagval[65000];
   Is it forbidden to use 6 number? In any case, decodeFlag() doesn't
   restrict return value. I suggest to enlarge array to 1<<16 and add limit
   to return value of decodeFlag().


I think it can be done.



4
  I'd like to see a short comment describing at least new functions


Now in spell.c there are more comments. I wanted to send fixed patch 
after adding all comments that I want to add. But I can send the patch now.
Also I will merge this commit 
http://www.postgresql.org/message-id/e1atf9o-0001ga...@gemulon.postgresql.org




5
  Pls, add tests for new code.




I will add.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] Proposal for \crosstabview in psql

2016-02-11 Thread Pavel Stehule
Thinking about this some more though, perhaps *sorting* the columns is
> the wrong way to be thinking about it. Perhaps a better approach would
> be to allow the columns to be *listed* (optionally, using a separate
> query). Something like the following (don't get too hung up on the
> syntax):
>
> SELECT name,
>to_char(date, 'Mon') AS month,
>sum(amount) AS amount
>  FROM invoices
>  GROUP BY 1,2
>  ORDER BY name
> \crosstabview cols = (select to_char(d, 'Mon') from
> generate_series('2000-01-01'::date, '2000-12-01', '1 month') d)
>

The idea is ok, but this design cannot be described as user friendly. The
work with time dimension is pretty common, and should be supported by some
short user friendly syntax.

Regards

Pavel


>
> Regards,
> Dean
>


[HACKERS] max_parallel_degree context level

2016-02-11 Thread Thom Brown
Hi all,

As it currently stands, max_parallel_degree is set to a superuser
context, but we probably want to discuss whether we want to keep it
this way prior to releasing 9.6.  Might we want to reduce its level so
that users can adjust it accordingly?  They'd still be limited by
max_worker_processes, so they'd at least be constrained by that
setting.

Opinions?

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Request for Code Review: BPGSQL

2016-02-11 Thread Rich Jones
Hello, team!

I am writing on behalf of the BPGSQL Project [1] to request a code audit
from a core PGSQL team member.

The current maintainer is worried about the security of the code, and is
considering closing the project unless it can be properly reviewed [2]. As
a project living downstream[3] of that client library, I'd obviously much
rather see that project get reviewed rather than see it die.

Would anybody here be so kind as to volunteer to give BPGSQL a code review
from an upstream developer's perspective? It would have a lot of value
downstream users who want to use Postgres on Amazon RDS for serverless
applications, and I'm sure in plenty of other places.

Thanks very much!,
Rich Jones

[1] https://github.com/d33tah/bpgsql
[2] https://github.com/d33tah/bpgsql/issues/7
[3] https://github.com/Miserlou/django-zappa/issues/3


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Teodor Sigaev

I have attached a new version of the patch. It fixes error of operators <->> and
%>:
- operator <->> did not pass the regression test in CentOS 32 bit (gcc 4.4.7
20120313).
- operator %> did not pass the regression test in FreeBSD 32 bit (gcc 4.2.1
20070831).

It was because of variable optimization by gcc.


Fixed with volatile modifier, right?

I'm close to push this patches, but I still doubt in names, and I'd like to see 
comment from English speackers:

1 sml_limit GUC variable (options: similarity_limit, sml_threshold)
2 subword_similarity(). Actually, it finds most similar word (not substring!) 
from whole string. word_similarity? word_in_string_similarity?



substring_similarity_pos() could be a separate patch.

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 3:15 AM, Amit Kapila  wrote:
> Sounds sensible, however after changes, CreateLWLocks() started
> looking unreadable, so moved initialization and registration of tranches
> to separate functions.

Seems good.

> Increased number of tranches allocated in LWLockTrancheArray, as
> now the LWTRANCHE_FIRST_USER_DEFINED is already greater
> than 16.

OK, cool.

+   int numIndividualLocks = NUM_INDIVIDUAL_LWLOCKS;
+   int numBufMappingLocks = NUM_BUFFER_PARTITIONS;
+   int numLmgrLocks = NUM_LOCK_PARTITIONS;
+   int i;

This doesn't seem like it's buying us anything.  Can't we just use the
constants directly?

+   BufMappingLWLockTranche.name = "Buffer Mapping Locks";

We've not been very consistent about how we name our tranches.  If
we're going to try to expose this information to users, we're going to
need to do better. We've got these names:

main
clog
commit_timestamp
subtrans
multixact_offset
multixact_member
async
oldserxid
WALInsertLocks
Buffer Content Locks
Buffer IO Locks
ReplicationOrigins
Replication Slot IO Locks
proc

Personally, I prefer the style of all lower case, words separated with
semicolons where necessary, the word "locks" left out.  That way we
can display wait events as something like lwlock: and
not have the word "lock" in there twice.  I don't think it does much
good to try to make these names "user friendly" because,
fundamentally, these are internal pieces of the system and you won't
know what they are unless you do.  So I'd prefer to make them look
like identifiers.  If we adopt that style, then I think the new
tranches created by this patch should be named buffer_mapping,
lock_manager, and predicate_lock_manager and then, probably as a
separate patch, we should rename the rest to match that style
(buffer_content, buffer_io, replication_origin, replication_slot_io).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Invalid user-level setting = confusing error message

2016-02-11 Thread Thom Brown
Hi,

At the moment, if we try to set up a configuration parameter for a
user which doesn't make sense in that context, we get an error message
that doesn't really tell us that we're not allowed to set it for
users:

# ALTER ROLE moo SET log_line_prefix = '%s';
ERROR:  parameter "log_line_prefix" cannot be changed now

Might I propose an alternative error message for user-level
configuration changes that target parameters of level sighup and
above?

Perhaps something like:

ERROR: parameter "log_line_prefix" cannot be set at the user level.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v4] GSSAPI encryption support

2016-02-11 Thread Michael Paquier
On Thu, Feb 11, 2016 at 6:06 AM, Robbie Harwood  wrote:
> For your consideration, here is a new version of GSSAPI encryption
> support.  For those who prefer, it's also available on my github:
> https://github.com/frozencemetery/postgres/commit/c92275b6605d7929cda5551de47a4c60aab7179e

Yeah! Glad to see you back.

> Some thoughts:
>
> - The overall design is different this time - GSS encryption sits in
>   parallel construction to SSL encryption rather than at the protocol
>   level - so a strict diff probably isn't useful.
>
> - The GSSAPI authentication code has been moved without modification.
>   In doing so, the temptation to modify it (flags, error checking, that
>   big comment at the top about things from Athena, etc.)  is very large.
>   I do not know whether these changes are best suited to another patch
>   in this series or should be reviewed separately.  I am also hesitant
>   to add things beyond the core before I am told this is the right
>   approach.

I would recommend a different patch if code needs to be moved around.
The move may make sense taken as an independent piece of the
integration.

> - There's no fallback here.  I wrote fallback support for versions 1-3,
>   and the same design could apply here without too much trouble, but I
>   am hesitant to port it over before the encryption design is approved.
>   I strongly suspect you will not want to merge this without fallback
>   support, and that makes sense to me.
>
> - The client and server code look a lot like each other.  This
>   resemblance is not exact, and my understanding is that server and
>   client need to compile independently, so I do not know of a way to
>   rectify this.  Suggestions are welcome.

At quick glance, I like the direction this is taking. You moved all
the communication protocol at a lower level where SSL and secure reads
are located, so this results in a neat integration.

+ * Portions Copyright (c) 2015-2016, Red Hat, Inc.
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
I think that this part may be a problem... Not sure the feeling of the
others regarding additional copyright notices.

It would be good to add that to the next CF, I will be happy to get a
look at it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_parallel_degree context level

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 7:40 AM, Thom Brown  wrote:
> As it currently stands, max_parallel_degree is set to a superuser
> context, but we probably want to discuss whether we want to keep it
> this way prior to releasing 9.6.  Might we want to reduce its level so
> that users can adjust it accordingly?  They'd still be limited by
> max_worker_processes, so they'd at least be constrained by that
> setting.

I don't have a clue why it's like that.  It seems like it should be
PGC_USERSSET just like, say, work_mem.  I think that's just brain fade
on my part, and I think the current setting will be really
inconvenient for unprivileged users: as it is, they have no way to
turn parallel query off.  Unless somebody objects, I'll go change
that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Mike Rylander
On Thu, Feb 11, 2016 at 8:11 AM, Teodor Sigaev  wrote:
>> I have attached a new version of the patch. It fixes error of operators
>> <->> and
>> %>:
>> - operator <->> did not pass the regression test in CentOS 32 bit (gcc
>> 4.4.7
>> 20120313).
>> - operator %> did not pass the regression test in FreeBSD 32 bit (gcc
>> 4.2.1
>> 20070831).
>>
>> It was because of variable optimization by gcc.
>
>
> Fixed with volatile modifier, right?
>
> I'm close to push this patches, but I still doubt in names, and I'd like to
> see comment from English speackers:
> 1 sml_limit GUC variable (options: similarity_limit, sml_threshold)
> 2 subword_similarity(). Actually, it finds most similar word (not
> substring!) from whole string. word_similarity? word_in_string_similarity?
>

At least for this English speaker, substring_similarity is not
confusing even if it's not internally accurate, but English is a
strange language.

Because I want the bike shed to be blue, how does
query_string_similarity sound instead?  If that's overly precise, then
word_similarity would be fine with me.

Thanks,

--
Mike Rylander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-02-11 Thread Robert Haas
On Tue, Feb 9, 2016 at 11:14 PM, Amit Kapila  wrote:
>> Patches still apply 1 month later.
>
> Thanks for verification!
>
>>
>> I don't really have an opinion on the variable naming.  I guess they
>> only need making longer if there's going to be some confusion about
>> what they're for,
>
> makes sense, that is the reason why I have added few comments
> as well, but not sure if you are suggesting something else.
>
>> but I'm guessing it's not a blocker here.
>>
>
> I also think so, but not sure what else is required here.  The basic
> idea of this rename_pgproc_variables_v2.patch is to rename
> few variables in existing similar code, so that the main patch
> group_update_clog can adapt those naming convention if required,
> other than that I have handled all review comments raised in this
> thread (mainly by Simon and Robert).
>
> Is there anything, I can do to move this forward?

Well, looking at this again, I think I'm OK to go with your names.
That doesn't seem like the thing to hold up the patch for.  So I'll go
ahead and push the renaming patch now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Request for Code Review: BPGSQL

2016-02-11 Thread Jacek Wielemborek
W dniu 11.02.2016 o 14:06, Rich Jones pisze:
> Hello, team!
> 
> I am writing on behalf of the BPGSQL Project [1] to request a code audit
> from a core PGSQL team member.
> 
> The current maintainer is worried about the security of the code, and is
> considering closing the project unless it can be properly reviewed [2]. As
> a project living downstream[3] of that client library, I'd obviously much
> rather see that project get reviewed rather than see it die.
> 
> Would anybody here be so kind as to volunteer to give BPGSQL a code review
> from an upstream developer's perspective? It would have a lot of value
> downstream users who want to use Postgres on Amazon RDS for serverless
> applications, and I'm sure in plenty of other places.
> 
> Thanks very much!,
> Rich Jones
> 
> [1] https://github.com/d33tah/bpgsql
> [2] https://github.com/d33tah/bpgsql/issues/7
> [3] https://github.com/Miserlou/django-zappa/issues/3
> 

Hello,

Thanks Rich, I second the request for a code review.

I felt I'd add that this is a 1500-line pure-Python PostgreSQL client
module that I inherited after Barry Pederson. After I realized how
execute() is implemented, I have my worries and I'd rather not risk
making my users vulnerable.

I'd be really grateful if somebody who knows a bit of Python and the
guts of PostgreSQL could speak up on this one.

Cheers,
d33tah



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] extend pgbench expressions with functions

2016-02-11 Thread Michael Paquier
On Thu, Feb 11, 2016 at 12:37 AM, Fabien COELHO  wrote:
> v26 attached implements these changes.

+/* the argument list has been built in reverse order, it is fixed here */
+expr->u.function.args = reverse_elist(args);
Hm. I may be missing something, but why is that necessary? This is
basically doing a double-reversion to put all the arguments in the
correct order when parsing the function arguments.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Artur Zakirov

On 11.02.2016 16:35, Mike Rylander wrote:

On Thu, Feb 11, 2016 at 8:11 AM, Teodor Sigaev  wrote:

I have attached a new version of the patch. It fixes error of operators
<->> and
%>:
- operator <->> did not pass the regression test in CentOS 32 bit (gcc
4.4.7
20120313).
- operator %> did not pass the regression test in FreeBSD 32 bit (gcc
4.2.1
20070831).

It was because of variable optimization by gcc.



Fixed with volatile modifier, right?

I'm close to push this patches, but I still doubt in names, and I'd like to
see comment from English speackers:
1 sml_limit GUC variable (options: similarity_limit, sml_threshold)
2 subword_similarity(). Actually, it finds most similar word (not
substring!) from whole string. word_similarity? word_in_string_similarity?



At least for this English speaker, substring_similarity is not
confusing even if it's not internally accurate, but English is a
strange language.

Because I want the bike shed to be blue, how does
query_string_similarity sound instead?  If that's overly precise, then
word_similarity would be fine with me.

Thanks,

--
Mike Rylander



Great. I can change names:
1 - sml_limit to similarity_limit. sml_threshold is difficult to write I 
think, similarity_limit is more simple.

2 - subword_similarity() to word_similarity().

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Robert Haas
On Tue, Feb 9, 2016 at 11:53 PM, Michael Paquier
 wrote:
> On Wed, Feb 10, 2016 at 1:17 AM, Yury Zhuravlev
>  wrote:
>> I've just run into a problem with these macro. Function ginStepRight breaks
>> completely when compiled using the MSVC2013 and MSVC2015 (since these
>> releases use C99's bools but without stdbool.h like C++).
>> I don't understand why the patch has not been commited yet. It seems to me
>> not so important !! or ! = 0, the solution is all that matters.
>
> +1 for applying it. There were some conflicts in gist and lwlock, that
> I fixed as per the attached. I have added as well an entry in next CF:
> https://commitfest.postgresql.org/9/507/
> If a committer wants to pick up that before, feel free. For now it
> won't fall in the void, that's better than nothing.

Not actually attached.

Are we thinking to back-patch this?  I would be disinclined to
back-patch widespread changes like this.  If there's a specific
instance related to Gin where this is causing a tangible problem, we
could back-patch just that part, with a clear description of that
problem.  Otherwise, I think this should be master-only.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-02-11 Thread Aleksander Alekseev
Hello, Robert

> Thanks, this looks way better.  Some more comments:
> 
> - I don't think there's any good reason for this patch to change the
> calling convention for ShmemInitHash().  Maybe that's a good idea and
> maybe it isn't, but it's a separate issue from what this patch is
> doing, and if we're going to do it at all, it should be discussed
> separately.  Let's leave it out of this patch.
> 
> - I would not provide an option to change the number of freelist
> mutexes.  Let's dump DEFAULT_MUTEXES_NUM and MAX_MUTEXES_NUM and have
> FREELIST_MUTEXES_NUM.  The value of 32 which you selected is fine with
> me.
> 
> - The change to LOG2_NUM_LOCK_PARTITIONS in lwlock.h looks like an
> independent change.  Again, leave it out of this patch and submit it
> separately with its own benchmarking data if you want to argue for it.
> 
> I think if you make these changes this patch will be quite a bit
> smaller and in pretty good shape.
> 
> Thanks for working on this.
> 

Here is a corrected patch. I decided to keep a small change in
InitLocks procedure (lock.c) since ShmemInitHash description clearly
states "For a table whose maximum size is certain, [init_size] should
be equal to max_size". Also I added two items to my TODO list to send
more patches as soon (and if) this patch will be applied.

Thanks for your contribution to this patch.diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index e3e9599..5296e77 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -374,18 +374,10 @@ void
 InitLocks(void)
 {
 	HASHCTL		info;
-	long		init_table_size,
-max_table_size;
+	long		max_table_size;
 	bool		found;
 
 	/*
-	 * Compute init/max size to request for lock hashtables.  Note these
-	 * calculations must agree with LockShmemSize!
-	 */
-	max_table_size = NLOCKENTS();
-	init_table_size = max_table_size / 2;
-
-	/*
 	 * Allocate hash table for LOCK structs.  This stores per-locked-object
 	 * information.
 	 */
@@ -393,16 +385,16 @@ InitLocks(void)
 	info.keysize = sizeof(LOCKTAG);
 	info.entrysize = sizeof(LOCK);
 	info.num_partitions = NUM_LOCK_PARTITIONS;
+	max_table_size = NLOCKENTS();
 
 	LockMethodLockHash = ShmemInitHash("LOCK hash",
-	   init_table_size,
+	   max_table_size,
 	   max_table_size,
 	   ,
 	HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
 
 	/* Assume an average of 2 holders per lock */
 	max_table_size *= 2;
-	init_table_size *= 2;
 
 	/*
 	 * Allocate hash table for PROCLOCK structs.  This stores
@@ -414,7 +406,7 @@ InitLocks(void)
 	info.num_partitions = NUM_LOCK_PARTITIONS;
 
 	LockMethodProcLockHash = ShmemInitHash("PROCLOCK hash",
-		   init_table_size,
+		   max_table_size,
 		   max_table_size,
 		   ,
  HASH_ELEM | HASH_FUNCTION | HASH_PARTITION);
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 24a53da..1a3244a 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -15,7 +15,7 @@
  * to hash_create.  This prevents any attempt to split buckets on-the-fly.
  * Therefore, each hash bucket chain operates independently, and no fields
  * of the hash header change after init except nentries and freeList.
- * A partitioned table uses a spinlock to guard changes of those two fields.
+ * A partitioned table uses spinlocks to guard changes of those fields.
  * This lets any subset of the hash buckets be treated as a separately
  * lockable partition.  We expect callers to use the low-order bits of a
  * lookup key's hash value as a partition number --- this will work because
@@ -111,6 +111,11 @@
 #define DEF_DIRSIZE			   256
 #define DEF_FFACTOR			   1	/* default fill factor */
 
+/*
+ * Number of mutexes and respectively number of nentries and freeLists
+ * (see below). Should be power of 2.
+ */
+#define MUTEXES_NUM32
 
 /* A hash bucket is a linked list of HASHELEMENTs */
 typedef HASHELEMENT *HASHBUCKET;
@@ -128,12 +133,23 @@ typedef HASHBUCKET *HASHSEGMENT;
  */
 struct HASHHDR
 {
-	/* In a partitioned table, take this lock to touch nentries or freeList */
-	slock_t		mutex;			/* unused if not partitioned table */
-
-	/* These fields change during entry addition/deletion */
-	long		nentries;		/* number of entries in hash table */
-	HASHELEMENT *freeList;		/* linked list of free elements */
+	/*
+	 * There are two fields declared below: nentries and freeList. nentries
+	 * stores current number of entries in a hash table. freeList is a linked
+	 * list of free elements.
+	 *
+	 * To keep these fields consistent in a partitioned table we need to
+	 * synchronize access to them using a spinlock. But it turned out that a
+	 * single spinlock can create a bottleneck. To prevent lock contention an
+	 * array of MUTEXES_NUM spinlocks is used. Each spinlock protects one
+	 * element of nentries and freeList arrays.
+	 *
+	 * If hash table is not partitioned only nentries[0] and 

Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Andres Freund
On 2016-02-11 08:50:41 -0500, Robert Haas wrote:
> Are we thinking to back-patch this?  I would be disinclined to
> back-patch widespread changes like this.  If there's a specific
> instance related to Gin where this is causing a tangible problem, we
> could back-patch just that part, with a clear description of that
> problem.  Otherwise, I think this should be master-only.

I'm not sure. It's pretty darn nasty that right now we fail in some
places in the code if stdbool.h is included previously. That's probably
going to become more and more common. On the other hand it's invasive,
as you say.  Partially patching things doesn't seem like a really viable
approach to me, bugs caused by this are hard to find/trigger.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fuzzy substring searching with the pg_trgm extension

2016-02-11 Thread Artur Zakirov

On 11.02.2016 16:11, Teodor Sigaev wrote:

I have attached a new version of the patch. It fixes error of
operators <->> and
%>:
- operator <->> did not pass the regression test in CentOS 32 bit (gcc
4.4.7
20120313).
- operator %> did not pass the regression test in FreeBSD 32 bit (gcc
4.2.1
20070831).

It was because of variable optimization by gcc.


Fixed with volatile modifier, right?


Yes, it was fixes with volatile modifier.



I'm close to push this patches, but I still doubt in names, and I'd like
to see comment from English speackers:
1 sml_limit GUC variable (options: similarity_limit, sml_threshold)
2 subword_similarity(). Actually, it finds most similar word (not
substring!) from whole string. word_similarity? word_in_string_similarity?


substring_similarity_pos() could be a separate patch.




--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speed up Clog Access by increasing CLOG buffers

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:04 AM, Robert Haas  wrote:
>> Is there anything, I can do to move this forward?
>
> Well, looking at this again, I think I'm OK to go with your names.
> That doesn't seem like the thing to hold up the patch for.  So I'll go
> ahead and push the renaming patch now.

On the substantive part of the patch, this doesn't look safe:

+/*
+ * Add ourselves to the list of processes needing a group XID status
+ * update.
+ */
+proc->clogGroupMember = true;
+proc->clogGroupMemberXid = xid;
+proc->clogGroupMemberXidStatus = status;
+proc->clogGroupMemberPage = pageno;
+proc->clogGroupMemberLsn = lsn;
+while (true)
+{
+nextidx = pg_atomic_read_u32(>clogGroupFirst);
+
+/*
+ * Add the proc to list if the clog page where we need to update the
+ * current transaction status is same as group leader's clog page.
+ */
+if (nextidx != INVALID_PGPROCNO &&
+ProcGlobal->allProcs[nextidx].clogGroupMemberPage !=
proc->clogGroupMemberPage)
+return false;

DANGER HERE!

+pg_atomic_write_u32(>clogGroupNext, nextidx);
+
+if (pg_atomic_compare_exchange_u32(>clogGroupFirst,
+   ,
+   (uint32) proc->pgprocno))
+break;
+}

There is a potential ABA problem here.  Suppose that this code
executes in one process as far as the line that says DANGER HERE.
Then, the group leader wakes up, performs all of the CLOG
modifications, performs another write transaction, and again becomes
the group leader, but for a different member page.  Then, the original
process that went to sleep at DANGER HERE wakes up.  At this point,
the pg_atomic_compare_exchange_u32 will succeed and we'll have
processes with different pages in the list, contrary to the intention
of the code.

This kind of thing can be really subtle and difficult to fix.  The
problem might not happen even with a very large amount of testing, and
then might happen in the real world on some other hardware or on
really rare occasions.  In general, compare-and-swap loops need to be
really really simple with minimal dependencies on other data, ideally
none.  It's very hard to make anything else work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_parallel_degree context level

2016-02-11 Thread Simon Riggs
On 11 February 2016 at 12:40, Thom Brown  wrote:

> Hi all,
>
> As it currently stands, max_parallel_degree is set to a superuser
> context, but we probably want to discuss whether we want to keep it
> this way prior to releasing 9.6.  Might we want to reduce its level so
> that users can adjust it accordingly?  They'd still be limited by
> max_worker_processes, so they'd at least be constrained by that
> setting.
>

A few questions and thoughts to help decide...

Does it take into account the parallel degree during planning?
Does it take into account the actual parallel degree during planning?

If you make max_worker_processes USERSET won't everybody just set it to
max_worker_processes?

How do you prevent or control that? Is that limited by the user's
connection limit?

How does the server behave when less servers are available than
max_parallel_degree?

Is it slower if you request N workers, yet only 1 is available?

Does pg_stat_activity show the number of parallel workers active for a
controliing process?
Do parallel workers also show in pg_stat_activity at all?
If so, does it show who currently has them?
Does pg_stat_statements record how many workers were available during
execution?

Is there a way to prevent execution if too few parallel workers are
available?

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-11 Thread Tom Lane
Robert Haas  writes:
>> That would be great.  Taking a look at what happened, I have a feeling
>> this may be a race condition of some kind in the isolation tester.  It
>> seems to have failed to recognize that a1 started waiting, and that
>> caused the "deadlock detected" message to reported differently.  I'm
>> not immediately sure what to do about that.

> Yeah, so: try_complete_step() waits 10ms, and if it still hasn't
> gotten any data back from the server, then it uses a separate query to
> see whether the step in question is waiting on a lock.  So what
> must've happened here is that it took more than 10ms for the process
> to show up as waiting in pg_stat_activity.

No, because the machines that are failing are showing a ""
annotation that your reference output *doesn't* have.  I think what is
actually happening is that these machines are seeing the process as
waiting and reporting it, whereas on your machine the backend detects
the deadlock and completes the query (with an error) before
isolationtester realizes that the process is waiting.

It would probably help if you didn't do this:

setup   { BEGIN; SET deadlock_timeout = '10ms'; }

which pretty much guarantees that there is a race condition: you've set it
so that the deadlock detector will run at approximately the same time when
isolationtester will be probing the state.  I'm surprised that it seemed
to act consistently for you.  I would suggest putting all the other
sessions to deadlock_timeout of 100s and the one you want to fail to
timeout of ~ 5s.  That will mean that the "" output should
show up pretty reliably even on overloaded buildfarm critters.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_parallel_degree context level

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 10:32 AM, Simon Riggs  wrote:
> A few questions and thoughts to help decide...
>
> Does it take into account the parallel degree during planning?
> Does it take into account the actual parallel degree during planning?

max_worker_processes is a query planner GUC, just like work_mem.  Just
as we can't know how much memory will be available at planning time,
we can't know how many worker processes will be available at execution
time.  In each case, we have a GUC that tells the system what to
assume.  In each case also, some better model might be possible, but
today we don't have it.

> If you make max_worker_processes USERSET won't everybody just set it to
> max_worker_processes?

I think that you meant for the first instance of max_worker_processes
in that sentence to be max_parallel_degree.  I'll respond as if that's
what you meant.  Basically, I think this like asking whether everybody
won't just set work_mem to the entire amount of free memory on the
machine and try to use it all themselves.  We really have never tried
very hard to prevent that sort of thing in PostgreSQL.  Maybe we
should, but we'd have to fix an awful lot of stuff.  There are many
ways for malicious users to do things that interfere with the ability
of other users to use the system.  I admit that the same problem
exists here, but I don't think it's any more severe than any of the
cases that already exist.  In some ways I think it's a whole lot LESS
serious than what a bad work_mem setting an do to your system.

> How does the server behave when less servers are available than
> max_parallel_degree?

The same query plan is executed with fewer workers, even with 0
workers.  If we chose a parallel plan that is a mirror of the
non-parallel plan we would have chosen, this doesn't cost much.  If
there's some other non-parallel plan that would be much faster and we
only picked this parallel plan because we thought we would have
several workers available, and then we get fewer or none, that might
be expensive.  One can imagine a system that always computes both a
parallel plan and a non-parallel plan and chooses between them at
runtime, or even multiple plans for varying number of workers, but we
don't have that today.  I am not actually sure it would be worth it.

Basically, I think this comes back to the analogy between
max_parallel_degree and work_mem.  If you set work_mem too high and
the system starts swapping and becomes very slow, that's your fault
(we say) for setting an unreasonable value of work_mem.  Similarly, if
you set max_parallel_degree to an unreasonable value such that the
system is unlikely to be able to obtain that number of workers at
execution time, you have configured your query planner settings
poorly.  This is no different than setting random_page_cost lower than
seq_page_cost or any number of other dumb things you could do.

> Is it slower if you request N workers, yet only 1 is available?

I sure hope so.  There may be some cases where more workers are slower
than fewer workers, but those cases are defects that we should try to
fix.

> Does pg_stat_activity show the number of parallel workers active for a
> controliing process?
> Do parallel workers also show in pg_stat_activity at all?
> If so, does it show who currently has them?
> Does pg_stat_statements record how many workers were available during
> execution?

Background workers show up in pg_stat_activity, but the number of
workers used by a parallel query isn't reported anywhere.  It's
usually pretty easy to figure out from the EXPLAIN (ANALYZE, VERBOSE)
output, but clearly there might be some benefit in reporting it to
other monitoring facilities.  I hadn't really thought about that idea
before, but it's a good thought.

> Is there a way to prevent execution if too few parallel workers are
> available?

No. That might be a useful feature, but I don't have any plans to
implement it myself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_parallel_degree context level

2016-02-11 Thread Joe Conway
On 02/11/2016 07:55 AM, Robert Haas wrote:
> On Thu, Feb 11, 2016 at 10:32 AM, Simon Riggs  wrote:
>> If you make max_worker_processes USERSET won't everybody just set it to
>> max_worker_processes?
> 
> I think that you meant for the first instance of max_worker_processes
> in that sentence to be max_parallel_degree.  I'll respond as if that's
> what you meant.  Basically, I think this like asking whether everybody
> won't just set work_mem to the entire amount of free memory on the
> machine and try to use it all themselves.  We really have never tried
> very hard to prevent that sort of thing in PostgreSQL.  Maybe we
> should, but we'd have to fix an awful lot of stuff.  There are many
> ways for malicious users to do things that interfere with the ability
> of other users to use the system.  I admit that the same problem
> exists here, but I don't think it's any more severe than any of the
> cases that already exist.  In some ways I think it's a whole lot LESS
> serious than what a bad work_mem setting an do to your system.

This is pretty much exactly what I was thinking -- work_mem is already a
bigger potential problem than this. In general I think we need to
eventually provide more admin control over USERSET GUCs, but that is a
whole other conversation.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan support on readfuncs.c)

2016-02-11 Thread Kouhei Kaigai
> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Robert Haas
> Sent: Thursday, February 11, 2016 1:26 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Andres Freund; Amit Kapila; pgsql-hackers
> Subject: Re: CustomScan in a larger structure (RE: [HACKERS] CustomScan 
> support
> on readfuncs.c)
> 
> On Wed, Feb 10, 2016 at 1:25 AM, Kouhei Kaigai  wrote:
> > It is pretty good!
> >
> > The attached patch (primary one) implements the above idea.
> >
> > Now ExtensibleNode works as a basis structure of data container,
> > regardless of CustomScan and ForeignScan.
> > Also, fdw_private and custom_private are de-defined to Node * type
> > from List * type. It affected to a few FDW APIs.
> 
> I extracted the subset of this that just creates the extensible node
> framework and did some cleanup - the result is attached.  I will
> commit this if nobody objects.
>
I have no objection of course.

> I think the part about whacking around the FDW API is a little more
> potentially objectionable to others, so I want to hold off doing that
> unless a few more people chime in with +1.  Perhaps we could start a
> new thread to talk about that specific idea.  This is useful even
> without that, though.
>

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: NextXID format change (was Re: [HACKERS] exposing pg_controldata and pg_config as functions)

2016-02-11 Thread Bruce Momjian
On Wed, Feb 10, 2016 at 10:23:41AM +0900, Michael Paquier wrote:
> On Wed, Feb 10, 2016 at 9:57 AM, Joe Conway  wrote:
> > I'll commit the attached tomorrow if there are no other concerns voiced.
> 
> Just a nitpick regarding this block:
> +   if (strchr(p, '/') != NULL)
> +   p = strchr(p, '/');
> +   /* delimiter changed from '/' to ':' in 9.6 */
> +   else if (GET_MAJOR_VERSION(cluster->major_version) >= 906)
> +   p = strchr(p, ':');
> +   else
> +   p = NULL;
> Changing it as follows would save some instructions because there is
> no need to call strchr an extra time:
> if (GET_MAJOR_VERSION(cluster->major_version) >= 906)
> p = strchr(p, ':');
> else
> p = strchr(p, '/');

No, that is not an improvement --- see my previous comment:

> We could get more sophisticated by checking the catalog version number
> where the format was changed, but that doesn't seem worth it, and is
> overly complex because we get the catalog version number from
> pg_controldata, so you would be adding a dependency in ordering of the
> pg_controldata entries.

By testing for '906', you prevent users from using pg_upgrade to go from
one catalog version of 9.6 to a later one.  Few people may want to do
it, but it should work.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: NextXID format change (was Re: [HACKERS] exposing pg_controldata and pg_config as functions)

2016-02-11 Thread Michael Paquier
On Fri, Feb 12, 2016 at 9:18 AM, Bruce Momjian  wrote:
> On Wed, Feb 10, 2016 at 10:23:41AM +0900, Michael Paquier wrote:
>> On Wed, Feb 10, 2016 at 9:57 AM, Joe Conway  wrote:
>> > I'll commit the attached tomorrow if there are no other concerns voiced.
>>
>> Just a nitpick regarding this block:
>> +   if (strchr(p, '/') != NULL)
>> +   p = strchr(p, '/');
>> +   /* delimiter changed from '/' to ':' in 9.6 */
>> +   else if (GET_MAJOR_VERSION(cluster->major_version) >= 906)
>> +   p = strchr(p, ':');
>> +   else
>> +   p = NULL;
>> Changing it as follows would save some instructions because there is
>> no need to call strchr an extra time:
>> if (GET_MAJOR_VERSION(cluster->major_version) >= 906)
>> p = strchr(p, ':');
>> else
>> p = strchr(p, '/');
>
> No, that is not an improvement --- see my previous comment:
>
>> We could get more sophisticated by checking the catalog version number
>> where the format was changed, but that doesn't seem worth it, and is
>> overly complex because we get the catalog version number from
>> pg_controldata, so you would be adding a dependency in ordering of the
>> pg_controldata entries.
>
> By testing for '906', you prevent users from using pg_upgrade to go from
> one catalog version of 9.6 to a later one.  Few people may want to do
> it, but it should work.

OK, I see now. I did not consider the case where people would like to
get upgrade from a dev version of 9.6 to the latest 9.6 version, just
the upgrade from a previous major version <= 9.5. Thanks for reminding
that pg_upgrade needs to support that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] custom function for converting human readable sizes to bytes

2016-02-11 Thread Vitaly Burovoy
Hello!

On 2/11/16, Pavel Stehule  wrote:
> Hi
>
> assigned https://commitfest.postgresql.org/9/514/
>
> Regards
> Pavel


This patch was almost done by the end of the previous CF(2016-01):
there was few little flaws which are solved by now.

I have reviewed this patch.
It applies cleanly at the top of current master
(f144f73242acef574bc27a4c70e809a64806e4a4), compiles silently and
implements the behavior reached by consensus.

All size units (the same as used in the GUC) are supported.
In addition "bytes" is also supported that makes it be a inverse
function for the pg_size_pretty[1].

The documentation describes behavior of the function and possible size
units. I don't see what else needs to be said.

The code is clean and commented.

Regression tests cover possible use cases and corner cases.


Notes for a committer:
1. pg_proc.h has changed, so the CATALOG_VERSION_NO must also be changed.
2. Code comments, error message strings, DESCR in pg_proc.h and
changes in the documentation need proof-reading by a native English
speaker, which the Author and I are not.


[CF] https://commitfest.postgresql.org/9/514/
[1] 
http://www.postgresql.org/message-id/ca+tgmozfomg4eyorzzgf7pzotg9pxpuhtqvxlfskim4izh8...@mail.gmail.com

-- 
Best regards,
Vitaly Burovoy


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto: add s2k-count

2016-02-11 Thread Michael Paquier
On Fri, Feb 12, 2016 at 2:46 AM, Robert Haas  wrote:
> On Wed, Feb 10, 2016 at 12:44 AM, Jeff Janes  wrote:
>> I did not bump the extension version.  I realized the migration file
>> would be empty, as there no change to SQL-level functionality (the new
>> s2k-count is parsed out of a string down in the C code).  Since only
>> one version of contrib extensions binary object files are installed in
>> any given postgres installation, people using the newer binary gets
>> the feature even if they have not updated the extension version.  So I
>> don't know if it makes sense to bump the version if people inherently
>> get the feature anyway.
>
> There's zero reason to bump the extension version if the SQL interface
> hasn't changed.
>
> (I have no opinion on the underlying patch.)

+1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extend pgbench expressions with functions

2016-02-11 Thread Michael Paquier
On Fri, Feb 12, 2016 at 2:41 AM, Fabien COELHO  wrote:
>
> Hello Michaël,
>
>> +/* the argument list has been built in reverse order, it is fixed
>> here */
>> +expr->u.function.args = reverse_elist(args);
>> Hm. I may be missing something, but why is that necessary? This is
>> basically doing a double-reversion to put all the arguments in the
>> correct order when parsing the function arguments.
>
> This is because the expression list is parsed left to right and the list is
> built as a stack to avoid looking for the last argument to append the next
> expression, but then the list is in reverse order at the end of parsing, so
> it is reversed once to make it right. This way the complexity is kept as
> O(n).
>
> If this is too much I can switch to O(n**2) by appending each expression at
> the end of the list.

(this one has been mentioned by Alvaro offlist)
Using a pointer to the tail of the list would make the code simple,
and save a couple of lines.

Another thing that could be considered is also to move list.c and
pg_list.h into src/common and reuse that. There are other frontend
utilities that emulate the same kind of facilities, have a look for
example at the other copycats in pg_dump and pg_basebackup.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v4] GSSAPI encryption support

2016-02-11 Thread Michael Paquier
On Fri, Feb 12, 2016 at 3:56 AM, Robbie Harwood  wrote:
> Michael Paquier  writes:
>> On Thu, Feb 11, 2016 at 6:06 AM, Robbie Harwood  wrote:
>>> - The GSSAPI authentication code has been moved without modification.
>>>   In doing so, the temptation to modify it (flags, error checking, that
>>>   big comment at the top about things from Athena, etc.)  is very large.
>>>   I do not know whether these changes are best suited to another patch
>>>   in this series or should be reviewed separately.  I am also hesitant
>>>   to add things beyond the core before I am told this is the right
>>>   approach.
>>
>> I would recommend a different patch if code needs to be moved around.
>> The move may make sense taken as an independent piece of the
>> integration.
>
> Sorry, are you suggesting separate patch for moving the GSS auth code,
> or separate patch for changes to said code?  I am happy to move it if
> so, just want to be sure.

This is based on my first impressions on the patch. Let's discuss more
those points once I got a more in-depth look at the patch with what it
actually does. In short, there is no need to put more efforts in the
coding now :) Sorry to confuse you.

>> + * Portions Copyright (c) 2015-2016, Red Hat, Inc.
>> + * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
>> I think that this part may be a problem... Not sure the feeling of the
>> others regarding additional copyright notices.
>
> Good catch.  That's an accident (force of habit).  Since I'm pretty sure
> this version won't be merged anyway, I'll drop it from the next one.
>
>> It would be good to add that to the next CF, I will be happy to get a
>> look at it.
>
> Sounds good.  Thanks for looking at it!

Okay, let's do this.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing Functionally Dependent GROUP BY Columns

2016-02-11 Thread Tom Lane
David Rowley  writes:
> [ prune_group_by_clause_ab4f491_2016-01-23.patch ]
> [ check_functional_grouping_refactor.patch ]

I've committed this with mostly-cosmetic revisions (principally, rewriting
a lot of the comments, which seemed on the sloppy side).

>> * Both of the loops iterating over the groupClause neglect to check
>> varlevelsup, thus leading to assert failures or worse if an outer-level
>> Var is present in the GROUP BY list.  (I'm pretty sure outer Vars can
>> still be present at this point, though I might be wrong.)

> Fixed in the first loop, and the way I've rewritten the code to use
> bms_difference, there's no need to check again in the 2nd loop.

Um, AFAICS, you *do* need to check again in the second loop, else you'll
be accessing a surplusvars[] entry that might not exist at all, and in
any case might falsely tell you that you can exclude the outer var from
the new GROUP BY list.

That was the only actual bug I found, though I changed some other stuff.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Clock with Adaptive Replacement

2016-02-11 Thread Konstantin Knizhnik

Hi hackers,

What do you think about improving cache replacement clock-sweep algorithm in 
PostgreSQL with adaptive version proposed in this article:

http://www-cs.stanford.edu/~sbansal/pubs/fast04.pdf

Are there some well known drawbacks of this approach or it will be interesting 
to adopt this algorithm to PostgreSQL and measure it impact om performance 
under different workloads?
I find this ten years old thread:

http://www.postgresql.org/message-id/flat/d2jkde$6bg$1...@sea.gmane.org#d2jkde$6bg$1...@sea.gmane.org

but it mostly discus possible patent issues with another algorithm ARC (CAR is 
inspired by ARC,  but it is different algorithm).
As far as I know there are several problems with current clock-sweep algorithm 
in PostgreSQL, especially for very large caches.
May be CAR can address some of them?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-11 Thread David G. Johnston
On Thu, Feb 11, 2016 at 10:53 AM, Pavel Stehule 
wrote:

> > most recent error in verbose mode, without making a permanent session
>
>> > state change.  Something like
>> >
>> > regression=# insert into bar values(1);
>> > ERROR:  insert or update on table "bar" violates foreign key constraint
>> "bar_f1_fkey"
>> > DETAIL:  Key (f1)=(1) is not present in table "foo".
>>
>

> > regression=# \saywhat
>
>
​Maybe its too cute but I'll give a +1 for this alone.

David J.


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-11 Thread Michael Paquier
On Fri, Feb 12, 2016 at 2:56 AM, Robert Haas  wrote:
> On Fri, Feb 5, 2016 at 3:36 AM, Michael Paquier
>  wrote:
>> So, here are some thoughts to make that more user-friendly. I think
>> that the critical issue here is to properly flatten the meta data in
>> the custom language and represent it properly in a new catalog,
>> without messing up too much with the existing pg_stat_replication that
>> people are now used to for 5 releases since 9.0.
>
> Putting the metadata in a catalog doesn't seem great because that only
> can ever work on the master.  Maybe there's no need to configure this
> on the slaves and therefore it's OK, but I feel nervous about putting
> cluster configuration in catalogs.  Another reason for that is that if
> synchronous replication is broken, then you need a way to change the
> catalog, which involves committing a write transaction; there's a
> danger that your efforts to do this will be tripped up by the broken
> synchronous replication configuration.

I was referring to a catalog view that parses the information related
to groups of s_s_names in a flattened way to show each group sync
status. Perhaps my words should have been clearer.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Way to check whether a particular block is on the shared_buffer?

2016-02-11 Thread Kouhei Kaigai
> On Tue, Feb 9, 2016 at 6:35 PM, Kouhei Kaigai  wrote:
> > Unfortunately, it was not sufficient.
> >
> > Due to the assumption, the buffer page to be suspended does not exist
> > when a backend process issues a series P2P DMA command. (If block would
> > be already loaded to the shared buffer, it don't need to issue P2P DMA,
> > but just use usual memory<->device DMA because RAM is much faster than
> > SSD.)
> > It knows the pair of (rel,fork,block), but no BufferDesc of this block
> > exists. Thus, it cannot acquire locks in BufferDesc structure.
> >
> > Even if the block does not exist at this point, concurrent process may
> > load the same page. BufferDesc of this page shall be assigned at this
> > point, however, here is no chance to lock something in BufferDesc for
> > the process which issues P2P DMA command.
> >
> > It is the reason why I assume the suspend/resume mechanism shall take
> > a pair of (rel,fork,block) as identifier of the target block.
> 
> I see the problem, but I'm not terribly keen on putting in the hooks
> that it would take to let you solve it without hacking core.  It
> sounds like an awfully invasive thing for a pretty niche requirement.
>
Hmm. In my experience, it is often not a productive discussion whether
a feature is niche or commodity. So, let me change the viewpoint.

We may utilize OS-level locking mechanism here.

Even though it depends on filesystem implementation under the VFS,
we may use inode->i_mutex lock that shall be acquired during the buffer
copy from user to kernel, at least, on a few major filesystems; ext4,
xfs and btrfs in my research. As well, the modified NVMe SSD driver can
acquire the inode->i_mutex lock during P2P DMA transfer.

Once we can consider the OS buffer is updated atomically by the lock,
we don't need to worry about corrupted pages, but still needs to pay
attention to the scenario when updated buffer page is moved to GPU.

In this case, PD_ALL_VISIBLE may give us a hint. GPU side has no MVCC
infrastructure, so I intend to move all-visible pages only.
If someone updates the buffer concurrently, then write out the page
including invisible tuples, PD_ALL_VISIBLE flag shall be cleared because
updated tuples should not be visible to the transaction which issued
P2P DMA.

Once GPU met a page with !PD_ALL_VISIBLE, it can return an error status
that indicates CPU to retry this page again. In this case, this page is
likely loaded to the shared buffer already, so retry penalty is not so
much.

I'll try to investigate the implementation in this way.
Please correct me, if I misunderstand something (especially, treatment
of PD_ALL_VISIBLE).

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Michael Paquier
On Fri, Feb 12, 2016 at 3:45 AM, Tom Lane  wrote:
> Andres Freund  writes:
>> On 2016-02-11 13:37:17 -0500, Tom Lane wrote:
>>> Absolutely; they don't work safely for testing bits that aren't in the
>>> rightmost byte of a flag word, for instance.  I'm on board with making
>>> these fixes, I'm just unconvinced that stdbool is a good reason for it.
>
>> Oh, ok. Interactions with stdbool was what made me looking into this,
>> that's primarily why I mentioned it.   What's your thinking about
>> back-patching, independent of that then?
>
> Well, Yury was saying upthread that some MSVC versions have a problem
> with the existing coding, which would be a reason to back-patch ...
> but I'd like to see a failing buildfarm member first.  Don't particularly
> want to promise to support compilers not represented in the farm.

Grmbl. Forgot to attach the rebased patch upthread. Here is it now.

As of now the only complain has been related to MS2015 and MS2013. If
we follow the pattern of cec8394b and [1], support to compile on newer
versions of MSVC would be master and REL9_5_STABLE, but MS2013 is
supported down to 9.3. Based on this reason, we would want to
backpatch down to 9.3 the patch of this thread.

[1]: http://www.postgresql.org/message-id/529d05cc.7070...@gmx.de
-- 
Michael
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index b0d5440..3bbb218 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -5329,7 +5329,8 @@ xact_redo_commit(xl_xact_parsed_commit *parsed,
 		LWLockRelease(XidGenLock);
 	}
 
-	Assert(!!(parsed->xinfo & XACT_XINFO_HAS_ORIGIN) == (origin_id != InvalidRepOriginId));
+	Assert(((parsed->xinfo & XACT_XINFO_HAS_ORIGIN) != 0) ==
+		   (origin_id != InvalidRepOriginId));
 
 	if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN)
 		commit_time = parsed->origin_timestamp;
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 78acced..024929a 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -575,7 +575,7 @@ ReorderBufferTXNByXid(ReorderBuffer *rb, TransactionId xid, bool create,
 	if (is_new)
 		*is_new = !found;
 
-	Assert(!create || !!txn);
+	Assert(!create || txn != NULL);
 	return txn;
 }
 
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index d245857..ef42fad 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -205,11 +205,11 @@ PRINT_LWDEBUG(const char *where, LWLock *lock, LWLockMode mode)
 	 errmsg_internal("%d: %s(%s): excl %u shared %u haswaiters %u waiters %u rOK %d",
 			MyProcPid,
 			where, MainLWLockNames[id],
-			!!(state & LW_VAL_EXCLUSIVE),
+			(state & LW_VAL_EXCLUSIVE) != 0,
 			state & LW_SHARED_MASK,
-			!!(state & LW_FLAG_HAS_WAITERS),
+			(state & LW_FLAG_HAS_WAITERS) != 0,
 			pg_atomic_read_u32(>nwaiters),
-			!!(state & LW_FLAG_RELEASE_OK;
+			(state & LW_FLAG_RELEASE_OK) != 0)));
 		else
 			ereport(LOG,
 	(errhidestmt(true),
@@ -217,11 +217,11 @@ PRINT_LWDEBUG(const char *where, LWLock *lock, LWLockMode mode)
 	 errmsg_internal("%d: %s(%s %d): excl %u shared %u haswaiters %u waiters %u rOK %d",
 			MyProcPid,
 			where, T_NAME(lock), id,
-			!!(state & LW_VAL_EXCLUSIVE),
+			(state & LW_VAL_EXCLUSIVE) != 0,
 			state & LW_SHARED_MASK,
-			!!(state & LW_FLAG_HAS_WAITERS),
+			(state & LW_FLAG_HAS_WAITERS) != 0,
 			pg_atomic_read_u32(>nwaiters),
-			!!(state & LW_FLAG_RELEASE_OK;
+			(state & LW_FLAG_RELEASE_OK) != 0)));
 	}
 }
 
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index d2ea588..e212c9f 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -112,22 +112,22 @@ typedef struct GinMetaPageData
  */
 #define GinPageGetOpaque(page) ( (GinPageOpaque) PageGetSpecialPointer(page) )
 
-#define GinPageIsLeaf(page)( GinPageGetOpaque(page)->flags & GIN_LEAF )
+#define GinPageIsLeaf(page)( (GinPageGetOpaque(page)->flags & GIN_LEAF) != 0 )
 #define GinPageSetLeaf(page)   ( GinPageGetOpaque(page)->flags |= GIN_LEAF )
 #define GinPageSetNonLeaf(page)( GinPageGetOpaque(page)->flags &= ~GIN_LEAF )
-#define GinPageIsData(page)( GinPageGetOpaque(page)->flags & GIN_DATA )
+#define GinPageIsData(page)( (GinPageGetOpaque(page)->flags & GIN_DATA) != 0 )
 #define GinPageSetData(page)   ( GinPageGetOpaque(page)->flags |= GIN_DATA )
-#define GinPageIsList(page)( GinPageGetOpaque(page)->flags & GIN_LIST )
+#define GinPageIsList(page)( (GinPageGetOpaque(page)->flags & GIN_LIST) != 0 )
 #define GinPageSetList(page)   ( GinPageGetOpaque(page)->flags |= GIN_LIST )
-#define GinPageHasFullRow(page)( GinPageGetOpaque(page)->flags & GIN_LIST_FULLROW )
+#define GinPageHasFullRow(page)( 

Re: [HACKERS] Tracing down buildfarm "postmaster does not shut down" failures

2016-02-11 Thread Noah Misch
On Wed, Feb 10, 2016 at 10:55:10AM -0500, Tom Lane wrote:
> Interestingly, we seem to have managed to greatly reduce the "other"
> time (which I presume is basically mdpostchkpt unlinking) since 9.2.
> The worst case observed in HEAD is about 100s:
> 
> regression=# select ts,write,sync,total-write-sync as other, total from 
> chkpts order by total-write-sync desc limit 10;
>  ts |  write  |  sync   |  other  |  total  
> +-+-+-+-
>  2016-02-09 08:27:04.365-08 |  14.479 | 298.043 | 100.586 | 413.108
>  2016-02-09 06:15:22.477-08 |  75.099 |  26.590 |  90.452 | 192.141
>  2016-02-09 16:31:34.661-08 | 254.462 | 186.246 |  51.541 | 492.249
>  2016-02-09 15:25:01.586-08 |   0.015 |   2.985 |  47.822 |  50.822
>  2016-02-09 11:40:13.565-08 | 287.327 | 314.218 |  44.059 | 645.604
>  2016-02-09 08:56:51.339-08 | 135.464 |  31.242 |  39.131 | 205.837
>  2016-02-09 20:23:52.797-08 |   1.309 |  12.155 |  36.350 |  49.814
>  2016-02-09 00:51:41.157-08 | 134.502 | 103.482 |  34.110 | 272.094
>  2016-02-09 19:28:33.154-08 |   0.575 |  20.434 |  33.803 |  54.812
>  2016-02-09 03:47:02.57-08  |   0.066 |  37.475 |  33.119 |  70.660
> (10 rows)
> 
> but 9.2's worst cases greatly exceed that:
> 
> regression=# select ts,write,sync,total-write-sync as other, total from 
> chkpts92 order by total-write-sync desc limit 10;
>  ts | write  |  sync   |  other  |  total  
> ++-+-+-
>  2016-02-09 03:27:12.904-08 | 48.761 | 124.453 | 299.877 | 473.091
>  2016-02-09 16:31:25.657-08 |  3.860 | 184.042 | 248.488 | 436.390
>  2016-02-08 23:33:36.16-08  | 66.904 | 397.653 | 229.020 | 693.577
>  2016-02-09 11:39:57.061-08 |  2.471 | 303.924 | 201.923 | 508.318
>  2016-02-09 07:37:44.48-08  |  3.317 |   1.494 | 158.159 | 162.970
>  2016-02-09 04:32:49.668-08 | 11.328 | 292.310 | 148.088 | 451.726
>  2016-02-09 12:37:43.52-08  | 17.620 | 333.983 | 141.172 | 492.775
>  2016-02-09 08:27:04.407-08 | 12.019 | 300.510 | 102.855 | 415.384
>  2016-02-09 18:17:32.193-08 | 90.370 |  25.369 |  63.626 | 179.365
>  2016-02-09 11:02:14.977-08 |  2.713 |   2.306 |  38.581 |  43.600
> (10 rows)
> 
> Not real sure where that improvement came from.  We could hardly
> be unlinking fewer files overall, and the bottleneck seems to be
> in the kernel, so what changed?  The *average*, as opposed to worst
> case, checkpoint time has definitely grown since 9.2:

Good question; I don't know.  I would have expected 9.2's xlog-driven
checkpoints to create more stability than HEAD's time-driven checkpoints.


On the AIX animals, I have now set PGCTLTIMEOUT=900 and removed the "-t 120"
arguments.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-11 Thread Amit Kapila
On Fri, Feb 12, 2016 at 12:39 AM, Robert Haas  wrote:
>
> On Thu, Feb 11, 2016 at 12:10 PM, Amit Kapila 
wrote:
> > Okay, changed as per suggestion.
>
> Looks good to me; committed.
>

Thanks!

Attached patch changes the name of some of the existing tranches
as suggested by you upthread.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


rename_tranches_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-11 Thread Tom Lane
We're not out of the woods on this :-( ... jaguarundi, which is the first
of the CLOBBER_CACHE_ALWAYS animals to run these tests, didn't like them
at all.  I think I fixed the deadlock-soft-2 failure, but its take on
deadlock-hard is:

*** 17,25 
  step s6a7: LOCK TABLE a7; 
  step s7a8: LOCK TABLE a8; 
  step s8a1: LOCK TABLE a1; 
- step s8a1: <... completed>
  step s7a8: <... completed>
! error in steps s8a1 s7a8: ERROR:  deadlock detected
  step s8c: COMMIT;
  step s7c: COMMIT;
  step s6a7: <... completed>
--- 17,25 
  step s6a7: LOCK TABLE a7; 
  step s7a8: LOCK TABLE a8; 
  step s8a1: LOCK TABLE a1; 
  step s7a8: <... completed>
! step s8a1: <... completed>
! ERROR:  deadlock detected
  step s8c: COMMIT;
  step s7c: COMMIT;
  step s6a7: <... completed>

The problem here is that when the deadlock detector kills s8's
transaction, s7a8 is also left free to proceed, so there is a race
condition as to which query completion will get back to
isolationtester first.

One grotty way to handle that would be something like

-step "s7a8"{ LOCK TABLE a8; }
+step "s7a8"{ LOCK TABLE a8; SELECT pg_sleep(5); }

Or we could simplify the locking structure enough so that no other
transactions are released by the deadlock failure.  I do not know
exactly what you had in mind to be testing here?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]WIP: Covering + unique indexes.

2016-02-11 Thread Anastasia Lubennikova

02.02.2016 15:50, Anastasia Lubennikova:

31.01.2016 11:04, David Rowley:

On 27 January 2016 at 03:35, Anastasia Lubennikova
 wrote:

including_columns_3.0 is the latest version of patch.
And changes regarding the previous version are attached in a 
separate patch.

Just to ease the review and debug.

Hi,

I've made another pass over the patch. There's still a couple of
things that I think need to be looked at.

Thank you again.
I just write here to say that I do not disappear and I do remember 
about the issue.
But I'm very very busy this week. I'll send an updated patch next week 
as soon as possible.




As promised, here's the new version of the patch "including_columns_4.0".
I fixed all issues except some points mentioned below.
Besides, I did some refactoring:
- use macros IndexRelationGetNumberOfAttributes, 
IndexRelationGetNumberOfKeyAttributes where possible. Use macro 
RelationGetNumberOfAttributes. Maybe that's a bit unrelated changes, but 
it'll make development much easier in future.

- rename related variables to indnatts,  indnkeyatts.

I'm also not that keen on index_reform_tuple() in general. I wonder if
there's a way we can just keep the Datum/isnull arrays a bit longer,
and only form the tuple when needed. I've not looked into this in
detail, but it does look like reforming the tuple is not going to be
cheap.
It is used in splits, for example. There is no datum array, we just 
move tuple key from a child page to a parent page or something like that.
And according to INCLUDING algorithm we need to truncate nonkey 
attributes.

If we do need to keep this function, I think a better name might be
index_trim_tuple() and I don't think you need to pass the original
length. It might make sense to Assert() that the trim length is
smaller than the tuple size


As regards the performance, I don't think that it's a big problem here.
Do you suggest to do it in a following way memcpy(oldtup, newtup, 
newtuplength)?


I've tested it some more, and still didn't find any performance issues.


in gistrescan() there is some code:

for (attno = 1; attno <= natts; attno++)
{
TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL,
   scan->indexRelation->rd_opcintype[attno - 1],
   -1, 0);
}

Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to
be sized by the number of key columns, but this loop goes over the
number of attribute columns.
Perhaps this is not a big problem since GIST does not support
INCLUDING columns, but it does seem wrong still.


GiST doesn't support INCLUDING clause, so natts and nkeyatts are 
always equal. I don't see any problem here.
And I think that it's an extra work to this patch. Maybe I or someone 
else would add this feature to other access methods later.


Still the same.

Which brings me to the fact that I've spent a bit of time trying to
look for places where you've forgotten to change natts to nkeyatts. I
did find this one, but I don't have much confidence that there's not
lots more places that have been forgotten. Apart from this one, how
confident are you that you've found all the places? I'm getting
towards being happy with the code that I see that's been changed, but
I'm hesitant to mark as "Ready for committer" due to not being all
that comfortable that all the code that needs to be updated has been
updated. I'm not quite sure of a good way to find all these places.
I found all mentions of natts and other related variables with grep, 
and replaced (or expand) them with nkeyatts where it was necessary.

As mentioned before, I didn't change other AMs.
I strongly agree that any changes related to btree require thorough 
inspection, so I'll recheck it again. But I'm almost sure that it's okay.


I rechecked everything again and fixed couple of omissions. Thank you 
for being exacting reviewer)
I don't know how to ensure that everything is ok, but I have no idea 
what else I can do.



I wondering if hacking the code so that each btree index which is
created with > 1 column puts all but the first column into the
INCLUDING columns, then run the regression tests to see if there are
any crashes. I'm really not that sure of how else to increase the
confidence levels on this. Do you have ideas?


Do I understand correctly that you suggest to replace all multicolumn 
indexes with (1key column) + included?
I don't think it's a good idea. INCLUDING clause brings some 
disadvantages. For example, included columns must be filtered after 
the search, while key columns could be used in scan key directly. I 
already mentioned this in test example:


explain analyze select c1, c2 from tbl where c1<1 and c3<20;
If columns' opclasses are used, new query plan uses them in  Index 
Cond: ((c1 < 1) AND (c3 < 20))
Otherwise, new query can not use included column in Index Cond and 
uses filter instead:

Index Cond: (c1 < 1)
Filter: (c3 < 20)
Rows Removed by Filter: 9993
It slows down the query significantly.

And besides that, we 

Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-11 Thread Tom Lane
I wrote:
> No, because the machines that are failing are showing a ""
> annotation that your reference output *doesn't* have.  I think what is
> actually happening is that these machines are seeing the process as
> waiting and reporting it, whereas on your machine the backend detects
> the deadlock and completes the query (with an error) before
> isolationtester realizes that the process is waiting.

I confirmed this theory by the expedient of changing the '10ms' setting
in the test script to 1ms (which worked) and 100ms (which did not, on
the same machine).

I've committed an update that adjusts the timeouts to hopefully ensure
that isolationtester always sees the query as blocked before the deadlock
detector unblocks it; which seems like the behavior we want to test for,
anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Refactoring of LWLock tranches

2016-02-11 Thread Amit Kapila
On Thu, Feb 11, 2016 at 6:45 PM, Robert Haas  wrote:
>
> On Thu, Feb 11, 2016 at 3:15 AM, Amit Kapila 
wrote:
> > Sounds sensible, however after changes, CreateLWLocks() started
> > looking unreadable, so moved initialization and registration of tranches
> > to separate functions.
>
> Seems good.
>
> > Increased number of tranches allocated in LWLockTrancheArray, as
> > now the LWTRANCHE_FIRST_USER_DEFINED is already greater
> > than 16.
>
> OK, cool.
>
> +   int numIndividualLocks =
NUM_INDIVIDUAL_LWLOCKS;
> +   int numBufMappingLocks =
NUM_BUFFER_PARTITIONS;
> +   int numLmgrLocks = NUM_LOCK_PARTITIONS;
> +   int i;
>
> This doesn't seem like it's buying us anything.  Can't we just use the
> constants directly?
>

Yes, we can use the constants.  The reason I have used variables
was they need to be used at couple of places, but I think using
constants directly also doesn't harm, so changed accordingly.

> +   BufMappingLWLockTranche.name = "Buffer Mapping Locks";
>
> We've not been very consistent about how we name our tranches.  If
> we're going to try to expose this information to users, we're going to
> need to do better. We've got these names:
>
> main
> clog
> commit_timestamp
> subtrans
> multixact_offset
> multixact_member
> async
> oldserxid
> WALInsertLocks
> Buffer Content Locks
> Buffer IO Locks
> ReplicationOrigins
> Replication Slot IO Locks
> proc
>
> Personally, I prefer the style of all lower case, words separated with
> semicolons where necessary, the word "locks" left out.  That way we
> can display wait events as something like lwlock: and
> not have the word "lock" in there twice.  I don't think it does much
> good to try to make these names "user friendly" because,
> fundamentally, these are internal pieces of the system and you won't
> know what they are unless you do.  So I'd prefer to make them look
> like identifiers.  If we adopt that style, then I think the new
> tranches created by this patch should be named buffer_mapping,
> lock_manager, and predicate_lock_manager

Okay, changed as per suggestion.

> and then, probably as a
> separate patch, we should rename the rest to match that style
> (buffer_content, buffer_io, replication_origin, replication_slot_io).
>

Sure, will send these changes as a separate patch.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fixed_locks_tranche_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 12:04 PM, Tom Lane  wrote:
> I wrote:
>> No, because the machines that are failing are showing a ""
>> annotation that your reference output *doesn't* have.  I think what is
>> actually happening is that these machines are seeing the process as
>> waiting and reporting it, whereas on your machine the backend detects
>> the deadlock and completes the query (with an error) before
>> isolationtester realizes that the process is waiting.
>
> I confirmed this theory by the expedient of changing the '10ms' setting
> in the test script to 1ms (which worked) and 100ms (which did not, on
> the same machine).
>
> I've committed an update that adjusts the timeouts to hopefully ensure
> that isolationtester always sees the query as blocked before the deadlock
> detector unblocks it; which seems like the behavior we want to test for,
> anyway.

Thanks.  I really appreciate it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Robert Haas
On Thu, Jan 14, 2016 at 12:28 PM, Tom Lane  wrote:
> Andres Freund  writes:
>> This brings me to something related: I'm wondering if we shouldn't merge
>> unix/win32_latch.c.
>
> Well, it's duplicated code on the one hand versus maze-of-ifdefs on the
> other.  Feel free to try it and see, but I'm unsure it'd be an improvement.

I think we should either get this fixed RSN or revert the problematic
commit until we get it fixed.  I'd be rather disappointed about the
latter because I think this was a very good thing on the merits, but
probably not good enough to justify taking the performance hit over
the long term.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Tom Lane
Robert Haas  writes:
> I think we should either get this fixed RSN or revert the problematic
> commit until we get it fixed.  I'd be rather disappointed about the
> latter because I think this was a very good thing on the merits, but
> probably not good enough to justify taking the performance hit over
> the long term.

Since it's only in HEAD, I'm not seeing the urgency of reverting it.
However, it'd be a good idea to put this on the 9.6 open items list
(have we got such a page yet?) to make sure it gets addressed before
beta.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Yury Zhuravlev

Robert Haas wrote:

So, is it being pulled in indirectly by some other #include?


I can double-check it tomorrow.

--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-11 Thread Robert Haas
On Tue, Jan 5, 2016 at 10:16 PM, Tom Lane  wrote:
> Jim Nasby  writes:
>> FWIW, I suspect very few people know about the verbosity setting (I
>> didn't until a few months ago...) Maybe psql should hint about it the
>> first time an error is reported in a session.
>
> Actually, what'd be really handy IMO is something to regurgitate the
> most recent error in verbose mode, without making a permanent session
> state change.  Something like
>
> regression=# insert into bar values(1);
> ERROR:  insert or update on table "bar" violates foreign key constraint 
> "bar_f1_fkey"
> DETAIL:  Key (f1)=(1) is not present in table "foo".
> regression=# \saywhat
> ERROR:  23503: insert or update on table "bar" violates foreign key 
> constraint "bar_f1_fkey"
> DETAIL:  Key (f1)=(1) is not present in table "foo".
> SCHEMA NAME:  public
> TABLE NAME:  bar
> CONSTRAINT NAME:  bar_f1_fkey
> LOCATION:  ri_ReportViolation, ri_triggers.c:3326
> regression=#

Wow, that's a fabulous idea.  I see Oleksandr has tried to implement
it, although I haven't looked at the patch.  But I think this would be
REALLY helpful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 12:48 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think we should either get this fixed RSN or revert the problematic
>> commit until we get it fixed.  I'd be rather disappointed about the
>> latter because I think this was a very good thing on the merits, but
>> probably not good enough to justify taking the performance hit over
>> the long term.
>
> Since it's only in HEAD, I'm not seeing the urgency of reverting it.
> However, it'd be a good idea to put this on the 9.6 open items list
> (have we got such a page yet?) to make sure it gets addressed before
> beta.

One problem is that it makes for misleading results if you try to
benchmark 9.5 against 9.6.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Andres Freund
On 2016-02-11 12:50:58 -0500, Robert Haas wrote:
> On Thu, Feb 11, 2016 at 12:48 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> I think we should either get this fixed RSN or revert the problematic
> >> commit until we get it fixed.  I'd be rather disappointed about the
> >> latter because I think this was a very good thing on the merits, but
> >> probably not good enough to justify taking the performance hit over
> >> the long term.
> >
> > Since it's only in HEAD, I'm not seeing the urgency of reverting it.
> > However, it'd be a good idea to put this on the 9.6 open items list
> > (have we got such a page yet?) to make sure it gets addressed before
> > beta.
> 
> One problem is that it makes for misleading results if you try to
> benchmark 9.5 against 9.6.

You need a really beefy box to show the problem. On a large/new 2 socket
machine the performance regression in in the 1-3% range for a pgbench of
SELECT 1. So it's not like it's immediately showing up for everyone.

Putting it on the open items list sounds good to me.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add schema-qualified relnames in constraint error messages.

2016-02-11 Thread Pavel Stehule
> most recent error in verbose mode, without making a permanent session

> > state change.  Something like
> >
> > regression=# insert into bar values(1);
> > ERROR:  insert or update on table "bar" violates foreign key constraint
> "bar_f1_fkey"
> > DETAIL:  Key (f1)=(1) is not present in table "foo".
> > regression=# \saywhat
> > ERROR:  23503: insert or update on table "bar" violates foreign key
> constraint "bar_f1_fkey"
> > DETAIL:  Key (f1)=(1) is not present in table "foo".
> > SCHEMA NAME:  public
> > TABLE NAME:  bar
> > CONSTRAINT NAME:  bar_f1_fkey
> > LOCATION:  ri_ReportViolation, ri_triggers.c:3326
> > regression=#
>
> Wow, that's a fabulous idea.  I see Oleksandr has tried to implement
> it, although I haven't looked at the patch.  But I think this would be
> REALLY helpful.
>

yes

+1

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Support for N synchronous standby servers - take 2

2016-02-11 Thread Robert Haas
On Fri, Feb 5, 2016 at 3:36 AM, Michael Paquier
 wrote:
> So, here are some thoughts to make that more user-friendly. I think
> that the critical issue here is to properly flatten the meta data in
> the custom language and represent it properly in a new catalog,
> without messing up too much with the existing pg_stat_replication that
> people are now used to for 5 releases since 9.0.

Putting the metadata in a catalog doesn't seem great because that only
can ever work on the master.  Maybe there's no need to configure this
on the slaves and therefore it's OK, but I feel nervous about putting
cluster configuration in catalogs.  Another reason for that is that if
synchronous replication is broken, then you need a way to change the
catalog, which involves committing a write transaction; there's a
danger that your efforts to do this will be tripped up by the broken
synchronous replication configuration.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Tom Lane
Andres Freund  writes:
> On 2016-02-11 09:51:26 -0500, Robert Haas wrote:
>> I have never been quite clear on what you think is going to cause
>> stdbool.h inclusions to become more common.

> Primarily because it's finally starting to be supported across the
> board, thanks to MS finally catching up.

There are exactly 0 references to stdbool in our source tree.  The
only way it'd get pulled in is if some other system header does so.
(Which seems possible, admittedly.  I'm not sure whether the C
standard allows or forbids such things.)

It's worth worrying also about extensions that might want to include
stdbool.h --- anything in our own headers that would conflict would
be a problem.  But I'm unconvinced that we need to make our .c files
prepared for stdbool.h to be #included in them.  By that argument,
any random symbol in any system header in any platform on the planet
is a hazard to us.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 12:53 PM, Andres Freund  wrote:
>> One problem is that it makes for misleading results if you try to
>> benchmark 9.5 against 9.6.
>
> You need a really beefy box to show the problem. On a large/new 2 socket
> machine the performance regression in in the 1-3% range for a pgbench of
> SELECT 1. So it's not like it's immediately showing up for everyone.
>
> Putting it on the open items list sounds good to me.

Well, OK, I've done that then.  I don't really agree that it's not a
problem; the OP said he saw a 3x regression, and some of my colleagues
doing benchmarking are complaining about this commit, too.  It doesn't
seem like much of a stretch to think that it might be affecting other
people as well.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH v4] GSSAPI encryption support

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 8:42 AM, Michael Paquier
 wrote:
> + * Portions Copyright (c) 2015-2016, Red Hat, Inc.
> + * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
> I think that this part may be a problem... Not sure the feeling of the
> others regarding additional copyright notices.

Yep.  "PostgreSQL Global Development Group" means "whoever it was that
contributed", not a specific legal organization.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Andres Freund
On 2016-02-11 13:06:14 -0500, Tom Lane wrote:
> But I'm unconvinced that we need to make our .c files prepared for
> stdbool.h to be #included in them.

The fixes, besides two stylistic edits around !! use, are all in
headers. The issue is that we return things meant to be used in a
boolean context, that's not just 0/1 but some random set bit.  Looking
over the (outdated) patch attached to
http://archives.postgresql.org/message-id/20150812161140.GD25424%40awork2.anarazel.de
it's not completely outlandish that one wants to use one of these
functions, but also something that uses stdbool.h.

> By that argument, any random
> symbol in any system header in any platform on the planet is a hazard
> to us.

Don't think that's really the same. C99's booleans are part of the
standard, and have surprising behaviour that you can't get on the C
level (they always contain 1/0 after assignment). That makes for more
likely and more subtle bugs.

And anyway, these macros are a potential issue even without stdbool.h
style booleans. If you compare them using equality, you can get into
trouble...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Andres Freund
On 2016-02-11 13:09:27 -0500, Robert Haas wrote:
> On Thu, Feb 11, 2016 at 12:53 PM, Andres Freund  wrote:
> >> One problem is that it makes for misleading results if you try to
> >> benchmark 9.5 against 9.6.
> >
> > You need a really beefy box to show the problem. On a large/new 2 socket
> > machine the performance regression in in the 1-3% range for a pgbench of
> > SELECT 1. So it's not like it's immediately showing up for everyone.
> >
> > Putting it on the open items list sounds good to me.
> 
> Well, OK, I've done that then.  I don't really agree that it's not a
> problem; the OP said he saw a 3x regression, and some of my colleagues
> doing benchmarking are complaining about this commit, too.  It doesn't
> seem like much of a stretch to think that it might be affecting other
> people as well.

Well, I can't do anything about that right now. I won't have the time to
whip up the new/more complex API we discussed upthread in the next few
days.  So either we go with a simpler API (e.g. pretty much a cleaned up
version of my earlier patch), revert the postmaster deatch check, or
somebody else has to take lead in renovating, or we wait...

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 9:54 AM, Andres Freund  wrote:
> On 2016-02-11 09:51:26 -0500, Robert Haas wrote:
>> I have never been quite clear on what you think is going to cause
>> stdbool.h inclusions to become more common.
>
> Primarily because it's finally starting to be supported across the
> board, thanks to MS finally catching up.
>
> Then there's uglyness like:
> http://archives.postgresql.org/message-id/508bd3a0-b0a1-4f10-ad9c-2f86656f6e79%40postgrespro.ru

So, is it being pulled in indirectly by some other #include?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto: add s2k-count

2016-02-11 Thread Robert Haas
On Wed, Feb 10, 2016 at 12:44 AM, Jeff Janes  wrote:
> pgcrypto supports s2k-mode for key-stretching during symmetric
> encryption, and even defaults to s2k-mode=3, which means configurable
> iterations.  But it doesn't support s2k-count to actually set those
> iterations to be anything other than the default.  If you are
> interested in key-stretching, the default is not going to cut it.
> (You could argue that pgp's s2k doesn't cut it either even at the max,
> but at least we should offer the maximum that the pgp spec makes
> available.)
>
> This patch implements s2k-count as an option to pgp_sym_encrypt.
>
> Demo (note the password is intentionally wrong in the last character):
>
> select pgp_sym_decrypt(
> pgp_sym_encrypt('foobar','acf86729b6b0289f4d1909db8c1aaf0c','s2k-mode=3'),
> 'acf86729b6b0289f4d1909db8c1aaf0d');
> ERROR:  Wrong key or corrupt data
> Time: 1.606 ms
>
> select pgp_sym_decrypt(
>
> pgp_sym_encrypt('foobar','acf86729b6b0289f4d1909db8c1aaf0c','s2k-mode=3,s2k-count=6500'),
>'acf86729b6b0289f4d1909db8c1aaf0d');
> ERROR:  Wrong key or corrupt data
> Time: 615.720 ms
>
> I did not bump the extension version.  I realized the migration file
> would be empty, as there no change to SQL-level functionality (the new
> s2k-count is parsed out of a string down in the C code).  Since only
> one version of contrib extensions binary object files are installed in
> any given postgres installation, people using the newer binary gets
> the feature even if they have not updated the extension version.  So I
> don't know if it makes sense to bump the version if people inherently
> get the feature anyway.

There's zero reason to bump the extension version if the SQL interface
hasn't changed.

(I have no opinion on the underlying patch.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pl/pgsql exported functions

2016-02-11 Thread Magnus Hagander
Most of the pl/pgsql functions and variables are prefixed plpgsql_, so they
don't risk conflicting with other shared libraries loaded.

There are a couple that are not prefixed. Attached patch fixes that. It's
mostly a cleanup, but I think it's something we should do since it's only 2
variables and 2 functions.

AFAICT these are clearly meant to be internal. (the plugin variable is
accessed through find_rendezvous_variable)

Comments?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
*** a/src/pl/plpgsql/src/pl_comp.c
--- b/src/pl/plpgsql/src/pl_comp.c
***
*** 51,57  bool		plpgsql_check_syntax = false;
  PLpgSQL_function *plpgsql_curr_compile;
  
  /* A context appropriate for short-term allocs during compilation */
! MemoryContext compile_tmp_cxt;
  
  /* --
   * Hash table for compiled functions
--- 51,57 
  PLpgSQL_function *plpgsql_curr_compile;
  
  /* A context appropriate for short-term allocs during compilation */
! MemoryContext plpgsql_compile_tmp_cxt;
  
  /* --
   * Hash table for compiled functions
***
*** 253,259  recheck:
   * careful about pfree'ing their allocations, it is also wise to
   * switch into a short-term context before calling into the
   * backend. An appropriate context for performing short-term
!  * allocations is the compile_tmp_cxt.
   *
   * NB: this code is not re-entrant.  We assume that nothing we do here could
   * result in the invocation of another plpgsql function.
--- 253,259 
   * careful about pfree'ing their allocations, it is also wise to
   * switch into a short-term context before calling into the
   * backend. An appropriate context for performing short-term
!  * allocations is the plpgsql_compile_tmp_cxt.
   *
   * NB: this code is not re-entrant.  We assume that nothing we do here could
   * result in the invocation of another plpgsql function.
***
*** 343,349  do_compile(FunctionCallInfo fcinfo,
  	 ALLOCSET_DEFAULT_MINSIZE,
  	 ALLOCSET_DEFAULT_INITSIZE,
  	 ALLOCSET_DEFAULT_MAXSIZE);
! 	compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);
  
  	function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid);
  	function->fn_oid = fcinfo->flinfo->fn_oid;
--- 343,349 
  	 ALLOCSET_DEFAULT_MINSIZE,
  	 ALLOCSET_DEFAULT_INITSIZE,
  	 ALLOCSET_DEFAULT_MAXSIZE);
! 	plpgsql_compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);
  
  	function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid);
  	function->fn_oid = fcinfo->flinfo->fn_oid;
***
*** 387,393  do_compile(FunctionCallInfo fcinfo,
  			 * argument types.  In validation mode we won't be able to, so we
  			 * arbitrarily assume we are dealing with integers.
  			 */
! 			MemoryContextSwitchTo(compile_tmp_cxt);
  
  			numargs = get_func_arg_info(procTup,
  		, , );
--- 387,393 
  			 * argument types.  In validation mode we won't be able to, so we
  			 * arbitrarily assume we are dealing with integers.
  			 */
! 			MemoryContextSwitchTo(plpgsql_compile_tmp_cxt);
  
  			numargs = get_func_arg_info(procTup,
  		, , );
***
*** 774,781  do_compile(FunctionCallInfo fcinfo,
  
  	plpgsql_check_syntax = false;
  
! 	MemoryContextSwitchTo(compile_tmp_cxt);
! 	compile_tmp_cxt = NULL;
  	return function;
  }
  
--- 774,781 
  
  	plpgsql_check_syntax = false;
  
! 	MemoryContextSwitchTo(plpgsql_compile_tmp_cxt);
! 	plpgsql_compile_tmp_cxt = NULL;
  	return function;
  }
  
***
*** 833,839  plpgsql_compile_inline(char *proc_source)
  	 ALLOCSET_DEFAULT_MINSIZE,
  	 ALLOCSET_DEFAULT_INITSIZE,
  	 ALLOCSET_DEFAULT_MAXSIZE);
! 	compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);
  
  	function->fn_signature = pstrdup(func_name);
  	function->fn_is_trigger = PLPGSQL_NOT_TRIGGER;
--- 833,839 
  	 ALLOCSET_DEFAULT_MINSIZE,
  	 ALLOCSET_DEFAULT_INITSIZE,
  	 ALLOCSET_DEFAULT_MAXSIZE);
! 	plpgsql_compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);
  
  	function->fn_signature = pstrdup(func_name);
  	function->fn_is_trigger = PLPGSQL_NOT_TRIGGER;
***
*** 911,918  plpgsql_compile_inline(char *proc_source)
  
  	plpgsql_check_syntax = false;
  
! 	MemoryContextSwitchTo(compile_tmp_cxt);
! 	compile_tmp_cxt = NULL;
  	return function;
  }
  
--- 911,918 
  
  	plpgsql_check_syntax = false;
  
! 	MemoryContextSwitchTo(plpgsql_compile_tmp_cxt);
! 	plpgsql_compile_tmp_cxt = NULL;
  	return function;
  }
  
***
*** 1325,1331  make_datum_param(PLpgSQL_expr *expr, int dno, int location)
  	param = makeNode(Param);
  	param->paramkind = PARAM_EXTERN;
  	param->paramid = dno + 1;
! 	exec_get_datum_type_info(estate,
  			 datum,
  			 >paramtype,
  			 >paramtypmod,
--- 1325,1331 
  	param = makeNode(Param);
  	param->paramkind = PARAM_EXTERN;
  	param->paramid = dno + 1;
! 	

Re: [HACKERS] extend pgbench expressions with functions

2016-02-11 Thread Fabien COELHO


Hello Michaël,


+/* the argument list has been built in reverse order, it is fixed here */
+expr->u.function.args = reverse_elist(args);
Hm. I may be missing something, but why is that necessary? This is
basically doing a double-reversion to put all the arguments in the
correct order when parsing the function arguments.


This is because the expression list is parsed left to right and the list 
is built as a stack to avoid looking for the last argument to append the 
next expression, but then the list is in reverse order at the end of 
parsing, so it is reversed once to make it right. This way the complexity 
is kept as O(n).


If this is too much I can switch to O(n**2) by appending each expression 
at the end of the list.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/pgsql exported functions

2016-02-11 Thread Pavel Stehule
2016-02-11 18:29 GMT+01:00 Magnus Hagander :

> Most of the pl/pgsql functions and variables are prefixed plpgsql_, so
> they don't risk conflicting with other shared libraries loaded.
>
> There are a couple that are not prefixed. Attached patch fixes that. It's
> mostly a cleanup, but I think it's something we should do since it's only 2
> variables and 2 functions.
>
> AFAICT these are clearly meant to be internal. (the plugin variable is
> accessed through find_rendezvous_variable)
>

+1

Pavel


>
> Comments?
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-02-11 Thread Marco Nenciarini
Hi Magnus,

thanks for working on this topic.
What it does is very similar to what Barman's pgespresso extension does and I'd 
like to see it implemented soon in the core.

I've added myself as reviewer for the patch on commitfest site.

Regards,
Marco

-- 
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 1:19 PM, Andres Freund  wrote:
>> > Putting it on the open items list sounds good to me.
>>
>> Well, OK, I've done that then.  I don't really agree that it's not a
>> problem; the OP said he saw a 3x regression, and some of my colleagues
>> doing benchmarking are complaining about this commit, too.  It doesn't
>> seem like much of a stretch to think that it might be affecting other
>> people as well.
>
> Well, I can't do anything about that right now. I won't have the time to
> whip up the new/more complex API we discussed upthread in the next few
> days.  So either we go with a simpler API (e.g. pretty much a cleaned up
> version of my earlier patch), revert the postmaster deatch check, or
> somebody else has to take lead in renovating, or we wait...

Well, I thought we could just revert the patch until you had time to
deal with it, and then put it back in.  That seemed like a simple and
practical option from here, and I don't think I quite understand why
you and Tom don't like it.  I don't have a problem with deferring to
the majority will here, but I would sort of like to understand the
reason for the majority will.

BTW, if need be, I can look for an EnterpriseDB resource to work on
this.  It won't likely be me, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Tom Lane
Andres Freund  writes:
> And anyway, these macros are a potential issue even without stdbool.h
> style booleans.

Absolutely; they don't work safely for testing bits that aren't in the
rightmost byte of a flag word, for instance.  I'm on board with making
these fixes, I'm just unconvinced that stdbool is a good reason for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Andres Freund
On 2016-02-11 13:37:17 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > And anyway, these macros are a potential issue even without stdbool.h
> > style booleans.
> 
> Absolutely; they don't work safely for testing bits that aren't in the
> rightmost byte of a flag word, for instance.  I'm on board with making
> these fixes, I'm just unconvinced that stdbool is a good reason for it.

Oh, ok. Interactions with stdbool was what made me looking into this,
that's primarily why I mentioned it.   What's your thinking about
back-patching, independent of that then?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Tom Lane
Robert Haas  writes:
> On Thu, Feb 11, 2016 at 1:19 PM, Andres Freund  wrote:
>> Well, I can't do anything about that right now. I won't have the time to
>> whip up the new/more complex API we discussed upthread in the next few
>> days.  So either we go with a simpler API (e.g. pretty much a cleaned up
>> version of my earlier patch), revert the postmaster deatch check, or
>> somebody else has to take lead in renovating, or we wait...

> Well, I thought we could just revert the patch until you had time to
> deal with it, and then put it back in.  That seemed like a simple and
> practical option from here, and I don't think I quite understand why
> you and Tom don't like it.

Don't particularly want the git history churn, if we expect that the
patch will ship as-committed in 9.6.  If it becomes clear that the
performance fix is unlikely to happen, we can revert then.

If the performance change were an issue for a lot of testing, I'd agree
with a temporary revert, but I concur with Andres that it's not blocking
much.  Anybody who does have an issue there can revert locally, no?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checkpointer continuous flushing - V16

2016-02-11 Thread Andres Freund
On 2016-02-04 16:54:58 +0100, Andres Freund wrote:
> Fabien asked me to post a new version of the checkpoint flushing patch
> series. While this isn't entirely ready for commit, I think we're
> getting closer.
> 
> I don't want to post a full series right now, but my working state is
> available on
> http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/checkpoint-flush
> git://git.postgresql.org/git/users/andresfreund/postgres.git checkpoint-flush

The first two commits of the series are pretty close to being ready. I'd
welcome review of those, and I plan to commit them independently of the
rest as they're beneficial independently.  The most important bits are
the comments and docs of 0002 - they weren't particularly good
beforehand, so I had to rewrite a fair bit.

0001: Make SetHintBit() a bit more aggressive, afaics that fixes all the
  potential regressions of 0002
0002: Fix the overaggressive flushing by the wal writer, by only
  flushing every wal_writer_delay ms or wal_writer_flush_after
  bytes.

Greetings,

Andres Freund
>From f3bc3a7c40c21277331689595814b359c55682dc Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 11 Feb 2016 19:34:29 +0100
Subject: [PATCH 1/6] Allow SetHintBits() to succeed if the buffer's LSN is new
 enough.

Previously we only allowed SetHintBits() to succeed if the commit LSN of
the last transaction touching the page has already been flushed to
disk. We can't generally change the LSN of the page, because we don't
necessarily have the required locks on the page. But the required LSN
interlock does not require the commit record to be flushed, it just
requires that the commit record will be flushed before the page is
written out. Therefore if the buffer LSN is newer than the commit LSN,
the hint bit can be safely set.

In a number of scenarios (e.g. pgbench) this noticeably increases the
number of hint bits are set. But more importantly it also keeps the
success rate up when flushing WAL less frequently. That was the original
reason for commit 4de82f7d7, which has negative performance consequences
in a number of scenarios. This will allow a follup commit to reduce the
flush rate.

Discussion: 20160118163908.gw10...@awork2.anarazel.de
---
 src/backend/utils/time/tqual.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 465933d..503bd1d 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -89,12 +89,13 @@ static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot);
  * Set commit/abort hint bits on a tuple, if appropriate at this time.
  *
  * It is only safe to set a transaction-committed hint bit if we know the
- * transaction's commit record has been flushed to disk, or if the table is
- * temporary or unlogged and will be obliterated by a crash anyway.  We
- * cannot change the LSN of the page here because we may hold only a share
- * lock on the buffer, so we can't use the LSN to interlock this; we have to
- * just refrain from setting the hint bit until some future re-examination
- * of the tuple.
+ * transaction's commit record is guaranteed to be flushed to disk before the
+ * buffer, or if the table is temporary or unlogged and will be obliterated by
+ * a crash anyway.  We cannot change the LSN of the page here because we may
+ * hold only a share lock on the buffer, so we can only use the LSN to
+ * interlock this if the buffer's LSN already is newer than the commit LSN;
+ * otherwise we have to just refrain from setting the hint bit until some
+ * future re-examination of the tuple.
  *
  * We can always set hint bits when marking a transaction aborted.  (Some
  * code in heapam.c relies on that!)
@@ -122,8 +123,12 @@ SetHintBits(HeapTupleHeader tuple, Buffer buffer,
 		/* NB: xid must be known committed here! */
 		XLogRecPtr	commitLSN = TransactionIdGetCommitLSN(xid);
 
-		if (XLogNeedsFlush(commitLSN) && BufferIsPermanent(buffer))
-			return;/* not flushed yet, so don't set hint */
+		if (BufferIsPermanent(buffer) && XLogNeedsFlush(commitLSN) &&
+			BufferGetLSNAtomic(buffer) < commitLSN)
+		{
+			/* not flushed and no LSN interlock, so don't set hint */
+			return;
+		}
 	}
 
 	tuple->t_infomask |= infomask;
-- 
2.7.0.229.g701fa7f

>From e4facce2cf8b982408ff1de174cffc202852adfd Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 11 Feb 2016 19:34:29 +0100
Subject: [PATCH 2/6] Allow the WAL writer to flush WAL at a reduced rate.

Commit 4de82f7d7 increased the WAL flush rate, mainly to increase the
likelihood that hint bits can be set quickly. More quickly set hint bits
can reduce contention around the clog et al.  But unfortunately the
increased flush rate can have a significant negative performance impact,
I have measured up to a factor of ~4.  The reason for this slowdown is
that if there are independent 

Re: [HACKERS] Performance degradation in commit ac1d794

2016-02-11 Thread Robert Haas
On Thu, Feb 11, 2016 at 1:41 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Feb 11, 2016 at 1:19 PM, Andres Freund  wrote:
>>> Well, I can't do anything about that right now. I won't have the time to
>>> whip up the new/more complex API we discussed upthread in the next few
>>> days.  So either we go with a simpler API (e.g. pretty much a cleaned up
>>> version of my earlier patch), revert the postmaster deatch check, or
>>> somebody else has to take lead in renovating, or we wait...
>
>> Well, I thought we could just revert the patch until you had time to
>> deal with it, and then put it back in.  That seemed like a simple and
>> practical option from here, and I don't think I quite understand why
>> you and Tom don't like it.
>
> Don't particularly want the git history churn, if we expect that the
> patch will ship as-committed in 9.6.  If it becomes clear that the
> performance fix is unlikely to happen, we can revert then.
>
> If the performance change were an issue for a lot of testing, I'd agree
> with a temporary revert, but I concur with Andres that it's not blocking
> much.  Anybody who does have an issue there can revert locally, no?

True.  Maybe we'll just have to start doing that for EnterpriseDB
benchmarking as standard practice.  Not sure everybody who is
benchmarking will realize the issue though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GinPageIs* don't actually return a boolean

2016-02-11 Thread Tom Lane
Andres Freund  writes:
> On 2016-02-11 13:37:17 -0500, Tom Lane wrote:
>> Absolutely; they don't work safely for testing bits that aren't in the
>> rightmost byte of a flag word, for instance.  I'm on board with making
>> these fixes, I'm just unconvinced that stdbool is a good reason for it.

> Oh, ok. Interactions with stdbool was what made me looking into this,
> that's primarily why I mentioned it.   What's your thinking about
> back-patching, independent of that then?

Well, Yury was saying upthread that some MSVC versions have a problem
with the existing coding, which would be a reason to back-patch ...
but I'd like to see a failing buildfarm member first.  Don't particularly
want to promise to support compilers not represented in the farm.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers