date:20170917

On 2017-09-18 07:24:43 +0100, Simon Riggs wrote:
> On 18 September 2017 at 05:50, Andres Freund  wrote:
> > Hi,
> >
> > Just noticed that we're returning the underlying values for
> > pg_control_recovery() without any checks:
> > postgres[14388][1]=# SELECT * FROM pg_control_recovery();
> > ┌──┬───┬──┬┬───┐
> > │ min_recovery_end_lsn │ min_recovery_end_timeline │ backup_start_lsn │ 
> > backup_end_lsn │ end_of_backup_record_required │
> > ├──┼───┼──┼┼───┤
> > │ 0/0  │ 0 │ 0/0  │ 0/0 
> >│ f │
> > └──┴───┴──┴┴───┘
> > (1 row)
> 
> Yes, that would have made sense for these to be NULL

Yea, that's what I think was well.  Joe, IIRC that's your code, do you
agree as well?


> > postgres[14388][1]=# SELECT pg_is_in_recovery();
> > ┌───┐
> > │ pg_is_in_recovery │
> > ├───┤
> > │ f │
> > └───┘
> > (1 row)
> 
> But not this, since it is a boolean and the answer is known.

Oh, that was just for reference, to show that the cluster isn't in
recovery...


- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_control_recovery() return value when not in recovery

2017-09-17 Thread Simon Riggs

On 18 September 2017 at 05:50, Andres Freund  wrote:
> Hi,
>
> Just noticed that we're returning the underlying values for
> pg_control_recovery() without any checks:
> postgres[14388][1]=# SELECT * FROM pg_control_recovery();
> ┌──┬───┬──┬┬───┐
> │ min_recovery_end_lsn │ min_recovery_end_timeline │ backup_start_lsn │ 
> backup_end_lsn │ end_of_backup_record_required │
> ├──┼───┼──┼┼───┤
> │ 0/0  │ 0 │ 0/0  │ 0/0   
>  │ f │
> └──┴───┴──┴┴───┘
> (1 row)

Yes, that would have made sense for these to be NULL


> postgres[14388][1]=# SELECT pg_is_in_recovery();
> ┌───┐
> │ pg_is_in_recovery │
> ├───┤
> │ f │
> └───┘
> (1 row)

But not this, since it is a boolean and the answer is known.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] sync process names between ps and pg_stat_activity

2017-09-17 Thread MauMau

From: Peter Eisentraut
> The process names shown in pg_stat_activity.backend_type as of PG10
and
> the process names used in the ps display are in some cases
gratuitously
> different, so here is a patch to make them more alike.  Of course it
> could be debated in some cases which spelling was better.

(1)
In the following comment, it's better to change "wal sender process"
to "walsender" to follow the modified name.

- * postgres: wal sender process   
+ * postgres: walsender   
  *
  * To achieve that, we pass "wal sender process" as username and
username
  * as dbname to init_ps_display(). XXX: should add a new variant
of
  * init_ps_display() to avoid abusing the parameters like this.
  */


(2)
"WAL writer process" is used, not "walwriter", is used in postmaster.c
as follows.  I guess this is for natural language.  Is this intended?
I'm OK with either, though.

HandleChildCrash(pid, exitstatus,
 _("WAL writer process"));

case WalWriterProcess:
ereport(LOG,
(errmsg("could not fork WAL writer process:
%m")));


Personally, I prefer "wal writer", "wal sender" and "wal receiver"
that separate words as other process names.  But I don't mind leaving
them as they are now.  I'd like to make this as ready for committer
when I get some reply.

Regards
MauMau



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] valgrind vs. shared typmod registry

On Mon, Sep 18, 2017 at 5:39 PM, Thomas Munro
 wrote:
> Here is a patch to fix that.

Here's a better one (same code, corrected commit message).

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Fix-uninitialized-variable-in-dshash.c.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] UPDATE of partition key

On Fri, Sep 15, 2017 at 4:55 PM, Amit Khandekar  wrote:
> On 12 September 2017 at 12:39, Amit Khandekar  wrote:
>> On 12 September 2017 at 11:57, Dilip Kumar  wrote:
>>> On Tue, Sep 12, 2017 at 11:15 AM, Amit Khandekar  
>>> wrote:
>>>
> I found out that, in case when there is a DELETE statement trigger
> using transition tables, it's not only an issue of redundancy; it's a
> correctness issue. Since for transition tables both DELETE and UPDATE
> use the same old row tuplestore for capturing OLD table, that table
> gets duplicate rows: one from ExecARDeleteTriggers() and another from
> ExecARUpdateTriggers(). In presence of INSERT statement trigger using
> transition tables, both INSERT and UPDATE events have separate
> tuplestore, so duplicate rows don't show up in the UPDATE NEW table.
> But, nevertheless, we need to prevent NEW rows to be collected in the
> INSERT event tuplestore, and capture the NEW rows only in the UPDATE
> event tuplestore.
>
> In the attached patch, we first call ExecARUpdateTriggers(), and while
> doing that, we first save the info that a NEW row is already captured
> (mtstate->mt_transition_capture->tcs_update_old_table == true). If it
> captured, we pass NULL transition_capture pointer to
> ExecARDeleteTriggers() (and ExecARInsertTriggers) so that it does not
> again capture an extra row.
>
> Modified a testcase in update.sql by including DELETE statement
> trigger that uses transition tables.

Ok, this fix looks correct to me, I will review the latest patch.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] valgrind vs. shared typmod registry

On Sun, Sep 17, 2017 at 8:49 AM, Thomas Munro
 wrote:
> On Sun, Sep 17, 2017 at 7:42 AM, Thomas Munro
>  wrote:
>> On Sun, Sep 17, 2017 at 12:30 AM, Tomas Vondra
>>  wrote:
>>> I've been running some regression tests under valgrind, and it seems
>>> select_parallel triggers some uses of uninitialized values in dshash. If
>>> I'm reading the reports right, it complains about hashtable->size_log2
>>> being not being initialized in ensure_valid_bucket_pointers.
>>
>> Thanks.  Will investigate.
>
> Yeah, it's a bug, I simply failed to initialise it.
> ensure_valid_bucket_pointers() immediately fixes the problem (unless
> the uninitialised memory had an unlikely value), explaining why it
> works anyway.  I'm a bit tied up today but will test and post a patch
> tomorrow.

Here is a patch to fix that.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Fix-uninitialized-variable-in-dshash.c.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Small patch for pg_basebackup argument parsing

2017-09-17 Thread Ryan Murphy

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:not tested

I applied this patch via patch -p1.  (Had an issue using git apply, but 
apparently that's common?)

All tests passed normally.  Ran make check, make installcheck, and make 
installcheck-world.

Looked thru the diffs and it looks fine to me.
Changing a lot of a integer/long arguments that were being read directly via 
atoi / atol
to be read as strings first, to give better error handling.

This looks good to go to me.  Reviewing this as "Ready for Committer".
Thanks for the patch, Pierre!

Ryan

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Append implementation

2017-09-17 Thread Amit Khandekar

On 16 September 2017 at 11:45, Amit Kapila  wrote:
> On Thu, Sep 14, 2017 at 8:30 PM, Amit Khandekar  
> wrote:
>> On 11 September 2017 at 18:55, Amit Kapila  wrote:

>>>
>>> How?  See, if you have four partial subpaths and two non-partial
>>> subpaths, then for parallel-aware append it considers all six paths in
>>> parallel path whereas for non-parallel-aware append it will consider
>>> just four paths and that too with sub-optimal strategy.  Can you
>>> please try to give me some example so that it will be clear.
>>
>> Suppose 4 appendrel children have costs for their cheapest partial (p)
>> and non-partial paths (np)  as shown below :
>>
>> p1=5000  np1=100
>> p2=200   np2=1000
>> p3=80   np3=2000
>> p4=3000  np4=50
>>
>> Here, following two Append paths will be generated :
>>
>> 1. a parallel-aware Append path with subpaths :
>> np1, p2, p3, np4
>>
>> 2. Partial (i.e. non-parallel-aware) Append path with all partial subpaths:
>> p1,p2,p3,p4
>>
>> Now, one thing we can do above is : Make the path#2 parallel-aware as
>> well; so both Append paths would be parallel-aware.
>>
>
> Yes, we can do that and that is what I think is probably better.  So,
> the question remains that in which case non-parallel-aware partial
> append will be required?  Basically, it is not clear to me why after
> having parallel-aware partial append we need non-parallel-aware
> version?  Are you keeping it for the sake of backward-compatibility or
> something like for cases if someone has disabled parallel append with
> the help of new guc in this patch?

Yes one case is the enable_parallelappend GUC case. If a user disables
it, we do want to add the usual non-parallel-aware append partial
path.

About backward compatibility, the concern we discussed in [1] was that
we better continue to have the usual non-parallel-aware partial Append
path, plus we should have an additional parallel-aware Append path
containing mix of partial and non-partial subpaths.

But thinking again on the example above, I think Amit, I tend to agree
that we don't have to worry about the existing behaviour, and so we
can make the path#2 parallel-aware as well.

Robert, can you please suggest what is your opinion on the paths that
are chosen in the above example ?

[1] 
https://www.postgresql.org/message-id/CA%2BTgmoaLRtaWdJVHfhHej2s7w1spbr6gZiZXJrM5bsz1KQ54Rw%40mail.gmail.com

>
-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

On Sun, Sep 17, 2017 at 4:34 PM, Dilip Kumar  wrote:
>>
>> I have repeated one of the tests after fixing the problems pointed by
>> you but this time results are not that impressive.  Seems like below
>> check was the problem in the previous patch
>>
>>if (tbm->nentries > tbm->maxentries / 2)
>> tbm->maxentries = Min(tbm->nentries, (INT_MAX - 1) / 2) * 2;
>>
>> Because we were lossifying only till tbm->nentries becomes 90% of
>> tbm->maxentries but later we had this check which will always be true
>> and tbm->maxentries will be doubled and that was the main reason of
>> huge reduction of lossy pages, basically, we started using more
>> work_mem in all the cases.
>>
>> I have taken one reading just to see the impact after fixing the
>> problem with the patch.
>>
>>  Work_mem: 40 MB
>> (Lossy Pages count)
>>
>> Queryhead  patch
>> 6   995223   733087
>> 14 337894   206824
>> 15 995417   798817
>> 20   1654016 1588498
>>
>> Still, we see a good reduction in lossy pages count.  I will perform
>> the test at different work_mem and for different values of
>> TBM_FILFACTOR and share the number soon.
>
> I haven't yet completely measured the performance with executor
> lossification change, meanwhile, I have worked on some of the comments
> on optimiser change and taken the performance again, I still see good
> improvement in the performance (almost 2x for some of the queries) and
> with new method of lossy pages calculation I don't see regression in
> Q14 (now Q14 is not changing its plan).
>
> I used  lossy_pages = max(0, total_pages - maxentries / 2). as
> suggesed by Alexander.
>
>
> Performance Results:
>
> Machine: Intell 56 core machine (2 NUMA node)
> work_mem: varies.
> TPCH S.F: 20
> Median of 3 runs.
>
> work_mem = 4MB
>
> QueryPatch(ms)Head(ms)Change in plan
>
> 4   4686.186   5039.295 PBHS -> PSS
>
> 5   26772.19227500.800BHS -> SS
>
> 6   6615.916   7760.005 PBHS -> PSS
>
> 8   6370.611  12407.731PBHS -> PSS
>
>   15   17493.564   24242.256 BHS -> SS
>
>
> work_mem = 20MB
>
> QueryPatch(ms)Head(ms)Change in plan
>
> 6   6656.467   7469.961 PBHS -> PSS
>
> 8   6116.526  12300.784PBHS -> PSS
>
> 15 17873.72622913.421BHS -> PSS
>
>
> work_mem = 64MB
>
> QueryPatch(ms)Head(ms)   Change in plan
>
> 15 14900.88127460.093   BHS -> PBHS
>


There was some problem with the previous patch, even if the bitmap was
enough to hold all the heap pages I was calculating the lossy pages.
I have fixed that in the attached patch.  I have also verified the
performance it's same as reported in the previous email.



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


improve_bitmap_cost_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_control_recovery() return value when not in recovery

Hi,

Just noticed that we're returning the underlying values for
pg_control_recovery() without any checks:
postgres[14388][1]=# SELECT * FROM pg_control_recovery();
┌──┬───┬──┬┬───┐
│ min_recovery_end_lsn │ min_recovery_end_timeline │ backup_start_lsn │ 
backup_end_lsn │ end_of_backup_record_required │
├──┼───┼──┼┼───┤
│ 0/0  │ 0 │ 0/0  │ 0/0 
   │ f │
└──┴───┴──┴┴───┘
(1 row)

postgres[14388][1]=# SELECT pg_is_in_recovery();
┌───┐
│ pg_is_in_recovery │
├───┤
│ f │
└───┘
(1 row)

Wouldn't it be more accurate to return NULLs here?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] parallel.c oblivion of worker-startup failures

2017-09-17 Thread Amit Kapila

Sometime back Tom Lane has reported [1] about $Subject.  I have looked
into the issue and found that the problem is not only with parallel
workers but with general background worker machinery as well in
situations where fork or some such failure occurs. The first problem
is that after we register the dynamic worker, the way to know whether
the worker has started
(WaitForBackgroundWorkerStartup/GetBackgroundWorkerPid) won't give the
right answer if the fork failure happens.  Also, in cases where the
worker is marked not to start after the crash, postmaster doesn't
notify the backend if it is not able to start the worker which can
make the backend wait forever as it is oblivion of the fact that the
worker is not started.  Now, apart from these general problems of
background worker machinery, parallel.c assumes that after it has
registered the dynamic workers, they will start and perform their
work.  We need to ensure that in case, postmaster is not able to start
parallel workers due to fork failure or any similar situations,
backend doesn't keep on waiting forever.  To fix it, before waiting
for workers to finish, we can check whether the worker exists at all.
Attached patch fixes these problems.

Another way to fix the parallel query related problem is that after
registering the workers, the master backend should wait for workers to
start before setting up different queues for them to communicate.  I
think that can be quite expensive.

Thoughts?

[1] - https://www.postgresql.org/message-id/4905.1492813...@sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_worker_startup_failures_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Automatic testing of patches in commit fest

On Mon, Sep 18, 2017 at 2:39 PM, Andres Freund  wrote:
> E.g. very little of the new stuff in 
> https://codecov.io/gh/postgresql-cfbot/postgresql/commit/ceaa3dbece3c9b98abcaa28009320fde45a83f88
> is exercised.

Hoist by my own petard.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Automatic testing of patches in commit fest

Hi,

On 2017-09-18 14:26:53 +1200, Thomas Munro wrote:
>  A couple of new experimental features on commitfest.cputube.org:

Yay.

> 2.  It'll now dump a gdb backtrace for any core files found after a
> check-world failure (if you can find your way to the build log...).
> Thanks to Andres for the GDB scripting for this!

Scripting that should not be needed, except that tools are generally
crappy :(

> The code coverage reports at codecov.io are supposed to allow
> comparison between a branch and master so you can see exactly what
> changed in terms of coverage, but I ran into a technical problem and
> ran out of time for now to make that happen.  But you can still click
> your way through to the commit and see a coverage report for the
> commit diff.  In other words you can see which modified lines are run
> by make check-world, which seems quite useful.

Yes, it's definitely already useful. If you check
https://codecov.io/gh/postgresql-cfbot/postgresql/commits you can see
the code coverage for the various CF entries. Both the absolute coverage
and, more interestingly, the coverage of the changed lines. There's some
noticeable difference in coverage between the various entries...

E.g. very little of the new stuff in 
https://codecov.io/gh/postgresql-cfbot/postgresql/commit/ceaa3dbece3c9b98abcaa28009320fde45a83f88
is exercised.

> There are plenty more things we could stick into this pipeline, like
> LLVM sanitizer stuff, valgrind, Coverity, more compilers,  here>... but I'm not sure what things really make sense.  I may be
> placing undue importance on things that I personally screwed up
> recently :-D

A lot of these probably are too slow to be practical...  I know it's not
fun, but a good bit of that probably is going to be making the UI of the
overview a bit better. E.g. direct links to the build / tests logs from
travis/codecov/..., allowing to filter by failing to apply / failing
tests / ok, etc...

Some of this would be considerably easier if the project were ok with
having a .travis.yml in our sourcetree.

Regards,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DROP SUBSCRIPTION hangs if sub is disabled in the same transaction

2017-09-17 Thread Peter Eisentraut

On 9/16/17 08:52, Peter Eisentraut wrote:
> On 9/15/17 13:35, Arseny Sher wrote:
>> Peter Eisentraut  writes:
>>
>>> Here is a simple patch that fixes this, based on my original proposal
>>> point #4.
>> I checked, it passes the tests and solves the problem. However, isn't
>> this
>>
>> +if (slotname || !subenabled)
>>
>> is a truism? Is it possible that subscription has no slot but still
>> enabled?
> Yeah, we could just remove the _at_commit() branch entirely.  That would
> effectively undo the change in 7e174fa793a2df89fe03d002a5087ef67abcdde8,
> but I don't see any other choice for now.  And the practical impact
> would be quite limited.

Fix committed based on this discussion.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Automatic testing of patches in commit fest

Hi hackers,

A couple of new experimental features on commitfest.cputube.org:

1. I didn't have --enable-cassert enabled before. Oops.
2. It'll now dump a gdb backtrace for any core files found after a
check-world failure (if you can find your way to the build log...).
Thanks to Andres for the GDB scripting for this!
3. It'll now push gcov results to codecov.io -- see link at top of
page. Thanks again to Andres for this idea.
4. It now builds a little bit faster due to -j4 (Travis CI VMs have 2
virtual cores) and .proverc -j3. (So far one entry now fails in TAP
tests with that setting, will wait longer before drawing any
conclusions about that.)

The code coverage reports at codecov.io are supposed to allow
comparison between a branch and master so you can see exactly what
changed in terms of coverage, but I ran into a technical problem and
ran out of time for now to make that happen. But you can still click
your way through to the commit and see a coverage report for the
commit diff. In other words you can see which modified lines are run
by make check-world, which seems quite useful.

There are plenty more things we could stick into this pipeline, like
LLVM sanitizer stuff, valgrind, Coverity, more compilers, ... but I'm not sure what things really make sense. I may be
placing undue importance on things that I personally screwed up
recently :-D

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Setting pd_lower in GIN metapage

2017-09-17 Thread Amit Kapila

On Sat, Sep 16, 2017 at 7:15 PM, Michael Paquier
 wrote:
> On Fri, Sep 15, 2017 at 4:22 PM, Amit Langote
>  wrote:
>> On 2017/09/14 16:00, Michael Paquier wrote:
>>> On Wed, Sep 13, 2017 at 4:43 PM, Amit Langote
>>>  wrote:
 Sure, no problem.
>>>
>>> OK, I took a closer look at all patches, but did not run any manual
>>> tests to test the compression except some stuff with
>>> wal_consistency_checking.
>>
>> Thanks for the review.
>>
>>> For SpGist, I think that there are two missing: spgbuild() and 
>>> spgGetCache().
>>
>> spgbuild() calls SpGistInitMetapage() before marking the metapage buffer
>> dirty.  The latter already sets pd_lower correctly, so we don't need to do
>> it explicitly in spgbuild() itself.
>
> Check.
>
>> spgGetCache() doesn't write the metapage, only reads it:
>>
>> /* Last, get the lastUsedPages data from the metapage */
>> metabuffer = ReadBuffer(index, SPGIST_METAPAGE_BLKNO);
>> LockBuffer(metabuffer, BUFFER_LOCK_SHARE);
>>
>> metadata = SpGistPageGetMeta(BufferGetPage(metabuffer));
>>
>> if (metadata->magicNumber != SPGIST_MAGIC_NUMBER)
>> elog(ERROR, "index \"%s\" is not an SP-GiST index",
>>  RelationGetRelationName(index));
>>
>> cache->lastUsedPages = metadata->lastUsedPages;
>>
>> UnlockReleaseBuffer(metabuffer);
>>
>> So, I think it won't be correct to set pd_lower here, no?
>
> Yeah, I am just reading the code again and there is no alarm here.
>
>> Updated patch attached, which implements your suggested changes to the
>> masking functions.
>>
>> By the way, as I noted on another unrelated thread, I will not be able to
>> respond to emails from tomorrow until 9/23.
>
> No problem. Enjoy your vacations.
>
> I have spent some time looking at the XLOG insert code, and tested the
> compressibility of the meta pages. And I have noticed that all pages
> registered with REGBUF_WILL_INIT will force XLogRecordAssemble to not
> take a FPW of the block registered because the page will be
> reinitialized at replay, and in such cases the zero'ed page is filled
> with the data from the record. log_newpage is used to initialize init
> forks for unlogged relations, which is where this patch allows page
> compression to take effect correctly. Still setting REGBUF_STANDARD
> with REGBUF_WILL_INIT is actually a no-op, except if
> wal_checking_consistency is used when registering a buffer for WAL
> insertion. There is one code path though where things are useful all
> the time: revmap_physical_extend for BRIN.
>
> The patch is using the correct logic, still such comments are in my
> opinion incorrect because of the reason written above:
> +* This won't be of any help unless we use option like REGBUF_STANDARD
> +* while registering the buffer for metapage as otherwise, it won't try to
> +* remove the hole while logging the full page image.
> Here REGBUF_STANDARD is actually a no-op for btree.
>

You have already noticed above that it will help when
wal_checking_consistency is used and that was the main motivation to
pass REGBUF_STANDARD apart from maintaining consistency.  It is not
clear to me what is bothering you.  If your only worry about these
patches is that you want this sentence to be removed from the comment
because you think it is obvious or doesn't make much sense, then I
think we can leave this decision to committer.  I have added it based
on Tom's suggestion above thread about explaining why it is
inessential or essential to set pd_lower.  I think Amit Langote just
tried to mimic what I have done in hash and btree patches to maintain
consistency.  I am also not very sure if we should write some detailed
comment or leave the existing comment as it is.  I think it is just a
matter of different perspective.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-09-17 Thread Andreas Karlsson

I have not looked at the issue with the btree_gin tests yet, but here is 
the first part of my review.


= Review

This is my first quick review where I just read the documentation and 
quickly tested the feature. I will review it more in-depth later.


This is a very useful feature, one which I have a long time wished for.

The patch applies, compiles and passes the test suite with just one warning.

parse_coerce.c: In function ‘select_common_type_2args’:
parse_coerce.c:1379:7: warning: statement with no effect [-Wunused-value]
   rightOid;
   ^~~~

= Functional

The documentation does not agree with the code on the syntax. The 
documentation claims it is "FOREIGN KEY (ELEMENT xs) REFERENCES t1 (x)" 
when it actually is "FOREIGN KEY (EACH ELEMENT OF xs) REFERENCES t1 (x)".


Likewise I can't get the "final_positions integer[] ELEMENT REFERENCES 
drivers" syntax to work, but here I cannot see any change in the syntax 
to support it.


Related to the above: I am not sure if it is a good idea to make ELEMENT 
a reserved word in column definitions. What if the SQL standard wants to 
use it for something?


The documentation claims ON CASCADE DELETE is not supported by array 
element foreign keys, but I do not think that is actually the case.


I think I prefer (EACH ELEMENT OF xs) over (ELEMENT xs) given how the 
former is more in what I feel is the spirit of SQL. And if so we should 
match it as "xs integer[] EACH ELEMENT REFERENCES t1 (x)", assuming we 
want that syntax.


Once I have created an array element foreign key the basic features seem 
to work as expected.


The error message below fails to mention that it is an array element 
foreign key, but I do not think that is not a blocker for getting this 
feature merged. Right now I cannot think of how to improve it either.


$ INSERT INTO t3 VALUES ('{1,3}');
ERROR:  insert or update on table "t3" violates foreign key constraint 
"t3_xs_fkey"

DETAIL:  Key (xs)=({1,3}) is not present in table "t1".

= Nitpicking/style comments

In doc/src/sgml/catalogs.sgml the 
"conpfeqop" line is 
incorrectly indented.


I am not fan of calling it "array-vs-scalar". What about array to scalar?

In ddl.sgml date should be lower case like the other types  in "race_day 
DATE,".


In ddl.sgml I suggest removing the "..." from the examples to make it 
possible to copy paste them easily.


Your text wrapping in ddl.sqml and create_table.sgqml is quite 
arbitrary. I suggest wrapping all paragraphs at 80 characters (except 
for code which should not be wrapped). Your text editor probably has 
tools for wrapping paragraphs.


Please be consistent about how you write table names and SQL in general. 
I think almost all places use lower case for table names, while your 
examples in create_table.sgml are FKTABLEFORARRAY.


Andreas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscripting

2017-09-17 Thread Tom Lane

Arthur Zakirov  writes:
> CREATE SUBSCRIPTING FOR type_name
>   INITFUNC = subscripting_init_func
>   FETCHFUNC = subscripting_fetch_func
>   ASSIGNFUNC = subscripting_assign_func
> DROP SUBSCRIPTING FOR type_name

Reasonable, but let's make the syntax more like other similar
utility commands such as CREATE AGGREGATE --- basically just
adding some parens, IIRC.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add --if-exists to pg_recvlogical

2017-09-17 Thread Rosser Schwarz

On Fri, Sep 1, 2017 at 10:22 AM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:
> I understand the --drop-slot part.  But I don't understand what it means
> to ignore a missing replication slot when running --start.

I'm not sure I do either, honestly. I followed the Principle of Least
Surprise in making it a no-op when those switches are used and the slot
doesn't exist, over "no one will ever do that". Because someone will.

I'm happy to hear suggestions on what to do in that case beyond exit
cleanly.

> The same option should be added to pg_receivewal as well.

Done.

On Fri, Sep 8, 2017 at 21:06 Thomas Munro 
wrote:
> Also "--start" breaks the documentation build
> (missing slash on the closing tag).

Fixed, thanks.

Please see revised patch, attached, addressing both comments.

rls

-- 
:wq

pg_recvlogical--if-exists-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Generic type subscripting

2017-09-17 Thread Arthur Zakirov

On Sun, Sep 17, 2017 at 12:27:58AM +0200, Dmitry Dolgov wrote:
> spite of what form this step will be. Maybe it's possible to make something
> like `CREATE FUNCTION ... FOR SUBSCRIPTING`, then verify that assign/extract
> functions are presented and notify user if he missed them (but I would
> rather
> not do this unless it's really necessary, since it looks like an overkill).
> 
> But I'm open to any suggestions, do you have something in mind?

I have put some thought into it. What about the following syntax?

CREATE SUBSCRIPTING FOR type_name
  INITFUNC = subscripting_init_func
  FETCHFUNC = subscripting_fetch_func
  ASSIGNFUNC = subscripting_assign_func
DROP SUBSCRIPTING FOR type_name

But I am not if the community will like such syntax.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving DISTINCT with LooseScan node

On Mon, Sep 18, 2017 at 5:43 AM, Dmitriy Sarafannikov
 wrote:
> Hi hackers,
>
> Everybody knows, that we have unefficient execution of query like "SELECT
> DISTINCT id from mytable"
> if table has many-many rows and only several unique id values. Query plan
> looks like Unique + IndexScan.
>
> I have tried to implement this feature in new type of node called Loose
> Scan.
> This node must appears in plan together with IndexScan or IndexOnlyScan just
> like Unique node in this case.
> But instead of filtering rows with equal values, LooseScan must retreive
> first row from IndexScan,
> then remember and return this. With all subsequent calls LooseScan must
> initiate calling index_rescan via ExecReScan
> with search value that > or < (depending on scan direction) of previous
> value.
> Total cost of this path must be something like total_cost =
> n_distinct_values * subpath->startup_cost
> What do you think about this idea?
>
> I was able to create new LooseScan node, but i ran into difficulties with
> changing scan keys.
> I looked (for example) on the ExecReScanIndexOnlyScan function and I find it
> difficult to understand
> how i can reach the ioss_ScanKeys through the state of executor. Can you
> help me with this?
>
> I also looked on the Nested Loop node, which as i think must change scan
> keys, but i didn't become clear for me.
> The only thought that came to my head, that maybe i incorrectly create this
> subplan.
> I create it just like create_upper_unique_plan, and create subplan for
> IndexScan via create_plan_recurse.
> Can you tell me where to look or maybe somewhere there are examples?

Not an answer to your question, but generally +1 for working on this
area.  I did some early proto-hacking a while ago, which I haven't had
time to get back to yet:

https://www.postgresql.org/message-id/flat/CADLWmXWALK8NPZqdnRQiPnrzAnic7NxYKynrkzO_vxYr8enWww%40mail.gmail.com

That was based on the idea that a DISTINCT scan using a btree index to
skip ahead is always going to be using the leading N columns and
always going to be covered by the index, so I might as well teach
Index Only Scan how to do it directly rather than making a new node to
sit on top.  As you can see in that thread I did start thinking about
using a new node to sit on top and behave a bit like a nested loop for
the more advanced feature of an Index Skip Scan (trying every value of
(a) where you had an index on (a, b, c) but had a WHERE clause qual on
(b, c)), but the only feedback I had so far was from Robert Haas who
thought that too should probably be pushed into the index scan.

FWIW I'd vote for 'skip' rather than 'loose' as a term to use in this
family of related features (DISTINCT being the easiest to build).  It
seems more descriptive than the MySQL term.  (DB2 added this a little
while ago and went with 'index jump scan', suggesting that we should
consider 'hop'... (weak humour, 'a hop, skip and a jump' being an
English idiom meaning a short distance)).

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GnuTLS support

2017-09-17 Thread Andreas Karlsson


On 09/15/2017 06:55 PM, Jeff Janes wrote:

I can't build against gnutls-2.12.23-21.el6.x86_64 from CentOS 6.9


Thanks for testing my patch. I have fixed both these issues plus some of 
the other feedback. A new version of my patch is attached which should, 
at least on theory, support all GnuTLS versions >= 2.11.


I just very quickly fixed the broken SSL tests, as I am no fan of how 
the SSL tests currently are written and think they should be cleaned up.


Andreas
diff --git a/configure b/configure
index 0d76e5ea42..33b1f00bff 100755
--- a/configure
+++ b/configure
@@ -709,6 +709,7 @@ UUID_EXTRA_OBJS
 with_uuid
 with_systemd
 with_selinux
+with_gnutls
 with_openssl
 krb_srvtab
 with_python
@@ -838,6 +839,7 @@ with_bsd_auth
 with_ldap
 with_bonjour
 with_openssl
+with_gnutls
 with_selinux
 with_systemd
 with_readline
@@ -1534,6 +1536,7 @@ Optional Packages:
   --with-ldap build with LDAP support
   --with-bonjour  build with Bonjour support
   --with-openssl  build with OpenSSL support
+  --with-gnutls   build with GnuTS support
   --with-selinux  build with SELinux support
   --with-systemd  build with systemd support
   --without-readline  do not use GNU Readline nor BSD Libedit for editing
@@ -6051,6 +6054,41 @@ fi
 $as_echo "$with_openssl" >&6; }
 
 
+#
+# GnuTLS
+#
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with GnuTLS support" >&5
+$as_echo_n "checking whether to build with GnuTLS support... " >&6; }
+
+
+
+# Check whether --with-gnutls was given.
+if test "${with_gnutls+set}" = set; then :
+  withval=$with_gnutls;
+  case $withval in
+yes)
+
+$as_echo "#define USE_GNUTLS 1" >>confdefs.h
+
+  ;;
+no)
+  :
+  ;;
+*)
+  as_fn_error $? "no argument expected for --with-gnutls option" "$LINENO" 5
+  ;;
+  esac
+
+else
+  with_gnutls=no
+
+fi
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_gnutls" >&5
+$as_echo "$with_gnutls" >&6; }
+
+
 #
 # SELinux
 #
@@ -10218,6 +10256,83 @@ done
 
 fi
 
+if test "$with_gnutls" = yes ; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing gnutls_init" >&5
+$as_echo_n "checking for library containing gnutls_init... " >&6; }
+if ${ac_cv_search_gnutls_init+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char gnutls_init ();
+int
+main ()
+{
+return gnutls_init ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' gnutls; do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search_gnutls_init=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+conftest$ac_exeext
+  if ${ac_cv_search_gnutls_init+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search_gnutls_init+:} false; then :
+
+else
+  ac_cv_search_gnutls_init=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_gnutls_init" >&5
+$as_echo "$ac_cv_search_gnutls_init" >&6; }
+ac_res=$ac_cv_search_gnutls_init
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+else
+  as_fn_error $? "library 'gnutls' is required for GnuTLS" "$LINENO" 5
+fi
+
+  # GnuTLS versions before 3.4.0 do not support sorting incorrectly sorted
+  # certificate chains.
+  ac_fn_c_check_decl "$LINENO" "GNUTLS_X509_CRT_LIST_SORT" "ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" "#include 
+#include 
+
+"
+if test "x$ac_cv_have_decl_GNUTLS_X509_CRT_LIST_SORT" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_GNUTLS_X509_CRT_LIST_SORT $ac_have_decl
+_ACEOF
+
+fi
+
 if test "$with_pam" = yes ; then
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for pam_start in -lpam" >&5
 $as_echo_n "checking for pam_start in -lpam... " >&6; }
@@ -11015,6 +11130,17 @@ else
 fi
 
 
+fi
+
+if test "$with_gnutls" = yes ; then
+  ac_fn_c_check_header_mongrel "$LINENO" "gnutls/gnutls.h" "ac_cv_header_gnutls_gnutls_h" "$ac_includes_default"
+if test "x$ac_cv_header_gnutls_gnutls_h" = xyes; then :
+
+else
+  as_fn_error $? "header file  is required for GnuTLS" "$LINENO" 5
+fi
+
+
 fi
 
 if test "$with_pam" = yes ; then
@@ -15522,9 +15648,11 @@ fi
 # in the template or configure command line.
 
 # If not selected manually, try to select a source automatically.
-if test "$enable_strong_random" = "yes" && test x"$USE_OPENSSL_RANDOM" = x"" && test x"$USE_WIN32_RANDOM" = x"" && test x"$USE_DEV_URANDOM" = x"" ; then
+if test "$enable_strong_random" = "yes" && test x"$USE_OPENSS

Re: [HACKERS] [GENERAL] Remove useless joins (VARCHAR vs TEXT)

2017-09-17 Thread Tom Lane

David Rowley  writes:
> On 17 September 2017 at 08:07, Kim Rose Carlsen  wrote:
>> It seems there are some difference in VARCHAR vs TEXT when postgres tries to
>> decide if a LEFT JOIN is useful or not.

> Yeah, it looks like the code to check for distinctness in the subquery
> fails to consider that the join condition may contain RelabelTypes
> instead of plain Vars.
> 
> The attached fixes.

Looks like a good fix to me (except for the copied-and-pasted,
not-quite-on-point comment ;-)).  Pushed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Christoph Berg

Re: Peter Eisentraut 2017-08-14 

> There are probably a bunch of Perl or Python modules that can be
> employed for this.

https://github.com/ChristophBerg/postgresql-numeral

Christoph


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation

2017-09-17 Thread David Fetter

On Mon, Sep 11, 2017 at 04:43:46PM +0900, Yuto Hayamizu wrote:
> Hi hackers,
> 
> Currently, cost of a filter with multiple clauses is estimated by
> summing up estimated cost of each clause.  As long as a filter
> consists of simple clauses and its cost is fairly small, it works
> fine. However, when there exists some heavy clauses (like SubQuery or
> user-defined functions) and most of tuples are filtered by other
> simple clauses, total cost is likely to be overestimated.  An attached
> patch mitigates this overestimation by using selectivity estimation of
> subsets of clauses in a filter.

I've taken the liberty of adding this to the upcoming commitfest.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Improving DISTINCT with LooseScan node

2017-09-17 Thread Dmitriy Sarafannikov

Hi hackers, Everybody knows, that we have unefficient execution of query like "SELECT DISTINCT id from mytable"if table has many-many rows and only several unique id values. Query plan looks like Unique + IndexScan. I have tried to implement this feature in new type of node called Loose Scan.This node must appears in plan together with IndexScan or IndexOnlyScan just like Unique node in this case.But instead of filtering rows with equal values, LooseScan must retreive first row from IndexScan,then remember and return this. With all subsequent calls LooseScan must initiate calling index_rescan via ExecReScanwith search value that > or < (depending on scan direction) of previous value.Total cost of this path must be something like total_cost = n_distinct_values * subpath->startup_costWhat do you think about this idea? I was able to create new LooseScan node, but i ran into difficulties with changing scan keys.I looked (for example) on the ExecReScanIndexOnlyScan function and I find it difficult to understandhow i can reach the ioss_ScanKeys through the state of executor. Can you help me with this? I also looked on the Nested Loop node, which as i think must change scan keys, but i didn't become clear for me.The only thought that came to my head, that maybe i incorrectly create this subplan.I create it just like create_upper_unique_plan, and create subplan for IndexScan via create_plan_recurse.Can you tell me where to look or maybe somewhere there are examples? Thanks Regards,Dmitriy Sarafannikov

Re: [HACKERS] Range Merge Join v1

2017-09-17 Thread Jeff Davis

On Thu, Aug 31, 2017 at 1:52 AM, Jeff Davis  wrote:
> Updated patch attached. Changelog:
>
> * Rebased
> * Changed MJCompare to return an enum as suggested, but it has 4
> possible values rather than 3.
> * Added support for joining on contains or contained by (@> or <@) and
> updated tests.

The current patch does not work well with multiple keys, and I believe
it's important to solve because it will make it usable for
multi-dimension spatial joins.

The problem is this: the algorithm for a single key demands that the
input is sorted by (lower(r) NULLS FIRST, upper(r) NULLS LAST). That's
easy, because that's also the sort order for the range operator class,
so everything just works.

For multiple keys, the order of the input is:
   lower(r1) NULLS FIRST, lower(r2) NULLS FIRST,
   upper(r1) NULLS LAST, upper(r2) NULLS LAST

But that can't match up with any opclass, because an opclass can only
order one attribute at a time. In this case, the lower bound of r2 is
more significant than the upper bound of r1.

It's easy enough to adapt the execution to make this work as long as
the input is properly sorted. The challenge is making the optimizer
choose the sort keys properly.

I have tried a few approaches. The current approach that I'm using is:
* have a new, special range operator family with no operator classes
* in check_mergejoinable(), detect the && operator and set the
mergeopfamilies to contain only the special operator family
* in try_mergejoin_path(), it will convert the pathkeys using that
operator class into pathkeys over a "lower" expression over the same
EC; and then add on additional pathkeys for the "upper" expressions
onto the end of the pathkey list

Any comments or alternative suggestions welcome. This will probably
take a few days at least, so I put the patch in "waiting on author"
state.

Regards,
 Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Chris Travers

On Sun, Sep 17, 2017 at 6:43 PM, David Fetter  wrote:

> On Sat, Sep 16, 2017 at 10:42:49PM +, Douglas Doole wrote:
> > Oliver, I took a look at your tests and they look thorough to me.
> >
> > One recommendation, instead of having 3999 separate selects to test every
> > legal roman numeral, why not just do something like this:
> >
> > do $$
> > declare
> > i int;
> > rn text;
> > rn_val int;
> > begin
> > for i in 1..3999 loop
> > rn := trim(to_char(i, 'rn'));
> > rn_val := to_number(rn, 'rn');
> > if (i <> rn_val) then
> > raise notice 'Mismatch: i=% rn=% rn_val=%', i, rn, rn_val;
> > end if;
> > end loop;
> > raise notice 'Tested roman numerals 1..3999';
> > end;
> > $$;
> >
> > It's a lot easier to maintain than separate selects.
>
> Why not just one SELECT, as in:
>
> SELECT i, to_char(i, 'rn'), to_number(to_char(i, 'rn'), 'rn');
> FROM generate_series(1,3999) i
>

Question:  What is our definition of a legal Roman numeral?

For example sometimes IXX appears in the corpus to refer to 19 even though
our standardised notation would be XIX.

>
> Best,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread David Fetter

On Sat, Sep 16, 2017 at 10:42:49PM +, Douglas Doole wrote:
> Oliver, I took a look at your tests and they look thorough to me.
> 
> One recommendation, instead of having 3999 separate selects to test every
> legal roman numeral, why not just do something like this:
> 
> do $$
> declare
> i int;
> rn text;
> rn_val int;
> begin
> for i in 1..3999 loop
> rn := trim(to_char(i, 'rn'));
> rn_val := to_number(rn, 'rn');
> if (i <> rn_val) then
> raise notice 'Mismatch: i=% rn=% rn_val=%', i, rn, rn_val;
> end if;
> end loop;
> raise notice 'Tested roman numerals 1..3999';
> end;
> $$;
> 
> It's a lot easier to maintain than separate selects.

Why not just one SELECT, as in:

SELECT i, to_char(i, 'rn'), to_number(to_char(i, 'rn'), 'rn');
FROM generate_series(1,3999) i

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE

On Wed, Sep 6, 2017 at 4:14 PM, Rafia Sabih
 wrote:

> I worked on this idea of using local queue as a temporary buffer to
> write the tuples when master is busy and shared queue is full, and it
> gives quite some improvement in the query performance.
>

I have done some initial review of this patch and I have some comments.

/* this is actual size for this tuple which will be written in queue */
+ tuple_size = MAXALIGN(sizeof(Size)) + MAXALIGN(iov.len);
+
+ /* create and attach a local queue, if it is not yet created */
+ if (mqh->mqh_local_queue == NULL)
+ mqh = local_mq_attach(mqh);

I think we can create the local queue when first time we need it. So
basically you
can move this code inside else part where we first identify that there is no
space in the shared queue.

--
+ /* write in local queue if there is enough space*/
+ if (local_space > tuple_size)

I think the condition should be if (local_space >= tuple_size)

--
+ while(shm_space <= 0)
+ {
+ if (shm_mq_is_detached(mqh->mqh_queue))
+ return SHM_MQ_DETACHED;
+
+ shm_space = space_in_shm(mqh->mqh_queue);
+ }

Instead of waiting in CPU intensive while loop we should wait on some latch, why
can't we use wait latch of the shared queue and whenever some space
free, the latch will
be set then we can recheck the space and if it's enough we can write
to share queue
otherwise wait on the latch again.

Check other similar occurrences.
-

+ if (read_local_queue(lq, true) && shm_space > 0)
+ copy_local_to_shared(lq, mqh, false);
+

Instead of adding shm_space > 0 in check it can be Assert or nothing
because above loop will
only break if shm_space > 0.

+ /*
+ * create a local queue, the size of this queue should be way higher
+ * than PARALLEL_TUPLE_QUEUE_SIZE
+ */
+ char *mq;
+ Size len;
+
+ len = 6553600;

Create some macro which is multiple of PARALLEL_TUPLE_QUEUE_SIZE,

---

+ /* this local queue is not required anymore, hence free the space. */
+ pfree(mqh->mqh_local_queue);
+ return;
+}

I think you can remove the return at the end of the void function.
-

In empty_queue(shm_mq_handle *mqh) function I see that only last exit path frees
the local queue but not all even though local queue is created.

Other cosmetic issues.
-
+/* check the space availability in local queue */
+static uint64
+space_in_local(local_mq *lq, Size tuple_size)
+{
+ uint64 read, written, used, available, ringsize, writer_offset, reader_offset;
+
+ ringsize = lq->mq_ring_size;
+ read = lq->mq_bytes_read;
+ written = lq->mq_bytes_written;
+ used = written - read;
+ available = ringsize - used;
+
Alignment is not correct, I think you can run pgindent once.

+ /* check is there is required space in shared queue */
statement need refactoring. "check if there is required space in shared queue" ?

-

+ /* write in local queue if there is enough space*/
space missing before comment end.

-

+
+/* Routines required for local queue */
+
+/*
+ * Initialize a new local message queue, this is kept quite similar
to shm_mq_create.
+ */

Header comments formatting is not in sync with other functions.

-

+}
+/* routine to create and attach local_mq to the shm_mq_handle */

one blank line between two functions.

-

+ bool detached;
+
+ detached = false;

a better way is bool detached = false;

-

Compilation warning

shm_mq.c: In function ‘write_in_local_queue’:
shm_mq.c:1489:32: warning: variable ‘tuple_size’ set but not used
[-Wunused-but-set-variable]
  uint64 bytes_written, nbytes, tuple_size;

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages

On Mon, Sep 4, 2017 at 11:18 AM, Dilip Kumar  wrote:
> On Thu, Aug 31, 2017 at 11:27 PM, Robert Haas  wrote:
>
> I have repeated one of the tests after fixing the problems pointed by
> you but this time results are not that impressive.  Seems like below
> check was the problem in the previous patch
>
>if (tbm->nentries > tbm->maxentries / 2)
> tbm->maxentries = Min(tbm->nentries, (INT_MAX - 1) / 2) * 2;
>
> Because we were lossifying only till tbm->nentries becomes 90% of
> tbm->maxentries but later we had this check which will always be true
> and tbm->maxentries will be doubled and that was the main reason of
> huge reduction of lossy pages, basically, we started using more
> work_mem in all the cases.
>
> I have taken one reading just to see the impact after fixing the
> problem with the patch.
>
>  Work_mem: 40 MB
> (Lossy Pages count)
>
> Queryhead  patch
> 6   995223   733087
> 14 337894   206824
> 15 995417   798817
> 20   1654016 1588498
>
> Still, we see a good reduction in lossy pages count.  I will perform
> the test at different work_mem and for different values of
> TBM_FILFACTOR and share the number soon.

I haven't yet completely measured the performance with executor
lossification change, meanwhile, I have worked on some of the comments
on optimiser change and taken the performance again, I still see good
improvement in the performance (almost 2x for some of the queries) and
with new method of lossy pages calculation I don't see regression in
Q14 (now Q14 is not changing its plan).

I used  lossy_pages = max(0, total_pages - maxentries / 2). as
suggesed by Alexander.


Performance Results:

Machine: Intell 56 core machine (2 NUMA node)
work_mem: varies.
TPCH S.F: 20
Median of 3 runs.

work_mem = 4MB

QueryPatch(ms)Head(ms)Change in plan

4   4686.186   5039.295 PBHS -> PSS

5   26772.19227500.800BHS -> SS

6   6615.916   7760.005 PBHS -> PSS

8   6370.611  12407.731PBHS -> PSS

  15   17493.564   24242.256 BHS -> SS


work_mem = 20MB

QueryPatch(ms)Head(ms)Change in plan

6   6656.467   7469.961 PBHS -> PSS

8   6116.526  12300.784PBHS -> PSS

15 17873.72622913.421BHS -> PSS


work_mem = 64MB

QueryPatch(ms)Head(ms)   Change in plan

15 14900.88127460.093   BHS -> PBHS


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


improve_bitmap_cost_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SQL/JSON in PostgreSQL

2017-09-17 Thread Alexander Korotkov

On Sun, Sep 17, 2017 at 11:08 AM, Oleg Bartunov  wrote:

> On 16 Sep 2017 02:32, "Nikita Glukhov"  wrote:
>
> On 15.09.2017 22:36, Oleg Bartunov wrote:
>
> On Fri, Sep 15, 2017 at 7:31 PM, Robert Haas 
>> wrote:
>>
>>> On Fri, Sep 15, 2017 at 10:10 AM, Daniel Gustafsson 
>>> wrote:
>>>
 Can we expect a rebased version of this patch for this commitfest?
 Since it’s
 a rather large feature it would be good to get it in as early as we can
 in the
 process.

>>> Again, given that this needs a "major" rebase and hasn't been updated
>>> in a month, and given that the CF is already half over, this should
>>> just be bumped to the next CF.  We're supposed to be trying to review
>>> things that were ready to go by the start of the CF, not the end.
>>>
>> We are supporting v10 branch in our github repository
>> https://github.com/postgrespro/sqljson/tree/sqljson_v10
>>
>> Since the first post we made a lot of changes, mostly because of
>> better understanding the standard and availability of technical report
>> (http://standards.iso.org/ittf/PubliclyAvailableStandards/c0
>> 67367_ISO_IEC_TR_19075-6_2017.zip).
>> Most important are:
>>
>> 1.We abandoned FORMAT support, which could confuse our users, since we
>> have data types json[b].
>>
>> 2. We use XMLTABLE infrastructure, extended for  JSON_TABLE support.
>>
>> 3. Reorganize commits, so we could split one big patch by several
>> smaller patches, which could be reviewed independently.
>>
>> 4. The biggest problem is documentation, we are working on it.
>>
>> Nikita will submit patches soon.
>>
>
> Attached archive with 9 patches rebased onto latest master.
>
> 0001-jsonpath-v02.patch:
>  - jsonpath type
>  - jsonpath execution on jsonb type
>  - jsonpath operators for jsonb type
>  - GIN support for jsonpath operators
>
> 0002-jsonpath-json-v02.patch:
>  - jsonb-like iterators for json type
>  - jsonpath execution on json type
>  - jsonpath operators for json type
>
> 0003-jsonpath-extensions-v02.patch:
> 0004-jsonpath-extensions-tests-for-json-v02.patch:
>  - some useful standard extensions with tests
>  0005-sqljson-v02.patch:
>  - SQL/JSON constructors (JSON_OBJECT[AGG], JSON_ARRAY[AGG])
>  - SQL/JSON query functions (JSON_VALUE, JSON_QUERY, JSON_EXISTS)
>  - IS JSON predicate
>
> 0006-sqljson-json-v02.patch:
>  - SQL/JSON support for json type and tests
>
> 0007-json_table-v02.patch:
>  - JSON_TABLE using XMLTABLE infrastructure
>
> 0008-json_table-json-v02.patch:
>  - JSON_TABLE support for json type
>
> 0009-wip-extensions-v02.patch:
>  - FORMAT JSONB
>  - jsonb to/from bytea casts
>  - jsonpath operators
>  - some unfinished jsonpath extensions
>
>
> Originally, JSON path was implemented only for jsonb type, and I decided to
> add jsonb-like iterators for json type for json support implementation with
> minimal changes in JSON path code.  This solution (see jsonpath_json.c from
> patch 0002) looks a little dubious to me, so I separated json support into
> independent patches.
>
> The last WIP patch 0009 is unfinished and contains a lot of FIXMEs.  But
> the ability to use arbitrary Postgres operators in JSON path with
> explicitly
> specified  types is rather interesting, and I think it should be shown now
> to get a some kind of pre-review.
>
> We are supporting v11 and v10 branches in our github repository:
>
> https://github.com/postgrespro/sqljson/tree/sqljson
> https://github.com/postgrespro/sqljson/tree/sqljson_wip
> https://github.com/postgrespro/sqljson/tree/sqljson_v10
> https://github.com/postgrespro/sqljson/tree/sqljson_v10_wip
>
>
> We provide web interface to our build
> http://sqlfiddle.postgrespro.ru/#!21/
>

+1,
For experimenting with SQL/JSON select "PostgreSQL 10dev+SQL/JSON" in the
version select field on top toolbar.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Add Roman numeral conversion to to_number

2017-09-17 Thread Oleg Bartunov

On 3 Aug 2017 16:29, "Oliver Ford"  wrote:

Adds to the to_number() function the ability to convert Roman numerals
to a number. This feature is on the formatting.c TODO list. It is not
currently implemented in either Oracle, MSSQL or MySQL so gives
PostgreSQL an edge :-)


I see use of this in full text search as a dictionary. It's useful for
indexing and searching historical documents. Probably, better to have as
contrib.


==Usage==

Call: to_number(numerals, 'RN') or to_number(numerals, 'rn').

Example: to_number('MMMXIII', 'RN') returns 3013. to_number('xiv',
'rn') returns 14.

The function is case insensitive for the numerals. It accepts a mix of
cases and treats them the same. So  to_number ('MCI, 'rn') and
to_number ('McI', 'RN') both return 1101. The format mask must however
be either 'RN' or 'rn'. If there are other elements in the mask, those
other elements will be ignored. So to_number('MMM', 'FMRN') returns
3000.

Whitespace before the numerals is ignored.

==Validation==

The new function roman_to_int() in formatting.c performs the
conversion. It strictly validates the numerals based on the following
Roman-to-Arabic conversion rules:

1. The power-of-ten numerals (I, X, C, M) can be repeated up to three
times in a row. The beginning-with-5 numerals (V, L, D) can each
appear only once.

2. Subtraction from a power-of-ten numeral cannot occur if a
beginning-with-5 numeral appears later.

3. Subtraction cannot occur if the smaller numeral is less than a
tenth of the greater numeral (so IX is valid, but IC is invalid).

4. There cannot be two subtractions in a row.

5. A beginning-with-5 numeral cannot subtract.

If any of these rules are violated, an error is raised.

==Testing==

This has been tested on a Windows build of the master branch with
MinGW. The included regression tests positively test every value from
1 to 3999 (the Roman numeral max value) by calling the existing
to_char() function to get the Roman value, then converting it back to
an Arabic value. There are also negative tests for each invalid code
path and some positive mixed-case tests.

Documentation is updated to include this new feature.

==References==

http://sierra.nmsu.edu/morandi/coursematerials/RomanNumerals.html
Documents the strict Roman numeral standard.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Perform only one ReadControlFile() during startup.