date:20160305

Re: [HACKERS] Proposal: SET ROLE hook

2016-03-05 Thread Pavel Stehule

2016-03-05 21:49 GMT+01:00 Joe Conway :

> On 03/01/2016 08:18 AM, Pavel Stehule wrote:
> > 2016-03-01 16:52 GMT+01:00 Joe Conway:
> > On 03/01/2016 02:09 AM, Pavel Stehule wrote:
> > > > On 01/06/2016 10:36 AM, Tom Lane wrote:
> > > >> I think a design that was actually somewhat robust would
> require two
> > > >> hooks, one at check_role and one at assign_role, wherein
> the first one
> > > >> would do any potentially-failing work and package all
> required info into
> > > >> a blob that could be passed through to the assign hook.
> >
> > > I see following issues:
> > >
> > > 1. Missing the possibility to pass custom data from
> SetRoleCheck_hook to
> > > SetRoleAssign_hook. Tom mentioned it in his comment.
> >
> > You can pass the data between the two plugin hook functions in your
> > extension itself, so I see no need to try to pass custom data through
> > the backend. Do any of the other hooks even do that?
> >
> > I don't know about it, but these hooks are specific. is it ensured a
> > order of calls of these hooks?
>
> > > 2. Missing little bit more comments and an explanation why and
> when to
> > > use these hooks.
> >
> > Doesn't look all that different from existing user hooks to me, but
> > sure, I'll add a bit more to the comments.
>
> > some like "I think the main point was to have two hooks. The
> > potentially-failing
> > work could be done during the check_role() hook and the collected info
> > could be used during the assign_role() hook."
>
> Ok, I added a comment similar to that at the check_role() function hook
> call site. Updated patch attached.
>

the comments are good enough now


>
> I still don't see any point in trying to pass data back from the hooks
> as the extension can maintain that state just fine, although it looks
> like it would be pretty trivial to do using a new void* member added to
> role_auth_extra. Tom (or anyone else), any comment on that?
>

see Tom's comment, I share his opinion.


>
> I do however find myself wishing I could pass the action down from
> set_config_option() to at least the assign_role() hook, but that seems
> more invasive than I'd like.
>

describe this use case, please.

In this case, I don't afraid of some possible API changes in future. This
API is "invisible" for common user. So it should not to handle all possible
use cases.

For me, the possibility to control private data is in category better to
have, the extension can be safer, smaller, but probably 75% of potential
customer will be happy from current state of this patch.

On the other hand, if we can do some simple changes (few or little bit more
than few lines), why don't do it now?

Pavel


>
> Joe
>
> --
> Crunchy Data - http://crunchydata.com
> PostgreSQL Support for Secure Enterprises
> Consulting, Training, & Open Source Development
>

Re: [HACKERS] How can we expand PostgreSQL ecosystem?

2016-03-05 Thread Craig Ringer

On 6 March 2016 at 13:29, MauMau  wrote:

> However, there is a problem.  The number of software is very small that
> the users can know to interoperate with PostgreSQL.
>

Yep, even among many big and popular OSS apps.

> Many applications might be interoperable through standard interfaces like
> JDBC/ODBC, but the case is unknown.
>

Not tons, in my experience, unless they're doing only trivial SQL. Many
*are* PostgreSQL-compatible via higher level abstractions like
Hibernate/JPA, ActiveRecord, etc, but those are also the systems that are
least interesting from a database PoV since they generally use only the
lowest common denominator of DB features, with the DB treated as not much
more than a dumb row store.

There *is* some cool stuff we could do with ORMs if we'd stop sneering at
them for long enough. I'd love to see the ability to omit repeated values
on a LEFT JOIN for example, where we project the left tuple completely the
first time we see it, i.e. when the key changes, then we emit only the key
when we project it again for subsequent right-hand rows. This would save
ORMs an immense amount of computational effort and network bandwidth... and
you only have to patch each ORM to use it, not each app.

Even more interesting would be to emit JSON as a left-join product, so a
typical chained left join to fetch 'customer' with 'account' and 'contact'
for each 'account' would emit

[
  {
customer_id: '42',
...
accounts: [
  {
account_id = 1,
...,
contacts: [
  {
contact_id = 4,
...
  },
  {
contact_id = 5,
...
  }
]
  },
  {
account_id = 2,
...
  }
]
  },
  {
customer_id = 43,
...
  }
]

i.e. emit the requested data in the form the application actually wants,
without needing to produce a giant relational projection and then process
it in the application.

We could help ORMs solve the N+1 SELECTs problem and help them avoid
transferring vast join projections unnecessarily. That'd make PostgreSQL
pretty compelling for exactly the users we're mostly too busy dismissing to
consider.

When you consider the adoption of PostgreSQL in the Rails/ActiveRecord
crowd I think there's some interesting potential there. What there isn't is
funding AFAIK.

I'd be interested in reaching out to some Hibernate/JPA and ActiveRecord
folks about how the DB could help the ORM and have been meaning to explore
this area for a while, but -ENOTIME. If anyone pursues it I'll be really
interested in hearing how things go.

>  Besides, in practice, we probably should increase the number of software
> interoperable with PostgreSQL.  e.g. one customer asked us whether Arcserve
> can be used to back up PostgreSQL databases, but unfortunately we had to
> answer no.  They are using Arcserve to back up Oracle databases and other
> resources.  "Then, you can use NetVault instead" is not the best answer;
> they just want to replace the database.
>

The "we" here is the problem. It's not likely to be folks focused on
PostgreSQL core dev, but ... who, exactly?

> Provide technical assistance to those vendors as an organization so that
> they can support PostgreSQL smoothly.
>

This one is a help. That said, pgsql-general is pretty helpful already...

> * Make a directory of software/services that can be used with PostgreSQL
> on the community web site (wiki.postgresql.org or www.postgresql.org).
> Software/services vendors and PostgreSQL developers/users can edit this
> directory.
>

I thought we had that? Yep.

http://www.postgresql.org/download/product-categories/

It's pretty invisible though, partly due to the postgresql.org landing
page's need for a trim-down and tidy. (Don't even get me started on
http://www.postgresql.org/about/ )

> * How/Where can we get the knowledge of expanding the software ecosystem?
> Is there any OSS project that we can learn from?
>

Mongo.

Despite the real world capabilities being ... patchy ... and the magic
autoscaling claims etc being a bit dubious, they've successfully marketed
it as the magic solution for everything.

This leads to a fair bit of disillusionment, but also massive adoption. I
think they take it too far, but try opening these two pages side by side:

http://www.postgresql.org/
https://www.mongodb.org/

... and tell me which you'd look over first if you were evaluating things.

> How can we attract software vendors to support PostgreSQL?  What words are
> convincing to appeal the increasing potential of PostgreSQL as a good
> replacement for commercial databases?
>

Change the name :p

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Proposal: SET ROLE hook

2016-03-05 Thread Tom Lane

Joe Conway  writes:
> I still don't see any point in trying to pass data back from the hooks
> as the extension can maintain that state just fine, although it looks
> like it would be pretty trivial to do using a new void* member added to
> role_auth_extra. Tom (or anyone else), any comment on that?

The point of packaging GUC-related state into a blob that guc.c
knows about is that then the right things will happen when guc.c handles
something like a SET LOCAL, GUC reversion at subtransaction rollback,
SET clauses attached to functions, yadda yadda.  Are you sure your
extension can, or wants to, track all those possibilities for itself?

I remember thinking that we probably would need to extend role_auth_extra
to make this work, so I have no objection if you're finding that that's
actually the case.  Might need to think about how more than one hook
could include state into the blob.

(Note: I've not actually read this version of your patch, although
I could make time for that next week sometime.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How can we expand PostgreSQL ecosystem?

2016-03-05 Thread Mark Kirkwood

On 06/03/16 18:29, MauMau wrote:
> As I said in the previous greeting mail, I'd like to discuss how to 
> expand PostgreSQL ecosystem.  Here, ecosystem means "interoperability" 
> -- the software products and cloud services which use/support 
> PostgreSQL.  If pgsql-advocacy or somewhere else is better for this 
> topic, just tell me so.

For cloud - in particular Openstack (which I am working with ATM), the
biggest thing would be:

- multi-master replication

or failing that:

- self managing single master failover (voting/quorum etc)

so that operators can essentially 'set and forget'. We currently use
Mysql+ Galera (multi master) and Mongodb (self managing single master)
and the convenience and simplicity is just so important (Openstack is a
huge complex collection of services - hand holding of any one service is
pretty much a non starter).

regards

Mark

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Amit Kapila

On Sat, Mar 5, 2016 at 10:11 PM, Tom Lane  wrote:
>
> Amit Kapila  writes:
> > On Fri, Mar 4, 2016 at 11:31 PM, Tom Lane  wrote:
> >> (BTW, I found what seemed to be a couple of pre-existing bugs of
> >> the same kind, eg create_mergejoin_path was different from the
> >> other two kinds of join as to setting parallel_degree.)
>
> > I think the reason for keeping parallel_degree as zero for mergejoin
path
> > is that currently it can't participate in parallelism.
>
> Is there some reason why hash and nestloop are safe but merge isn't?
>

I think it is because we consider to pushdown hash and nestloop to workers,
but not merge join and the reason for not pushing mergejoin is that
currently we don't have executor support for same, more work is needed
there.  I think even if we set parallel_degree as you are doing in patch
for merge join is harmless, but ideally there is no need to set it as far
as what we support today in terms of parallelism.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

[HACKERS] How can we expand PostgreSQL ecosystem?

2016-03-05 Thread MauMau

Hello,

As I said in the previous greeting mail, I'd like to discuss how to expand
PostgreSQL ecosystem. Here, ecosystem means "interoperability" -- the software
products and cloud services which use/support PostgreSQL. If pgsql-advocacy or
somewhere else is better for this topic, just tell me so.

THE BACKGROUND
==

Thanks to the long and hard efforts by the community, PostgreSQL has been
evolving to be a really great software comparable to existing strong commercial
products. Open source databases are gaining more popularity to influence the
database market.

Open source threatens to eat the database market
http://www.infoworld.com/article/2916057/open-source-software/open-source-threatens-to-eat-the-database-market.html

"Though the proprietary RDBMS market grew at a sluggish 5.4 percent in 2014,
the open source database market grew 31 percent to hit $562 million."
"As Gartner highlights in a recent research report, open source databases now
consume 25 percent of relational database usage."

Perhaps related to this is that the revenues of Oracle, IBM and Microsoft have
been declining (but I read in an article that SQL Server is gaining more
revenue).

On the other hand, there is a gulf between the two top popular databases --
Oracle and MySQL -- and PostgreSQL. They are nearly five times more popular
than PostgreSQL.

DB-Engines Ranking
http://db-engines.com/en/ranking

Yes, I understand this ranking doesn't necessarily reflect the actual use, but
I also don't think the ranking is far from the real popularity. In fact, some
surveys show that MySQL has been in more widespread use even here in Japan than
PostgreSQL since around 2010 (IIRC).

What should we do to boost the popularity of PostgreSQL? One challenge is to
increase the number of software which supports PostgreSQL. To take advantage
of the trend of shift from commercial products to open source, PostgreSQL needs
to interoperate with many software that are used together with the commercial
databases.

The easily understandable target is Oracle, because it is anticipated that more
users of Oracle will seek another database to avoid the expensive Oracle
Standard Edition 2 and increasing maintenance costs. In addition, PostgreSQL
has affinity for Oracle.

However, there is a problem. The number of software is very small that the
users can know to interoperate with PostgreSQL. That is, when the users want
to migrate from commercial databases to PostgreSQL, they can't get information
on whether they can continue to use their assets with PostgreSQL. Many
applications might be interoperable through standard interfaces like JDBC/ODBC,
but the case is unknown. For example:

* Only 24 open source projects are listed as interoperable.
Open Source Projects Using PostgreSQL
https://wiki.postgresql.org/wiki/OpenSource_Projects_Using_PostgreSQL

* Even EnterpriseDB has only 12 certified application vendors.
http://www.enterprisedb.com/partner-programs/enterprisedb-certified-application-vendors

* PostgreSQL Enterprise Consortium lists only about30 related products
(Japanese only).
https://www.pgecons.org/postgresql-info/business_sw/

* MySQL touts more than 2,000 ISV/OEM/VARs.
http://www.mysql.com/oem/

Besides, in practice, we probably should increase the number of software
interoperable with PostgreSQL. e.g. one customer asked us whether Arcserve can
be used to back up PostgreSQL databases, but unfortunately we had to answer no.
They are using Arcserve to back up Oracle databases and other resources.
"Then, you can use NetVault instead" is not the best answer; they just want to
replace the database.

PROPOSAL
==

Last month, I attended the steering committee of PostgreSQL Enterprise
Consortium (PGECons) for the first time and proposed starting the following
activity. PGECons is a Japanese non-profit organization to promote PostgreSQL
for enterprise use. The members include NTT, SRA OSS (Tatsuo Ishii runs), NEC,
Hitachi, HP, Fujitsu, etc. We concluded that we need to consult the PostgreSQL
community on how to proceed the activity and work in cooperation with the
community.

* Attract and ask product/service vendors to support/use PostgreSQL.
Provide technical assistance to those vendors as an organization so that they
can support PostgreSQL smoothly.
If the vendors aren't proactive, we verify the interoperability with their
software by executing it.

* Make a directory of software/services that can be used with PostgreSQL on the
community web site (wiki.postgresql.org or www.postgresql.org).
Software/services vendors and PostgreSQL developers/users can edit this
directory.
This list not only has the names of software and its vendors, but also other
information such as the level of interoperability (certified by the vendor, or
verified by the community/users) and remarks about configuration, tuning, an

Re: [HACKERS] Greeting for coming back, and where is PostgreSQL going

2016-03-05 Thread MauMau

Thankyou, Michael, Nagayasu-san, Ishii-san, and Joshua.  Your reply gave me 
energy!


I'm relieved to know that community people use Emacs for editing SGML/XML. 
My main editor on Linux is Emacs.



These days, there is a lot of discussion and activity to make Postgres
better at scaling out. There are discussions about backporting stuff
from XC/XL back to core, though that's a tough work. This thread is a
good summary of what is happening lately in this area:
http://www.postgresql.org/message-id/20160223164335.ga11...@momjian.us


Cool, exciting to know this!


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: SET ROLE hook

2016-03-05 Thread Craig Ringer

On 6 March 2016 at 04:49, Joe Conway  wrote:

> I still don't see any point in trying to pass data back from the hooks
> as the extension can maintain that state just fine, although it looks
> like it would be pretty trivial to do using a new void* member added to
> role_auth_extra. Tom (or anyone else), any comment on that?

I think it's possibly worth doing, but since auth is simple and linear I'm
not overly bothered. It'd hardly be the first place Pg relied on passing
information between functions using globals.

If you expect to daisy-chain hooks then a private-data member isn't too
helpful, since a prior hook in the call chain may have set it already and
you don't know what's there or how to manage it. So overall, I'm -1 for a
private_data arg, I don't think it gains us anything in lifetime
management, code clarity etc and it makes daisy-chaining way more complex.

I don't see what the point of SetRoleAssign_hook is, since it returns void
and doesn't have out parameters. If it's expected to takes some action,
what is it? If it's meant to modify myextra->roleid and
myextra->is_superuser prior to their use by SetCurrentRoleId they should be
passed pointer-indirected so they can be modified by the hook.

I'd like to see parameter names specified in the hook type definition.

It needs tests added, along with a note somewhere on usage of the hook that
mentions the usual pattern for using it, possibly in the test/example.
Something like:

static SetRoleCheck_hook_type previous_SetRoleCheck_hook = NULL;

void my_check_role(Oid sessionuserid, Oid wanted_roleid, Oid issuper)
{
  if (previous_SetRoleCheck_hook)
previous_SetRoleCheck_hook(sessionuserid, wanted_roleid, issuper);

  /* do my stuff here */
}

_PG_init()
{
  if (SetRoleCheck_hook)
  previous_SetRoleCheck_hook = SetRoleCheck_hook;
  SetRoleCheck_hook = my_check_role;
}

Also behaviour of each hook should be more completely specified. Can it
elog(ERROR)? What happens if it does? What can it safely change and what
not? Are there restrictions to what it can do in terms of access to
syscache/relcache/heap etc?

Why do you pass GetSessionUserId() to the hook given that the caller can
look it up directly? Just a convenience?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-05 Thread Andres Freund

On 2016-03-05 22:25:36 +0900, Michael Paquier wrote:
> OK, I hacked a v7:
> - Move the link()/rename() group with HAVE_WORKING_LINK into a single
> routine, making the previous link_safe renamed to replace_safe. This
> is sharing a lot of things with rename_safe. I am not sure it is worth
> complicating the code more this way by having a common single routine
> for whole. Thoughts welcome. Honestly, I kind of liked the separation
> with link_safe/rename_safe of previous patches because link_safe could
> have been directly used by extensions and plugins btw.
> - Remove the call of stat() in rename_safe() and implement a logic
> depending on OpenTransientFile()/pg_fsync() to flush any existing
> target file before performing the rename.
> Andres, feel free to use this patch as a base, perhaps that will help.

I started working on this; delayed by taking longer than planned on the
logical decoding stuff (quite a bit complicated by
e1a11d93111ff3fba7a91f3f2ac0b0aca16909a8).  I'm not very happy with the
error handling as it is right now.  For one, you have rename_safe return
error codes, and act on them in the callers, but on the other hand you
call fsync_fname which always errors out in case of failure.  I also
don't like the new messages much.

Will continue working on this tomorrow.

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix handling of invalid sockets returned by PQsocket()

2016-03-05 Thread Alvaro Herrera

Peter Eisentraut wrote:
> On 2/17/16 10:52 PM, Michael Paquier wrote:
> > On Thu, Feb 18, 2016 at 1:58 AM, Alvaro Herrera
> >  wrote:
> >> Michael Paquier wrote:
> >>> Hi all,
> >>>
> >>> After looking at Alvaro's message mentioning the handling of
> >>> PQsocket() for invalid sockets, I just had a look by curiosity at
> >>> other calls of this routine, and found a couple of issues:
> >>> 1) In vacuumdb.c, init_slot() does not check for the return value of 
> >>> PQsocket():
> >>> slot->sock = PQsocket(conn);
> >>> 2) In isolationtester.c, try_complete_step() should do the same.
> >>> 3) In pg_recvlogical.c for StreamLogicalLog() I am spotting the same 
> >>> problem.
> >>> I guess those ones should be fixed as well, no?
> >>
> >> I patched pgbench to use PQerrorMessage rather than strerror(errno).  I
> >> think your patch should do the same.
> > 
> > OK, this looks like a good idea. I would suggest doing the same in
> > receivelog.c then.
> 
> Let's make the error messages consistent as "invalid socket".  "bad
> socket" isn't really our style, and pg_basebackup saying "socket not
> open" is just plain incorrect.

You're completely right.  Let's patch pgbench that way too.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Tom Lane

I wrote:
> Attached is a version that addresses today's concerns, and also finishes
> filling in the loose ends I'd left before, such as documentation and
> outfuncs.c support.  I think this is in a committable state now, though
> I plan to read through the whole thing again.

Sigh, forgot to press the magic button.  Let's try that again.

regards, tom lane



upper-planner-pathification-3.patch.gz
Description: upper-planner-pathification-3.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Tom Lane

Attached is a version that addresses today's concerns, and also finishes
filling in the loose ends I'd left before, such as documentation and
outfuncs.c support.  I think this is in a committable state now, though
I plan to read through the whole thing again.

regards, tom lane

#application/x-gzip; name="upper-planner-pathification-3.patch.gz" 
[upper-planner-pathification-3.patch.gz] 
/home/tgl/pgsql/upper-planner-pathification-3.patch.gz


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance improvement for joins where outer side is unique

2016-03-05 Thread David Rowley

On 5 March 2016 at 10:43, Alvaro Herrera  wrote:
> I wonder why do we have two identical copies of clause_sides_match_join ...

Yeah, I noticed the same a while back, and posted about it. Here was
the response:
http://www.postgresql.org/message-id/26820.1405522...@sss.pgh.pa.us


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fix handling of invalid sockets returned by PQsocket()

2016-03-05 Thread Peter Eisentraut

On 2/17/16 10:52 PM, Michael Paquier wrote:
> On Thu, Feb 18, 2016 at 1:58 AM, Alvaro Herrera
>  wrote:
>> Michael Paquier wrote:
>>> Hi all,
>>>
>>> After looking at Alvaro's message mentioning the handling of
>>> PQsocket() for invalid sockets, I just had a look by curiosity at
>>> other calls of this routine, and found a couple of issues:
>>> 1) In vacuumdb.c, init_slot() does not check for the return value of 
>>> PQsocket():
>>> slot->sock = PQsocket(conn);
>>> 2) In isolationtester.c, try_complete_step() should do the same.
>>> 3) In pg_recvlogical.c for StreamLogicalLog() I am spotting the same 
>>> problem.
>>> I guess those ones should be fixed as well, no?
>>
>> I patched pgbench to use PQerrorMessage rather than strerror(errno).  I
>> think your patch should do the same.
> 
> OK, this looks like a good idea. I would suggest doing the same in
> receivelog.c then.

Let's make the error messages consistent as "invalid socket".  "bad
socket" isn't really our style, and pg_basebackup saying "socket not
open" is just plain incorrect.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-05 Thread Andrew Dunstan




On 03/05/2016 01:31 PM, Michael Paquier wrote:

On Sat, Mar 5, 2016 at 11:34 PM, Andrew Dunstan  wrote:

Here is a translation into perl of the sed script, courtesy of the s2p
incarnation of psed: 
The sed script appears to have been stable for a long time, so I don't think
we need to be too concerned about possibly maintaining two versions.

That's 95% of the work already done, nice! If I finish wrapping up a
patch for this issue at least would you backpatch? It would be saner
to get rid of this dependency everywhere I think regarding compilation
with perl 5.22.


Sure.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: SET ROLE hook

2016-03-05 Thread Joe Conway

On 03/01/2016 08:18 AM, Pavel Stehule wrote:
> 2016-03-01 16:52 GMT+01:00 Joe Conway:
> On 03/01/2016 02:09 AM, Pavel Stehule wrote:
> > > On 01/06/2016 10:36 AM, Tom Lane wrote:
> > >> I think a design that was actually somewhat robust would require 
> two
> > >> hooks, one at check_role and one at assign_role, wherein the 
> first one
> > >> would do any potentially-failing work and package all required 
> info into
> > >> a blob that could be passed through to the assign hook.
> 
> > I see following issues:
> >
> > 1. Missing the possibility to pass custom data from SetRoleCheck_hook to
> > SetRoleAssign_hook. Tom mentioned it in his comment.
> 
> You can pass the data between the two plugin hook functions in your
> extension itself, so I see no need to try to pass custom data through
> the backend. Do any of the other hooks even do that?
> 
> I don't know about it, but these hooks are specific. is it ensured a
> order of calls of these hooks?

> > 2. Missing little bit more comments and an explanation why and when to
> > use these hooks.
> 
> Doesn't look all that different from existing user hooks to me, but
> sure, I'll add a bit more to the comments.

> some like "I think the main point was to have two hooks. The
> potentially-failing
> work could be done during the check_role() hook and the collected info
> could be used during the assign_role() hook."

Ok, I added a comment similar to that at the check_role() function hook
call site. Updated patch attached.

I still don't see any point in trying to pass data back from the hooks
as the extension can maintain that state just fine, although it looks
like it would be pretty trivial to do using a new void* member added to
role_auth_extra. Tom (or anyone else), any comment on that?

I do however find myself wishing I could pass the action down from
set_config_option() to at least the assign_role() hook, but that seems
more invasive than I'd like.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c
index 903b3a6..bcf5cc8 100644
*** a/src/backend/commands/variable.c
--- b/src/backend/commands/variable.c
***
*** 32,37 
--- 32,41 
  #include "utils/timestamp.h"
  #include "mb/pg_wchar.h"
  
+ /* Hooks for plugins to get control in check_role() and assign_role() */
+ SetRoleCheck_hook_type SetRoleCheck_hook = NULL;
+ SetRoleAssign_hook_type SetRoleAssign_hook = NULL;
+ 
  /*
   * DATESTYLE
   */
*** check_role(char **newval, void **extra,
*** 900,905 
--- 904,916 
  	myextra->is_superuser = is_superuser;
  	*extra = (void *) myextra;
  
+ 	/*
+ 	 * Potentially-failing work should be done here in the check_role()
+ 	 * hook and the collected info used during the assign_role() hook.
+ 	 */
+ 	if (SetRoleCheck_hook)
+ 		(*SetRoleCheck_hook) (GetSessionUserId(), roleid, is_superuser);
+ 
  	return true;
  }
  
*** assign_role(const char *newval, void *ex
*** 908,913 
--- 919,931 
  {
  	role_auth_extra *myextra = (role_auth_extra *) extra;
  
+ 	/*
+ 	 * Any hooks defined here must be able to execute in a failed
+ 	 * transaction to restore a prior value of the ROLE GUC variable.
+ 	 */
+ 	if (SetRoleAssign_hook)
+ 		(*SetRoleAssign_hook) (myextra->roleid, myextra->is_superuser);
+ 
  	SetCurrentRoleId(myextra->roleid, myextra->is_superuser);
  }
  
diff --git a/src/include/commands/variable.h b/src/include/commands/variable.h
index 8105951..f3870e9 100644
*** a/src/include/commands/variable.h
--- b/src/include/commands/variable.h
***
*** 12,17 
--- 12,22 
  
  #include "utils/guc.h"
  
+ /* Hooks for plugins to get control in check_role() and assign_role() */
+ typedef void (*SetRoleCheck_hook_type) (Oid, Oid, bool);
+ extern PGDLLIMPORT SetRoleCheck_hook_type SetRoleCheck_hook;
+ typedef void (*SetRoleAssign_hook_type) (Oid, bool);
+ extern PGDLLIMPORT SetRoleAssign_hook_type SetRoleAssign_hook;
  
  extern bool check_datestyle(char **newval, void **extra, GucSource source);
  extern void assign_datestyle(const char *newval, void *extra);


signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Relaxing SSL key permission checks

2016-03-05 Thread Christoph Berg

Re: Alvaro Herrera 2016-03-04 <20160304205521.GA735336@alvherre.pgsql>
> For the sake of cleanliness, I propose splitting out the checks for
> regular file and for owned-by-root-or-us from the actual chmod-level
> checks at the same time.  That way we can provide more specific error
> messages for each case.  (Furthermore, I'm pretty sure that the check
> for superuserness could be applied on Windows also -- in the attached
> patch it's still #ifdef'ed out, because I don't know how to write it.)

Thanks, this is a reasonable improvement. I've included your patch in
the 9.6 Debian package and also implemented some regression tests in
postgresql-common to make sure the bad permissions are indeed catched.

> After doing that change I started to look at the details of the check
> and found some mistakes:
> 
> * it said "g=w" instead of "g=r", in contradiction with the numeric
> specification: g=w means 020 rather than 040.  We want the file to be
> group-readable, not group-writeable.

Typo on my side, sorry.

> * it failed to check for S_IXUSR, so permissions 0700 were okay, in
> contradiction with what the error message indicates.  This is a
> preexisting bug actually.  Do we want to fix it by preventing a
> user-executable file (possibly breaking compability with existing
> executable key files), or do we want to document what the restriction
> really is?

x on regular files doesn't do anything, so it wouldn't matter, but of
course syncing the code with the error message makes sense. I don't
think 0700 is a case that is seen in the wild (in contrast to 0777
files), and even if so, the error message clearly says what the
problem is.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Tom Lane

I wrote:
> Amit Kapila  writes:
>> I think here we should use rel->consider_parallel to set parallel_safe as
>> is done in create_mergejoin_path.

> Well, the "rel" is going to be an upperrel that will have been
> manufactured by fetch_upper_rel, and it will contain no useful
> information about parallelism, so I'm not real sure what that
> would buy.

Ah, after further study I found where this issue is handled for
joinrels.  I think you're probably right that it'd be a good idea
to include rel->consider_parallel when setting parallel_safe in
upper paths.  In the short term that will have the effect of
marking all upper paths as parallel-unsafe, but that's at least a
safe default that we can improve later.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Tom Lane

Greg Stark  writes:
> What query is that lone data point that took 8ms instead of 6ms to
> plan in both charts (assuming it's the same data point)?

Ah, sorry, I should probably have spent a little more time on making those
charts.  That thing you're looking at isn't a data point, it's gnuplot
showing what symbol it used for the data from this file.

Here's another one with the axes adjusted to keep the labels away from
the data, and with both sets of data overlaid.  This makes it a bit
clearer that the UPPEREL_INITIAL removal moved the distribution slightly
but measurably at the bottom of the time range.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-03-05 Thread Joe Conway

On 02/28/2016 07:45 PM, Michael Paquier wrote:
> On Mon, Feb 29, 2016 at 3:50 AM, Joe Conway  wrote:
>> If there are no other complaints or comments, I will commit the attached
>> sometime this coming week (the the requisite catversion bump).
> 
> Thanks for the updated patch, I have nothing else to say on my side.
> The new version has fixed the MSVC build, I just double-checked it.

Pushed with catversion bump.

Thanks,

Joe


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Greg Stark

On Sat, Mar 5, 2016 at 6:09 PM, Tom Lane  wrote:
> There might be some other things we could do to provide a fast-path for
> particularly trivial cases.  But on the whole I think this plot shows that
> there's no systematic problem, and indeed not really a lot of change at
> all.

Amazing data.

What query is that lone data point that took 8ms instead of 6ms to
plan in both charts (assuming it's the same data point)?

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-05 Thread Michael Paquier

On Sat, Mar 5, 2016 at 11:34 PM, Andrew Dunstan  wrote:
> Here is a translation into perl of the sed script, courtesy of the s2p
> incarnation of psed: 
> The sed script appears to have been stable for a long time, so I don't think
> we need to be too concerned about possibly maintaining two versions.

That's 95% of the work already done, nice! If I finish wrapping up a
patch for this issue at least would you backpatch? It would be saner
to get rid of this dependency everywhere I think regarding compilation
with perl 5.22.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Tom Lane

I wrote:
> Robert Haas  writes:
>> One idea might be to run a whole bunch of queries and record all of
>> the planning times, and then run them all again and compare somehow.
>> Maybe the regression tests, for example.

> That sounds like something we could do pretty easily, though interpreting
> the results might be nontrivial.

I spent some time on this project.  I modified the code to log the runtime
of standard_planner along with decompiled text of the passed-in query
tree.  I then ran the regression tests ten times with cassert-off builds
of both current HEAD and HEAD+pathification patch, and grouped all the
numbers for log entries with identical texts.  (FYI, there are around
1 distinguishable queries in the current tests, most planned only
once or twice, but some as many as 2900 times.)  I had intended to look
at the averages within each group, but that was awfully noisy; I ended up
looking at the minimum times, after discarding a few groups with
particularly awful standard deviations.  I theorize that a substantial
part of the variation in the runtime depends on whether catalog entries
consulted by the planner have been sucked into syscache or not, and thus
that using the minimum is a reasonable way to eliminate cache-loading
effects, which surely ought not be considered in this comparison.

Here is a scatter plot, on log axes, of planning times in milliseconds
with HEAD (x axis) vs those with patch (y axis):

The most noticeable thing about that is that the worst percentage-wise
cases appear near the bottom end of the range.  And indeed inspection
of individual entries showed that trivial cases like

SELECT (ROW(1, 2) < ROW(1, 3)) AS "true"

were hurting the most percentage-wise.  After some study I decided that
the only thing that could explain that was the two rounds of
construct-an-upper-rel-and-add-paths-to-it happening in grouping_planner.
I was able to get rid of one of them by discarding the notion of
UPPERREL_INITIAL altogether, and instead having the code apply the desired
tlist in-place, like this:

sub_target = make_subplanTargetList(root, tlist,
&groupColIdx);

/*
 * Forcibly apply that tlist to all the Paths for the scan/join rel.
 *
 * In principle we should re-run set_cheapest() here to identify the
 * cheapest path, but it seems unlikely that adding the same tlist
 * eval costs to all the paths would change that, so we don't bother.
 * Instead, just assume that the cheapest-startup and cheapest-total
 * paths remain so.  (There should be no parameterized paths anymore,
 * so we needn't worry about updating cheapest_parameterized_paths.)
 */
foreach(lc, current_rel->pathlist)
{
Path   *subpath = (Path *) lfirst(lc);
Path   *path;

Assert(subpath->param_info == NULL);
path = apply_projection_to_path(root, current_rel,
subpath, sub_target);
/* If we had to add a Result, path is different from subpath */
if (path != subpath)
{
lfirst(lc) = path;
if (subpath == current_rel->cheapest_startup_path)
current_rel->cheapest_startup_path = path;
if (subpath == current_rel->cheapest_total_path)
current_rel->cheapest_total_path = path;
}
}

With that fixed, the scatter plot looks like:

There might be some other things we could do to provide a fast-path for
particularly trivial cases.  But on the whole I think this plot shows that
there's no systematic problem, and indeed not really a lot of change at
all.

I won't bother to repost the modified patch right now, but will spend some
time filling in the missing pieces first.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Tom Lane

Amit Kapila  writes:
> On Fri, Mar 4, 2016 at 11:31 PM, Tom Lane  wrote:
>> (BTW, I found what seemed to be a couple of pre-existing bugs of
>> the same kind, eg create_mergejoin_path was different from the
>> other two kinds of join as to setting parallel_degree.)

> I think the reason for keeping parallel_degree as zero for mergejoin path
> is that currently it can't participate in parallelism.

Is there some reason why hash and nestloop are safe but merge isn't?

>> + RecursiveUnionPath *
>> + create_recursiveunion_path(PlannerInfo *root,
>> + ...
>> + pathnode->path.parallel_safe =
>> + leftpath->parallel_safe && rightpath->parallel_safe;

> I think here we should use rel->consider_parallel to set parallel_safe as
> is done in create_mergejoin_path.

Well, the "rel" is going to be an upperrel that will have been
manufactured by fetch_upper_rel, and it will contain no useful
information about parallelism, so I'm not real sure what that
would buy.

This does bring up what seems to me probably a pre-existing bug in
the parallel query planning stuff: what about parallel-safe vs
parallel-unsafe functions in join quals, or other expressions that
have to be evaluated at places above the scan level?  I would expect
to see upper path nodes needing to account for parallel-safety
of the specific expressions they need to execute.  However, the
existing join path node types don't have any provision for this,
so I did not feel that it was incumbent on me to fix it for the
path node types I'm adding.

> +  * It's only needed atop a node that doesn't support projection

> "needed atop a node", seems unclear to me, typo?

Seems perfectly good English to me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JPUG wants to have a copyright notice on the translated doc

2016-03-05 Thread Magnus Hagander

On Sat, Mar 5, 2016 at 3:05 AM, Joshua D. Drake 
wrote:

> On 03/04/2016 06:01 PM, Tatsuo Ishii wrote:
>
> Considering they are BSD licensed, I am not sure what abuses could be
>>> taken?
>>>
>>
>> I imagine kind of an extream case: a bad guy removes "Copyright
>> 1996-2016 The PostgreSQL Global Development Group" and replaces it
>> with his/her copyright.
>>
>
> Right but again, what happens when they do that? In a practical sense,
> nothing because we are BSD licensed.
>

Removing the copyright is pretty much the only thing that is not allowed
under the PostgreSQL or BSD license, isn't it?

That said, nothing is likely to actually *happen*, mainly  because I don't
think we actually care that much. But it is not *allowed* to remove it. But
there is nothing that doesn't say you can also add do it.

> In short, no worries, 100% happy that JPUG is taking initiative.
>
>
I also see no problem at all with this.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] The plan for FDW-based sharding

2016-03-05 Thread Kevin Grittner

On Fri, Mar 4, 2016 at 10:10 PM, Craig Ringer  wrote:
> On 28 February 2016 at 06:38, Kevin Grittner  wrote:

>> What I sketched out with the "apparent order of execution"
>> ordering of the transactions (basically, commit order except
>> when one SERIALIZABLE transaction needs to be dragged in front
>> of another due to a read-write dependency) is possibly the
>> simplest approach, but batching may well give better
>> performance.
>
> I'd be really interested in some ideas on how that information might be
> usefully accessed. If we could write info on when to apply commits to the
> xlog in serializable mode that'd be very handy, especially when looking to
> the future with logical decoding of in-progress transactions, parallel
> apply, etc.

Are you suggesting the possibility of holding off on writing the
commit record for a SERIALIZABLE transaction to WAL until it is
known that no other SERIALIZABLE transaction comes ahead of it in
the apparent order of execution?  If so, that's an interesting idea
that I hadn't given much thought to yet -- I had been assuming
current WAL writes, with adjustments to the timing of application
of the records.

> For parallel apply I anticipated that we'd probably have workers applying
> xacts in parallel and committing them in upstream commit order. They'd
> sometimes deadlock with each other; when this happened all workers whose
> xacts committed after the first aborted xact would have to abort and start
> again. Not ideal, but safe.
>
> Being able to avoid that by using SSI information was in the back of my
> mind, but with no idea how to even begin to tackle it. What you've mentioned
> here is helpful and I'd be interested if you could share a bit more of your
> experience in the area.

My thinking so far has been that reordering the application of
transaction commits on a replica would best be done as the minimal
rearrangement possible from commit order which allows the work of
transactions to become visible in an order consistent with some
one-at-a-time run of those transactions.  Partly that is because
the commit order is something that is fairly obvious to see and is
what most people intuitively look at, even when it is wrong.
Deviating from this intuitive order seems likely to introduce
confusion, even when the results are 100% correct.

The only place you *need* to vary from commit order for correctness
is when there are overlapping SERIALIZABLE transactions, one
modifies data and commits, and another reads the old version of the
data but commits later.  Due to the action of SSI on the source
machine, you know that there could not be any SERIALIZABLE
transaction which saw the inconsistent state between the two
commits, but on replicas we don't yet manage that.  The key is that
there is a read-write dependency (a/k/a rw-conflict) between the
two transactions which tells you that the second to commit has to
come before the first in any graph of apparent order of execution.

The tricky part is that when there are two overlapping SERIALIZABLE
transactions and one of them has modified data and committed, and
there is an overlapping SERIALIZABLE transaction which is not READ
ONLY which has not yet reached completion (COMMIT or ROLLBACK) the
correct ordering remains in doubt -- there is no way to know which
might need to commit first, or whether it even matters.  I am
skeptical about whether in logical replication (including MMR), it
is going to be possible to manage this by finding "safe snapshots".
The only alternative I can see, though, is to suspend replication
while correct transaction ordering remains in doubt.  A big READ
ONLY transaction would not cause a replication stall, but a big
READ WRITE transaction could cause an indefinite stall.  Simon
seemed to be saying that this is unacceptable, but I tend to think
it is a viable approach for some workloads, especially if the READ
ONLY transaction property is used when possible.

There might be some wiggle room in terms of letting
non-SERIALIZABLE transactions commit while the ordering of
SERIALIZABLE transactions remain in doubt, but that would involve
allowing bigger deviations from commit order in transaction
application, which may confuse people.  The argument on the other
side is that if they use transaction isolation less strict than
SERIALIZABLE that they are vulnerable to seeing anomalies anyway,
so they must be OK with that.

Hopefully this is in some way helpful

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Greg Stark

On Sat, Mar 5, 2016 at 2:35 PM, Tom Browder  wrote:
>> [Removing all the other xposted lists -- don't do that!]
>
> Okay, sorry.  I thought since the reply was pg-specific it would cut down 
> noise.

I'm sorry I was unclear. I meant, I was removing all the others from
my reply and was saying not to cross-post like that in the first
place. I see you removed them in your response too which is good but I
missed that and responded to the previous message.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Tom Browder

On Sat, Mar 5, 2016 at 7:03 AM, Greg Stark  wrote:
> On Sat, Mar 5, 2016 at 12:59 PM, Greg Stark  wrote:
>> Well. Not dealt with yet. I think it's more or less clear how to
>> tackle it using macros and builtins now but there's a lot of drudgery
>> work to actually rewrite all the checks. I have the reports from Xi
>> Wang's tool saved if anyone else wants to take it up. I would say it's
>> on my TODO list but that's more of an abstract concept than an actual
>> list.
>
> [Removing all the other xposted lists -- don't do that!]

Okay, sorry.  I thought since the reply was pg-specific it would cut down noise.

-Tom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] VS 2015 support in src/tools/msvc

2016-03-05 Thread Andrew Dunstan




On 03/05/2016 12:46 AM, Tom Lane wrote:

Michael Paquier  writes:

On Sat, Mar 5, 2016 at 1:41 PM, Petr Jelinek  wrote:

I vote for just using sed considering we need flex and bison anyway.

OK cool, we could go with something else than sed to generate probes.h
but that seems sensible considering that this should definitely be
back-patched. Not sure what the others think about adding a new file
in the source tarball by default though.

AFAIK, sed flex and bison originate from three separate source projects;
there is no reason to suppose that the presence of flex and bison on a
particular system guarantee the presence of sed.  I thought the proposal
to get rid of the psed dependence in favor of some more perl code was
pretty sane.




Here is a translation into perl of the sed script, courtesy of the s2p 
incarnation of psed: 
 The sed script 
appears to have been stable for a long time, so I don't think we need to 
be too concerned about possibly maintaining two versions.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Freeze avoidance of very large table.

2016-03-05 Thread Masahiko Sawada

On Sat, Mar 5, 2016 at 1:25 AM, Robert Haas  wrote:
> On Wed, Mar 2, 2016 at 6:41 PM, Tom Lane  wrote:
>> Jim Nasby  writes:
>>> On 3/2/16 4:21 PM, Peter Geoghegan wrote:
 I think you should commit this. The chances of anyone other than you
 and Masahiko recalling that you developed this tool in 3 years is
 essentially nil. I think that the cost of committing a developer-level
 debugging tool like this is very low. Modules like pg_freespacemap
 currently already have no chance of being of use to ordinary users.
 All you need to do is restrict the functions to throw an error when
 called by non-superusers, out of caution.

 It's a problem that modules like pg_stat_statements and
 pg_freespacemap are currently lumped together in the documentation,
 but we all know that.
>>
>>> +1.
>>
>> Would it make any sense to stick it under src/test/modules/ instead of
>> contrib/ ?  That would help make it clear that it's a debugging tool
>> and not something we expect end users to use.
>
> I actually think end-users might well want to use it.  Also, I created
> it by hacking up pg_freespacemap, so it may make sense to have it in
> the same place.
> I would also be tempted to add an additional C functions that scan the
> entire visibility map and return counts of the total number of bits of
> each type that are set, and similarly for the page level bits.
> Presumably that would be much faster than

+1.

>
> I am also tempted to change the API to be a bit more friendly,
> although I am not sure exactly how.  This was a quick and dirty hack
> so that I could test, but the hardest thing about making it not a
> quick and dirty hack is probably deciding on a good UI.
>

Does it mean visibility map API in visibilitymap.c?

Regards,

--
Masahiko Sawada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-05 Thread Michael Paquier

On Sat, Mar 5, 2016 at 7:47 AM, Andres Freund  wrote:
> On 2016-03-05 07:43:00 +0900, Michael Paquier wrote:
>> On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund  wrote:
>> > On 2016-03-04 14:51:50 +0900, Michael Paquier wrote:
>> >> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund  wrote:
>> >> Hm. OK. I don't see any reason why switching to link() even in code
>> >> paths like KeepFileRestoredFromArchive() or pgarch_archiveDone() would
>> >> be a problem thinking about it. Should HAVE_WORKING_LINK be available
>> >> on a platform we can combine it with unlink. Is that in line with what
>> >> you think?
>> >
>> > I wasn't trying to suggest we should replace all rename codepaths with
>> > the link wrapper, just the ones that already have a HAVE_WORKING_LINK
>> > check. The name of the routine I suggested is bad though...
>>
>> So we'd introduce a first routine rename_or_link_safe(), say replace_safe().
>
> Or actually maybe just link_safe(), which falls back to access() &&
> rename() if !HAVE_WORKING_LINK.
>
>> > That's one approach, yes. Combined with the fact that you can't actually
>> > reliably rename across directories, the two could be on different
>> > filesystems after all, that'd be a suitable defense. It just needs to be
>> > properly documented in the function header, not at the bottom.
>>
>> OK. Got it. Or the two could be on the same filesystem.
>
>> Still, link() and rename() do not support doing their stuff on
>> different filesystems (EXDEV).
>
> That's my point ...

OK, I hacked a v7:
- Move the link()/rename() group with HAVE_WORKING_LINK into a single
routine, making the previous link_safe renamed to replace_safe. This
is sharing a lot of things with rename_safe. I am not sure it is worth
complicating the code more this way by having a common single routine
for whole. Thoughts welcome. Honestly, I kind of liked the separation
with link_safe/rename_safe of previous patches because link_safe could
have been directly used by extensions and plugins btw.
- Remove the call of stat() in rename_safe() and implement a logic
depending on OpenTransientFile()/pg_fsync() to flush any existing
target file before performing the rename.
Andres, feel free to use this patch as a base, perhaps that will help.
-- 
Michael
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index f6da673..80ec293 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -417,25 +417,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 	 */
 	TLHistoryFilePath(path, newTLI);
 
-	/*
-	 * Prefer link() to rename() here just to be really sure that we don't
-	 * overwrite an existing file.  However, there shouldn't be one, so
-	 * rename() is an acceptable substitute except for the truly paranoid.
-	 */
-#if HAVE_WORKING_LINK
-	if (link(tmppath, path) < 0)
-		ereport(ERROR,
-(errcode_for_file_access(),
- errmsg("could not link file \"%s\" to \"%s\": %m",
-		tmppath, path)));
-	unlink(tmppath);
-#else
-	if (rename(tmppath, path) < 0)
+	/* And perform the rename */
+	if (replace_safe(tmppath, path) < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
- errmsg("could not rename file \"%s\" to \"%s\": %m",
+ errmsg("could not replace file \"%s\" to \"%s\": %m",
 		tmppath, path)));
-#endif
 
 	/* The history file can be archived immediately. */
 	if (XLogArchivingActive())
@@ -507,25 +494,12 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size)
 	 */
 	TLHistoryFilePath(path, tli);
 
-	/*
-	 * Prefer link() to rename() here just to be really sure that we don't
-	 * overwrite an existing logfile.  However, there shouldn't be one, so
-	 * rename() is an acceptable substitute except for the truly paranoid.
-	 */
-#if HAVE_WORKING_LINK
-	if (link(tmppath, path) < 0)
-		ereport(ERROR,
-(errcode_for_file_access(),
- errmsg("could not link file \"%s\" to \"%s\": %m",
-		tmppath, path)));
-	unlink(tmppath);
-#else
-	if (rename(tmppath, path) < 0)
+	/* And perform the rename */
+	if (replace_safe(tmppath, path) < 0)
 		ereport(ERROR,
 (errcode_for_file_access(),
- errmsg("could not rename file \"%s\" to \"%s\": %m",
+ errmsg("could not replace file \"%s\" to \"%s\": %m",
 		tmppath, path)));
-#endif
 }
 
 /*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 94b79ac..6c4f36d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3298,35 +3298,17 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath,
 		}
 	}
 
-	/*
-	 * Prefer link() to rename() here just to be really sure that we don't
-	 * overwrite an existing logfile.  However, there shouldn't be one, so
-	 * rename() is an acceptable substitute except for the truly paranoid.
-	 */
-#if HAVE_WORKING_LINK
-	if (link(tmppath, path) < 0)
+	/* rename the segment file */
+	if (replace_safe(tmppath, path) < 0)
 	{
 		if (use_lock)
 			LWLockRelease(ControlFileLoc

[HACKERS] Re: Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Greg Stark

On Sat, Mar 5, 2016 at 12:59 PM, Greg Stark  wrote:
> Well. Not dealt with yet. I think it's more or less clear how to
> tackle it using macros and builtins now but there's a lot of drudgery
> work to actually rewrite all the checks. I have the reports from Xi
> Wang's tool saved if anyone else wants to take it up. I would say it's
> on my TODO list but that's more of an abstract concept than an actual
> list.

[Removing all the other xposted lists -- don't do that!]

And fwiw the reason it's not an urgent issue for Postgres is because
we build with -fwrapv, essentially asking the compiler for a C
language that offers more guarantees than the standard (but matches
traditional C environments). So there isn't an active bug on Postgres
with GCC (or I think Clang) but may be with other compilers if they
don't have that option.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Greg Stark

On Sat, Mar 5, 2016 at 12:41 PM, Tomas Vondra
 wrote:
> And it was dealt with

Well. Not dealt with yet. I think it's more or less clear how to
tackle it using macros and builtins now but there's a lot of drudgery
work to actually rewrite all the checks. I have the reports from Xi
Wang's tool saved if anyone else wants to take it up. I would say it's
on my TODO list but that's more of an abstract concept than an actual
list.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Tom Browder

On Sat, Mar 5, 2016 at 6:41 AM, Tomas Vondra
 wrote:
> On Sat, 2016-03-05 at 06:24 -0600, Tom Browder wrote:
>> Interesting article in latest issue of subject titled:
>>
>>   "A Differential Approach to Undefined Behavior Detection"
...
> AFAIK this is not an entirely new tool - it was published a few years
> back (2013?) along with a paper that also mentioned a few issues in
> PostgreSQL. And it was dealt with, see for example this thread
>
> http://www.postgresql.org/message-id/flat/20130715215950.ga4...@eldon.alvh.no-ip.org
>
> Or is this something new?

No, and I think the article mentions that at least one bug was found
in the postgresql code.

Sorry for the false alarm.

Best regards,

-Tom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Tomas Vondra

Hi,

On Sat, 2016-03-05 at 06:24 -0600, Tom Browder wrote:
> Interesting article in latest issue of subject titled:
> 
>   "A Differential Approach to Undefined Behavior Detection"
> 
> which may describe procedures not used in other static analysis programs.
> 
> Article references the authors' website here:
> 
>   http://css.csail.mit.edu/stack
> 
> which contains more info links and a link to the software on github here:
> 
>   https://github.com/xiw/stack
> 
> Best regards,

AFAIK this is not an entirely new tool - it was published a few years
back (2013?) along with a paper that also mentioned a few issues in
PostgreSQL. And it was dealt with, see for example this thread

http://www.postgresql.org/message-id/flat/20130715215950.ga4...@eldon.alvh.no-ip.org

Or is this something new?


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Static code checker research worth investigating (Communications of the ACM, 03/2016, Vol. 59, No. 03, p. 99)

2016-03-05 Thread Tom Browder

Interesting article in latest issue of subject titled:

  "A Differential Approach to Undefined Behavior Detection"

which may describe procedures not used in other static analysis programs.

Article references the authors' website here:

  http://css.csail.mit.edu/stack

which contains more info links and a link to the software on github here:

  https://github.com/xiw/stack

Best regards,

-Tom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Move PinBuffer and UnpinBuffer to atomics

2016-03-05 Thread Dilip Kumar

On Wed, Mar 2, 2016 at 11:05 AM, Dilip Kumar  wrote:

> And this latest result (no regression) is on X86 but on my local machine.
>
> I did not exactly saw what this new version of patch is doing different,
> so I will test this version in other machines also and see the results.
>

I tested this on PPC again, This time in various order (sometime patch
first and then base first).
 I tested with latest patch *pinunpin-cas-2.patch* on Power8.

Shared Buffer = 8GB
./pgbench  -j$ -c$ -T300 -M prepared -S postgres

BASE
-
Clientsrun1run2run3
1   212001875420537
2   403313952038746

Patch
-
Clientsrun1run2run3
1   202251980619778
2   398304189836620

I think, here we can not see any regression, (If I take median then it may
looks low with patch so posting all 3 reading).

Note: reverted only ac1d794 commit in my test.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Amit Kapila

On Sat, Mar 5, 2016 at 4:51 PM, Amit Kapila  wrote:
>
> On Fri, Mar 4, 2016 at 11:31 PM, Tom Lane  wrote:
> >
> > OK, here is a version that I think addresses all of the recent comments:
> >
> > * Fixed handling of parallel-query fields in new path node types.
> > (BTW, I found what seemed to be a couple of pre-existing bugs of
> > the same kind, eg create_mergejoin_path was different from the
> > other two kinds of join as to setting parallel_degree.)
> >
>
> I think the reason for keeping parallel_degree as zero for mergejoin path
is that currently it can't participate in parallelism.
>
>
> *** create_unique_path(PlannerInfo *root, Re
> *** 1440,1446 
>   pathnode->path.param_info = subpath-
> >param_info;
>   pathnode->path.parallel_aware = false;
>   pathnode->path.parallel_safe = subpath->parallel_safe;
> !
> pathnode->path.parallel_degree = 0;
>
>   /*
>   * Assume the output is unsorted, since we don't necessarily
> have pathkeys
> --- 1445,1451 
>   pathnode->path.param_info = subpath->param_info;
>   pathnode-
> >path.parallel_aware = false;
>   pathnode->path.parallel_safe = subpath->parallel_safe;
> ! pathnode-
> >path.parallel_degree = subpath->parallel_degree;
>
> Similar to reason for merge join path, I think this should also be set to
0.
>
> Similarly for LimitPath, parallel_degree should be set to 0.
>

Oops, It seems Robert has made comment upthread that we should set
parallel_degree from subpath except for write paths, so I think the above
comment can be ignored.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] WIP: Upper planner pathification

2016-03-05 Thread Amit Kapila

On Fri, Mar 4, 2016 at 11:31 PM, Tom Lane  wrote:
>
> OK, here is a version that I think addresses all of the recent comments:
>
> * Fixed handling of parallel-query fields in new path node types.
> (BTW, I found what seemed to be a couple of pre-existing bugs of
> the same kind, eg create_mergejoin_path was different from the
> other two kinds of join as to setting parallel_degree.)
>

I think the reason for keeping parallel_degree as zero for mergejoin path
is that currently it can't participate in parallelism.


*** create_unique_path(PlannerInfo *root, Re
*** 1440,1446 
  pathnode->path.param_info = subpath-
>param_info;
  pathnode->path.parallel_aware = false;
  pathnode->path.parallel_safe = subpath->parallel_safe;
!
pathnode->path.parallel_degree = 0;

  /*
  * Assume the output is unsorted, since we don't necessarily
have pathkeys
--- 1445,1451 
  pathnode->path.param_info = subpath->param_info;
  pathnode-
>path.parallel_aware = false;
  pathnode->path.parallel_safe = subpath->parallel_safe;
! pathnode-
>path.parallel_degree = subpath->parallel_degree;

Similar to reason for merge join path, I think this should also be set to 0.

Similarly for LimitPath, parallel_degree should be set to 0.


+ RecursiveUnionPath *
+ create_recursiveunion_path(PlannerInfo *root,
+   RelOptInfo *rel,
+   Path *leftpath,
+   Path *rightpath,
+   PathTarget *target,
+   List *distinctList,
+   int wtParam,
+   double numGroups)
+ {
+ RecursiveUnionPath *pathnode = makeNode(RecursiveUnionPath);
+
+ pathnode->path.pathtype = T_RecursiveUnion;
+ pathnode->path.parent = rel;
+ pathnode->path.pathtarget = target;
+ /* For now, assume we are above any joins, so no parameterization */
+ pathnode->path.param_info = NULL;
+ pathnode->path.parallel_aware = false;
+ pathnode->path.parallel_safe =
+ leftpath->parallel_safe && rightpath->parallel_safe;

I think here we should use rel->consider_parallel to set parallel_safe as
is done in create_mergejoin_path.

+  * It's only needed atop a node that doesn't support projection

"needed atop a node", seems unclear to me, typo?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

39 matches

Mail list logo