Re: [HACKERS] [bug fix] Suppress "autovacuum: found orphan temp table" message

2014-07-18 Thread MauMau

From: "Andres Freund" 

On 2014-07-18 23:38:09 +0900, MauMau wrote:

LOG:  autovacuum: found orphan temp table "pg_temp_838"."some_table" in
database "some_db"
LOG:  autovacuum: found orphan temp table "pg_temp_902"."some_table" in
database "some_db"


So they had server crashes of some form before - otherwise they
shouldn't see this because during ordinary shutdown the schema will have
been dropped. C.f. RemoveTempRelationsCallback().


Yes, they are using streaming replication, and experienced failover.



1. Why and when are these messages are output?  Do we have to do
something?


Yes, you should investigate how the situation came to be.


Yes, as mentioned before, I know the reason thanks to the past mails of this 
community.  The situation is like this:


1. The applications were using temporary tables.  The rows for temporary 
tables were created in pg_namespace (one row for pg_temp_n) and pg_class. 
Those rows were replicated to the standby.   The data files for the 
temporary tables were not replicated.

2. The server crashed the standby was promoted to the primary.
3. The new primary performed recovery, but the rows for temporary tables in 
the system catalog were left.
4. The applications resumed processing.  However, the workload got lighter, 
so the zonbie pg_temp_n entries were not recycled.
5. autovacuum workers found the zonbie temporary table entries in the system 
catalog, repeatedly emitting lots of messages.




3. Doesn't the output processing of these messages or its cause affect
performance?  We happen to be facing a performance problem, the cause of
which we haven't found yet.


Meh. If that's the bottleneck you've bigger problems.


I guess the performance problem they are facing is not due to this message 
output, but I don't have evidence.  Anyway, I think worrying users with lots 
of messages is evil itself.



So, I propose a simple fix to change the LOG level to DEBUG1.  I don't 
know

which of DEBUG1-DEBUG5 is appropriate, and any level is OK.  Could you
include this in 9.2.9?


Surely that's the wrong end to tackle this from. Hiding actual problems
is a seriously bad idea.


No, there is no serious problem in the user operation in this situation. 
Server crash cannot be avoided, and must be anticipated.  The problem is 
that PostgreSQL makes users worried about lots of (probably) unnecessary 
messages.





It'd be nice if we had infrastructure to do this at startup, but we
don't...


Yes, ideally so.  It is the responsibility of the database server to clean 
up the zombie metadata (catalog entries).  But I understand there's not such 
infrastracture now.  If it's not (easily) possible, the best and only thing 
is to not make users concerned.  Is there any reason to output the message 
in the viewpoint of users, not the viewpoint of developers?


The problem is pressing.  The customer is trying to use PostgreSQL for very 
mission-critical system, and I wish PostgreSQL will get high reputation. 
Could you include this in 9.2.9?


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] subquery in CHECK constraint

2014-07-18 Thread Tatsuo Ishii
> I think the basic problem would be what the check constraint subquery
> meant to the user, and how useful that is expected to be in general. A
> subquery in a check constraint would presumably involve checking the
> subquery using an existing snapshot of the command that required the
> constraint to be verified (say, an INSERT). But why should that
> snapshot be so special? In any case the result of the subquery may not
> be immutable (even in some limited, practical sense), and we expect
> check constraints to be on immutable conditions on constrained columns
> only. In general it would be practically impossible to determine that
> something else had changed the state of the database in such a way as
> to make the check constraint no longer verify successfully on each
> row, so we would not be able to prevent that from happening later on.
> 
> I imagine that you have a very specific case in mind, though. Perhaps
> you can share the details.

No I don't have a specific case. I am just wondering because it's
defined in the standard.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] subquery in CHECK constraint

2014-07-18 Thread Tom Lane
Peter Geoghegan  writes:
> On Fri, Jul 18, 2014 at 7:31 PM, Tatsuo Ishii  wrote:
>> Has anybody tried to implement subquery in CHECK constaint? If so,
>> what are issues to implement it? Or the feature is not worth the
>> effort? Comments and/or opinions are welcome.

> I think the basic problem would be what the check constraint subquery
> meant to the user, and how useful that is expected to be in general.

Yeah.  Check constraints are only well-defined to the extent that they
constrain the contents of the current row independent of anything else.
It's hard to conceive of a use-case for a subquery that wouldn't violate
that in some fashion.

I can certainly conceive of cases in which you want to constrain the
contents of one table in terms of another's contents, sort of like foreign
keys, but let's suppose that the particular invariant you have in mind
isn't expressible as a foreign key.  But you can write a CHECK subquery
that captures what you want.  Now what?  There's a *lot* of complicated
infrastructure needed to implement foreign keys, because they constrain
both tables not just one.  How would you invert a CHECK subquery to figure
out what changes are allowed in the referenced table?

Maybe you're willing to accept the special case in which you don't intend
ever to change the referenced table, or are willing to take responsibility
for not changing it in a way that violates the CHECK constraint for any
existing row in the referencing table.  So fine; all the system is
supposed to do is check the constraint on every insert/update in the
referencing table.  I think the implementation issues would be

(1) there's no support for doing any planning of subqueries in standalone
expressions.  This is probably just a small matter of programming, but
still a hurdle to be jumped.

(2) how would pg_dump deal with check constraints like these?  At minimum
it'd have to understand, or guess at, the dump ordering restrictions
needed to allow data to be reloaded with such a constraint.

I'm not sure this is much easier to solve than the general case of
SQL assertions (which we have not got either).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] subquery in CHECK constraint

2014-07-18 Thread Peter Geoghegan
Hi,

On Fri, Jul 18, 2014 at 7:31 PM, Tatsuo Ishii  wrote:
> Has anybody tried to implement subquery in CHECK constaint? If so,
> what are issues to implement it? Or the feature is not worth the
> effort? Comments and/or opinions are welcome.

I think the basic problem would be what the check constraint subquery
meant to the user, and how useful that is expected to be in general. A
subquery in a check constraint would presumably involve checking the
subquery using an existing snapshot of the command that required the
constraint to be verified (say, an INSERT). But why should that
snapshot be so special? In any case the result of the subquery may not
be immutable (even in some limited, practical sense), and we expect
check constraints to be on immutable conditions on constrained columns
only. In general it would be practically impossible to determine that
something else had changed the state of the database in such a way as
to make the check constraint no longer verify successfully on each
row, so we would not be able to prevent that from happening later on.

I imagine that you have a very specific case in mind, though. Perhaps
you can share the details.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] subquery in CHECK constraint

2014-07-18 Thread Tatsuo Ishii
Hi,

Has anybody tried to implement subquery in CHECK constaint? If so,
what are issues to implement it? Or the feature is not worth the
effort? Comments and/or opinions are welcome.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL format and API changes (9.5)

2014-07-18 Thread Michael Paquier
On Wed, Jul 2, 2014 at 4:23 PM, Michael Paquier
 wrote:
>
>
>
> On Wed, Jul 2, 2014 at 4:09 PM, Michael Paquier 
> wrote:
>>
>> 3) I noticed a bug in gin redo code path when trying to use the WAL replay
>> facility.
This patch has been on status "Waiting on author" for a little bit of
time, marking it now as "Returned with feedback" as there are still
issues with gin record construction and redo code paths.

Feel free to disagree if you don't think this is correct.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Issues with dropped columns in views depending on functions

2014-07-18 Thread Tom Lane
I looked into the problem described here:
http://www.postgresql.org/message-id/53c93986.80...@clickware.de

The core issue is that the ruleutils.c code isn't thinking about dropped
columns in the output of a function-in-FROM that returns a composite type.
Pre-9.3, there had been some code there to cope with the case by means of
doing a get_rte_attribute_is_dropped() test on each column as the alias
list was printed, but I'd lost that in the rewrite that improved handling
of column aliases in RTEs with dropped columns.  I looked at putting that
back, but (a) it's slow, and (b) it didn't really work right even when it
was there: it only accidentally failed to crash in EXPLAIN's usage,
because we were applying exprType() to a null funcexpr field.

The next possibility that I looked at was teaching the code to ignore
empty-string items in a function RTE's alias list, since that's what
the parser inserts to correspond to dropped columns at parse time.
This solves the reported problem, almost, except that it exposed an
ancient bug in outfuncs/readfuncs: the code to print a T_String node is
wrong for the case of an empty string.  What it actually prints is "<>"
which of course reads back as <>, not empty.  After repairing that, I had
a patch that takes care of the reported problem, at least for newly
created views (see attached).  It won't do anything to retroactively fix
the problem in existing views, but I think that might be okay given the
small number of complaints.

However, this only fixes the issue for columns that were dropped before
the view was created.  The system will let you drop columns *after* the
view is created, and then we have a problem, as illustrated by the
rather dubious results in the regression test in the attached patch.
I'm of the opinion that this is a bug and we should disallow dropping
a composite type's column if it's explicitly referenced in any view, much
as would happen if there were a direct reference to the table.  However,
that will take some nontrivial surgery on the dependency code, and I
rather doubt that we want to back-patch such a thing --- especially since
the needed pg_depend entries wouldn't be there anyway for existing views
in existing databases.

What I propose to do is apply the attached back through 9.3, so that at
least users are no worse off than they were pre-9.3.  I will look at
fixing the problem of a post-view-creation column drop later, but I
suspect we'll end up not back-patching those changes.

regards, tom lane

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 9573a9b..0631bc2 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*** _outValue(StringInfo str, const Value *v
*** 2506,2512 
  			break;
  		case T_String:
  			appendStringInfoChar(str, '"');
! 			_outToken(str, value->val.str);
  			appendStringInfoChar(str, '"');
  			break;
  		case T_BitString:
--- 2506,2513 
  			break;
  		case T_String:
  			appendStringInfoChar(str, '"');
! 			if (value->val.str[0] != '\0')
! _outToken(str, value->val.str);
  			appendStringInfoChar(str, '"');
  			break;
  		case T_BitString:
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 0781ac8..7237e5d 100644
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
*** set_relation_column_names(deparse_namesp
*** 3086,3092 
  		i = 0;
  		foreach(lc, rte->eref->colnames)
  		{
! 			real_colnames[i] = strVal(lfirst(lc));
  			i++;
  		}
  	}
--- 3086,3101 
  		i = 0;
  		foreach(lc, rte->eref->colnames)
  		{
! 			/*
! 			 * If the column name shown in eref is an empty string, then it's
! 			 * a column that was dropped at the time of parsing the query, so
! 			 * treat it as dropped.
! 			 */
! 			char	   *cname = strVal(lfirst(lc));
! 
! 			if (cname[0] == '\0')
! cname = NULL;
! 			real_colnames[i] = cname;
  			i++;
  		}
  	}
diff --git a/src/test/regress/expected/create_view.out b/src/test/regress/expected/create_view.out
index 06b2037..e2d4276 100644
*** a/src/test/regress/expected/create_view.out
--- b/src/test/regress/expected/create_view.out
*** select pg_get_viewdef('vv6', true);
*** 1332,1337 
--- 1332,1389 
JOIN tt13 USING (z);
  (1 row)
  
+ --
+ -- Check some cases involving dropped columns in a function's rowtype result
+ --
+ create table tt14t (f1 text, f2 text, f3 text, f4 text);
+ insert into tt14t values('foo', 'bar', 'baz', 'quux');
+ alter table tt14t drop column f2;
+ create function tt14f() returns setof tt14t as
+ $$
+ declare
+ rec1 record;
+ begin
+ for rec1 in select * from tt14t
+ loop
+ return next rec1;
+ end loop;
+ end;
+ $$
+ language plpgsql;
+ create view tt14v as select t.* from tt14f() t;
+ select pg_get_viewdef('tt14v', true);
+  pg_get_viewdef 
+ 
+   SELECT t.f1, +
+  

Re: [HACKERS] RLS Design

2014-07-18 Thread Brightwell, Adam
>
> I think we do want a way to modify policies.  However, we tend to
> avoid syntax that involves unnatural word order, as this certainly
> does.  Maybe it's better to follow the example of CREATE RULE and
> CREATE TRIGGER and do something this instead:
>
> CREATE POLICY policy_name ON table_name USING quals;
> ALTER POLICY policy_name ON table_name USING quals;
> DROP POLICY policy_name ON table_name;
>
> The advantage of this is that you can regard "policy_name ON
> table_name" as the identifier for the policy throughout the system.
> You need some kind of identifier of that sort anyway to support
> COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.


Sounds good.  I certainly think it makes a lot of sense to include the
ALTER functionality, if for no other reason than ease of use.

Another item to consider, though I believe it can come later, is per-action
policies.  Following the above suggested syntax, perhaps that might look
like the following?

CREATE POLICY policy_name ON table_name FOR action USING quals;
ALTER POLICY policy_name ON table_name FOR action USING quals;
DROP POLICY policy_name ON table_name FOR action;

I was also giving some thought to the use of "POLICY", perhaps I am wrong,
but it does seem it could be at risk of becoming ambiguous down the road.
 I can't think of any specific examples at the moment, but my concern is
what happens if we wanted to add another "type" of policy, whatever that
might be, later?  Would it make more sense to go ahead and qualify this a
little more with "ROW SECURITY POLICY"?

Thanks,
Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com


Re: [HACKERS] how to reach D5 in tuplesort.c 's polyphase merge algorithm?

2014-07-18 Thread 土卜皿
2014-07-19 6:26 GMT+08:00 Tom Lane :

> =?UTF-8?B?5Zyf5Y2c55q/?=  writes:
> >   for studying polyphase merge algorithm of tuplesort.c, I use ddd and
> > apend a table, which has a schema as follows:
> > ...
> > and has 36684 records, and every record is like:
> > id  code  article  name  department
> > 31800266\NMachault77
>
> > and for getting into external sort, I type the following command:
>
> > select * from towns order by name desc;
>
> > but I found it need not reach D5 and D6 during sorting,
>
> That doesn't sound like enough data to force it to spill to disk at all;
> at least not unless you turn down work_mem to some very small value.
>
>

hi, Tom
  thanks a lot!


>
work_mem you said remind me one more thing I did, I tried to change BLCKSZ
= 8192/2, and successfully compiled, but I got a error when executing
initdb

Dillon


Re: [HACKERS] how to reach D5 in tuplesort.c 's polyphase merge algorithm?

2014-07-18 Thread Tom Lane
=?UTF-8?B?5Zyf5Y2c55q/?=  writes:
>   for studying polyphase merge algorithm of tuplesort.c, I use ddd and
> apend a table, which has a schema as follows:
> ...
> and has 36684 records, and every record is like:
> id  code  article  name  department
> 31800266\NMachault77

> and for getting into external sort, I type the following command:

> select * from towns order by name desc;

> but I found it need not reach D5 and D6 during sorting,

That doesn't sound like enough data to force it to spill to disk at all;
at least not unless you turn down work_mem to some very small value.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] how to reach D5 in tuplesort.c 's polyphase merge algorithm?

2014-07-18 Thread 土卜皿
hi, all
  for studying polyphase merge algorithm of tuplesort.c, I use ddd and
apend a table, which has a schema as follows:

CREATE TABLE Towns (
   id SERIAL UNIQUE NOT NULL,
   code VARCHAR(10) NOT NULL, -- Only unique inside a department
   article TEXT,
   name TEXT NOT NULL, -- Names are not really unique, for instance
'Sainte-Croix'
   department VARCHAR(4) NOT NULL REFERENCES Departments (code),
   UNIQUE (code, department)
   -- UNIQUE (name, department) -- Not perfectly unique but almost
  );

and has 36684 records, and every record is like:
id  code  article  name  department
31800266\NMachault77

and for getting into external sort, I type the following command:

select * from towns order by name desc;

but I found it need not reach D5 and D6 during sorting,  I thought that the
reason is the amount of runs is 3 (too small) before merging, for generate
more runs, I shuffled the records
when inputting them and got the same result.

so that I want some help, what schema and records do I as least need for
reaching D5 and D6?  any advice will be appreciated!

BEST REGAERDS
Dillon


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-07-18 Thread Alvaro Herrera
Jeff Janes wrote:

> Should we push the refactoring through anyway?  I have a hard time
> believing that pg_dump is going to be the only client program we ever have
> that will need process-level parallelism, even if this feature itself does
> not need it.  Why make the next person who comes along re-invent that
> re-factoring of this wheel?

I gave the refactoring patch a look some days ago, and my conclusion was
that it is reasonably sound but it needed quite some cleanup in order
for it to be committable.  Without any immediate use case, it's hard to
justify going through all that effort.  Maybe we can add a TODO item and
have it point to the posted patch, so that if in the future we see a
need for another parallel client program we can easily rebase the
current patch.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for updating src/timezone

2014-07-18 Thread Tom Lane
John Cochran  writes:
> Did a diff between the 2010c version of localtime.c and the postgres
> version and saw a lot more differences than what could have been expected
> from simple reformatting and adaptation. So I installed gitk and took a
> look at the change history of localtime.c

> Well, looking at the results, it pretty much put paid to the concept of
> ever importing the IANA code unaltered into the postgres tree. In fact, it
> looks like the original version of localtime.c was based upon one of the
> 2004a thru 2004c versions of the IANA code and since then has been
> independently maintained.

That's certainly true, but I'm not sure that we should give up all that
easily on getting closer to the IANA code again.  For one thing, some of
the thrashing had to do with the fact that we went over to 64-bit
timestamps sooner than the IANA crew did.  I'm assuming that they insist
on 64-bit arithmetic now; we do too, so for instance some of the typedef
hacking might no longer be necessary.

AFAICT from a quick scan of the git logs, the main thing we've done to the
API that's not in IANA is invent pg_next_dst_boundary().  Perhaps that
could be pushed into a separate source file.

Even if we only got to a point where the sources were the same except for
a few narrow patches, it would make tracking their updates much easier.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-07-18 Thread Jeff Janes
On Wed, Jul 16, 2014 at 5:30 AM, Dilip kumar  wrote:

>  On 16 July 2014 12:13 Magnus Hagander Wrote,
>
> >>Yeah, those are exactly my points. I think it would be significantly
> simpler to do it that way, rather than forking and threading. And also
> easier to make portable...
>
> >>(and as a  optimization on Alvaros suggestion, you can of course reuse
> the initial connection as one of the workers as long as you got the full
> list of tasks from it up front, which I think you  do anyway in order to do
> sorting of tasks...)
>
> Oh, I got your point, I will update my patch and send,
>
> Now we can completely remove vac_parallel.h file and no need of
> refactoring also:)
>
> Thanks & Regards,
>
> Dilip Kumar
>

Should we push the refactoring through anyway?  I have a hard time
believing that pg_dump is going to be the only client program we ever have
that will need process-level parallelism, even if this feature itself does
not need it.  Why make the next person who comes along re-invent that
re-factoring of this wheel?

Cheers,

Jeff


Re: [HACKERS] Proposal for updating src/timezone

2014-07-18 Thread John Cochran
On Fri, Jul 18, 2014 at 1:21 PM, Tom Lane  wrote:

> John Cochran  writes:
> > My proposal is the have the following directory structure
>


> ...
> > 1. I would have liked to recommend 2 sub-directories underneath
>
...

>
>
I have exactly zero expectation of using their Makefile, so this is not a
> concern.
>
> > 2. Depositing the IANA code unaltered would also deposit some "junk"
> files
>
...

> We're not doing that either, IMO.  Why should we invest 500K of tarball
> space on that?
>


Understandable, and both issues are in the category of "Things I didn't
like"

Did a diff between the 2010c version of localtime.c and the postgres
version and saw a lot more differences than what could have been expected
from simple reformatting and adaptation. So I installed gitk and took a
look at the change history of localtime.c

Well, looking at the results, it pretty much put paid to the concept of
ever importing the IANA code unaltered into the postgres tree. In fact, it
looks like the original version of localtime.c was based upon one of the
2004a thru 2004c versions of the IANA code and since then has been
independently maintained.

In any case, I think my best path forward is to look at the change history
of the source files under src/timezone and determine the actual origin of
each file based upon the git history and the IANA archive. Then see what's
needed to bring the code up to date based upon the IANA code. Due to the
configure option "--with-system-tzdata", I assume that postgres needs to
remain compatible with the output from the IANA zic program. That combined
with the warning messages about some of the entries in the current timezone
data not being compatible with pre-2013 code worries me a bit since the
2010c code from IANA does not support the output from the 2014e zic.
-- 

There are two ways of constructing a software design: One way is to make it
so simple that there are obviously no deficiencies and the other way is to
make it so complicated that there are no obvious deficiencies. — C.A.R.
Hoare


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Peter Geoghegan
On Fri, Jul 18, 2014 at 11:32 AM, Peter Geoghegan  wrote:
> Well, maybe. If the genericity of this syntax isn't what people want,
> I may go with something else.

By the way, I didn't mention that there is now also the optional
ability to specify a "merge on" unique index directly in DML. It would
be much nicer to specify a sort key instead, but that might be tricky
in the general case. I imagine that other systems solve the problem by
being very restrictive in what will be (implicitly) accepted as a
merge-on index. Seemingly there are problems with all major SQL MERGE
implementations, so I don't necessarily expect to be able to draw
lessons from them in any way here.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-07-18 Thread Robert Haas
On Wed, Jul 16, 2014 at 10:04 PM, Brightwell, Adam
 wrote:

> Yes, I just tested it and the following would work from a grammar
> perspective:
>
> ALTER TABLE  POLICY ADD  (policy_quals)
> ALTER TABLE  POLICY DROP 
>
> Though, it would obviously require the addition of POLICY to the list of
> unreserved keywords.  I don't suspect that would be a concern, as it is not
> "reserved", but thought I would point it out just in case.
>
> Another thought I had was, would we also want the following, so that
> policies could be modified?
>
> ALTER TABLE  POLICY ALTER  (policy_quals)

I think we do want a way to modify policies.  However, we tend to
avoid syntax that involves unnatural word order, as this certainly
does.  Maybe it's better to follow the example of CREATE RULE and
CREATE TRIGGER and do something this instead:

CREATE POLICY policy_name ON table_name USING quals;
ALTER POLICY policy_name ON table_name USING quals;
DROP POLICY policy_name ON table_name;

The advantage of this is that you can regard "policy_name ON
table_name" as the identifier for the policy throughout the system.
You need some kind of identifier of that sort anyway to support
COMMENT ON, SECURITY LABEL, and ALTER EXTENSION ADD/DROP for policies.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Peter Geoghegan
On Fri, Jul 18, 2014 at 11:23 AM, Andres Freund  wrote:
> Meh. A understandable syntax wouldn't require the pullups with a special
> scan node and such. I think you're attempting a sort of genericity
> that's making your (important!) goal much harder to reach.

Well, maybe. If the genericity of this syntax isn't what people want,
I may go with something else. At the suggestion of Robert I did start
a dedicated thread on that question months back (explicitly putting
aside the nitty-gritty details of the implementation), but that didn't
go anywhere.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Andres Freund
On 2014-07-18 11:14:34 -0700, Peter Geoghegan wrote:
> On Fri, Jul 18, 2014 at 11:06 AM, Andres Freund  
> wrote:
> > I don't see why you'd need such a node at all if we had a fully builtin
> > UPSERT. The whole stuff with ON CONFLICT SELECT FOR UPDATE and then
> > UPDATE ... FROM c CONFLICTS is too complicated and exposes stuff that
> > barely anybody will understand, let alone use correctly in queries they
> > write themselves.
> 
> I accept that there will be a need for certain restrictions. Most
> obviously, if you update the target table referencing a CTE like this,
> not using the special CONFLICTS clause in the UPDATE (or DELETE) is an
> error. And as I mentioned, you may only join the projected duplicates
> to the UPDATE ModifyTable - an attempt to join any more relations is
> an error. In short, this *is* a fully built-in upsert.

Meh. A understandable syntax wouldn't require the pullups with a special
scan node and such. I think you're attempting a sort of genericity
that's making your (important!) goal much harder to reach.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-07-18 Thread Petr Jelinek

On 08/07/14 02:14, Tom Lane wrote:

Also, the set of functions provided still needs more thought, because the
resolution of overloaded functions doesn't really work as nicely as all
that.


I am honestly considering just removing special case for int8 for now, 
the sql standard width_bucket has only support for float8 and numeric, 
maybe it's enough for this one also...


What's your opinion on that?

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Peter Geoghegan
On Fri, Jul 18, 2014 at 11:06 AM, Andres Freund  wrote:
> I don't see why you'd need such a node at all if we had a fully builtin
> UPSERT. The whole stuff with ON CONFLICT SELECT FOR UPDATE and then
> UPDATE ... FROM c CONFLICTS is too complicated and exposes stuff that
> barely anybody will understand, let alone use correctly in queries they
> write themselves.

I accept that there will be a need for certain restrictions. Most
obviously, if you update the target table referencing a CTE like this,
not using the special CONFLICTS clause in the UPDATE (or DELETE) is an
error. And as I mentioned, you may only join the projected duplicates
to the UPDATE ModifyTable - an attempt to join any more relations is
an error. In short, this *is* a fully built-in upsert.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Andres Freund
On 2014-07-18 10:53:36 -0700, Peter Geoghegan wrote:
> On Fri, Jul 18, 2014 at 10:46 AM, Andres Freund  
> wrote:
> > I think the things that use "wierd" visibility semantics are pretty much
> > all doing that internally (things being EvalPlanQual stuff for
> > INSERT/UPDATE/DELETE and the referential integrity triggers). I don't
> > see sufficient reason why we should break away from that here. Yes,
> > there's stuff that cannot be done when it's done internally, but we can
> > live with those not being possible.
> 
> The whole point of what I was proposing was that those semantics would
> only apply to a special tid scan node. Perhaps I missed something, but
> I'm not sure why you'd consider that I was breaking away from that
> here at all.

I don't see why you'd need such a node at all if we had a fully builtin
UPSERT. The whole stuff with ON CONFLICT SELECT FOR UPDATE and then
UPDATE ... FROM c CONFLICTS is too complicated and exposes stuff that
barely anybody will understand, let alone use correctly in queries they
write themselves.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Peter Geoghegan
On Fri, Jul 18, 2014 at 10:46 AM, Andres Freund  wrote:
> I think the things that use "wierd" visibility semantics are pretty much
> all doing that internally (things being EvalPlanQual stuff for
> INSERT/UPDATE/DELETE and the referential integrity triggers). I don't
> see sufficient reason why we should break away from that here. Yes,
> there's stuff that cannot be done when it's done internally, but we can
> live with those not being possible.

The whole point of what I was proposing was that those semantics would
only apply to a special tid scan node. Perhaps I missed something, but
I'm not sure why you'd consider that I was breaking away from that
here at all.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Making joins involving ctid work for the benefit of UPSERT

2014-07-18 Thread Andres Freund
Hi,

On 2014-07-17 18:18:41 -0700, Peter Geoghegan wrote:
> I'm working on UPSERT again. I think that in order to make useful
> progress, I'll have to find a better way of overcoming the visibility
> issues (i.e. the problem of what to do about a
> still-in-progress-to-our-snapshot row being locked at READ COMMITTED
> isolation level [1][2]).

Agreed.

> I've made some tentative additions to my earlier patch to change the syntax:
> 
> postgres=# explain analyze
> with c as (
>   insert into ints(key, val) values (1, 'a'), (2, 'b')
>   on conflict
>   select * for update)
>   update ints set val = c.val from c conflicts;

I still don't agree that this is a sane syntax for upsert. Avoiding it
would reduce the scope of the problems you have measureably. We should
provide a syntax that does the UPSERT itself, instead of delegating that
to the user as you're proposing above.  Afaics nearly none of the
problems you're debating in your email would exist if upsert were
entirely done internally.

I think the things that use "wierd" visibility semantics are pretty much
all doing that internally (things being EvalPlanQual stuff for
INSERT/UPDATE/DELETE and the referential integrity triggers). I don't
see sufficient reason why we should break away from that here. Yes,
there's stuff that cannot be done when it's done internally, but we can
live with those not being possible.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for updating src/timezone

2014-07-18 Thread Tom Lane
John Cochran  writes:
> My proposal is the have the following directory structure

> .../src/timezone - Would have only PostgreSQL specific code
> .../src/timezone/tznames - would be unaltered from current (optionally this
> could be removed)
> .../src/timezone/iana - would contain unaltered code and data from IANA

> The .../src/timezone/data directory would be removed.

I am not in favor of doing anything to the data/ directory.  If we can
have unaltered tzcode source files in their own subdirectory, that'd be
great, but I see no advantage to smashing the IANA code and data files
together.  Furthermore, doing so would break the existing ability to just
cherry-pick tzdata update commits into back branches.

Part of my thought here is that we do generally just dump tzdata updates
into our tree without much thought, but I don't think that will ever be
the case for tzcode updates; we'll want to look at those with more care,
and probably we'll not back-patch them.

> 1. I would have liked to recommend 2 sub-directories underneath
> .../src/timezone/iana named "code" and "data", but unfortunately have to
> suggest untarring both the code and data files directly into the iana
> directory. The reason for this is that the IANA code does not compile using
> the IANA Makefile unless the script yearistype.sh is present and that
> script is currently present in the data tar file, not the code tar file.

I have exactly zero expectation of using their Makefile, so this is not a
concern.

> 2. Depositing the IANA code unaltered would also deposit some "junk" files
> into the directory. The files would mainly consist of man pages and html
> files containing documentation for the timezone code. The extra files would
> consume approximately 500 kilobytes above what's actually needed, but
> otherwise wouldn't have any adverse effects.

We're not doing that either, IMO.  Why should we invest 500K of tarball
space on that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] Suppress "autovacuum: found orphan temp table" message

2014-07-18 Thread Andres Freund
Hi,

On 2014-07-18 23:38:09 +0900, MauMau wrote:
> My customer reported a problem that the following message is output too
> often.
> 
> LOG:  autovacuum: found orphan temp table "pg_temp_838"."some_table" in
> database "some_db"
> LOG:  autovacuum: found orphan temp table "pg_temp_902"."some_table" in
> database "some_db"

So they had server crashes of some form before - otherwise they
shouldn't see this because during ordinary shutdown the schema will have
been dropped. C.f. RemoveTempRelationsCallback().

> 1. Why and when are these messages are output?  Do we have to do
> something?

Yes, you should investigate how the situation came to be.

> 2. Won't they use up disk space?

Yes.

> 3. Doesn't the output processing of these messages or its cause affect
> performance?  We happen to be facing a performance problem, the cause of
> which we haven't found yet.

Meh. If that's the bottleneck you've bigger problems.

> So, I propose a simple fix to change the LOG level to DEBUG1.  I don't know
> which of DEBUG1-DEBUG5 is appropriate, and any level is OK.  Could you
> include this in 9.2.9?

Surely that's the wrong end to tackle this from. Hiding actual problems
is a seriously bad idea.

It'd be nice if we had infrastructure to do this at startup, but we
don't...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal for updating src/timezone

2014-07-18 Thread John Cochran
Given my recent examination of the src/timezone subtree and the current
code at IANA for timezone information and tools, I proposing the following
changes and would like to first get group consensus on the change prior to
investing any major effort.

My proposal is the have the following directory structure

.../src/timezone - Would have only PostgreSQL specific code
.../src/timezone/tznames - would be unaltered from current (optionally this
could be removed)
.../src/timezone/iana - would contain unaltered code and data from IANA

The .../src/timezone/data directory would be removed.

In doing this, any future updates to the IANA code should be trivial to
incorporate into the postgres tree.

Reasons for change.
1. The current timezone code is based upon, according to the comments,
version 2010c from IANA. IANA's latest code is version 2014e and there are
substantial changes between those versions (as well as 35 intermediate
versions ... Yes, IANA changes and updates the code regularly and
frequently)

2. The current timezone data has a few entries that are not properly
handled by any pre-2013 code so an update of some sort is needed anyway.

3. The latest code from IANA compiles with GCC using a kitchen sink array
of warning options without any generated warnings. This should alleviate
any  concerns about not accepting any code unless it's warning free. In
case anyone is interested, the GCC options are: -Dlint -g3 -O3 -fno-common
-fstrict-aliasing  -Wall -Wextra -Wbad-function-cast -Wcast-align
-Wcast-qual -Wformat=2 -Winit-self -Wmissing-declarations
-Wmissing-noreturn -Wmissing-prototypes -Wnested-externs -Wno-address
-Wno-cast-qual -Wno-format-nonliteral -Wno-sign-compare
-Wno-sign-conversion -Wno-type-limits -Wno-unused-parameter
-Woverlength-strings -Wpointer-arith -Wshadow -Wstrict-prototypes
-Wsuggest-attribute=const -Wsuggest-attribute=noreturn
-Wsuggest-attribute=pure -Wtrampolines -Wwrite-strings

Aspects of change that I'm not happy with.
1. I would have liked to recommend 2 sub-directories underneath
.../src/timezone/iana named "code" and "data", but unfortunately have to
suggest untarring both the code and data files directly into the iana
directory. The reason for this is that the IANA code does not compile using
the IANA Makefile unless the script yearistype.sh is present and that
script is currently present in the data tar file, not the code tar file.
And that unfortunately means that splitting the IANA code and data into
different directories would violate the objective of being able to deposit
the files unaltered into postgres.

2. Depositing the IANA code unaltered would also deposit some "junk" files
into the directory. The files would mainly consist of man pages and html
files containing documentation for the timezone code. The extra files would
consume approximately 500 kilobytes above what's actually needed, but
otherwise wouldn't have any adverse effects.

Thank you for reading this proposal,
John Cochran
-- 

There are two ways of constructing a software design: One way is to make it
so simple that there are obviously no deficiencies and the other way is to
make it so complicated that there are no obvious deficiencies. — C.A.R.
Hoare


Re: [HACKERS] gaussian distribution pgbench

2014-07-18 Thread Fabien COELHO


Please find attached 2 patches, which are a split of the patch discussed 
in this thread.


(A) add gaussian & exponential options to pgbench \setrandom
the patch includes sql test files.

There is no change in the *code* from previous already reviewed 
submissions, so I do not think that it needs another review on that 
account.


However I have (yet again) reworked the *documentation* (for Andres Freund 
& Robert Haas), in particular both descriptions now follow the same 
structure (introduction, formula, intuition, rule of thumb and 
constraint). I have differentiated the concept and the option by putting 
the later in  tags, and added a link to the corresponding 
wikipedia pages.



Please bear in mind that:
 1. English is not my native language.
 2. this is not easy reading... this is maths, to read slowly:-)
 3. word smithing contributions are welcome.

I assume somehow that a user interested in gaussian & exponential 
distributions must know a little bit about probabilities...




(B) add pgbench test variants with gauss & exponential.

I have reworked the patch so as to avoid copy pasting the 3 test cases, as 
requested by Andres Freund, thus this is new, although quite simple, code. 
I have also added explanations in the documentation about how to interpret 
the "decile" outputs, so as to hopefully address Robert Haas comments.


--
Fabien.diff --git a/contrib/pgbench/README b/contrib/pgbench/README
new file mode 100644
index 000..6881256
--- /dev/null
+++ b/contrib/pgbench/README
@@ -0,0 +1,5 @@
+# gaussian and exponential tests
+# with XXX as "expo" or "gauss"
+psql test < test-init.sql
+./pgbench -M prepared -f test-XXX-run.sql -t 100 -P 1 -n test
+psql test < test-XXX-check.sql
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 4aa8a50..a80c0a5 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include 
 #ifdef HAVE_SYS_SELECT_H
 #include 
 #endif
@@ -98,6 +99,8 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #define LOG_STEP_SECONDS	5	/* seconds between log messages */
 #define DEFAULT_NXACTS	10		/* default nxacts */
 
+#define MIN_GAUSSIAN_THRESHOLD		2.0	/* minimum threshold for gauss */
+
 int			nxacts = 0;			/* number of transactions per client */
 int			duration = 0;		/* duration in seconds */
 
@@ -471,6 +474,76 @@ getrand(TState *thread, int64 min, int64 max)
 	return min + (int64) ((max - min + 1) * pg_erand48(thread->random_state));
 }
 
+/*
+ * random number generator: exponential distribution from min to max inclusive.
+ * the threshold is so that the density of probability for the last cut-off max
+ * value is exp(-exp_threshold).
+ */
+static int64
+getExponentialrand(TState *thread, int64 min, int64 max, double exp_threshold)
+{
+	double cut, uniform, rand;
+	assert(exp_threshold > 0.0);
+	cut = exp(-exp_threshold);
+	/* erand in [0, 1), uniform in (0, 1] */
+	uniform = 1.0 - pg_erand48(thread->random_state);
+	/*
+	 * inner expresion in (cut, 1] (if exp_threshold > 0),
+	 * rand in [0, 1)
+	 */
+	assert((1.0 - cut) != 0.0);
+	rand = - log(cut + (1.0 - cut) * uniform) / exp_threshold;
+	/* return int64 random number within between min and max */
+	return min + (int64)((max - min + 1) * rand);
+}
+
+/* random number generator: gaussian distribution from min to max inclusive */
+static int64
+getGaussianrand(TState *thread, int64 min, int64 max, double stdev_threshold)
+{
+	double		stdev;
+	double		rand;
+
+	/*
+	 * Get user specified random number from this loop, with
+	 * -stdev_threshold < stdev <= stdev_threshold
+	 *
+	 * This loop is executed until the number is in the expected range.
+	 *
+	 * As the minimum threshold is 2.0, the probability of looping is low:
+	 * sqrt(-2 ln(r)) <= 2 => r >= e^{-2} ~ 0.135, then when taking the average
+	 * sinus multiplier as 2/pi, we have a 8.6% looping probability in the
+	 * worst case. For a 5.0 threshold value, the looping probability
+	 * is about e^{-5} * 2 / pi ~ 0.43%.
+	 */
+	do
+	{
+		/*
+		 * pg_erand48 generates [0,1), but for the basic version of the
+		 * Box-Muller transform the two uniformly distributed random numbers
+		 * are expected in (0, 1] (see http://en.wikipedia.org/wiki/Box_muller)
+		 */
+		double rand1 = 1.0 - pg_erand48(thread->random_state);
+		double rand2 = 1.0 - pg_erand48(thread->random_state);
+
+		/* Box-Muller basic form transform */
+		double var_sqrt = sqrt(-2.0 * log(rand1));
+		stdev = var_sqrt * sin(2.0 * M_PI * rand2);
+
+		/*
+ 		 * we may try with cos, but there may be a bias induced if the previous
+		 * value fails the test? To be on the safe side, let us try over.
+		 */
+	}
+	while (stdev < -stdev_threshold || stdev >= stdev_threshold);
+
+	/* stdev is in [-threshold, threshold), normalization to [0,1) */
+	rand = (stdev + stdev_threshold) / (stdev_threshold * 2.0);
+
+	/* return int64 random number within between min and max */
+	return min + (int64)((max - min + 1)

Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-07-18 Thread MauMau

From: "Magnus Hagander" 
On Fri, Jul 18, 2014 at 5:33 AM, Amit Kapila  
wrote:

On Thu, Jul 17, 2014 at 4:51 PM, Magnus Hagander 
wrote:


Did anyone actually test this patch? :)

I admit I did not build it on Windows specifically because I assumed
that was done as part of the development and review. And the changes
to pg_event.c can never have built, since the file does not include
the required header.


I have tested it on Windows and infact on Linux as well to see if there
is any side impact before marking it as Ready For Committer.

It seems to me that the required header is removed in last version
(pg_ctl_eventsrc_v11) where MessageBox() related changes have been
removed from patch as per recent discussion.  Sorry for not being able
to check last version posted.


Gotcha. Thanks for clarifying, and I apologize if I came across a bit
harsh even with the smiley.


I'm sorry to have caused both of you trouble.  I have to admit that I didn't 
compile the source when I removed the MessageBox()-related changes.  The 
attached patch fixes that.  I confirmed successful build this time.


Thank you for committing, Magnus-san, and thank you very much for kind and 
repeated review and help, Amit-san.



Regards
MauMau


pgevent.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-07-18 Thread Petr Jelinek

On 16/07/14 21:35, Pavel Stehule wrote:

The performance difference is about 20% (+/- few depending on the
array size), I don't know if that's bad enough to warrant
type-specific implementation. I personally don't know how to make
the generic implementation much faster than it is now, except maybe
by turning it into aggregate which would cache the deconstructed
version of the array, but that change semantics quite a bit and is
probably not all that desirable.


I am not sure if our API is enough to do it - there are no any simple
support for immutable parameters.


Just to clarify, the ~20% performance difference is with separate 
generic implementation for fixed width types (most of the time seems to 
be spent in the FunctionCallInvoke part, I even tryed to use sortsupport 
instead but it does not seem to help much).



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gaussian distribution pgbench

2014-07-18 Thread Fabien COELHO



For example,  when we set the number of transaction 10,000 (-t 1),
range of aid is 100,000,
and --exponential is 10, decile percents is under following as you know.

decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
highest/lowest percent of the range: 9.5% 0.0%

They mean that,
#number of access in range of aid (from decile percents):
 1 to 10,000 => 6,320 times
 10,001 to 20,000=> 2,330 times
 20,001 to 30,000=> 860 times
 ...
 90,001 to 10,  => 0 times

#number of access in range of aid (from highest/lowest percent of the
range):
1 to 1,000=> 950 times
...
99,001 to 10,   => 0 times

that's all.

Their information is easy to understand distribution of access probability,
isn't it?
Maybe I and Fabien-san have a knowledge of mathematics, so we think decile
percentage is common sense.
But if it isn't common sense, I agree with adding about these explanation
in the documents.


What we are talking about is the "summary" at the end of the run, which is 
expected to be compact, hence the terse few lines.


I'm not sure how to make it explicit without extending the summary too 
much, so it would not be a summary anymore:-)


My initial assumption is that anyone interested enough in changing the 
default uniform distribution for a test would know about decile, but that 
seems to be optimistic.


Maybe it would be okay to keep a terse summary but to expand the 
documentation to explain what it means, as you suggested above...


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-07-18 Thread Petr Jelinek

On 18/07/14 13:18, Andres Freund wrote:

On 2014-07-18 13:08:24 +0200, Petr Jelinek wrote:

On 18/07/14 00:41, Andres Freund wrote:

Wouldn't it be better to use sscanf()? That should work with large
inputs across platforms.



Well, sscanf does not do much validation and silently truncates too big
input (which is why I replaced it with strtoull, original patch did have
sscanf)


Well, the checks on length you've added should catch that when adapted
properly.


I don't see a portable way how to validate too big input values when 
using sscanf.




Maybe it's time to pursue
http://archives.postgresql.org/message-id/20140603144654.GM24145%40awork2.anarazel.de
further :(



Yes, I've seen that patch and was quite sad that it didn't make it in core.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New functions in sslinfo module

2014-07-18 Thread Michael Paquier
On Fri, Jul 18, 2014 at 3:12 PM, Воронин Дмитрий
 wrote:
> I make a new version of patch. I corrected your notes for my previous
> version of patch. Could you test it? Thank you.

I just had a look at the new version of this patch, and this is
lacking a couple of things.

First, this patch has been developed using a tarball of 9.3.4. sslinfo
is not a module changing frequently so you are lucky that this is not
generating diffs with HEAD but you should try to get yourself familiar
with git and submit patches that are based on HEAD to facilitate their
review and integration. You can have a look here to begin with:
https://wiki.postgresql.org/wiki/Working_with_GIT

Then, here are more comments about the patch:
- Update sslinfo to 1.1. Here are all the details about how to do it:
http://www.postgresql.org/docs/devel/static/extend-extensions.html#AEN57147
Well, basically this is only creating sslinfo--1.0--1.1.sql to be able
to do an upgrade, copy sslinfo--1.0.sql to sslinfo--1.1.sql with the
new objects defined and dumping sslinfo.control.
- In your previous patches I saw that you defined the new functions in
sslinfo--1.0.sql but in the new version of your patch it is not the
case.
- Please remove all the diff noise in sslinfo.sgml, like this one:
*** 157,167 
 
  
   ssl_client_dn_field(fieldname text) returns text
-  
-   ssl_client_dn_field
-  
  
  
  
--- 157,167 
 

 
+ 
+  ssl_client_dn_field
+ 
Your patch should only include documentation for the new functions.
- Please remove whitespaces, there are quite a lot of them.
- in 9.5, PG_FUNCTION_INFO_V1 does the function declaration for you,
making following block useless:
[...]
+Datum  ssl_is_used(PG_FUNCTION_ARGS);
+Datum  ssl_version(PG_FUNCTION_ARGS);
+Datum  ssl_cipher(PG_FUNCTION_ARGS);
[...]
- Please avoid if blocks of this type, this is not consistent with the
project style:
if (SRF_IS_FIRSTCALL()) {
[...]
}
One newline for each bracket. Here is the manual page referencing code
conventions:
http://www.postgresql.org/docs/current/static/source.html

Most of those comments have been already mentioned by Andreas in one
of his emails upthread but you have obviously not solved the issues he
has reported.

This looks like a useful feature, but at this point of the commit fest
and considering the number of structural changes still needed I will
mark this patch as "Returned with feedback". Feel free to resubmit to
the next commit fest in August with an updated patch of course!

Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-07-18 Thread Andres Freund
On 2014-07-18 13:08:24 +0200, Petr Jelinek wrote:
> On 18/07/14 00:41, Andres Freund wrote:
> >On 2014-06-27 00:51:02 +0200, Petr Jelinek wrote:
> >>{
> >>switch (c)
> >>{
> >>@@ -227,6 +229,33 @@ main(int argc, char *argv[])
> >>XLogFromFileName(optarg, &minXlogTli, 
> >> &minXlogSegNo);
> >>break;
> >>
> >>+   case 's':
> >>+   if (optarg)
> >>+   {
> >>+#ifdef HAVE_STRTOULL
> >>+   set_sysid = strtoull(optarg, &endptr, 
> >>10);
> >>+#else
> >>+   set_sysid = strtoul(optarg, &endptr, 
> >>10);
> >>+#endif
> >
> >Wouldn't it be better to use sscanf()? That should work with large
> >inputs across platforms.
> >
> 
> Well, sscanf does not do much validation and silently truncates too big
> input (which is why I replaced it with strtoull, original patch did have
> sscanf)

Well, the checks on length you've added should catch that when adapted
properly.

>, also I think the portability of sscanf might not be as good, see
> 9d7ded0f4277f5c0063eca8e871a34e2355a8371 commit.

Fair point. But I don't think the arguments why using strtoul instead of
strtoull is safe hold much water here. In the pg_stat_statement case the
worst that could happen is that the rowcount isn't accurate. Here it's
setting a wrong system identifier. Not really the same.

Maybe it's time to pursue
http://archives.postgresql.org/message-id/20140603144654.GM24145%40awork2.anarazel.de
further :(

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-07-18 Thread Petr Jelinek

On 18/07/14 00:41, Andres Freund wrote:

On 2014-06-27 00:51:02 +0200, Petr Jelinek wrote:

{
switch (c)
{
@@ -227,6 +229,33 @@ main(int argc, char *argv[])
XLogFromFileName(optarg, &minXlogTli, 
&minXlogSegNo);
break;

+   case 's':
+   if (optarg)
+   {
+#ifdef HAVE_STRTOULL
+   set_sysid = strtoull(optarg, &endptr, 
10);
+#else
+   set_sysid = strtoul(optarg, &endptr, 
10);
+#endif


Wouldn't it be better to use sscanf()? That should work with large
inputs across platforms.



Well, sscanf does not do much validation and silently truncates too big 
input (which is why I replaced it with strtoull, original patch did have 
sscanf), also I think the portability of sscanf might not be as good, 
see 9d7ded0f4277f5c0063eca8e871a34e2355a8371 commit.




+   /*
+* Validate input, we use strspn 
because strtoul(l) accepts
+* negative numbers and silently 
converts them to unsigned
+*/
+   if (set_sysid == 0 || errno != 0 ||
+   endptr == optarg || *endptr != 
'\0' ||
+   strspn(optarg, "0123456789") != 
strlen(optarg))
+   {
+   fprintf(stderr, _("%s: invalid 
argument for option -s\n"), progname);
+   fprintf(stderr, _("Try \"%s --help\" 
for more information.\n"), progname);
+   exit(1);
+   }


Maybe: 'invalid argument \"%s\" ...'?



Ok


@@ -1087,6 +1147,7 @@ usage(void)
printf(_("  -o OID   set next OID\n"));
printf(_("  -O OFFSETset next multitransaction offset\n"));
printf(_("  -V, --versionoutput version information, then exit\n"));
+   printf(_("  -s [SYSID]   set system identifier (or generate 
one)\n"));
printf(_("  -x XID   set next transaction ID\n"));
printf(_("  -?, --help   show this help, then exit\n"));
printf(_("\nReport bugs to .\n"));


I think we usually try to keep the options roughly alphabetically
ordered. Same in the getopt() above.



Ok, attached v4 with those changes.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/doc/src/sgml/ref/pg_resetxlog.sgml b/doc/src/sgml/ref/pg_resetxlog.sgml
index 34b0606..39ae574 100644
--- a/doc/src/sgml/ref/pg_resetxlog.sgml
+++ b/doc/src/sgml/ref/pg_resetxlog.sgml
@@ -30,6 +30,7 @@ PostgreSQL documentation
-m mxid,mxid
-O mxoff
-l xlogfile
+   -s [sysid]
datadir
   
  
@@ -184,6 +185,13 @@ PostgreSQL documentation
   
 
   
+   The -s option instructs pg_resetxlog to 
+   set the system identifier of the cluster to specified sysid.
+   If the sysid is not provided new one will be generated using
+   same algorithm initdb uses.
+  
+
+  
The -V and --version options print
the pg_resetxlog version and exit.  The
options -? and --help show supported arguments,
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index 915a1ed..c94ccda 100644
--- a/src/bin/pg_resetxlog/pg_resetxlog.c
+++ b/src/bin/pg_resetxlog/pg_resetxlog.c
@@ -67,7 +67,9 @@ static MultiXactId set_mxid = 0;
 static MultiXactOffset set_mxoff = (MultiXactOffset) -1;
 static uint32 minXlogTli = 0;
 static XLogSegNo minXlogSegNo = 0;
+static uint64 set_sysid = 0;
 
+static uint64 GenerateSystemIdentifier(void);
 static bool ReadControlFile(void);
 static void GuessControlValues(void);
 static void PrintControlValues(bool guessed);
@@ -111,7 +113,7 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt(argc, argv, "fl:m:no:O:x:e:")) != -1)
+	while ((c = getopt(argc, argv, "fl:m:no:O:s::x:e:")) != -1)
 	{
 		switch (c)
 		{
@@ -227,6 +229,33 @@ main(int argc, char *argv[])
 XLogFromFileName(optarg, &minXlogTli, &minXlogSegNo);
 break;
 
+			case 's':
+if (optarg)
+{
+#ifdef HAVE_STRTOULL
+	set_sysid = strtoull(optarg, &endptr, 10);
+#else
+	set_sysid = strtoul(optarg, &endptr, 10);
+#endif
+	/*
+	 * Validate input, we use strspn because strtoul(l) accepts
+	 * negative numbers and silently converts them to unsigned
+	 */
+	if (set_sysid == 0 || errno != 0 ||
+		endptr == optarg || *endptr != '\0' ||
+		strspn(optarg, " 0123456789") != strlen(optarg))
+	{
+		fprintf(stderr, _("%s: invalid argument \"%s\" for option -s\n"), progname, optarg);
+		fprintf(stderr, _("Try \"

Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-07-18 Thread Magnus Hagander
On Fri, Jul 18, 2014 at 5:33 AM, Amit Kapila  wrote:
> On Thu, Jul 17, 2014 at 4:51 PM, Magnus Hagander 
> wrote:
>>
>> Did anyone actually test this patch? :)
>>
>> I admit I did not build it on Windows specifically because I assumed
>> that was done as part of the development and review. And the changes
>> to pg_event.c can never have built, since the file does not include
>> the required header.
>
> I have tested it on Windows and infact on Linux as well to see if there
> is any side impact before marking it as Ready For Committer.
>
> It seems to me that the required header is removed in last version
> (pg_ctl_eventsrc_v11) where MessageBox() related changes have been
> removed from patch as per recent discussion.  Sorry for not being able
> to check last version posted.

Gotcha. Thanks for clarifying, and I apologize if I came across a bit
harsh even with the smiley.


>> I have reverted that part of the patch for now, hopefully that'll
>> unbreak the buildfarm.
>
> Do you want me to write a patch to use DEFAULT_EVENT_SOURCE in
> pgevent?

I'm not sure it's worth it. I do like the property that pg_event
doesn't have to pull in the full set of headers. But I guess it's
quite confusing to have *one* place that doesn't use the #define. So I
guess if it does work to just pull int he required header then yes, we
should do it - but if it turns out that header causes any conflicts
with anything else we're doing in the file, then let's just skipp it.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gaussian distribution pgbench

2014-07-18 Thread Mitsumasa KONDO
2014-07-18 5:13 GMT+09:00 Fabien COELHO :

>
>  However, ISTM that it is not the purpose of pgbench documentation to be a
>>> primer about what is an exponential or gaussian distribution, so the idea
>>> would yet be to have a relatively compact explanation, and that the
>>> interested but clueless reader would document h..self from wikipedia or a
>>> text book or a friend or a math teacher (who could be a friend as
>>> well:-).
>>>
>>
>> Well, I think it's a balance.  I agree that the pgbench documentation
>> shouldn't try to substitute for a text book or a math teacher, but I
>> also think that you shouldn't necessarily need to refer to a text book
>> or a math teacher in order to figure out how to use pgbench.  Saying
>> "it's complicated, so we don't have to explain it" would be a cop out;
>> we need to *make* it simple.  And if there's no way to do that, then
>> IMHO we should reject the patch in favor of some future patch that
>> implements something that will be easy for users to understand.
>>
>>   [nttcom@localhost postgresql]$ contrib/pgbench/pgbench --exponential=10
> starting vacuum...end.
> transaction type: Exponential distribution TPC-B (sort of)
> scaling factor: 1
> exponential threshold: 10.0
>
> decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
> highest/lowest percent of the range: 9.5% 0.0%
>

 I don't have a clue what that means.  None.

>>>
>>> Maybe we could add in front of the decile/percent
>>>
>>> "distribution of increasing account key values selected by pgbench:"
>>>
>>
>> I still wouldn't know what that meant.  And it misses the point
>> anyway: if the documentation is good, this will be unnecessary.  If
>> the documentation is bad, a printout that tries to illustrate it by
>> example is not an acceptable substitute.
>>
>
> The decile description is quite classic when discussing statistics.

Yeah, maybe, I and Fabien-san don't believe that he doesn't know the decile
percentage.
However, I think more description about decile is needed.

For example,  when we set the number of transaction 10,000 (-t 1),
range of aid is 100,000,
and --exponential is 10, decile percents is under following as you know.

decile percents: 63.2% 23.3% 8.6% 3.1% 1.2% 0.4% 0.2% 0.1% 0.0% 0.0%
highest/lowest percent of the range: 9.5% 0.0%

They mean that,
#number of access in range of aid (from decile percents):
  1 to 10,000 => 6,320 times
  10,001 to 20,000=> 2,330 times
  20,001 to 30,000=> 860 times
  ...
  90,001 to 10,  => 0 times

#number of access in range of aid (from highest/lowest percent of the
range):
 1 to 1,000=> 950 times
 ...
 99,001 to 10,   => 0 times

that's all.

Their information is easy to understand distribution of access probability,
isn't it?
Maybe I and Fabien-san have a knowledge of mathematics, so we think decile
percentage is common sense.
But if it isn't common sense, I agree with adding about these explanation
in the documents.

Best regards,
--
Mitsumasa KONDO