date:20140616

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread Pavel Stehule

2014-06-17 8:46 GMT+02:00 xbzhang :

> one resource owner per tuples, when error happens, only release resource
> owner belong to error tuple.
> Why some memory structures should be in undefined state? Can you give
> some examples?
>

there can be raised any exception -- any non fatal exception. I remember,
when I wrote some similar without exception, then it was very unstable.

Pavel


>
>
> *From:* Pavel Stehule 
> *Date:* 2014-06-17 14:01
> *To:* xbzhang 
> *CC:* Alvaro Herrera ; pgsql-hackers
> 
> *Subject:* Re: Re: [HACKERS] How to implement the skip errors for copy
> from ?
>
>
>
> 2014-06-17 7:30 GMT+02:00 xbzhang :
>
>> Use subtransaction , the tuples that had inserted into heap  must be
>> inserted again  when some exception is raised,
>> it is too expensive.
>> My solution is :
>> 1. delete the tuple that caused the error tuple;
>> 2. release all the resources when  inserting  the tuple;
>> 3. continue insert next tuple
>> Is it feasible?  Anybody give me some suggestion?
>>
>
> no, it should not work - after any exception some memory structures should
> be in undefined state. Errors in PostgreSQL are destructive and any error
> must be followed by ROLLBACK.
>
> Subtransaction for any row is expensive, but subtransaction for some block
> is cheap
>
> Regards
>
> Pavel
>
>
>>
>> --
>>
>> 张晓博   研发二部
>>
>> 北京人大金仓信息技术股份有限公司
>>
>> 地址：北京市海淀区上地西路八号院上地科技大厦4号楼501
>>
>> 邮编：100085
>>
>> 电话：(010) 5885 1118 - 8450
>>
>> 手机：15311394463
>>
>> 邮箱：xbzh...@kingbase.com.cn
>>
>>
>> *From:* Alvaro Herrera 
>> *Date:* 2014-06-17 02:37
>> *To:* Pavel Stehule 
>> *CC:* xbzhang ; pgsql-hackers
>> 
>> *Subject:* Re: [HACKERS] How to implement the skip errors for copy from ?
>> Pavel Stehule wrote:
>> > 2014-06-16 11:46 GMT+02:00 xbzhang :
>> >
>> > >
>> > > I want to implement the skip errors for copy from,lik as :
>> > > create table A (c int primary key);
>> > > copy A from stdin;
>> > > 1
>> > > 1
>> > > 2
>> > > \.
>> > >
>> > > copy will failed:
>> > > ERROR: duplicate key violates primary key constraint "CC_PKEY"
>> > > CONTEXT: COPY CC, line 2: "1"
>> > >
>> > > I want skip the error, and continue to copy the reset of tuple. The
>> result
>> > > will be that there are two rows in table A: 1 and 2.
>> > >
>> > > how to implement that ? Anybody give me some suggestion?
>> >
>> > you should to reimplement a copy procedure to use a subtransactions.
>> Using
>> > subtransaction for any row is too expensive, but you can do
>> subtransaction
>> > per 1000 rows, and when some exception is raised, then store data per
>> one
>> > row/one subtransaction.
>>
>> See http://pgloader.io/ for a ready-made solution.
>>
>> --
>> Álvaro Herrerahttp://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Training & Services
>>
>>
>>
>>
>> -
>> ???
>> :AVG - www.avg.com
>> ??:2013.0.3480 / ?:3955/7685 - :06/16/14
>>
>>
>>
> 在此邮件中未发现病毒。
> 检查工具：AVG - www.avg.com
> 版本：2013.0.3480 / 病毒数据库：3955/7689 - 发布日期：06/16/14
>
>

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread xbzhang

one resource owner per tuples, when error happens, only release resource owner 
belong to error tuple.Why some memory structures should be in undefined state? 
Can you give some examples?

 From: Pavel StehuleDate: 2014-06-17 14:01To: xbzhangCC: Alvaro Herrera; 
pgsql-hackersSubject: Re: Re: [HACKERS] How to implement the skip errors for 
copy from ?

2014-06-17 7:30 GMT+02:00 xbzhang :

Use subtransaction , the tuples that had inserted into heap  must be inserted 
again  when some exception is raised,

it is too expensive.My solution is :1. delete the tuple that caused the error 
tuple;2. release all the resources when  inserting  the tuple;

3. continue insert next tupleIs it feasible?  Anybody give me some suggestion?

no, it should not work - after any exception some memory structures should be 
in undefined state. Errors in PostgreSQL are destructive and any error must be 
followed by ROLLBACK.

Subtransaction for any row is expensive, but subtransaction for some block is 
cheap 

Regards

Pavel

张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址：北京市海淀区上地西路八号院上地科技大厦4号楼501

邮编：100085
电话：(010) 5885 1118 - 8450
手机：15311394463
邮箱：xbzh...@kingbase.com.cn

From: Alvaro HerreraDate: 2014-06-17 02:37To: Pavel Stehule

CC: xbzhang; pgsql-hackersSubject: Re: [HACKERS] How to implement the skip 
errors for copy from ?

Pavel Stehule wrote:
> 2014-06-16 11:46 GMT+02:00 xbzhang :
> 
> >
> > I want to implement the skip errors for copy from,lik as :
> > create table A (c int primary key);
> > copy A from stdin;
> > 1
> > 1
> > 2
> > \.
> >
> > copy will failed:
> > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > CONTEXT: COPY CC, line 2: "1"
> >
> > I want skip the error, and continue to copy the reset of tuple. The result
> > will be that there are two rows in table A: 1 and 2.
> >
> > how to implement that ? Anybody give me some suggestion?
> 
> you should to reimplement a copy procedure to use a subtransactions. Using
> subtransaction for any row is too expensive, but you can do subtransaction
> per 1000 rows, and when some exception is raised, then store data per one
> row/one subtransaction.

See http://pgloader.io/ for a ready-made solution.

-- 
Álvaro Herrera    http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-
???
:AVG - www.avg.com
??:2013.0.3480 / ?:3955/7685 - :06/16/14

在此邮件中未发现病毒。

检查工具：AVG - www.avg.com

版本：2013.0.3480 / 病毒数据库：3955/7689 - 发布日期：06/16/14

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread Pavel Stehule

2014-06-17 7:30 GMT+02:00 xbzhang :

> Use subtransaction , the tuples that had inserted into heap  must be
> inserted again  when some exception is raised,
> it is too expensive.
> My solution is :
> 1. delete the tuple that caused the error tuple;
> 2. release all the resources when  inserting  the tuple;
> 3. continue insert next tuple
> Is it feasible?  Anybody give me some suggestion?
>

no, it should not work - after any exception some memory structures should
be in undefined state. Errors in PostgreSQL are destructive and any error
must be followed by ROLLBACK.

Subtransaction for any row is expensive, but subtransaction for some block
is cheap

Regards

Pavel


>
> --
>
> 张晓博   研发二部
>
> 北京人大金仓信息技术股份有限公司
>
> 地址：北京市海淀区上地西路八号院上地科技大厦4号楼501
>
> 邮编：100085
>
> 电话：(010) 5885 1118 - 8450
>
> 手机：15311394463
>
> 邮箱：xbzh...@kingbase.com.cn
>
>
> *From:* Alvaro Herrera 
> *Date:* 2014-06-17 02:37
> *To:* Pavel Stehule 
> *CC:* xbzhang ; pgsql-hackers
> 
> *Subject:* Re: [HACKERS] How to implement the skip errors for copy from ?
> Pavel Stehule wrote:
> > 2014-06-16 11:46 GMT+02:00 xbzhang :
> >
> > >
> > > I want to implement the skip errors for copy from,lik as :
> > > create table A (c int primary key);
> > > copy A from stdin;
> > > 1
> > > 1
> > > 2
> > > \.
> > >
> > > copy will failed:
> > > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > > CONTEXT: COPY CC, line 2: "1"
> > >
> > > I want skip the error, and continue to copy the reset of tuple. The
> result
> > > will be that there are two rows in table A: 1 and 2.
> > >
> > > how to implement that ? Anybody give me some suggestion?
> >
> > you should to reimplement a copy procedure to use a subtransactions.
> Using
> > subtransaction for any row is too expensive, but you can do
> subtransaction
> > per 1000 rows, and when some exception is raised, then store data per one
> > row/one subtransaction.
>
> See http://pgloader.io/ for a ready-made solution.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
>
>
> -
> ???
> :AVG - www.avg.com
> ??:2013.0.3480 / ?:3955/7685 - :06/16/14
>
>
>

Re: [HACKERS] rm_desc signature

2014-06-16 Thread Heikki Linnakangas


On 06/17/2014 04:19 AM, Jeff Janes wrote:

This commit, or a related one, changed the default (i.e. commented out)
nature of:

#define WAL_DEBUG


Oops. Fixed, thanks!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread xbzhang


Use subtransaction , the tuples that had inserted into heap  must be inserted 
again  when some exception is raised,it is too expensive.My solution is :1. 
delete the tuple that caused the error tuple;2. release all the resources when  
inserting  the tuple;3. continue insert next tupleIs it feasible?  Anybody give 
me some suggestion?




张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址：北京市海淀区上地西路八号院上地科技大厦4号楼501
邮编：100085
电话：(010) 5885 1118 - 8450
手机：15311394463
邮箱：xbzh...@kingbase.com.cn
 From: Alvaro HerreraDate: 2014-06-17 02:37To: Pavel StehuleCC: xbzhang; 
pgsql-hackersSubject: Re: [HACKERS] How to implement the skip errors for copy 
from ?Pavel Stehule wrote:
> 2014-06-16 11:46 GMT+02:00 xbzhang :
> 
> >
> > I want to implement the skip errors for copy from,lik as :
> > create table A (c int primary key);
> > copy A from stdin;
> > 1
> > 1
> > 2
> > \.
> >
> > copy will failed:
> > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > CONTEXT: COPY CC, line 2: "1"
> >
> > I want skip the error, and continue to copy the reset of tuple. The result
> > will be that there are two rows in table A: 1 and 2.
> >
> > how to implement that ? Anybody give me some suggestion?
> 
> you should to reimplement a copy procedure to use a subtransactions. Using
> subtransaction for any row is too expensive, but you can do subtransaction
> per 1000 rows, and when some exception is raised, then store data per one
> row/one subtransaction.
 
See http://pgloader.io/ for a ready-made solution.
 
-- 
Álvaro Herrera    http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
 
 
 
 
-
???
:AVG - www.avg.com
??:2013.0.3480 / ?:3955/7685 - :06/16/14

Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-16 Thread Shigeru Hanada

Fujii-san,

I agree not to backpatch, but I noticed that the 9.3 document about
stats collector doesn't mention $PGDATA/pg_stat.

http://www.postgresql.org/docs/current/static/monitoring-stats.html

It just says:
> When the server shuts down, a permanent copy of the statistics data is stored 
> in the global subdirectory, so that statistics can be retained across server 
> restarts.

I'm not sure whether we should modify the 9.3 document at the moment,
but actually the description made me confused a bit.


2014-05-29 12:22 GMT+09:00 Fujii Masao :
> On Thu, May 29, 2014 at 4:55 AM, Tomas Vondra  wrote:
>> On 28.5.2014 19:52, Fujii Masao wrote:
>>> On Thu, May 29, 2014 at 12:37 AM, Peter Geoghegan  wrote:
 On Wed, May 28, 2014 at 7:01 AM, Fujii Masao  wrote:
> But pg_stat_statements file is saved under $PGDATA/global yet.
> Is this intentional or just oversight?


 I think it's an oversight.
>>>
>>> OK, patch attached.
>>>
>>> I'm afraid that it's not okay to change the file layout in $PGDATA at this 
>>> beta1
>>> stage because that change basically seems to need initdb. Otherwise 
>>> something
>>> like "no such file or directory" error can happen. But in this case what we 
>>> need
>>> to change is only the location of the pg_stat_statements permanent stats 
>>> file.
>>> So, without initdb, the server will not be able to find the
>>> pg_stat_statements stats
>>> file, but this is not so harmful. Only the problem is that the
>>> pg_stat_statements
>>> stats which were collected in past would disappear. OTOH, the server can 
>>> keep
>>> running successfully from then and no critical data will not
>>> disappear. Therefore
>>> I think we can commit this patch even at beta1. Thought?
>>
>> For HEAD, probably yes. But what about backpatching 9.3?
>
> I think No. So we should not backpatch this to 9.3.
>
> Regards,
>
> --
> Fujii Masao
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-16 Thread Abhijit Menon-Sen

At 2014-06-17 09:49:35 +0530, amit.kapil...@gmail.com wrote:
>
> By above do you mean that the patch should allow GUC_NOT_IN_SAMPLE or
> something else, if earlier then I have removed it as per comment from
> Fujji-san.

Yes, what you've done in v3 of the patch is what I meant. I've marked
this as ready for committer now.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Peter Geoghegan

On Mon, Jun 16, 2014 at 8:56 PM, Tom Lane  wrote:
> Not having looked at the patch, but: I think the probability of
> useless-noise HINTs could be substantially reduced if the code prints a
> HINT only when there is a single available alternative that is clearly
> better than the others in Levenshtein distance.  I'm not sure how much
> better is "clearly better", but I exclude "zero" from that.  I see that
> the original description of the patch says that it will arbitrarily
> choose one alternative when there are several with equal Levenshtein
> distance, and I'd say that's a bad idea.

I disagree. I happen to think that making some guess is better than no
guess at all here, given the fact that there aren't too many
possibilities to choose from. I think that it might be particularly
annoying to not show some suggestion in the event of a would-be
ambiguous column reference where the column name is itself wrong,
since both mistakes are common. For example, "order_id" was specified
instead of one of either "o.orderid" or "ol.orderid", as in my
original examples. If some correct alias was specified, that would
make the new code prefer the appropriate Var, but it might not be, and
that should be okay in my view.

I'm not trying to remove the need for human judgement here. We've all
heard stories about people who did things like input "Portland" into
their GPS only to end up in Maine rather than Oregon, but I think in
general you can only go so far in worrying about those cases.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Peter Geoghegan

On Mon, Jun 16, 2014 at 8:38 PM, Ian Barwick  wrote:
> I mean, does it come up with a suggestion in every case, even if there is
> no remotely similar column? E.g. would
>
>   SELECT foo FROM some_table
>
> bring up column "bar" as a suggestion if "bar" is the only column in
> the table?

Yes, it would, but I think that's the correct behavior.

> Anyway, is there an up-to-date version of the patch available? The one from
> March doesn't seem to apply cleanly to HEAD.

Are you sure? I think it might just be that patch is confused about
the deleted file contrib/fuzzystrmatch/levenshtein.c.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-16 Thread Amit Kapila

On Mon, Jun 16, 2014 at 7:06 PM, Fujii Masao  wrote:
> > Developer options are mainly for debugging information or might help in
one
> > of the situations, so I thought somebody might not want them to be part
of
> > server configuration once they are set.  We already disallow parameters
like
> > config_file, transaction_isolation, etc. which are disallowed to be set
in
> > postgresql.conf.  Could you please explain me a bit in which
> > situations/scenarios, do you think that allowing developer options via
Alter
> > System can be helpful?
>
> I think that's helpful. I've heard that some users enable the developer
option
> "trace_sort" in postgresql.conf in order to collect the information
> about sorting.
> They might want to set trace_sort via ALTER SYSTEM, for example.

Okay, if you have usecase where people use such parameters in
postegresql.conf, then it makes sense.  I have removed that restriction
from patch.

> I suggest the following wording:
>
> This command may not be used to set
> 
> or any parameters that are not allowed in postgresql.conf.

I think we should not use *may* in this line, because in no
circumstance we will allow this command for the parameters
mentioned.  I have changed it as per your suggestion.

> > + /*
> > +  * Disallow parameter's that are excluded or disallowed in
> > +  * postgresql.conf.
> > +  */
>
> "parameters", no apostrophe.
okay.
I have changed above comment as earlier comment has
information about developer option which we don't need now.

> >   if ((record->context == PGC_INTERNAL) ||
> > - (record->flags & GUC_DISALLOW_IN_FILE))
> > - ereport(ERROR,
> > -
(errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
> > -  errmsg("parameter \"%s\" cannot be
changed",
> > - name)));
> > + (record->flags & GUC_DISALLOW_IN_FILE) ||
> > + (record->flags & GUC_DISALLOW_IN_AUTO_FILE) ||
> > + (record->flags & GUC_NOT_IN_SAMPLE))
> > +  ereport(ERROR,
> > +
 (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
> > +   errmsg("parameter \"%s\" cannot be
changed",
> > +  name)));
>
> I looked at the settings that are marked GUC_NOT_IN_SAMPLE but neither
> PGC_INTERNAL nor GUC_DISALLOW_IN_*FILE. I don't feel strongly about it,
> but I don't see any particularly good reason to exclude them here.

By above do you mean that the patch should allow GUC_NOT_IN_SAMPLE or
something else, if earlier then I have removed it as per comment from
Fujji-san.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


prohibit_data_dir_by_alter_system-v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Tom Lane

Peter Geoghegan  writes:
> On Mon, Jun 16, 2014 at 7:09 PM, Ian Barwick  wrote:
>> Howver in this particular use case, as long as it doesn't produce false
>> positives (I haven't looked at the patch) I don't think it would cause
>> any problems (of the kind which would require actively excluding certain
>> languages/character sets), it just wouldn't be quite as useful.

> I'm not sure what you mean by false positives. The patch just shows a
> HINT, where before there was none. It's possible for any number of
> reasons that it isn't the most useful possible suggestion, since
> Levenshtein distance is used as opposed to any other scheme that might
> be better sometimes. I think that the hint given is a generally useful
> piece of information in the event of an ERRCODE_UNDEFINED_COLUMN
> error. Obviously I think the patch is worthwhile, but fundamentally
> the HINT given is just a guess, as with the existing HINTs.

Not having looked at the patch, but: I think the probability of
useless-noise HINTs could be substantially reduced if the code prints a
HINT only when there is a single available alternative that is clearly
better than the others in Levenshtein distance.  I'm not sure how much
better is "clearly better", but I exclude "zero" from that.  I see that
the original description of the patch says that it will arbitrarily
choose one alternative when there are several with equal Levenshtein
distance, and I'd say that's a bad idea.

You could possibly answer this objection by making the HINT list *all*
the alternatives meeting the minimum Levenshtein distance.  But I think
that's probably overcomplicated and of uncertain value anyhow.  I'd rather
have a rule that "we print only the choice that is at least K units better
than any other choice", where K remains to be determined exactly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Ian Barwick

On 14/06/17 11:57, Peter Geoghegan wrote:
> On Mon, Jun 16, 2014 at 7:09 PM, Ian Barwick  wrote:
>> Howver in this particular use case, as long as it doesn't produce false
>> positives (I haven't looked at the patch) I don't think it would cause
>> any problems (of the kind which would require actively excluding certain
>> languages/character sets), it just wouldn't be quite as useful.
> 
> I'm not sure what you mean by false positives. The patch just shows a
> HINT, where before there was none. It's possible for any number of
> reasons that it isn't the most useful possible suggestion, since
> Levenshtein distance is used as opposed to any other scheme that might
> be better sometimes. I think that the hint given is a generally useful
> piece of information in the event of an ERRCODE_UNDEFINED_COLUMN
> error. Obviously I think the patch is worthwhile, but fundamentally
> the HINT given is just a guess, as with the existing HINTs.

I mean, does it come up with a suggestion in every case, even if there is
no remotely similar column? E.g. would

  SELECT foo FROM some_table

bring up column "bar" as a suggestion if "bar" is the only column in
the table?

Anyway, is there an up-to-date version of the patch available? The one from
March doesn't seem to apply cleanly to HEAD.


Thanks

Ian Barwick


-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Peter Geoghegan

On Mon, Jun 16, 2014 at 7:09 PM, Ian Barwick  wrote:
> Howver in this particular use case, as long as it doesn't produce false
> positives (I haven't looked at the patch) I don't think it would cause
> any problems (of the kind which would require actively excluding certain
> languages/character sets), it just wouldn't be quite as useful.

I'm not sure what you mean by false positives. The patch just shows a
HINT, where before there was none. It's possible for any number of
reasons that it isn't the most useful possible suggestion, since
Levenshtein distance is used as opposed to any other scheme that might
be better sometimes. I think that the hint given is a generally useful
piece of information in the event of an ERRCODE_UNDEFINED_COLUMN
error. Obviously I think the patch is worthwhile, but fundamentally
the HINT given is just a guess, as with the existing HINTs.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

2014-06-16 Thread Abhijit Menon-Sen

At 2014-06-13 10:29:24 -0400, t...@sss.pgh.pa.us wrote:
>
> I wonder whether we should just get rid of log_disconnections as a
> separate variable, instead logging disconnections when log_connections
> is set.

I like that idea.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-16 Thread Noah Misch

On Sat, Jun 14, 2014 at 10:37:36AM -0400, Tom Lane wrote:
> After giving somebody advice, for the Nth time, to install a
> memory-consumption ulimit instead of leaving his database to the tender
> mercies of the Linux OOM killer, it occurred to me to wonder why we don't
> provide a built-in feature for that, comparable to the "ulimit -c max"
> option that already exists in pg_ctl.

In principle, I would like to have such a feature.  The only limit I've found
reliable on the occasions I wanted this was RLIMIT_AS; RLIMIT_DATA has been
ineffective on Linux+GNU libc.  RLIMIT_AS has its own problems, of course:
address space usage is only tenuously connected to the definition of "memory
consumption" folks actually want.  Worse, one can often construct a query to
crash an RLIMIT_AS-affected backend.  Make the query use heap space until just
shy of the limit, then burn stack space until RLIMIT_AS halts stack growth.

I would welcome a feature for configuring each RLIMIT_* or some selection
thereof.  Then it's up to the administrator to know the (possibly
platform-specific) benefits and risks of each limit.  I don't think a high
level "limit memory consumption" feature is within reach, though.

> A reasonably low-overhead way
> to do that would be to define it as something a backend process sets
> once at startup, if told to by a GUC.  The GUC could possibly be
> PGC_BACKEND level though I'm not sure if we want unprivileged users
> messing with it.

Letting unprivileged users raise the limit is somewhat like allowing them to
raise work_mem.  On the other hand, while one can craft queries to consume
arbitrary multiples of work_mem, the combination of RLIMIT_AS and CONNECTION
LIMIT should be effective to cap a user's memory consumption.  Overall,
PGC_SUSET sounds good to me, for the reasons Craig gave.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Ian Barwick

On 14/06/17 9:53, Tom Lane wrote:
> Michael Paquier  writes:
>> On Tue, Jun 17, 2014 at 9:30 AM, Ian Barwick  wrote:
>>> From what I've seen in the wild in Japan, Roman/ASCII characters are
>>> widely used for object/attribute names, as generally it's much less
>>> hassle than switching between input methods, dealing with different
>>> encodings etc. The only place where I've seen Japanese characters widely
>>> used is in tutorials, examples etc. However that's only my personal
>>> observation for one particular non-Roman language.
> 
>> And I agree to this remark, that's a PITA to manage database object
>> names with Japanese characters directly. I have ever seen some
>> applications using such ways to define objects though in the past, not
>> *that* many I concur..
> 
> What exactly is the rationale for thinking that Levenshtein distance is
> useless in non-Roman alphabets?  AFAIK it just counts insertions and
> deletions of characters, which seems like a concept rather independent
> of what those characters are.

With Japanese (which doesn't have an alphabet, but two syllabaries and
a bunch of logographic characters), Levenshtein distance is pretty useless
for examining similarities with words which can be written in either
syllabary (Michael's "ramen" example earlier in the thread); and when
catching "typos" caused by erroneous conversion from phonetic input to
characters - e.g. intending to input "成長" (seichou, growth) but
accidentally selecting "清聴" (seichou, courteous attention).

Howver in this particular use case, as long as it doesn't produce false
positives (I haven't looked at the patch) I don't think it would cause
any problems (of the kind which would require actively excluding certain
languages/character sets), it just wouldn't be quite as useful.

Regards

Ian Barwick

-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-16 Thread Craig Ringer

On 06/17/2014 02:02 AM, Shreesha wrote:
> But I believe, I am looking forward for the exact opposite of it. In
> other words, a possible work around for a root user to execute certain
> executable(s) as an unprivileged user. 
> 

So you want the su or sudo commands?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] btreecheck extension

2014-06-16 Thread Peter Geoghegan

As discussed at the developer meeting at pgCon, I think that there is
a lot to be said for a tool that checks nbtree index invariants on
live systems.

Attached prototype patch adds contrib extension, btreecheck. This
extension provides SQL-callable functions for checking these
conditions on nbtree indexes on live systems. This code generally
works by using the same insertion ScanKey infrastructure that the
B-Tree code itself uses in servicing all kinds of index scans.

There are two major goals for this patch. In order of their importance
in my mind:

1. Improve testing around the B-Tree code. Give 9.4 beta testers the
tools needed to detect subtle errors in a semi-automated way. I'm
quite encouraged by recent efforts along these lines by Jeff Janes,
Heikki, and others. Beyond 9.4, I think that stress testing is likely
to become indispensable as more complex performance optimizations are
pursued. Personally, I see some good opportunities to improve the
performance of the B-Tree code, and if those ideas are ever to be put
into action, there is clearly a big need for automated stress-testing.
I see this tool as a start. I'm not just talking about my recent work
on sort support for text, and the idea of generalizing that; I think
that there are many techniques that we could pursue for decreasing the
I/O and CPU overhead of using B-Tree indexes that are just too risky
to pursue today because of the subtle interactions of many moving
parts. I'd like to lower the risk.

2. Allow users to sanity check indexes in production. This is not my
immediate goal here, though; I will not add this patch to the next
CommitFest. That's something to think about in more detail a bit
later. Clearly there is a big demand for this kind of thing, though.

Since I believe the B-Tree code changes of 9.4 are generally thought
to be the most plausible source of serious bugs in 9.4 purely because
of their subtlety and fundamental importance, it seemed valuable to
get this prototype into the hands of beta testers as soon as possible.
For that reason, I have not given some aspects of the design as much
thought as I would have preferred. I haven't really come up with a
plausible plan of attack for finding bugs in the B-Tree code. Rather,
I've more or less just considered what invariant conditions exist that
can be correctly verified without too much effort, and then
implemented checks for each. I've also stuck to a design that only has
the tool share lock a single B-Tree buffer at a time, and copy the
contents into local memory before operating on that (relation level
heavyweight locks are used in various different ways too), in a
similar manner to contrib/pageinspect. Copying the entire page into
local memory is probably the right thing to do even from a performance
perspective, since in each case the entire page is inspected, which
takes much longer than, say, a binary search. Even though it's
supposed to be possible to run this on a live system without causing
too much disruption, subtle buffer locking protocols were avoided
(although there is a rationale for why things are okay in the module,
that considers current buffer locking protocols particular to the
B-Tree code). This is a principle that I think I can stick to, and
would certainly prefer to stick to for obvious reasons.

The checks performed (the entire set of invariant conditions checked
by all SQL-callable functions in the patch) are currently:

* That data items on each B-Tree page are in logical order.

* That non-rightmost pages (which are the only pages that have a high
key) have only data items that respect that high key as an upper
bound.

* That a down-link data item in a parent page points to a child page
where the parent down-link data item is respected as a lower bound.

* That a down-link data item in a parent page points to a child pages
whose high key (if any) is equal to the next data item on the parent
page (if any), while taking into account incomplete page splits and
right-most-ness.

* That a down-link data item in a parent page points to a child page
whose right link (if any) points to the same block/page as pointed to
by the next data item on the parent page (if any). In other words, the
children and their parent are in agreement about the children being
siblings, while taking into account incomplete page splits and
right-most-ness. In other other words, the second phase of a page
split where a downlink is inserted into the parent was sane (so this
is similar to the previous invariant check listed).

The two prior checks may not be sufficient, though, if we want to
operate with weak, non-disruptive locks -- what about cousins and
grandparents, and even great-grandparents and second cousins? It's a
good start, though.

* That on each level, starting at the true root level and working down
the the leaf level (level 0), there is mutual agreement between
siblings that they are siblings (that is, their right and left links
comport).

I won't go into locking pr

Re: [HACKERS] rm_desc signature

2014-06-16 Thread Jeff Janes

On Friday, June 13, 2014, Heikki Linnakangas 
wrote:

> As part of the WAL-format changing patch I've been working on, I changed
> the signature of the rm_desc function from:
>
> void (*rm_desc) (StringInfo buf, uint8 xl_info, char *rec);
> void (*rm_desc) (StringInfo buf, XLogRecord *record);
>
> The WAL-format patch needed that because it added more functions/macros
> for working with XLogRecords, also used by rm_desc routines, but it seems
> like a sensible change anyway. IMHO it was always a bit strange that
> rm_desc was passed the info field and record payload separately.
>
> So I propose to do that change as a separate commit. Per attached. This
> has no functional changes, it's just refactoring.
>
> Any objections?
>

This commit, or a related one, changed the default (i.e. commented out)
nature of:

#define WAL_DEBUG

Cheers,

Jeff

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Tom Lane

Michael Paquier  writes:
> On Tue, Jun 17, 2014 at 9:30 AM, Ian Barwick  wrote:
>> From what I've seen in the wild in Japan, Roman/ASCII characters are
>> widely used for object/attribute names, as generally it's much less
>> hassle than switching between input methods, dealing with different
>> encodings etc. The only place where I've seen Japanese characters widely
>> used is in tutorials, examples etc. However that's only my personal
>> observation for one particular non-Roman language.

> And I agree to this remark, that's a PITA to manage database object
> names with Japanese characters directly. I have ever seen some
> applications using such ways to define objects though in the past, not
> *that* many I concur..

What exactly is the rationale for thinking that Levenshtein distance is
useless in non-Roman alphabets?  AFAIK it just counts insertions and
deletions of characters, which seems like a concept rather independent
of what those characters are.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Michael Paquier

On Tue, Jun 17, 2014 at 9:30 AM, Ian Barwick  wrote:
> From what I've seen in the wild in Japan, Roman/ASCII characters are
> widely used for object/attribute names, as generally it's much less
> hassle than switching between input methods, dealing with different
> encodings etc. The only place where I've seen Japanese characters widely
> used is in tutorials, examples etc. However that's only my personal
> observation for one particular non-Roman language.
And I agree to this remark, that's a PITA to manage database object
names with Japanese characters directly. I have ever seen some
applications using such ways to define objects though in the past, not
*that* many I concur..
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-16 Thread Tom Lane

Gurjeet Singh  writes:
> Please find attached the patch. It includes the doc changes as well.

What exactly is the point of the static state you added here?  There
is no situation where that could possibly be useful, because this
code is executed at most once per process, and not at all in the
postmaster.  AFAICS, that's just complication (and permanent waste
of a kilobyte of static data space) for no return.

Also, this seems to try to write the file whether or not the environment
variable was set, which wasn't the plan.

I also don't really see the point of parsing the value variable into an
int.  Why not just spit it out as the original string?  It's not ours to
legislate whether the kernel will take a value that's not an integer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Ian Barwick

On 14/06/17 8:31, Peter Geoghegan wrote:
> On Mon, Jun 16, 2014 at 4:04 PM, Josh Berkus  wrote:
>> Question: How should we handle the issues with East Asian languages
>> (i.e. Japanese, Chinese) and this Hint?  Should we just avoid hinting
>> for a selected list of languages which don't work well with levenshtein?
>>  If so, how do we get that list?
> 
> I think that how useful Levenshtein distance is for users based in
> east Asia generally, and how useful this patch is to those users are
> two distinct questions. I have no idea how common it is for Japanese
> users to just use Roman characters as table and attribute names. Since
> they're very probably already writing application code that uses Roman
> characters (except in the comments, user strings and so on), it might
> make sense to do the same in the database. I would welcome further
> input on that question. I don't know what the trends are in the real
> world.

>From what I've seen in the wild in Japan, Roman/ASCII characters are
widely used for object/attribute names, as generally it's much less
hassle than switching between input methods, dealing with different
encodings etc. The only place where I've seen Japanese characters widely
used is in tutorials, examples etc. However that's only my personal
observation for one particular non-Roman language.


Regards

Ian Barwick

-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PG Manual: Clarifying the repeatable read isolation example

2014-06-16 Thread Bruce Momjian

On Sun, Jun  8, 2014 at 09:33:04AM +1200, Gavin Flower wrote:
> I know that I first look at the docs & seldom look at the Wiki - in
> fact it was only recently that I became aware of the Wiki, and it is
> still not the first thing I think of when I want to know something,
> and I often forget it exists.  I suspect many people are like me in
> this!
> 
> Also the docs have a more authoritative air, and probably
> automatically assumed to be more up-to-date and relevant to the
> version of Postgres used.
> 
> So I suggest that the docs should have an appropriate coverage of
> such topics, possibly mostly in an appendix with brief references in
> affected parts of the main docs) if it does not quite fit into the
> rest of the documentation (affects many different features, so no
> one place in the main docs is appropriate - or too detailed, or too
> much).  Also links to the Wiki, and to the more academic papers,
> could be provided for the really keen.

You can link to the wiki from our docs.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-16 Thread zhouqq.postg...@gmail.com

I like this feature but am wondering how to use it. If uses one value across 
all backends, we may have to set it conservatively to avoid OOM killer. But 
this does not promote resource sharing. If we set it per backend, what's the 
suggested value? One may is recommending user sorts his queries to small-q 
group and big-q group, where big-q connection sets a higher ulimit w.r.t 
work_mem. But this also has a limit mileage. 

An ideal way is PGC_SIGHUP, which implies all server process added up shall 
respect this setting and it is adjustable. Not sure how to implement it, as 
setrlimit() seems not supporting process group (and what about windows?). Even 
if it does, a small issue is that this might increase the chance we hit OOM at 
some inconvenient places. For example, here:

/* Special case for startup: use good ol' malloc */ 
node = (MemoryContext) malloc(needed); 
Assert(node != NULL); 

I wonder how far we want to go along the line. Consider this case: we have some 
concurrent big-q and med-q, the system may comfortably allowing 1 big-q running 
with 2 or 3 med-qs to zack the left-over memory. If with query throttling, we 
hopefully make all queries run successfully without middle-fail-surprises, and 
ulimit guards the bottomline if anything goes wrong. This may lead to a 
discussion of more complete workload management support.

Regards,
Qingqing

From: Tom Lane
Date: 2014-06-14 22:37
To: pgsql-hackers
Subject: [HACKERS] Built-in support for a memory consumption ulimit?
After giving somebody advice, for the Nth time, to install a
memory-consumption ulimit instead of leaving his database to the tender
mercies of the Linux OOM killer, it occurred to me to wonder why we don't
provide a built-in feature for that, comparable to the "ulimit -c max"
option that already exists in pg_ctl.  A reasonably low-overhead way
to do that would be to define it as something a backend process sets
once at startup, if told to by a GUC.  The GUC could possibly be
PGC_BACKEND level though I'm not sure if we want unprivileged users
messing with it.
 
Thoughts?
 
regards, tom lane
 
 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] avoiding tuple copying in btree index builds

2014-06-16 Thread Tom Lane

Robert Haas  writes:
> On further review, this is definitely the way to go: it's a
> straight-up win.  The isnull array is never more than one element in
> length, so testing the single element is quite trivial.   The
> attached, revised patch provides a modest but useful speedup for both
> hash and btree index builds.

> Anyone see any reason NOT to do this?  So far it looks like a
> slam-dunk from here.

If index_form_tuple leaks any memory in the sort context, this would be
not so much a slam-dunk win as a complete disaster.  I'm not sure that
no-leak is a safe assumption, especially in cases where we do toast
compression of the index datums.  (In fact, I'm pretty sure that the
reason it was coded like this originally was exactly to avoid that
assumption.)

Assuming that you inspect the code and determine it's safe, the patch
had better decorate index_form_tuple with dire warnings about not leaking
memory; because even if it's safe today, somebody might break it tomorrow.

On a micro-optimization level, it might be worth passing the TID as
ItemPointer not ItemPointerData (ie, pass a pointer until we get to
the point of actually inserting the TID into the index tuple).
I'm not sure that copying odd-size structs should be assumed to be
efficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Peter Geoghegan

On Mon, Jun 16, 2014 at 4:04 PM, Josh Berkus  wrote:
> Question: How should we handle the issues with East Asian languages
> (i.e. Japanese, Chinese) and this Hint?  Should we just avoid hinting
> for a selected list of languages which don't work well with levenshtein?
>  If so, how do we get that list?

I think that how useful Levenshtein distance is for users based in
east Asia generally, and how useful this patch is to those users are
two distinct questions. I have no idea how common it is for Japanese
users to just use Roman characters as table and attribute names. Since
they're very probably already writing application code that uses Roman
characters (except in the comments, user strings and so on), it might
make sense to do the same in the database. I would welcome further
input on that question. I don't know what the trends are in the real
world.

Also note that the patch scans the range table parse state to pick the
most probable candidate among all Vars/columns that already appear
there. The query would raise an error at an earlier point if a
non-existent relation was referenced, for example. We're only choosing
from a minimal list of possibilities, and pick one that is very
probably what was intended. Even if Levenshtein distance works badly
with Kanji (which is not obviously the case, at least to me), it might
not matter here.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-16 Thread Josh Berkus

On 04/02/2014 01:16 PM, Andres Freund wrote:
> On 2014-04-02 21:08:47 +0100, Greg Stark wrote:
>> Normally I'm not for adding gucs that just gate new features. But I think a
>> simple guc to turn this on or off would be fine and alleviate any concerns.
>> I think users would appreciate it quite a lot
> 
> I don't have strong feelings about the feature, but introducing a guc
> for it feels entirely ridiculous to me. This is a minor detail in an
> error message, not more.
> 
>> It would even have a positive effect of helping raise awareness of the
>> feature. I often scan the list of config options to get an idea of new
>> features when I'm installing new software or upgrading.
> 
> Really? Should we now add GUCs for every feature then?

-1 for having a GUC for this.

+1 on the feature.

Review with functional test coming up.

Question: How should we handle the issues with East Asian languages
(i.e. Japanese, Chinese) and this Hint?  Should we just avoid hinting
for a selected list of languages which don't work well with levenshtein?
 If so, how do we get that list?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

2014-06-16 Thread Jeff Janes

On Wed, Jun 11, 2014 at 12:59 PM, Greg Stark  wrote:

> I think this whole exercise has mostly just convinced me we should
> implement an HTTP interface and reimplement psql as a browser app.

I certainly hope not.  I've seen lots of browser apps that were nice
enough to use for casual use by a casual user.  I've never seen one
that was an effective power tool for power users, the way psql is.
Now maybe they are out there and I just don't know about them, but I
have my doubts.

As an additional tool, to each his own.  But a browser-based
replacement for psql, -1 from me.

And as far browser-based things apply to this patch, I must say I've
tried micromanaging the way large amounts of data wrap in a HTML table
when I found the default to be inadequate, and I have not found that
to be noticeably easy, either.

The original version of this patch was only a few lines long and did
one very simple and useful thing: avoiding the printing of whole
screens full of hyphens when in 'expanded mode'.  If we end up
reverting the other parts of this patch, hopefully we don't lose that
part.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ALTER TABLESPACE MOVE command tag tweak

2014-06-16 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Alvaro Herrera  writes:
> > ALTER TABLESPACE MOVE is a glorified ALTER TABLE.  If ALTER TABLESPACE
> > MOVE returned ALTER TABLE as a tag, I think it'd work well too; but not
> > ALTER TABLESPACE.  Individually, since the implementation works by
> > calling AlterTableInternal(), it already works.
> 
> > Now if you state that the current design in event_triggers that works by
> > slicing CommandTag and comparing the pieces is broken, I don't disagree
> > and I think I have now (in the patch posted in a nearby thread) some
> > more infrastructure to do it differently.  But even if we do that, I
> > think we're going to need a way to differentiate ALTER TABLESPACE MOVE
> > from other forms of ALTER TABLESPACE.  I haven't given this much
> > thought, though.
> 
> Yeah, I'd not paid much attention to it either.  Now that I look at it,
> ALTER TABLESPACE MOVE seems like a pretty unfortunate choice of naming
> all around, because (unless I'm misunderstanding) it doesn't actually
> alter any property of the tablespace itself.  It might be a bit late
> to propose this, but I wonder if some syntax along the lines of

I'm not against changing it- doing operations on a whole tablespace felt
like it would make sense under 'ALTER TABLESPACE' to me (hence the
implementation) but you're right, it's not actually changing properties
of the tablespaces themselves.

>MOVE ALL [ TABLES | INDEXES | ... ] IN TABLESPACE foo TO bar

I'm not a huge fan of new top-level constructs and re-using MOVE feels
completely wrong to me as that's used for cursors..

> wouldn't be less confusing.  Not sure what we'd use as command tag
> for it though (not MOVE, since that's taken).

I would have thought something under ALTER TABLE would make more sense,
if we're going to change it, eg:

ALTER TABLE ALL [ TABLES | INDEXES | ... ] IN TABLESPACE SET TABLESPACE ...

or perhaps something like

ALTER TABLES IN TABLESPACE ...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-16 Thread Tom Lane

Robert Haas  writes:
> On Sat, Jun 14, 2014 at 10:37 AM, Tom Lane  wrote:
>> After giving somebody advice, for the Nth time, to install a
>> memory-consumption ulimit instead of leaving his database to the tender
>> mercies of the Linux OOM killer, it occurred to me to wonder why we don't
>> provide a built-in feature for that, comparable to the "ulimit -c max"
>> option that already exists in pg_ctl.  A reasonably low-overhead way
>> to do that would be to define it as something a backend process sets
>> once at startup, if told to by a GUC.  The GUC could possibly be
>> PGC_BACKEND level though I'm not sure if we want unprivileged users
>> messing with it.

> What happens if the limit is exceeded?  ERROR?  FATAL?  PANIC?

We'd presumably get NULL returns from malloc, which'd be reported
as elog(ERROR, "out of memory").  It's not anything new for the
code to handle.

One issue if we allow this to be set on something other than PGC_BACKEND
timing is that somebody might try to reduce the setting to less than his
session is currently using.  I'm not sure what setrlimit would do in
such cases --- possibly fail, or maybe just set a limit that would cause
all subsequent malloc's to fail.  I doubt it'd actually take away existing
memory though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-16 Thread Greg Stark

On Mon, Jun 16, 2014 at 9:08 PM, Robert Haas  wrote:
> What happens if the limit is exceeded?  ERROR?  FATAL?  PANIC?

Well presumably it just makes malloc return NULL which causes an
ERROR. One advantage to setting it via a GUC is that it might be
possible to, for example, raise it automatically in critical sections
or during error unwinding.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autonomous Transaction (WIP)

2014-06-16 Thread Alvaro Herrera

What's the status of this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] delta relations in AFTER triggers

2014-06-16 Thread Alvaro Herrera

Kevin Grittner wrote:
> Attached is a WIP patch for implementing the capture of delta
> relations for a DML statement, in the form of two tuplestores --
> one for the old versions and one for the new versions.  In the
> short term it is intended to make these relations available in
> trigger functions, although the patch so far doesn't touch any PLs
> -- it just takes things as far as providing the relations as
> tuplestores in the TriggerData structure when appropriate, for the
> PLs to pick up from there.  It seemed best to get agreement on the
> overall approach before digging into all the PLs.  This is
> implemented only for INSERT, UPDATE, and DELETE since it wasn't
> totally clear what the use cases and proper behavior was for other
> triggers.  Opinions on whether we should try to provide deltas for
> other cases, and if so what the semantics are, are welcome.

TRUNCATE triggers shouldn't have delta relations, that's for sure, or
it'd completely negate the point of TRUNCATE triggers.  I don't think we
have any other event, do we?  People who get delta relations for
deleting all rows should be using DELETE, I think.

I am not sure about providing delta relations in the FOR EACH ROW case.
For what cases are them useful?  In all cases I think NEW and OLD are
sufficient.  I didn't read the standard, but if it's saying that in FOR
EACH ROW the new/deleted/changed row should be accessible by way of a
delta relation, then perhaps we should look into making NEW and OLD be
accesible via that relation rather than adding delta relations.  It
might be more code, though.

Now, if you only have delta relations in FOR EACH STATEMENT, then it
seems to me you can optimize obtaining them only when such triggers
are defined; this lets you do away with the reloption entirely, doesn't
it?

I noticed that GetCurrentFDWTuplestore() changed its identity without
getting its comment updated.  The new name seems awfully generic, and I
don't think it really describes what the function does.  I think it
needs more renaminguu

> (1)  My first impulse was to capture this delta data in the form of
> tuplestores of just TIDs, and fetching the tuples themselves from
> the heap on reference.  In earlier discussions others have argued
> for using tuplestores of the actual rows themselves.

Can you please supply pointers to such discussion?  I don't see any
point in not just storing TIDs, but perhaps I'm missing something.

> (2)  Do we want to just pick names for these in the PLs rather than
> using the standard syntax?  Implementing the standard syntax seemed
> to require three new (unreserved) keywords, changes to the catalogs
> to store the chosen relations names, and some way to tie the
> specified relation names in to the various PLs.

I think the only one for which we have a compulsion to follow someone
in this area would be PL/pgSQL, which probably needs to follow PL/SQL's
lead if there is one.  Other than that I don't think we need to do
anything standard.  We don't (yet) have PL/PSM which would need to have
the standard-mandated syntax.

> The way I have
> gone here just adds two new fields to the TriggerData structure and
> leaves it to each PL how to deal with that.  Failure to do anything
> in a PL just leaves it at the status quo with no real harm done --
> it just won't have the new delta relations available.

It seems to me that plpgsql support is mandatory, but other PLs can add
support as interested parties weigh in with patches.  As with event
triggers, I don't feel the author of the main feature is responsible for
patching all PLs.

> +
> + generate_deltas (boolean)
> + 
> +  
> +   Declare that a table generates delta relations when modified.  This
> +   allows AFTER triggers to reference the set of rows 
> modified
> +   by a statement. See
> +for details.
> +  
> + 
> +

Are you intentionally omitting the corresponding sql-createtrigger patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

2014-06-16 Thread Greg Stark

On Mon, Jun 16, 2014 at 9:05 PM, Robert Haas  wrote:
> So, it seems like we need to do something about this one way or
> another.  Who's working on that?

So I'm fine finishing what I started. I've just been a bit busy this past week.

My inclination is to try to push forward and commit this patch,
document the changes and make sure we check for any consequences of
them.

The alternate plan is to revert it for 9.4 and commit the changes to
9.5 and that gives us more time to be sure we're ok with them.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

2014-06-16 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Fujii Masao  writes:
> > That's harmful for audit purpose. I think that we should make
> > log_disconnections PGC_SUSET rather than PGC_BACKEND in order
> > to forbid non-superusers from changing its setting. Attached
> > patch does this.

I tend to agree with this- those are things which regular users really
shouldn't be able to modify.  Policy enforcement can be done when there
isn't system enforcement, but you really don't want to fall back to
policy enforcement when you don't need to.

> TBH, this complaint seems entirely artificial.  If somebody needs to audit
> disconnections, and they see a connection record with no disconnection
> record but the session's no longer there, they certainly know that a
> disconnection occurred.  And if there wasn't a server crash to explain it,
> they go slap the wrist of whoever violated corporate policy by turning off
> log_disconnections.  Why do we need system-level enforcement of this?

Going through every log file, and writing something to address log file
changes, to hunt for those cases is no small amount of effort which
you're asking every administrator with an audit requirement to write
code to do to provide something which we make it appear as if we're
doing for them.  It's certainly a violation of POLA that users can
decide on their own that their disconnections don't get logged.

> Moreover, your proposed fix breaks the entire point of the PGC_BACKEND
> setting, which was to try to prevent disconnections from being logged
> or not logged when the corresponding connection was not logged or logged
> respectively.  If an auditor wants the system to enforce that there are
> matched pairs of those records, he's not exactly going to be satisfied by
> being told that only superusers can cause them to not match.

This is only accurate when a superuser exists in the system and even so,
superuser access can be much more easily reviewed as it's going to be a
lot less traffic and there may be other auditing mechanisms in place for
that access.

> I wonder whether we should just get rid of log_disconnections as a
> separate variable, instead logging disconnections when log_connections
> is set.
> 
> Another answer is to make both variables PGC_SIGHUP, on the grounds
> that it doesn't make much sense for them not to be applied system-wide;
> except that I think there was some idea that logging might be enabled
> per-user or per-database using ALTER ROLE/DATABASE.

Both of these options look pretty reasonable to me as a way to improve
the current situation.  I'm not really sure that I see the use-case for
only logging connections/disconnections on a per-user or per-database
basis; in my experience it's not a lot of traffic to log it all and I
don't recall ever seeing anyone set those anything other than
system-wide.

I like the idea of flexibility in what's logged, I just don't see this
specific case as really needing it. 

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-16 Thread Robert Haas

On Sat, Jun 14, 2014 at 10:37 AM, Tom Lane  wrote:
> After giving somebody advice, for the Nth time, to install a
> memory-consumption ulimit instead of leaving his database to the tender
> mercies of the Linux OOM killer, it occurred to me to wonder why we don't
> provide a built-in feature for that, comparable to the "ulimit -c max"
> option that already exists in pg_ctl.  A reasonably low-overhead way
> to do that would be to define it as something a backend process sets
> once at startup, if told to by a GUC.  The GUC could possibly be
> PGC_BACKEND level though I'm not sure if we want unprivileged users
> messing with it.
>
> Thoughts?

What happens if the limit is exceeded?  ERROR?  FATAL?  PANIC?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

2014-06-16 Thread Robert Haas

On Wed, Jun 11, 2014 at 10:16 PM, Tom Lane  wrote:
> Noah Misch  writes:
>> Based on the commit message and procedural history, I thought commit 6513633
>> was changing behavior solely for the combination of "\pset expanded" and
>> "\pset format wrapped".  Peter's and my test cases show that it also changed
>> behavior for "\pset expanded" alone.  That's a bug, unless someone sees to
>> argue that the new "\pset expanded" behavior is a desirable improvement in
>> spite of its origin as an accident.  Altering an entrenched psql output 
>> format
>> is a big deal.
>
> TBH I'm wondering if we shouldn't just revert that patch (and the
> subsequent fix attempts).  It was not a major feature and I'm thinking
> we have better things to do right now than try to fix the multiple
> logic holes it evidently has.  The author's certainly welcome to try
> again with a more carefully thought-through patch for 9.5.

So, it seems like we need to do something about this one way or
another.  Who's working on that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] review: Non-recursive processing of AND/OR lists

2014-06-16 Thread Tom Lane

I wrote:
> Gurjeet Singh  writes:
>> I tried to eliminate the 'pending' list, but I don't see a way around it.
>> We need temporary storage somewhere to store the branches encountered on
>> the right; in recursion case the call stack was serving that purpose.

> I still think we should fix this in the grammar, rather than introducing
> complicated logic to try to get rid of the recursion later.  For example,
> as attached.

I went looking for (and found) some additional obsoleted comments, and
convinced myself that ruleutils.c is okay as-is, and pushed this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] avoiding tuple copying in btree index builds

2014-06-16 Thread Robert Haas

On Tue, Jun 3, 2014 at 4:38 PM, Robert Haas  wrote:
> On Sun, Jun 1, 2014 at 3:26 AM, Amit Kapila  wrote:
>> On Tue, May 6, 2014 at 12:04 AM, Robert Haas  wrote:
>>> On Mon, May 5, 2014 at 2:13 PM, Andres Freund 
>>> wrote:
>>> > On 2014-05-05 13:52:39 -0400, Robert Haas wrote:
>>> >> Today, I discovered that when building a btree index, the btree code
>>> >> uses index_form_tuple() to create an index tuple from the heap tuple,
>>> >> calls tuplesort_putindextuple() to copy that tuple into the sort's
>>> >> memory context, and then frees the original one it built.  This seemed
>>> >> inefficient, so I wrote a patch to eliminate the tuple copying.  It
>>> >> works by adding a function tuplesort_putindextuplevalues(), which
>>> >> builds the tuple in the sort's memory context and thus avoids the need
>>> >> for a separate copy.  I'm not sure if that's the best approach, but
>>> >> the optimization seems wortwhile.
>>> >
>>> > Hm. It looks like we could quite easily just get rid of
>>> > tuplesort_putindextuple(). The hash usage doesn't look hard to convert.
>>>
>>> I glanced at that, but it wasn't obvious to me how to convert the hash
>>> usage.  If you have an idea, I'm all ears.
>>
>> I also think it's possible to have similar optimization for hash index
>> incase it has to spool the tuple for sorting.
>>
>> In function hashbuildCallback(), when buildstate->spool is true, we
>> can avoid to form index tuple. To check for nulls before calling
>>
>> _h_spool(), we can traverse the isnull array.
>
> Hmm, that might work.  Arguably it's less efficient, but on the other
> hand if it avoids forming the tuple sometimes it might be MORE
> efficient.  And anyway the difference might not be enough to matter.

On further review, this is definitely the way to go: it's a
straight-up win.  The isnull array is never more than one element in
length, so testing the single element is quite trivial.   The
attached, revised patch provides a modest but useful speedup for both
hash and btree index builds.

Anyone see any reason NOT to do this?  So far it looks like a
slam-dunk from here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 7abb7a4..c30c6f9 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -142,26 +142,23 @@ hashbuildCallback(Relation index,
 	HashBuildState *buildstate = (HashBuildState *) state;
 	IndexTuple	itup;
 
-	/* form an index tuple and point it at the heap tuple */
-	itup = _hash_form_tuple(index, values, isnull);
-	itup->t_tid = htup->t_self;
-
 	/* Hash indexes don't index nulls, see notes in hashinsert */
-	if (IndexTupleHasNulls(itup))
-	{
-		pfree(itup);
+	if (isnull[0])
 		return;
-	}
 
 	/* Either spool the tuple for sorting, or just put it into the index */
 	if (buildstate->spool)
-		_h_spool(itup, buildstate->spool);
+		_h_spool(buildstate->spool, htup->t_self, values, isnull);
 	else
+	{
+		/* form an index tuple and point it at the heap tuple */
+		itup = _hash_form_tuple(index, values, isnull);
+		itup->t_tid = htup->t_self;
 		_hash_doinsert(index, itup);
+		pfree(itup);
+	}
 
 	buildstate->indtuples += 1;
-
-	pfree(itup);
 }
 
 /*
@@ -184,10 +181,6 @@ hashinsert(PG_FUNCTION_ARGS)
 #endif
 	IndexTuple	itup;
 
-	/* generate an index tuple */
-	itup = _hash_form_tuple(rel, values, isnull);
-	itup->t_tid = *ht_ctid;
-
 	/*
 	 * If the single index key is null, we don't insert it into the index.
 	 * Hash tables support scans on '='. Relational algebra says that A = B
@@ -197,11 +190,12 @@ hashinsert(PG_FUNCTION_ARGS)
 	 * NOTNULL scans, but that's an artifact of the strategy map architecture
 	 * chosen in 1986, not of the way nulls are handled here.
 	 */
-	if (IndexTupleHasNulls(itup))
-	{
-		pfree(itup);
+	if (isnull[0])
 		PG_RETURN_BOOL(false);
-	}
+
+	/* generate an index tuple */
+	itup = _hash_form_tuple(rel, values, isnull);
+	itup->t_tid = *ht_ctid;
 
 	_hash_doinsert(rel, itup);
 
diff --git a/src/backend/access/hash/hashsort.c b/src/backend/access/hash/hashsort.c
index c0d6fec..3a70bc1 100644
--- a/src/backend/access/hash/hashsort.c
+++ b/src/backend/access/hash/hashsort.c
@@ -90,9 +90,10 @@ _h_spooldestroy(HSpool *hspool)
  * spool an index entry into the sort file.
  */
 void
-_h_spool(IndexTuple itup, HSpool *hspool)
+_h_spool(HSpool *hspool, ItemPointerData self, Datum *values, bool *isnull)
 {
-	tuplesort_putindextuple(hspool->sortstate, itup);
+	tuplesort_putindextuplevalues(hspool->sortstate, hspool->index,
+  self, values, isnull);
 }
 
 /*
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 36dc6c2..1ea4a2c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -171,28 +171,21 @@ btbuildCallback(Relation index,
 void *state)
 {
 	BTBuildState *buildstate = (BTBuildState *) state;
-	IndexTuple	itup;
-
-	/* form an index tuple and

Re: [HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

2014-06-16 Thread Bruce Momjian

On Fri, Jun  6, 2014 at 01:18:24PM -0400, Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > On Wed, May 28, 2014 at 12:29:28PM -0400, Tom Lane wrote:
> > > Bruce Momjian  writes:
> > > > I think this is caused because the variable is not defined as SOCKET. 
> > > > The attached patch fixes this.  This should prevent the warning.
> > > 
> > > Surely that's just going to move the errors somewhere else.  The call
> > > site still expects the argument to be int[].
> > 
> > Ah, yes, you are right.  This is a similar problem I had with libpq
> > where PQsocket() had to return an int.
> > 
> > Attached is an updated patch which follows my previous coding of
> > checking for PGINVALID_SOCKET, and if not equal, assigns the value to an
> > integer handle.  I would also like to rename variable 's' to
> > 'listen_sock', but that is not in the patch, for clarity reasons.
> > 
> > Should this be held for 9.5?  I think it is only warning removal.  On
> > the other hand, portability is what we do during beta testing.
> 
> I think this should go in 9.4, but as you say it's only warning removal
> so there is probably little point in patching further back (this code
> dates back to 9.3.)

Agreed and applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-16 Thread Stephen Frost

* Ronan Dunklau (ronan.dunk...@dalibo.com) wrote:
> The problem with checking if the type is the same is deciding where to stop. 
> For composite types, sure it should be easy. But what about built-in types ? 

Haven't we already decided to trust these based on server version
information?  Look at what quals we find acceptable to push down to the
remote side...  Certainly, composites built off of built-in types should
work.

> Or types provided by an extension / a native library ? These could 
> theorically 
> change from one release to another.

This is definitely an issue which we really need to figure out how to
address.  I don't have any great ideas off-hand, but it feels like we'll
need a catalog and appropriate SQL-fu to allow users and extensions to
add other types and functions which can be pushed down..  For example,
it'd be great if PostGIS queries could be pushed down to the remote
server- and have that set up by the postgis extension on installation
(perhaps via a call-back hook which gives the extension a handle to the
remote server where it could interrogate the server and then a way to
store the information about what is allowed to be pushed to the remote
side, and how?).

That's certainly a rather complicated bit to address and I don't think
we should hold this up for that- but let's definitely be thinking about
how to add these things later and try to avoid putting anything in place
which would cause problems for us later...

Will try to come back to the rest of your questions later.. :)

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread Alvaro Herrera

Pavel Stehule wrote:
> 2014-06-16 11:46 GMT+02:00 xbzhang :
> 
> >
> > I want to implement the skip errors for copy from,lik as :
> > create table A (c int primary key);
> > copy A from stdin;
> > 1
> > 1
> > 2
> > \.
> >
> > copy will failed:
> > ERROR: duplicate key violates primary key constraint "CC_PKEY"
> > CONTEXT: COPY CC, line 2: "1"
> >
> > I want skip the error, and continue to copy the reset of tuple. The result
> > will be that there are two rows in table A: 1 and 2.
> >
> > how to implement that ? Anybody give me some suggestion?
> 
> you should to reimplement a copy procedure to use a subtransactions. Using
> subtransaction for any row is too expensive, but you can do subtransaction
> per 1000 rows, and when some exception is raised, then store data per one
> row/one subtransaction.

See http://pgloader.io/ for a ready-made solution.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-16 Thread Shreesha

@Craig Ringer, I am afraid I didn't understand your solution.
The scenario I am having is a system which has only user as root. I have
all the permissions and privileges of the system as the user is root. But
pgsql doesn't allow initdb to be executed by root.
I think the solution you proposed gives root permissions for a non-root
user.
But I believe, I am looking forward for the exact opposite of it. In other
words, a possible work around for a root user to execute certain
executable(s) as an unprivileged user.
Please clarify if I am wrong in my understanding.

On Sun, Jun 15, 2014 at 9:49 PM, Craig Ringer  wrote:

> On 06/13/2014 07:08 AM, Shreesha wrote:
> > I need to initialize the db as the root and start the database server
>
> Assuming there's no way around doing this (it's generally not a good
> idea), you can just use the simple program 'fakeroot'.
>
> This program changes the return values from system calls via LD_PRELOAD,
> so PostgreSQL thinks that the user it is running as isn't root. It's
> commonly used in testing and packaging systems.
>
> http://man.he.net/man1/fakeroot
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

-- 
~Shreesha.

Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-16 Thread Shreesha

@Craig Ringer, I am afraid I didn't understand your solution.
The scenario I am having is a system which has only user as root. I have
all the permissions and privileges of the system as the user is root. But
pgsql doesn't allow initdb to be executed by root.
I think the solution you proposed gives root permissions for a non-root
user.
But I believe, I am looking forward for the exact opposite of it. In other
words, a possible work around for a root user to execute certain
executable(s) as an unprivileged user.

On Sun, Jun 15, 2014 at 9:49 PM, Craig Ringer  wrote:

> On 06/13/2014 07:08 AM, Shreesha wrote:
> > I need to initialize the db as the root and start the database server
>
> Assuming there's no way around doing this (it's generally not a good
> idea), you can just use the simple program 'fakeroot'.
>
> This program changes the return values from system calls via LD_PRELOAD,
> so PostgreSQL thinks that the user it is running as isn't root. It's
> commonly used in testing and packaging systems.
>
> http://man.he.net/man1/fakeroot
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

-- 
~Shreesha.

Re: [HACKERS] Why is it "JSQuery"?

2014-06-16 Thread David E. Wheeler

On Jun 15, 2014, at 1:58 PM, Josh Berkus  wrote:

> In other words, what I'm saying is: I don't think there's an existing,
> poplular syntax we could reasonably use.

Okay, I’m good with that. Would be handy to document it in such a way as to 
kind of put it forward as a standard. :-)

D

signature.asc
Description: Message signed with OpenPGP using GPGMail

[HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-06-16 Thread Abhijit Menon-Sen

At 2014-05-30 16:04:33 +0700, the.ap...@gmail.com wrote:
>
> While developing some XML processing queries, i stumbled on an old bug
> mentioned in http://wiki.postgresql.org/wiki/Todo#XML: Fix Nested or
> repeated xpath() that apparently mess up namespaces.

Thanks for the patch, and welcome to Postgres development.

I can confirm that it works fine. I have attached here a very slightly
tweaked version of the patch (removed trailing whitespace, and changed
some comment text).

I'm marking this ready for committer.

-- Abhijit
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 422be69..f9ad598 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3602,19 +3602,28 @@ xml_xmlnodetoxmltype(xmlNodePtr cur)
 	if (cur->type == XML_ELEMENT_NODE)
 	{
 		xmlBufferPtr buf;
+		xmlNodePtr cur_copy;
 
 		buf = xmlBufferCreate();
+
+		/* the result of xmlNodeDump won't contain namespace definitions
+		 * from parent nodes, but xmlCopyNode duplicates a node along
+		 * with its required namespace definitions.
+		 */
+		cur_copy = xmlCopyNode(cur, 1);
 		PG_TRY();
 		{
-			xmlNodeDump(buf, NULL, cur, 0, 1);
+			xmlNodeDump(buf, NULL, cur_copy, 0, 1);
 			result = xmlBuffer_to_xmltype(buf);
 		}
 		PG_CATCH();
 		{
+			xmlFreeNode(cur_copy);
 			xmlBufferFree(buf);
 			PG_RE_THROW();
 		}
 		PG_END_TRY();
+		xmlFreeNode(cur_copy);
 		xmlBufferFree(buf);
 	}
 	else
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 382f9df..a6d26f7 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -584,6 +584,12 @@ SELECT xpath('//loc:piece/@id', 'http://127.0.0.1";>http://127.0.0.1";>number one', ARRAY[ARRAY['loc', 'http://127.0.0.1']]);
+ xpath  
+
+ {"http://127.0.0.1\"; id=\"1\">number one","http://127.0.0.1\"; id=\"2\"/>"}
+(1 row)
+
 SELECT xpath('//b', 'one two three etc');
   xpath  
 -
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index a34d1f4..c7bcf91 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -498,6 +498,12 @@ LINE 1: SELECT xpath('//loc:piece/@id', 'http://127.0.0.1";>number one', ARRAY[ARRAY['loc', 'http://127.0.0.1']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('//loc:piece', 'http://127.0.0.1";>number one');
 SELECT xpath('//loc:piece/@id', 'http://127.0.0.1";>number one', ARRAY[ARRAY['loc', 'http://127.0.0.1']]);
+SELECT xpath('//loc:piece', 'http://127.0.0.1";>number one', ARRAY[ARRAY['loc', 'http://127.0.0.1']]);
 SELECT xpath('//b', 'one two three etc');
 SELECT xpath('//text()', '<');
 SELECT xpath('//@value', '');

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-16 Thread Stephen Frost

* Kevin Grittner (kgri...@ymail.com) wrote:
> Stephen Frost  wrote:
> > Any dump not run by a superuser is already in doubt, imv.  That
> > is a problem we already have which really needs to be addressed,
> > but I view that as an independent issue.
> 
> I'm not seeing that.  If the user can't dump, you get an error and
> pg_dump returns something other than SUCCESS.

We've outlined an approach with RLS which would do the same.

I'm still of the opinion that, today, we have a problem that only a
superuser-run dump has any chance of success (and even if you get it
working today it'll probably break tomorrow, and you had better be
paying attention).  I'd like to fix that situation, but it's an
independent effort from this.  We've had issues in the past with pg_dump
creating things that can't be restored and they're certainly bugs but
trying to make that work with a regular user as a whole system backup
strategy, today, is just asking for trouble.

> > I agree with avoiding adding another superuser-only capability;
> > see the other sub-thread about making this a per-user capability.
> 
> It should be possible to design something which does not have this
> risk.  

The risk that pg_dump might create a dump which can't be restored?
Agreed, and I'd love to hear your thoughts on the proposal.

> What I was saying was that what was being described at that
> point wasn't it, and IMV was not acceptable.  I think that there
> should never by any doubt that a pg_dump run which completes
> without error copied all requested tables in their entirety, not a
> subset of the rows in the tables.

pg_dump needs to be able to have an option to go either way on this
case, as I can see value in running pg_dump in "RLS-enforcing" mode, but
it could default to "error-if-RLS".

> A GUC which only caused an error on the attempt to actually read
> specific rows which the user does not have permission to see would
> leak too much information.  A GUC which caused a SELECT or COPY
> from a table to throw an error if the user was not entitled to see
> all rows in the table could work.

Right- this would be the 'DIRECT SELECT' which would allow bypassing all
RLS and therefore mean that the user is allowed to see ALL rows of a
table.  That's one of the reasons why I agree with Dean's approach,
because we really need to know at the outset if the calling user is
allowed to extract all rows from a table or not- we can't go looking
through the entire table testing each row before we start running the
query.

>   Another thing which could work,
> if it can be coded, would be a GUC which would throw an error if
> the there were not quals on the query to prohibit seeing rows which
> the security conditions would prohibit, whether or not any matching
> rows actually existed.  

If I'm following you correctly, this would be an optimization that
allows avoiding RLS in the case where some information about the user
causes the overall qual to always return 'true', correct?  I'd certainly
like to see what happens in that case today and agree that it'd be great
to optimize for and perhaps even allow a user for which that is true to
not need the 'DIRECT SELECT' privilege, but in practice, I don't think
it'll be possible in most cases (certainly not in the case where an
external security module is deciding the access) and the optimization
may not be worth it.

> The latter would match the behavior of
> column level security -- you get an error when trying to select a
> prohibited column even if there are no rows in the table.

Agreed, but that would be a relaxation of the proposed approach and
therefore something which could be added later, if it's deemed
worthwhile.

Thanks,

Stephen

signature.asc
Description: Digital signature

[HACKERS] [REVIEW] Re: postgresql.auto.conf read from wrong directory

2014-06-16 Thread Abhijit Menon-Sen

Hi.

Just a few minor comments about your patch:

At 2014-06-13 11:46:21 +0530, amit.kapil...@gmail.com wrote:
>
> +  Notes
> +
> +  
> +This command will not allow to set parameters that are disallowed or
> +excluded in postgresql.conf. It also disallows to set configuration
> +parameter .
> +  
> + 

I suggest the following wording:

This command may not be used to set

or any parameters that are not allowed in postgresql.conf.

> + /*
> +  * Disallow parameter's that are excluded or disallowed in
> +  * postgresql.conf.
> +  */

"parameters", no apostrophe.

>   if ((record->context == PGC_INTERNAL) ||
> - (record->flags & GUC_DISALLOW_IN_FILE))
> - ereport(ERROR,
> - (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
> -  errmsg("parameter \"%s\" cannot be changed",
> - name)));
> + (record->flags & GUC_DISALLOW_IN_FILE) ||
> + (record->flags & GUC_DISALLOW_IN_AUTO_FILE) ||
> + (record->flags & GUC_NOT_IN_SAMPLE))
> +  ereport(ERROR,
> +  (errcode(ERRCODE_CANT_CHANGE_RUNTIME_PARAM),
> +   errmsg("parameter \"%s\" cannot be changed",
> +  name)));

I looked at the settings that are marked GUC_NOT_IN_SAMPLE but neither
PGC_INTERNAL nor GUC_DISALLOW_IN_*FILE. I don't feel strongly about it,
but I don't see any particularly good reason to exclude them here.

(I also agree with Fujii-san that it isn't worth making extensive
changes to avoid data_directory being offered via tab-completion.)

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-16 Thread Kevin Grittner

Stephen Frost  wrote:
> Kevin Grittner (kgri...@ymail.com) wrote:

>> The proposed approach would leave the validity of any dump which
>> was not run as a superuser in doubt.  The last thing we need, in
>> terms of improving security, is another thing you can't do
>> without connecting as a superuser.
>
> Any dump not run by a superuser is already in doubt, imv.  That
> is a problem we already have which really needs to be addressed,
> but I view that as an independent issue.

I'm not seeing that.  If the user can't dump, you get an error and
pg_dump returns something other than SUCCESS.

test=# create user bob;
CREATE ROLE
test=# create user tom;
CREATE ROLE
test=# set role bob;
SET
test=> create table person(person_id int primary key, name text not null, ssn 
text);
CREATE TABLE
test=> insert into person values (1, 'Stephen Frost', '123-45-6789');
INSERT 0 1
test=> insert into person values (2, 'Kevin Grittner');
INSERT 0 1
test=> grant select (person_id, name) on person to tom;
GRANT
test=> \q
kgrittn@Kevin-Desktop:~/pg/master$ pg_dump -U bob test >bob-backup.sql
kgrittn@Kevin-Desktop:~/pg/master$ pg_dump -U tom test >tom-backup.sql
pg_dump: [archiver (db)] query failed: ERROR:  permission denied for relation 
person
pg_dump: [archiver (db)] query was: LOCK TABLE public.person IN ACCESS SHARE 
MODE
kgrittn@Kevin-Desktop:~/pg/master$ echo $?
1

> I agree with avoiding adding another superuser-only capability;
> see the other sub-thread about making this a per-user capability.

It should be possible to design something which does not have this
risk.  What I was saying was that what was being described at that
point wasn't it, and IMV was not acceptable.  I think that there
should never by any doubt that a pg_dump run which completes
without error copied all requested tables in their entirety, not a
subset of the rows in the tables.

A GUC which only caused an error on the attempt to actually read
specific rows which the user does not have permission to see would
leak too much information.  A GUC which caused a SELECT or COPY
from a table to throw an error if the user was not entitled to see
all rows in the table could work.  Another thing which could work,
if it can be coded, would be a GUC which would throw an error if
the there were not quals on the query to prohibit seeing rows which
the security conditions would prohibit, whether or not any matching
rows actually existed.  The latter would match the behavior of
column level security -- you get an error when trying to select a
prohibited column even if there are no rows in the table.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [REVIEW] Re: Re: BUG #9578: Undocumented behaviour for temp tables created inside query language (SQL) functions

2014-06-16 Thread Abhijit Menon-Sen

(Cc: to pgsql-bugs dropped.)

At 2014-03-17 18:24:55 +1100, kommi.harib...@gmail.com wrote:
>
> *** a/doc/src/sgml/xfunc.sgml
> --- b/doc/src/sgml/xfunc.sgml
> ***
> *** 153,159  SELECT clean_emp();
> --- 153,186 
>   (\) (assuming escape string syntax) in the body of
>   the function (see ).
>  
> +
> +
> + Parsing mechanism of a function
>   
> +
> + function
> + parsing mechanism
> +

I suggest "Catalog changes within functions" instead of the above title.

> + 
> +  The body of an SQL function is parsed as if it were a single - 
> multi-part
> +  statement and thus uses a constant snapshot of the system catalogs for
> +  every sub-statement therein. Commands that alter the catalog will 
> likely not
> +  work as expected.
> + 
> + 
> +   
> +  For example: Issuing "CREATE TEMP TABLE" within an SQL function will 
> add
> +  the table to the catalog but subsequent statements in the same 
> function will
> +  not see those additions and thus the temporary table will be invisible 
> to them.
> + 
> + 
> +   
> +  Thus it is generally advised that PL/pgSQL be used, 
> instead of
> +  SQL, when any catalog visibilities are required in 
> the same function.
> + 
> +

I don't think that much text is warranted. I suggest something like the
following condensed wording:


 The body of an SQL function is parsed as if it were a single
 multi-part statement, using a constant snapshot of the system
 catalogs. The effect of any commands that alter the catalogs
 (e.g. "CREATE TEMP TABLE") will therefore not be visible to
 subsequent commands in the function body.



 The recommended workaround is to use PL/PgSQL.


Does that seem sensible to you?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-16 Thread Stephen Frost

Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 13 June 2014 01:13, Stephen Frost  wrote:
> > This approach was suggested by an existing user testing out this RLS
> > approach, to be fair, but it looks pretty sane to me as a way to address
> > some of these concerns. Certainly open to other ideas and thoughts though.
> 
> Yeah, I was thinking something like this could work, but I would go
> further. Suppose you had separate GRANTable privileges for direct
> access to individual tables, bypassing RLS, e.g.
> 
>   GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name

This is certainly an interesting idea and I'm glad we're getting this
level of discussion early on in the 9.5 cycle as I'd really like to see
a good solution implemented for 9.5.

I've been going back-and-forth about this and what's really swaying me
right now is that it'd be nearly impossible to determine if a given RLS
qual actually allows full access to a table for a given user without
going through the entire table and testing the qual against each row.
With this GRANT ability, we'd be able to completely avoid calling the
RLS quals when the user is granted this right.

Not sure offhand how many bits we've got left at the per-table level
though; we added TRUNCATE rights not that long ago and this looks like
another good right to add, but there are only so many bits available..
At the same time, I do think this is something we could also add later,
perhaps after figuring out a good way to extend the set of bits
available for privileges on tables.

> Combined with the GUC (direct_table_access, say) to request direct
> access to all tables. Then with direct_table_access = true/required, a
> SELECT from a table would error if the user hadn't been granted the
> DIRECT SELECT privilege on all the tables referenced in the query.

I can see this working.  One thing I'm curious about is if we would want
to support this inside of the SELECT statement (or perhaps COPY?)
directly, rather than making a user have to flip a GUC back and forth
while they're doing something.  I can imagine, during testing, a session
looking like this:

select * from table;
@#@!$!
set direct_table_access = true;
select * from table;
select * from table where blah = x;
alter table set row level security blah = x;
select * from table;
select * from table;
select * from table;
@!#$!@#!
set direct_table_access = false;
select * from table;
...

Would 'select direct' or 'select * from DIRECT table' (or maybe 'ONLY'?)
be workable?  There's certainly SQL standard concerns to be thought of
here which might precldue anything we do with SELECT, but we could
support something with COPY.

> Tools like pg_dump would require direct_table_access, but there might
> be other levels of access that didn't error out.

pg_dump would need an option to set direct_table_access or not.  Having
it ask by default is acceptable to me, but I do think we need to be able
to tell it to *not* set that.

> I think if I were using RLS, I would definitely want/expect this level
> of fine-grained control over permissions on a per-table basis, rather
> than the superuser/non-superuser level of control, or having
> RLS-exempt users.

I agree that it'd be great to have- and we need to make sure we don't
paint ourselves into a corner with the initial versions.  What I'm
worried about is that we're going to end up feature-creeping this to
death and ending up with nothing in 9.5.  I'll try to get a wiki page
going to discuss these items (as mentioned up-thread) and we can look at
prioritizing them and looking at what dependencies exist on other parts
of the system and seeing what's required for the initial version.

> Actually, given the fact that the majority of users won't be using
> RLS, I would be tempted to invert the above logic and have the new
> privilege be for LIMITED access (via RLS quals). So a user granted the
> normal SELECT privilege would be able to bypass RLS, but a user only
> granted LIMITED SELECT wouldn't.

This I don't agree with- it goes against what is done on existing
systems afaik and part of the idea is that you can minimize changes to
the applications or users but still be able to curtail what they can
see.  Making regular SELECTs start erroring if they haven't set some GUC
because RLS has been implemented on a given table would be quite
annoying, imv.

Now, that said, wouldn't the end user be able to control this for their
particular environment by setting the GUC accordingly in
postgresql.conf?  I'd still argue that it should be defaulted to what I
view as the 'normal' case, where RLS is applied unless you asked for
your queries to error instead, but if a user wants to have it flipped
around the other way, they could update their postgresql.conf to make it
so.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-16 Thread Fujii Masao

On Mon, Jun 16, 2014 at 12:14 PM, Amit Kapila  wrote:
> On Sun, Jun 15, 2014 at 6:29 PM, Christoph Berg  wrote:
>>
>> Re: Amit Kapila 2014-06-13
>> 
>> > Agreed, I had mentioned in Notes section of document.  Apart from that
>> > I had disallowed parameters that are excluded from postgresql.conf by
>> > initdb (Developer options) and they are recommended in user manual
>> > to be not used in production.
>>
>> Excluding developer options seems too excessive to me. ALTER SYSTEM
>> should be useful for developing too.
>
> Developer options are mainly for debugging information or might help in one
> of the situations, so I thought somebody might not want them to be part of
> server configuration once they are set.  We already disallow parameters like
> config_file, transaction_isolation, etc. which are disallowed to be set in
> postgresql.conf.  Could you please explain me a bit in which
> situations/scenarios, do you think that allowing developer options via Alter
> System can be helpful?

I think that's helpful. I've heard that some users enable the developer option
"trace_sort" in postgresql.conf in order to collect the information
about sorting.
They might want to set trace_sort via ALTER SYSTEM, for example.

> This information is not stored in pg_settings.  One way is to specify
> manually all the parameters which are disallowed but it seems the query
> will become clumsy, another could be to have another column in pg_settings.
> Do you think of any other way?

I like the latter idea, i.e., exposing the flags of GUC parameters in
pg_settings,
but it seems overkill just for tab-completion purpose. I'd withdraw my comment.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clarification of FDW API Documentation

2014-06-16 Thread Etsuro Fujita

(2014/06/14 2:46), Tom Lane wrote:
> Jason Petersen  writes:

>> Even if there is no guarantee that `IterateForeignScan` is called exactly 
>> once
>> before each `ExecForeignDelete` call (which would remove the ability to have
>> them cooperate using this single cursor), one could easily devise other 
>> storage
>> backends that don't need "junk" columns to perform `DELETE` operations.
> 
> Such as?  I could imagine having an optimization that works like you
> suggest for simple scan cases, but it's not there now, and it could not
> be the only mode.

The optimization in the following comment for postgresPlanForeignModify?

/*
 * postgresPlanForeignModify
 *  Plan an insert/update/delete operation on a foreign table
 *
 * Note: currently, the plan tree generated for UPDATE/DELETE will always
 * include a ForeignScan that retrieves ctids (using SELECT FOR UPDATE)
 * and then the ModifyTable node will have to execute individual remote
 * UPDATE/DELETE commands.  If there are no local conditions or joins
 * needed, it'd be better to let the scan node do UPDATE/DELETE RETURNING
 * and then do nothing at ModifyTable.  Room for future optimization ...
 */

I think this would be very useful.  So, I plan to add a patch for it to
2014-08.

Thanks,

Best regards,
Etsuro Fujita

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Atomics hardware support table & supported architectures

2014-06-16 Thread Andres Freund

On 2014-06-15 03:12:21 +0200, Andres Freund wrote:
> Hi,
> 
> At this year developer's meeting we'd discussed the atomics abstraction
> which is necessary for some future improvements. We'd concluded that a
> overview over the hardware capabilities of the supported platforms would
> be helpful. I've started with that at:
> https://wiki.postgresql.org/wiki/Atomics
> 
> Does somebody want other columns in there?
> 
> From that and previous discussions
> (e.g. 
> http://archives.postgresql.org/message-id/20131013004658.GG4056218%40alap2.anarazel.de
> ) I think we should definitely remove some platforms:

> 3) sparcv8: Last released model 1997.

Just as a bit of context: < sparcv9 support got dropped from solaris
10. In 2005.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_receivexlog add synchronous mode

2014-06-16 Thread furuyao

> You introduced the state machine using the flag "flush_flg" into
> pg_receivexlog.
> That's complicated and would reduce the readability of the source code.
> I think that the logic should be simpler like walreceiver's one.
> 
> Maybe I found one problematic path as follows:
> 
> 1. WAL is written and flush_flag is set to 1 2. PQgetCopyData() returns
> 0 and flush_flg is incremented to 2 3. PQconsumeInput() is executed 4.
> PQgetCopyData() reads keepalive message 5. After processing keepalive
> message, PQgetCopyDate() returns 0 6. Since flush_flg is 2, WAL is
> flushed and flush_flg is reset to 0
> 
> But new message can arrive while processing keepalive message. Before
> flushing WAL, such new message should be processed.
Together with the readability, fixed to the same process as the loop of 
walreceiver.

> +Enables synchronous mode. If replication slot is disabled then
> +this setting is irrelevant.
> 
> Why is that irrelevant in that case?
> 
> Even when replication slot is not used, thanks to this feature,
> pg_receivexlog can flush WAL more proactively and which may improve the
> durability of WAL which pg_receivexlog writes.
It's mean, report the flush position or not.
If the SLOT is not used, it is not reported.
Fixed to be reported only when using the SLOT.

> +printf(_("  -m, --sync-modesynchronous mode\n"));
> 
> I think that calling this feature "synchronous mode" is confusing.
Modified the "synchronous mode" to "this mode is written some records, flush 
them to disk.".

Regards,

-- 
Furuya Osamu


pg_receivexlog-add-synchronous-mode-v4.patch
Description: pg_receivexlog-add-synchronous-mode-v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread Pavel Stehule

2014-06-16 11:46 GMT+02:00 xbzhang :

>
> I want to implement the skip errors for copy from,lik as :
> create table A (c int primary key);
> copy A from stdin;
> 1
> 1
> 2
> \.
>
> copy will failed:
> ERROR: duplicate key violates primary key constraint "CC_PKEY"
> CONTEXT: COPY CC, line 2: "1"
>
> I want skip the error, and continue to copy the reset of tuple. The result
> will be that there are two rows in table A: 1 and 2.
>
> how to implement that ? Anybody give me some suggestion?
>

you should to reimplement a copy procedure to use a subtransactions. Using
subtransaction for any row is too expensive, but you can do subtransaction
per 1000 rows, and when some exception is raised, then store data per one
row/one subtransaction.

Regards

Pavel Stehule


>
> --
>
> 张晓博   研发二部
>
> 北京人大金仓信息技术股份有限公司
>
> 地址：北京市海淀区上地西路八号院上地科技大厦4号楼501
>
> 邮编：100085
>
> 电话：(010) 5885 1118 - 8450
>
> 手机：15311394463
>
> 邮箱：xbzh...@kingbase.com.cn
>

[HACKERS] How to implement the skip errors for copy from ?

2014-06-16 Thread xbzhang









I want to?implement the skip errors for copy from,lik as :create table A (c int 
primary key);copy A from stdin;112\.
copy will failed:ERROR: duplicate key violates primary key constraint "CC_PKEY"?
CONTEXT: COPY CC, line 2: "1"
I want skip the error, and continue to copy the reset of tuple. The resultwill 
be that there are two rows in table A: 1 and 2.
how to?implement that ? Anybody give me some?suggestion?



张晓博?? 研发二部
北京人大金仓信息技术股份有限公司
地址：北京市海淀区上地西路八号院上地科技大厦4号楼501
邮编：100085
电话：(010) 5885 1118 - 8450
手机：15311394463
邮箱：xbzh...@kingbase.com.cn

Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-16 Thread Christoph Berg

Re: Amit Kapila 2014-06-16 

> Developer options are mainly for debugging information or might help in one
> of the situations, so I thought somebody might not want them to be part of
> server configuration once they are set.  We already disallow parameters like
> config_file, transaction_isolation, etc. which are disallowed to be set in
> postgresql.conf.  Could you please explain me a bit in which
> situations/scenarios, do you think that allowing developer options via Alter
> System can be helpful?

Oh if these are disallowed in postgresql.conf, they should be
disallowed in auto.conf too. I meant to say that there shouldn't be
additional restrictions for auto.conf, except for cases like
data_directory which won't work at all (or are highly confusing).

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [v9.5] Custom Plan API

2014-06-16 Thread Shigeru Hanada

Kaigai-san,

I've just applied v1 patch, and tried build and install, but I found two issues:

1) The contrib/ctidscan is not automatically built/installed because
it's not described in contrib/Makefile.  Is this expected behavior?
2) I got an error message below when building document.

$ cd doc/src/sgml
$ make
openjade  -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
-d stylesheet.dsl -t sgml -i output-html -V html-index postgres.sgml
openjade:catalogs.sgml:2525:45:X: reference to non-existent ID
"SQL-CREATECUSTOMPLAN"
make: *** [HTML.index] Error 1
make: *** Deleting file `HTML.index'

I'll review another part of the patch, including the design.


2014-06-14 10:59 GMT+09:00 Kohei KaiGai :
> According to the discussion upthread, I revised the custom-plan patch
> to focus on regular relation scan but no join support right now, and to
> support DDL command to define custom-plan providers.
>
> Planner integration with custom logic to scan a particular relation is
> enough simple, unlike various join cases. It's almost similar to what
> built-in logic are doing now - custom-plan provider adds a path node
> with its cost estimation if it can offer alternative way to scan referenced
> relation. (in case of no idea, it does not need to add any paths)
>
> A new DDL syntax I'd like to propose is below:
>
>   CREATE CUSTOM PLAN  FOR  PROVIDER ;
>
>  is as literal, put a unique identifier.
>  is workload type to be offered by this custom-plan provider.
> "scan" is the only option right now, that means base relation scan.
>  is also as literal; it shall perform custom-plan provider.
>
> A custom-plan provider function is assumed to take an argument of
> "internal" type to deliver a set of planner information that is needed to
> construct custom-plan pathnode.
> In case of "scan" class, pointer towards an customScanArg object
> shall be delivered on invocation of custom-plan provider.
>
> typedef struct {
> uint32custom_class;
> PlannerInfo*root;
> RelOptInfo *baserel;
> RangeTblEntry  *rte;
> } customScanArg;
>
> In case when the custom-plan provider function being invoked thought
> it can offer an alternative scan path on the relation of "baserel", things
> to do is (1) construct a CustomPath (or its inherited data type) object
> with a table of callback function pointers (2) put its own cost estimation,
> and (3) call add_path() to register this path as an alternative one.
>
> Once the custom-path was chosen by query planner, its CreateCustomPlan
> callback is called to populate CustomPlan node based on the pathnode.
> It also has a table of callback function pointers to handle various planner's
> job in setrefs.c and so on.
>
> Similarly, its CreateCustomPlanState callback is called to populate
> CustomPlanState node based on the plannode. It also has a table of
> callback function pointers to handle various executor's job during quey
> execution.
>
> Most of callback designs are not changed from the prior proposition in
> v9.4 development cycle, however, here is a few changes.
>
> * CustomPlan became to inherit Scan, and CustomPlanState became to
>   inherit ScanState. Because some useful routines to implement scan-
>   logic, like ExecScan, expects state-node has ScanState as its base
>   type, it's more kindness for extension side. (I'd like to avoid each
>   extension reinvent ExecScan by copy & paste!)
>   I'm not sure whether it should be a union of Join in the future, however,
>   it is a reasonable choice to have compatible layout with Scan/ScanState
>   to implement alternative "scan" logic.
>
> * Exporting static functions - I still don't have a graceful answer here.
>   However, it is quite natural that extensions to follow up interface updates
>   on the future version up of PostgreSQL.
>   Probably, it shall become clear what class of functions shall be
>   exported and what class of functions shall be re-implemented within
>   extension side in the later discussion.
>   Right now, I exported minimum ones that are needed to implement
>   alternative scan method - contrib/ctidscan module.
>
> Items to be discussed later:
> * planner integration for relations join - probably, we may define new
>   custom-plan classes as alternative of hash-join, merge-join and
>   nest-loop. If core can know this custom-plan is alternative of hash-
>   join, we can utilize core code to check legality of join.
> * generic key-value style options in custom-plan definition - Hanada
>   san proposed me off-list - like foreign data wrapper. It may enable
>   to configure multiple behavior on a binary.
> * ownership and access control of custom-plan. right now, only
>   superuser can create/drop custom-plan provider definition, thus,
>   it has no explicit ownership and access control. It seems to me
>   a reasonable assumption, however, we may have a usecase that
>   needs custom-plan by unprivileged users.
>
> Thanks,
>
> 2014-05-12 10:09 GMT+09:00 Kou

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-16 Thread Michael Paquier

On Sat, May 24, 2014 at 5:08 AM, Ronan Dunklau  wrote:
> Hello,
>
> Since my last proposal didn't get any strong rebuttal, please find attached a
> more complete version of the IMPORT FOREIGN SCHEMA statement.
>
> I tried to follow the SQL-MED specification as closely as possible.
>
> This adds discoverability to foreign servers. The structure of the statement
> as I understand it is simple enough:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> EXCEPT) table_list ] INTO local_schema.
>
> The import_foreign_schema patch adds the infrastructure, and a new FDW
> routine:
>
> typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree);
>
> This routine must return a list of CreateForeignTableStmt mirroring whatever
> tables were found on the remote side, which will then be executed.
>
> The import_foreign_schema_postgres_fdw patch proposes an implementation of
> this API for postgres_fdw. It will import a foreign schema using the right
> types as well as nullable information.
>
> Regarding documentation, I don't really know where it should have been put. If
> I missed something, let me know and I'll try to correct it.

I have been playing a bit with this patch... And got a couple of comments.

1) Perhaps somebody could guide me to a better spec but by looking at
the spec here => http://farrago.sourceforge.net/design/sqlmed.html, it
seems that the FROM SERVER clause is misplaced, it should be placed
after the table list.
2) The query I am seeing on this spec offers the possiblitily to query
TABLE_NAME LIKE 'pattern'. Is this part of the spec as well? If yes,
are you planning to add it as well. I imagine that this could be quite
handy for users importing from a foreign schema tables that have the
same prefix name for example. ImportForeignSchemaRestrictionType could
be extended with an IMPORT_LIKE type. Extending a bit the spec...
There could be a list of LIKE patterns, this matches more your code by
adding all of them in table_names. I am not voting to add TABLE_NAME
in the list of reserved keywords though, so something like "TABLE LIKE
'pattern'" would be nice.
3) This has been already raised in this thread, but something missing
with this patch is the possiblity to pass options when executing an
import. This could allow a FDW to do more fine-grained operations on
the underlying objects of the table. There would be two level of
options: table level and schema level. For example, with the current
feature it is not possible to rename imported tables on-the-fly. I
imagine that this would be useful.
4) Something not mandatory with the core patch but always nice to
have: tab completion for this command in psql.
5) The error message returned to user when import fails because of a
missing type is rather unfriendly. Let's imagine with two nodes and
postgres_fdw... Node 1:
create type toto as (a int, b int);
create table toto_tab (a toto);
Node 2:
=# import foreign schema public from server postgres_server into public;
ERROR:  42704: type "public.toto" does not exist
LOCATION:  parseTypeString, parse_type.c:797
I would rather imagine something like "IMPORT failed because of
missing type defined on remote but not locally".
6) Import does not fail if a table specified in LIMIT TO is not
defined on remote-side:
=# import foreign schema public from server postgres_server limit to
(tab_not_there) into public;
IMPORT FOREIGN SCHEMA
=# \d
No relations found.
7) A small thing: code comments do not respect the project format:
http://www.postgresql.org/docs/9.1/interactive/source-format.html
8) In fetch_remote_tables@postgres_fdw.c, you are missing materialized views:
WHERE relkind IN ('r', 'v', 'f')
9) The changes in pg_type.h adding new OID defines should be a
separate patch IMO
10) Code has many whitespaces.
11) Sometimes the import goes strangely:
11-1) After some testing I noticed that tables with incorrect names
were imported when using LIMIT TO. For example on remote a table
called "ab" is present, IMPORT SCHEMA LIMIT TO 'other_ab' created a
local entry called "other_ab" while it should create nothing.
11-2) Tables with empty names can be created locally. On foreign server:
=# \d
 List of relations
 Schema |   Name   | Type  | Owner
+--+---+
 public | aa   | table | ioltas
 public | toto_tab | table | ioltas
(2 rows)
On local node:
=# import foreign schema public from server postgres_server limit to
(toto_tab, "NULL", aa) into public;
-- (Forget about NULL here, I could reproduce it without as well)
IMPORT FOREIGN SCHEMA
Time: 13.589 ms
=# \d
 List of relations
 Schema |   Name   | Type  | Owner
+--+---+
 public |  | foreign table | ioltas
 public | toto_tab | foreign table | ioltas
(2 rows)
Some more testing showed me that as well:
=# \d

List
of relations
 Schema |

61 matches

Mail list logo