Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Andres Freund
On 2012-11-12 19:21:28 +, Simon Riggs wrote:
 On 10 September 2012 17:50, Tom Lane t...@sss.pgh.pa.us wrote:
 
  The point of the proposal that I am making is to have a simple,
  low-maintenance solution for people who need a single-application
  database.  A compromise somewhere in the middle isn't likely to be an
  improvement for anybody.  For instance, if you want to have additional
  connections, you open up a whole collection of communication and
  authentication issues, which potential users of a single-application
  database don't want to cope with.
 
 So the proposal is to implement a database that can't ever have 2 or
 more connections.
 ...
 It's almost impossible to purchase a CPU these days that doesn't have
 multiple cores, so the whole single-process architecture is just dead.
 Yes, we want Postgres installed everywhere, but this isn't the way to
 achieve that.
 
 I agree we should allow a PostgreSQL installation to work for a single
 user, but I don't see that requires other changes. This idea will
 cause endless bugs, thinkos and severely waste our time. So without a
 much better justification, I don't think we should do this.

I personally think that a usable  scriptable --single mode is
justification enough, even if you don't aggree with the other
goals. Having to wait for hours just enter one more command because
--single doesn't support any scripts sucks. Especially in recovery
situations.

I also don't think a single-backend without further child processes is
all that helpful - but I think this might be a very useful stepping
stone.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Heikki Linnakangas

On 12.11.2012 22:53, Tom Lane wrote:

Here's an updated patch that fixes the GIST replay functions as well as
the other minor issues that were mentioned.  Barring objections, I'll
set about back-patching this as far as 9.0.


Ok. It won't help all that much on 9.0, though.


One thing that could use verification is my fix for
gistRedoPageSplitRecord.  AFAICS, the first page listed in the WAL
record is always the original page, and the ones following it are
pages that were split off from it, and can (as yet) only be reached by
following right-links from the original page.  As such, it should be
okay to release locks on the non-first pages as soon as we've written
them.  We have to hold lock on the original page though to avoid letting
readers follow dangling right-links.  Also, the update of
NSN/FOLLOW_RIGHT on the child page (if any) has to be done atomically
with all this, so that has to be done before releasing the original-page
lock as well.  Does that sound right?


Yep.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-11-13 Thread Andres Freund
Hi CK,

On 2012-10-30 21:16:07 +0100, Christian Kruse wrote:
 index b4fcbaf..66ed10f 100644
 --- a/doc/src/sgml/config.sgml
 +++ b/doc/src/sgml/config.sgml

I think a short introduction or at least a reference on how to configure
hugepages would be a good thing.

   varlistentry id=guc-temp-buffers xreflabel=temp_buffers
termvarnametemp_buffers/varname (typeinteger/type)/term
indexterm
 diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
 index df06312..f9de239 100644
 --- a/src/backend/port/sysv_shmem.c
 +++ b/src/backend/port/sysv_shmem.c
 @@ -27,10 +27,14 @@
  #ifdef HAVE_SYS_SHM_H
  #include sys/shm.h
  #endif
 +#ifdef MAP_HUGETLB
 +#include dirent.h
 +#endif

I think a central #define for the MAP_HUGETLB capability would be a good
idea, akin to HAVE_SYS_SHM_H.

E.g. this:
 --- a/src/backend/utils/misc/guc.c
 +++ b/src/backend/utils/misc/guc.c
 @@ -22,6 +22,7 @@
  #include limits.h
  #include unistd.h
  #include sys/stat.h
 +#include sys/mman.h
  #ifdef HAVE_SYSLOG
  #include syslog.h
  #endif

is unlikely to fly on windows.


 +/*
 + *   static long InternalGetHugepageSize()
 + *
 + * Attempt to get a valid hugepage size from /sys/kernel/mm/hugepages/ by
 + * reading directory contents
 + * Will fail (return -1) if the directory could not be opened or no valid
 + * page sizes are available. Will return the biggest hugepage size on
 + * success.
 + *
 + */

The biggest remark is out of date.


 +static long
 +InternalGetHugepageSize()
 +{
 ...
 + if ((smallest_size == -1 || size  smallest_size)
 +  InternalGetFreeHugepagesCount(ent-d_name)  
 0)
 + {
 + smallest_size = size;
 + }
 ...
 +
 + if (smallest_size == -1)
 + {
 + ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
 + (errmsg(Could not find a valid hugepage size),
 +  errhint(This error usually means that either 
 CONFIG_HUGETLB_PAGE 
 +  is not in kernel or that your 
 architecture does not 
 +  support hugepages or you did 
 not configure hugepages)));
 + }

I think differentiating the error message between no hugepages found and
InternalGetFreeHugepagesCount(ent-d_name) always beeing zero would be a
good idea. Failing this way if
InternalGetFreeHugepagesCount(ent-d_name)  0 seems fine.


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-11-13 Thread Andres Freund
Oh, one more thing...

On 2012-10-30 21:16:07 +0100, Christian Kruse wrote:
 ok, I think I implemented all of the changes you requested. All but
 the ia64 dependent, I have to do more research for this one.

I vote for simply not caring about ia64.

This is:

 +#ifdef MAP_HUGETLB
 +#  ifdef __ia64__
 +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
 +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
 +#  else
 +#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
 +#define PG_MAP_HUGETLB MAP_HUGETLB
 +#  endif
 +#else
 +#  define PG_MAP_HUGETLB 0
 +#endif

too much underdocumented crazyness for a very minor platform. Should
somebody with the approprate harware want to submit an additional patch,
fine


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Markus Wanner
On 11/13/2012 01:22 AM, Greg Smith wrote:
 Once you accept that eventually there need to be online conversion
 tools, there needs to be some easy way to distinguish which pages have
 been processed for several potential implementations.

Agreed. What I'm saying is that this identification doesn't need to be
as fine grained as a per-page bit. A single horizon or border is
enough, given an ordering of relations (for example by OID) and an
ordering of pages in the relations (obvious).

 All of the table-based checksum enabling ideas ...

This is not really one - it doesn't allow per-table switching. It's just
meant to be a more compact way of representing which pages have been
checksummed and which not.

 I'm thinking of this in some ways like the way creation of a new (but
 not yet valid) foreign key works.  Once that's active, new activity is
 immediately protected moving forward.  And eventually there's this
 cleanup step needed, one that you can inch forward over a few days.

I understand that. However, I question if users really care. If a
corruption is detected, the clever DBA tells his trainee immediately
check the file- and disk subsystem - no matter whether the corruption
was on old or new data.

You have a point in that pages with newer data are often more likely
to be re-read and thus getting checked. Where as the checksums written
to pages with old data might not be re-read any time soon. Starting to
write checksums from the end of the relation could mitigate this to some
extent, though.

Also keep in mind the quietly corrupted after checked once, but still
in the middle of checking a relation case. Thus a single bit doesn't
really give us the guarantee you ask for. Sure, we can add more than one
bit. And yeah, if done properly, adding more bits exponentially reduces
the likeliness of a corruption inadvertently turning off checksumming
for a page.

All that said, I'm not opposed to using a few bits of the page header. I
wanted to outline an alternative that I think is viable and less intrusive.

 This is why I think any good solution to this problem needs to
 incorporate restartable conversion.

I fully agree to that.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Simon Riggs
On 13 November 2012 06:14, Amit kapila amit.kap...@huawei.com wrote:

I get the installability thang, very very much, I just don't see the
single process thing as the only solution. At very least an open
minded analysis of the actual problem and ways of solving it is called
for, not just reach for a close to hand solution.

 Some other usecase where I have seen it required is in telecom billing apps.
 In telecom application where this solution works, needs other maintainence 
 connections as well.
 Some of the reasons for its use are performance and less maintainence 
 overhead and also their data requirements are
 also not so high.
 So even if this solution doesn't meet all requirements of single process 
 solution (and neither I think it is written to address all)  but can't we 
 think of it as first version and then based on requirements extend it to have 
 other capabilities:
 a. to have a mechnism for other background processes (autovacuum, checkpoint, 
 ..).
 b. more needs to be thought of..

Why would we spend time trying to put back something that is already
there? Why not simply avoid removing it in the first place?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Alvaro Herrera
Simon Riggs escribió:

  So even if this solution doesn't meet all requirements of single
  process solution (and neither I think it is written to address all)
  but can't we think of it as first version and then based on
  requirements extend it to have other capabilities:
  a. to have a mechnism for other background processes (autovacuum, 
  checkpoint, ..).
  b. more needs to be thought of..
 
 Why would we spend time trying to put back something that is already
 there? Why not simply avoid removing it in the first place?

Actually, the whole point of this solution originally was just to serve
pg_upgrade needs, so that it doesn't have to start a complete postmaster
environment just to have to turn off most of what postmaster does, and
with enough protections to disallow everyone else from connecting.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Simon Riggs
On 13 November 2012 13:05, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Simon Riggs escribió:

  So even if this solution doesn't meet all requirements of single
  process solution (and neither I think it is written to address all)
  but can't we think of it as first version and then based on
  requirements extend it to have other capabilities:
  a. to have a mechnism for other background processes (autovacuum, 
  checkpoint, ..).
  b. more needs to be thought of..

 Why would we spend time trying to put back something that is already
 there? Why not simply avoid removing it in the first place?

 Actually, the whole point of this solution originally was just to serve
 pg_upgrade needs, so that it doesn't have to start a complete postmaster
 environment just to have to turn off most of what postmaster does, and
 with enough protections to disallow everyone else from connecting.

I don't see anything that pg_upgrade is doing that causes the need to
support a special mode.

From other people's comments it's clear that single user mode is
desirable to many and *will* be widely deployed if we allow it. I
support the wish to allow a database server to be limited by
configuration to a single user. However, supporting a specifically
targeted mode that presents single user as an architectural
design/limitation is a regressive step that I am strongly opposed to.

The most popular relational database in the world is Microsoft Access,
not MySQL. Access appears desirable because it allows a single user to
create and use a database (which is very good). But all business
databases have a requirement for at least one of: high availability,
multi-user access or downstream processing in other parts of the
business. Businesses worldwide curse the difficulties caused by having
critical business data in desktop databases. And worldwide, there are
also many that don't understand the problems that disconnected data
causes because they can't see past the initial benefit.

The lessons from that are that its OK to start with a database used by
a single person, but that database soon needs to allow access from
multiple users or automated agents. Many database systems support
embedded or single user mode as an architectural option. All of those
systems cause headaches in all of the businesses where they are used.
They also cause problems on small detached devices such as phones,
because even on very small systems there is a requirement for multiple
concurrently active processes each of which may need database access.

PostgreSQL was designed from the ground up as a multi-user database.
This is the very fact that puts us in a good position to become
pervasive. A single database system that works the same on all
devices, with useful replication to connect data together.

The embedded or single mode concept has long been on the do not want
list. I believe that is a completely rational and strongly desirable
thing. Supporting multiple architectures is extra work, and the
restrictive architecture bites people in the long term. The fact that
its an easy patch is not a great argument for changing that
position, and in fact, its not easy, since it comes with a request to
make it work on Windows (= extra work). The easy bit is not proven
since people are already starting to ask about bgwriter and
autovacuum.

In this release there is much work happening around providing
additional autonomous agents (bgworker) and other work around flexible
replication (BDR), all of which would be nullified by the introduction
and eventual wide usage of a restrictive new architecture.

Single user configuration option, yes. Architecturally limited special
version of PostgreSQL, no.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 12.11.2012 22:53, Tom Lane wrote:
 Here's an updated patch that fixes the GIST replay functions as well as
 the other minor issues that were mentioned.  Barring objections, I'll
 set about back-patching this as far as 9.0.

 Ok. It won't help all that much on 9.0, though.

Well, it won't help GIST much, but the actually-reported-from-the-field
case is in btree, and it does fix that.

It occurs to me that if we're sufficiently scared of this case, we could
probably hack the planner (in 9.0 only) to refuse to use GIST indexes
in hot-standby queries.  That cure might be worse than the disease though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Merlin Moncure
On Tue, Nov 13, 2012 at 9:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Ok. It won't help all that much on 9.0, though.

 Well, it won't help GIST much, but the actually-reported-from-the-field
 case is in btree, and it does fix that.

 It occurs to me that if we're sufficiently scared of this case, we could
 probably hack the planner (in 9.0 only) to refuse to use GIST indexes
 in hot-standby queries.  That cure might be worse than the disease though.

if anything, it should be documented.  if you do this kind of thing
people will stop installing bugfix releases.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown

2012-11-13 Thread Fujii Masao
On Tue, Nov 13, 2012 at 1:06 PM, Amit kapila amit.kap...@huawei.com wrote:
 On Monday, November 12, 2012 8:23 PM Fujii Masao wrote:
 On Fri, Nov 9, 2012 at 3:03 PM, Amit Kapila amit.kap...@huawei.com wrote:
 On Thursday, November 08, 2012 10:42 PM Fujii Masao wrote:
 On Thu, Nov 8, 2012 at 5:53 PM, Amit Kapila amit.kap...@huawei.com
 wrote:
  On Thursday, November 08, 2012 2:04 PM Heikki Linnakangas wrote:
  On 19.10.2012 14:42, Amit kapila wrote:
   On Thursday, October 18, 2012 8:49 PM Fujii Masao wrote:
   Before implementing the timeout parameter, I think that it's
 better
  to change
   both pg_basebackup background process and pg_receivexlog so that

 BTW, IIRC the walsender has no timeout mechanism during sending
 backup data to pg_basebackup. So it's also useful to implement the
 timeout mechanism for the walsender during backup.

 Yes, its useful, but for walsender the main problem is that it uses blocking
 send call to send the data.
 I have tried using tcp_keepalive settings, but the send call doesn't comeout
 incase of network break.
 The only way I could get it out is:
 change in the corresponding file /proc/sys/net/ipv4/tcp_retries2 by using
 the command
 echo 8  /proc/sys/net/ipv4/tcp_retries2
 As per recommendation, its value should be at-least 8 (equivalent to 100
 sec)

 Do you have any idea, how it can be achieved?

 What about using pq_putmessage_noblock()?

 I will try this, but do you know why at first place in code the blocking mode 
 is used to send files?
 I am asking as I am little scared that it should not break any design which 
 was initially thought of while making send of files as blocking.

I'm afraid I don't know why. I guess that using non-blocking mode complicates
the code, so in the first version of pg_basebackup the blocking mode
was adopted.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 OTOH I can't see trying to back-patch a solution like that.   If we want
 to fix this in the back branches (and note the complaint linked above is
 against 8.3), I think we have to do it as attached.

 Thoughts?

I've been using textin(record_out(NEW)) in generic partitioning
triggers, and you can find examples of that trick in the wiki, so I
think we have users of that in the field.

Please indeed do consider backpatching!

I don't have an opinion on the opportunity to use a shorter memory
context, I feel that would need some more involved analytics than my
brainpower of the moment allows me to consider.

Thanks,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 Thoughts?

 I've been using textin(record_out(NEW)) in generic partitioning
 triggers, and you can find examples of that trick in the wiki, so I
 think we have users of that in the field.

I think explicit calls like that actually wouldn't be a problem,
since they'd be run in a per-tuple context anyway.  The cases that
are problematic are hard-coded I/O function calls.  I'm worried
about the ones like, say, plpgsql's built-in conversion operations.
We could probably fix printtup's usage with some confidence, but
there are a lot of other ones.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scans wiki page

2012-11-13 Thread Robert Haas
On Mon, Nov 12, 2012 at 8:25 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 13 November 2012 01:03, Jeff Janes jeff.ja...@gmail.com wrote:
 https://wiki.postgresql.org/wiki/Index-only_scans

 This page is seriously out of date.  It suggests they are not yet
 implemented, but only being talked about.

 Attached is a piece I wrote on the feature. That might form the basis
 of a new wiki page. Feel free to incorporate this material as you see
 fit.

I found this an interesting read.  As one of the people who worked on
the feature, I'm sort of curious whether people have any experience
yet with how this actually shakes out in the field.  Are you (or is
anyone) aware of positive/negative field experiences with this
feature?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I think explicit calls like that actually wouldn't be a problem,
 since they'd be run in a per-tuple context anyway.  The cases that
 are problematic are hard-coded I/O function calls.  I'm worried
 about the ones like, say, plpgsql's built-in conversion operations.
 We could probably fix printtup's usage with some confidence, but
 there are a lot of other ones.

That's a good reason to get them into a shorter memory context, but
which? per transaction maybe? shorter?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Patch to compute Max LSN of Data Pages

2012-11-13 Thread Fujii Masao
On Tue, Nov 13, 2012 at 1:23 PM, Amit kapila amit.kap...@huawei.com wrote:
 On Monday, November 12, 2012 9:56 PM Alvaro Herrera wrote:
 Robert Haas escribió:
 On Tue, Jul 31, 2012 at 8:09 AM, Amit kapila amit.kap...@huawei.com wrote:

  I think I can see all of those things being potentially useful.  There
  are a couple of pending patches that will revise the WAL format
  slightly; not sure how much those are likely to interfere with any
  development you might do on (2) in the meantime.
 
  Based on above conclusion, I have prepared a patch which implements 
  Option-1

 I wonder if we shouldn't make this a separate utility, rather than
 something that is part of pg_resetxlog.  Anyone have a thought on that
 topic?

 That thought did cross my mind too.

 One of the reasons for keeping it with pg_resetxlog, is that this was 
 proposed as a solution for scenario's where user's db has become corrupt and 
 now he
 want to start it. So to do it he can find the max LSN and set the same using 
 pg_resetxlog, it will avoid the further corruption of database after it got 
 started.
 If we keep it a separate utility then user needs to first run this utility to 
 find max LSN and then use pg_resetxlog to achieve the same. I don't see a big 
 problem in that
 but may be it would have been better if there are other usecases for it.

We might be able to use this utility to decide whether we need to take
a fresh backup from the master onto the standby, to start old master
as new standby after failover.

When starting new standby after failover, any data page in the standby must
not precede the master. Otherwise, the standby cannot catch up with the master
consistently. But, the master might write the data page corresponding to
the WAL which has not been replicated to the standby yet. So, if
failover happens
before that WAL has been replicated, the data page in old master would precede
new master (i.e., old standby), and in this case the backup is required. OTOH,
if maximum LSN in data page in the standby is less than the master, the backup
is not required.

Without this utility, it's difficult to calculate the maximum LSN of
data page, so
basically we needed to take a backup when starting the standby. In the future,
thanks to this utility, we can calculate the maximum LSN, and can skip a backup
if that LSN is less than the master (i.e., last applied LSN, IOW,
timeline switch LSN).

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 I think explicit calls like that actually wouldn't be a problem,
 since they'd be run in a per-tuple context anyway.  The cases that
 are problematic are hard-coded I/O function calls.  I'm worried
 about the ones like, say, plpgsql's built-in conversion operations.
 We could probably fix printtup's usage with some confidence, but
 there are a lot of other ones.

 That's a good reason to get them into a shorter memory context, but
 which? per transaction maybe? shorter?

It would have to be per-tuple to do any good.  The existing behavior
is per-query and causes problems if lots of rows are output.  In plpgsql
it would be a function-call-lifespan leak.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #7656: PL/Perl SPI_freetuptable() segfault

2012-11-13 Thread Tom Lane
pgm...@joh.to writes:
 I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
 down the test case to something sensible, but I do have a backtrace:

 219   while (context-firstchild != NULL)
 (gdb) bt
 #0  0x000104e90782 in MemoryContextDeleteChildren (context=0x102bd)
 at mcxt.c:219
 #1  0x000104e906a8 in MemoryContextDelete (context=0x102bd) at
 mcxt.c:174
 #2  0x000104bbefb5 in SPI_freetuptable (tuptable=0x7f9ae4289230) at
 spi.c:1003
 #3  0x00011ec9928b in plperl_spi_execute_fetch_result
 (tuptable=0x7f9ae4289230, processed=1, status=-6) at plperl.c:2900
 #4  0x00011ec98f27 in plperl_spi_exec (query=0x7f9ae4155f80
 0x7f9ae3e3fe50, limit=-439796840) at plperl.c:2821
 #5  0x00011ec9b5f7 in XS__spi_exec_query (my_perl=0x7f9ae40cce00,
 cv=0x7f9ae4148e90) at SPI.c:69

 While trying to narrow down the test case I noticed what the problem was: I
 was calling spi_execute_query() instead of spi_execute_prepared().

Hm.  It looks like SPI_execute failed as expected (note the status
passed to plperl_spi_execute_fetch_result is -6 which is
SPI_ERROR_ARGUMENT), but it did not reset SPI_tuptable, which led to
plperl_spi_execute_fetch_result trying to call SPI_freetuptable on what
was probably an already-deleted tuple table.

One theory we could adopt on this is that this is
plperl_spi_execute_fetch_result's fault and it shouldn't be trying to
free a tuple table unless status  0.

Another theory we could adopt is that SPI functions that are capable of
setting SPI_tuptable ought to clear it at start, to ensure that they
return it as null on failure.

The latter seems like a nicer fix but I'm afraid it might have
unexpected side-effects.  It would certainly be a lot more invasive.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #7656: PL/Perl SPI_freetuptable() segfault

2012-11-13 Thread Andrew Dunstan


On 11/13/2012 12:17 PM, Tom Lane wrote:

pgm...@joh.to writes:

I have a reproducible segmentation fault in PL/Perl.  I have yet to narrow
down the test case to something sensible, but I do have a backtrace:
219 while (context-firstchild != NULL)
(gdb) bt
#0  0x000104e90782 in MemoryContextDeleteChildren (context=0x102bd)
at mcxt.c:219
#1  0x000104e906a8 in MemoryContextDelete (context=0x102bd) at
mcxt.c:174
#2  0x000104bbefb5 in SPI_freetuptable (tuptable=0x7f9ae4289230) at
spi.c:1003
#3  0x00011ec9928b in plperl_spi_execute_fetch_result
(tuptable=0x7f9ae4289230, processed=1, status=-6) at plperl.c:2900
#4  0x00011ec98f27 in plperl_spi_exec (query=0x7f9ae4155f80
0x7f9ae3e3fe50, limit=-439796840) at plperl.c:2821
#5  0x00011ec9b5f7 in XS__spi_exec_query (my_perl=0x7f9ae40cce00,
cv=0x7f9ae4148e90) at SPI.c:69
While trying to narrow down the test case I noticed what the problem was: I
was calling spi_execute_query() instead of spi_execute_prepared().

Hm.  It looks like SPI_execute failed as expected (note the status
passed to plperl_spi_execute_fetch_result is -6 which is
SPI_ERROR_ARGUMENT), but it did not reset SPI_tuptable, which led to
plperl_spi_execute_fetch_result trying to call SPI_freetuptable on what
was probably an already-deleted tuple table.

One theory we could adopt on this is that this is
plperl_spi_execute_fetch_result's fault and it shouldn't be trying to
free a tuple table unless status  0.

Another theory we could adopt is that SPI functions that are capable of
setting SPI_tuptable ought to clear it at start, to ensure that they
return it as null on failure.

The latter seems like a nicer fix but I'm afraid it might have
unexpected side-effects.  It would certainly be a lot more invasive.



These aren't mutually exclusive, though, are they? It seems reasonable 
to do the minimal fix for the stable branches (looks like it's just a 
matter of moving the call up a couple of lines in plperl.c) and make the 
nicer fix just for the development branch.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 The most popular relational database in the world is Microsoft Access,
 not MySQL. Access appears desirable because it allows a single user to
 create and use a database (which is very good). But all business
 databases have a requirement for at least one of: high availability,
 multi-user access or downstream processing in other parts of the
 business.

That's a mighty sweeping claim, which you haven't offered adequate
evidence for.  The fact of the matter is that there is *lots* of demand
for simple single-user databases, and what I'm proposing is at least a
first step towards getting there.

The main disadvantage of approaching this via the existing single-user
mode is that you won't have any autovacuum, bgwriter, etc, support.
But the flip side is that that lack of infrastructure is a positive
advantage for certain admittedly narrow use-cases, such as disaster
recovery and pg_upgrade.  So while I agree that this isn't the only
form of single-user mode that we'd like to support, I think it is *a*
form we'd like to support, and I don't see why you appear to be against
having it at all.

A more reasonable objection would be that we need to make sure that this
isn't foreclosing the option of having a multi-process environment with
a single user connection.  I don't see that it is, but it might be wise
to sketch exactly how that case would work before accepting this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Josh Berkus
On 11/12/12 7:59 PM, Amit kapila wrote:
 On Monday, November 12, 2012 12:07 PM Greg Smith wrote:
 On 11/9/12 11:59 PM, Amit kapila wrote:
 
 Please let me know if there are any objections or problems in above method 
 of implementation,
 else I can go ahead to prepare the patch for the coming CF.
 
 It may be the case that the locking scheme Robert described is the best
 approach here.  It seems kind of heavy to me though.  I suspect that
 some more thinking about it might come up with something better.

So, here's the problem I'm seeing with having a single .auto file:  when
we write settings to a file, are we writing a *single* setting or *all
of a user's current settings*?

I was imagining writing single, specific settings, which inevitably
leads to one-setting-per-file, e.g.:

SET PERSISTENT work_mem = 256MB;

What Amit seems to be talking about is more EXPORT SETTINGS, where you
dump all current settings in the session to a file.  This seems likely
to produce accidental changes when the user writes out settings they've
forgotten they changed.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 I was imagining writing single, specific settings, which inevitably
 leads to one-setting-per-file, e.g.:

 SET PERSISTENT work_mem = 256MB;

 What Amit seems to be talking about is more EXPORT SETTINGS, where you
 dump all current settings in the session to a file.  This seems likely
 to produce accidental changes when the user writes out settings they've
 forgotten they changed.

Yeah.  It also seems to be unnecessarily different from the existing
model of SET.  I'd go with one-setting-per-command.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Gavin Flower

On 14/11/12 04:32, Merlin Moncure wrote:

On Tue, Nov 13, 2012 at 9:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:

Ok. It won't help all that much on 9.0, though.

Well, it won't help GIST much, but the actually-reported-from-the-field
case is in btree, and it does fix that.

It occurs to me that if we're sufficiently scared of this case, we could
probably hack the planner (in 9.0 only) to refuse to use GIST indexes
in hot-standby queries.  That cure might be worse than the disease though.

if anything, it should be documented.  if you do this kind of thing
people will stop installing bugfix releases.

merlin


How about displaying a warning, when people try to use the 'feature', as 
well as document it?


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inadequate thought about buffer locking during hot standby replay

2012-11-13 Thread Robert Haas
On Tue, Nov 13, 2012 at 10:32 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Nov 13, 2012 at 9:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Ok. It won't help all that much on 9.0, though.

 Well, it won't help GIST much, but the actually-reported-from-the-field
 case is in btree, and it does fix that.

 It occurs to me that if we're sufficiently scared of this case, we could
 probably hack the planner (in 9.0 only) to refuse to use GIST indexes
 in hot-standby queries.  That cure might be worse than the disease though.

 if anything, it should be documented.  if you do this kind of thing
 people will stop installing bugfix releases.

Agreed.  I think doing that in a back-branch release would be
extremely user-hostile.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Robert Haas
On Tue, Nov 13, 2012 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 I was imagining writing single, specific settings, which inevitably
 leads to one-setting-per-file, e.g.:

 SET PERSISTENT work_mem = 256MB;

 What Amit seems to be talking about is more EXPORT SETTINGS, where you
 dump all current settings in the session to a file.  This seems likely
 to produce accidental changes when the user writes out settings they've
 forgotten they changed.

 Yeah.  It also seems to be unnecessarily different from the existing
 model of SET.  I'd go with one-setting-per-command.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Robert Haas
On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila amit.kap...@huawei.com wrote:
 Is the above opinion about only locking or even on a way to write the changed 
 things in a file as mentioned in point-1 in mail chain upthread.
 (Point-1:  1. While writing .auto file, it will always assume that .auto 
 file contain
 all config parameters.
   Now as this .auto file is of fixed format and fixed record size, it can
 directly write a given record to its particular position.)
 What my thinking was that if we can decide that the format and size of each 
 configuration is fixed, it can be directly written without doing anything for 
 it in memory.

Uh, no, I don't think that's a good idea.  IMHO, what we should do is:

1. Read postgresql.conf.auto and remember all the settings we saw.  If
we see something funky like an include directive, barf.
2. Forget the value we remembered for the particular setting being
changed.  Instead, remember the user-supplied new value for that
parameter.
3. Write a new postgresql.conf.auto based on the information
remembered in steps 1 and 2.

Of course, if we go with one-file-per-setting, then this becomes even
simpler: just clobber the file for the single setting being updated -
creating it if it exists - and ignore all the rest.  I don't
personally favor that approach because I think I think it's clunky to
manage, but YMMV.

With either approach, it's worth noting that a RESET variant of this
could be useful - which would either remove the chosen setting from
postgresql.conf.auto, or remove the file containing the
automatically-set value for that setting.  I think my personal
favorite syntax is:

ALTER SYSTEM .. SET wunk = 'thunk';
ALTER SYSTEM .. RESET wunk;

But I'm OK with something else if there's consensus.  I don't
particularly like SET PERSISTENT because I think this is more like
ALTER DATABASE .. SET than it is like SET LOCAL, but IJWH.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Robert Haas
On Tue, Nov 13, 2012 at 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 The most popular relational database in the world is Microsoft Access,
 not MySQL. Access appears desirable because it allows a single user to
 create and use a database (which is very good). But all business
 databases have a requirement for at least one of: high availability,
 multi-user access or downstream processing in other parts of the
 business.

 That's a mighty sweeping claim, which you haven't offered adequate
 evidence for.  The fact of the matter is that there is *lots* of demand
 for simple single-user databases, and what I'm proposing is at least a
 first step towards getting there.

 The main disadvantage of approaching this via the existing single-user
 mode is that you won't have any autovacuum, bgwriter, etc, support.
 But the flip side is that that lack of infrastructure is a positive
 advantage for certain admittedly narrow use-cases, such as disaster
 recovery and pg_upgrade.  So while I agree that this isn't the only
 form of single-user mode that we'd like to support, I think it is *a*
 form we'd like to support, and I don't see why you appear to be against
 having it at all.

 A more reasonable objection would be that we need to make sure that this
 isn't foreclosing the option of having a multi-process environment with
 a single user connection.  I don't see that it is, but it might be wise
 to sketch exactly how that case would work before accepting this.

I'm not particularly excited about providing more single-user mode
options, but I think it's worth having this particular thing because
it makes pg_upgrade more robust.  Whether we do anything else is
something we can litigate when the time comes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Simon Riggs
On 13 November 2012 17:38, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 The most popular relational database in the world is Microsoft Access,
 not MySQL. Access appears desirable because it allows a single user to
 create and use a database (which is very good). But all business
 databases have a requirement for at least one of: high availability,
 multi-user access or downstream processing in other parts of the
 business.

 That's a mighty sweeping claim, which you haven't offered adequate
 evidence for.  The fact of the matter is that there is *lots* of demand
 for simple single-user databases, and what I'm proposing is at least a
 first step towards getting there.

I agree there is lots of demand for simple single-user databases and I
wish that too. What I don't agree with is something that casts that
requirement in stone by architecturally/permanently disallowing
secondary connections.

Evidence for claims:
* The whole Business Intelligence industry relies on being able to
re-purpose existing data, forming integrated webs of interconnecting
databases. All of that happens after the initial developers write the
first version of the database application.
* Everybody wants a remote backup, whether its for your mobile phone
contact list or your enterprise datastore.

People are migrating away from embedded databases in droves for these
very reasons.

 The main disadvantage of approaching this via the existing single-user
 mode is that you won't have any autovacuum, bgwriter, etc, support.
 But the flip side is that that lack of infrastructure is a positive
 advantage for certain admittedly narrow use-cases, such as disaster
 recovery and pg_upgrade.  So while I agree that this isn't the only
 form of single-user mode that we'd like to support, I think it is *a*
 form we'd like to support, and I don't see why you appear to be against
 having it at all.

I have no problem with people turning things off, I reject the idea
that we should encourage people to never be able to turn them back on.

 A more reasonable objection would be that we need to make sure that this
 isn't foreclosing the option of having a multi-process environment with
 a single user connection.  I don't see that it is, but it might be wise
 to sketch exactly how that case would work before accepting this.

Whatever we provide will become the norm. I don't have a problem with
you providing BOTH the proposed single user mode AND the multi-process
single user connection mode in this release. But if you provide just
one of them and its the wrong one, we will be severely hampered in the
future.

Yes, I am very much against this project producing a new DBMS
architecture that works on top of PostgreSQL data files, yet prevents
maintenance, backup, replication and multi-user modes.

I see this decision as a critical point for this project, so please
consider this objection and where it comes from.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Variable length array element encoding…

2012-11-13 Thread Sean Chittenden
[ Not subscribed, please keep me in the CC list ]

Is there a standard idiom for encoding small variable length data in an array? 
I wrote the varint extension[1] that encodes data using a variable width 
encoding scheme[2] for signed and unsigned integers[3]. Right now the extension 
is mostly of use in skinny tables that have at least 4-5 columns, all of which 
are of INT or INT8. If you have only 5 columns of INT8, you can save ~50% of 
your table space.

But, to get larger savings, it's required to bypass the tuple overhead and 
aggregating data in to an array (i.e. aggregate all time series data for a 5min 
window of time in to a single varuint[]).

The problem with that being, each varint takes 8 bytes in an array because of 
padding and alignment. Is there a way to prevent that, or, more realistically, 
are there standard ways of encoding this data in to a BYTEA and then manually 
scanning and unpacking the data? Random access in to the array isn't a concern. 
I was thinking about adding a BYTEA to varint[] cast, but am fishing for a 
better idea.

Any hints or thoughts? Thanks in advance. -sc


[1] https://github.com/sean-/postgresql-varint

[2] SELECT varint64,  pg_column_size(varint64) FROM varint64_table ORDER BY 
varint64 ASC;
   varint64   |  pg_column_size
--+-
 -4611686018427387905 |  11
 -4611686018427387904 |  10
 -36028797018963969   |  10
 -36028797018963968   |   9
 -281474976710657 |   9
 -281474976710656 |   8
 -219902323   |   8
 -219902322   |   7
 -17179869185 |   7
 -17179869184 |   6
 -134217729   |   6
 -134217728   |   5
 -1048577 |   5
 -1048576 |   4
 -8193|   4
 -8192|   3
 -65  |   3
 -64  |   2
 -1   |   2
 0|   2
 1|   2
 63   |   2
 64   |   3
 8191 |   3
 8192 |   4
 1048575  |   4
 1048576  |   5
 134217727|   5
 134217728|   6
 17179869183  |   6
 17179869184  |   7
 219902321|   7
 219902322|   8
 281474976710655  |   8
 281474976710656  |   9
 36028797018963967|   9
 36028797018963968|  10
 4611686018427387903  |  10
 4611686018427387904  |  11
(39 rows)

SELECT varuint64,  pg_column_size(varint64) FROM varuint64_table ORDER BY 
varint64 ASC;
  varuint64  |  pg_column_size
-+-
 0   |   2
 127 |   2
 128 |   3
 16383   |   3
 16384   |   4
 2097151 |   4
 2097152 |   5
 268435455   |   5
 268435456   |   6
 34359738367 |   6
 34359738368 |   7
 4398046511103   |   7
 4398046511104   |   8
 562949953421311 |   8
 562949953421312 |   9
 72057594037927935   |   9
 72057594037927936   |  10
 9223372036854775807 |  10

[3] I know the unsigned int only goes up to 2^^63 atm, it will go to 2^^64 once 
I get around to setting up a test methodology. Using INT8 internally was too 
convenient at the time.

--
Sean Chittenden
s...@chittenden.org



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Robert Haas
On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder though if we ought to think about running output functions in
 a short-lived memory context instead of the executor's main context.
 We've considered that before, I think, and it's always been the path
 of least resistance to fix the output functions instead --- but there
 will always be another leak I'm afraid.

Such is the lot of people who code in C.  I worry that the number of
memory contexts we're kicking around already is imposing a significant
distributed overhead on the system that is hard to measure but
nevertheless real, and that this will add to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 13 November 2012 17:38, Tom Lane t...@sss.pgh.pa.us wrote:
 ...  The fact of the matter is that there is *lots* of demand
 for simple single-user databases, and what I'm proposing is at least a
 first step towards getting there.

 I agree there is lots of demand for simple single-user databases and I
 wish that too. What I don't agree with is something that casts that
 requirement in stone by architecturally/permanently disallowing
 secondary connections.

If you want secondary connections, then I think you want a postmaster.
We already have umpteen ways to limit who can connect (for example,
putting the socket in a directory with limited access rights), and in
that sort of situation I don't see why you'd really want a database
that is only accessible when the main client is running.

The case that this patch is meant to address is one where there is only
one client application, period, and you'd rather that the database
starts and stops automatically with that application instead of needing
any management complexity.  Now we can debate whether we want only one
process or multiple processes underneath the client application, but
I think the restriction to one client connection is a key *feature*
not a bug, precisely because it removes a whole bunch of user-visible
complexity that we cannot escape otherwise.

 People are migrating away from embedded databases in droves for these
 very reasons.

[ shrug... ]  If they don't want an embedded database, they won't want
this either, but there are still plenty of people left who do want an
embedded database.  We've never had an adequate offering for those
people before.  If we ratchet up the management complexity of single
user mode then it still won't be an adequate offering for them.

 I see this decision as a critical point for this project, so please
 consider this objection and where it comes from.

I think this is nonsense.  It's not critical; it's a very small patch
that provides a feature of interest to a limited audience.  And I don't
believe it's foreclosing providing other operating modes later, unless
maybe people feel this is almost good enough and lose motivation to
work on those other operating modes.  But if that happens, then I'd say
the demand for the other modes isn't as high as you think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder though if we ought to think about running output functions in
 a short-lived memory context instead of the executor's main context.
 We've considered that before, I think, and it's always been the path
 of least resistance to fix the output functions instead --- but there
 will always be another leak I'm afraid.

 Such is the lot of people who code in C.  I worry that the number of
 memory contexts we're kicking around already is imposing a significant
 distributed overhead on the system that is hard to measure but
 nevertheless real, and that this will add to it.

Yeah, perhaps.  I'd like to think that a MemoryContextReset is cheaper
than a bunch of retail pfree's, but it's hard to prove anything without
actually coding and testing it --- and on modern machines, effects like
cache locality could swamp pure instruction-count gains anyway.

Anyway, I committed the narrow fix for the moment.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Robert Haas
On Sun, Nov 11, 2012 at 5:52 PM, Jeff Davis pg...@j-davis.com wrote:
 Per-database does sound easier than per-table. I'd have to think about
 how that would affect shared catalogs though.

 For now, I'm leaning toward an offline utility to turn checksums on or
 off, called pg_checksums. It could do so lazily (just flip a switch to
 enabling in pg_control), or it could do so eagerly and turn it into a
 fully-protected instance.

 For the first patch, it might just be an initdb-time option for
 simplicity.

It'd be pretty easy to write a pg_checksums utilitys to turn checksums
on/off on a database that is shut down, since the hard part of all of
this is to change the state while the database is running.  But I
think even that doesn't need to be part of the first patch.  A small
patch that gets committed is better than a big one that doesn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Christopher Browne
Preface: I think there's some great commentary here, and find myself
agreeing
pretty whole-heartedly.

On Tue, Nov 13, 2012 at 2:45 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 13 November 2012 17:38, Tom Lane t...@sss.pgh.pa.us wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  The most popular relational database in the world is Microsoft Access,
  not MySQL. Access appears desirable because it allows a single user to
  create and use a database (which is very good). But all business
  databases have a requirement for at least one of: high availability,
  multi-user access or downstream processing in other parts of the
  business.
 
  That's a mighty sweeping claim, which you haven't offered adequate
  evidence for.  The fact of the matter is that there is *lots* of demand
  for simple single-user databases, and what I'm proposing is at least a
  first step towards getting there.

 I agree there is lots of demand for simple single-user databases and I
 wish that too. What I don't agree with is something that casts that
 requirement in stone by architecturally/permanently disallowing
 secondary connections.

 Evidence for claims:
 * The whole Business Intelligence industry relies on being able to
 re-purpose existing data, forming integrated webs of interconnecting
 databases. All of that happens after the initial developers write the
 first version of the database application.
 * Everybody wants a remote backup, whether its for your mobile phone
 contact list or your enterprise datastore.

 People are migrating away from embedded databases in droves for these
 very reasons.


There seems to be a continuum of different sorts of scenarios of
more-to-less
concurrency that are desirable for some different reasons.  From
most-to-least,
I can see:

1 - Obviously, there's the case that Postgres is eminently good at, of
supporting
  many users concurrently using a database.  We love that, let's not break
it :-).

2 - We have found it useful to have some extra work processes that do some
  useful internal things, such as vacuuming, forcing background writes,
  collecting statistics.  And an online backup requires having a second
process.

3 - People doing embedded systems find it attractive to attach all the data
  to the singular user running the system.  Witness the *heavy* deployment
  of SQLite on Android and iOS.  People make an assumption that this is
  a single-process thing, but I am inclined to be a bit skeptical.  What
they
  *do* know is that it's convenient to not spawn extra processes and do
  IPC.  That's not quite the same thing as it being a certainty that they
  definitely want not to have more than one process.

4 - There are times when there *is* certainty about not wanting there to be
more
  than one process.  When running pg_upgrade, or, at certain times, when
  doing streaming replication node status switches, one might have that
  certainty.  Or when reindexing system tables, which needs single user
mode.

For us to conflate the 3rd and 4th items seems like a mistake to me.


  The main disadvantage of approaching this via the existing single-user
  mode is that you won't have any autovacuum, bgwriter, etc, support.
  But the flip side is that that lack of infrastructure is a positive
  advantage for certain admittedly narrow use-cases, such as disaster
  recovery and pg_upgrade.  So while I agree that this isn't the only
  form of single-user mode that we'd like to support, I think it is *a*
  form we'd like to support, and I don't see why you appear to be against
  having it at all.

 I have no problem with people turning things off, I reject the idea
 that we should encourage people to never be able to turn them back on.


Yep.  That seems like conflating #2 with #4.

It's mighty attractive to have a forcible single process mode to add
safety to
certain activities.

I think we need a sharper knife, though, so we don't ablate off stuff like
#2, just
because someone imagined that Must Have Single Process!!! was the right
doctrine.


  A more reasonable objection would be that we need to make sure that this
  isn't foreclosing the option of having a multi-process environment with
  a single user connection.  I don't see that it is, but it might be wise
  to sketch exactly how that case would work before accepting this.

 Whatever we provide will become the norm. I don't have a problem with
 you providing BOTH the proposed single user mode AND the multi-process
 single user connection mode in this release. But if you provide just
 one of them and its the wrong one, we will be severely hampered in the
 future.

 Yes, I am very much against this project producing a new DBMS
 architecture that works on top of PostgreSQL data files, yet prevents
 maintenance, backup, replication and multi-user modes.

 I see this decision as a critical point for this project, so please
 consider this objection and where it comes from.


I don't think we're necessarily *hugely* hampered by doing one of 

Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Robert Haas
On Mon, Nov 12, 2012 at 4:44 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 That'll make it hard for VACUUM, hint-bit setting, etc to
 opportunistically checksum pages whenever they're doing a page write anyway.

 Is it absurd to suggest using another bitmap, like the FSM or visibility
 map, to store information on page checksumming while checksumming is
 enabled but incomplete? As a much smaller file the bitmap could its self
 be very quickly generated in one pass when checksumming is enabled, with
 its starting state showing no pages having checksums.

Hmm... what if we took this a step further and actually stored the
checksums in a separate relation fork?  That would make it pretty
simple to support enabling/disabling checksums for particular
relations.  It would also allow us to have a wider checksum, like 32
or 64 bits rather than 16.  I'm not scoffing at a 16-bit checksum,
because even that's enough to catch a very high percentage of errors,
but it wouldn't be terrible to be able to support a wider one, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Robert Haas
On Tue, Nov 13, 2012 at 3:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Nov 13, 2012 at 12:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder though if we ought to think about running output functions in
 a short-lived memory context instead of the executor's main context.
 We've considered that before, I think, and it's always been the path
 of least resistance to fix the output functions instead --- but there
 will always be another leak I'm afraid.

 Such is the lot of people who code in C.  I worry that the number of
 memory contexts we're kicking around already is imposing a significant
 distributed overhead on the system that is hard to measure but
 nevertheless real, and that this will add to it.

 Yeah, perhaps.  I'd like to think that a MemoryContextReset is cheaper
 than a bunch of retail pfree's, but it's hard to prove anything without
 actually coding and testing it --- and on modern machines, effects like
 cache locality could swamp pure instruction-count gains anyway.

Yeah.  The thing that concerns me is that I think we have a pretty
decent number of memory contexts where the expected number of
allocations is very small ... and we have the context *just in case*
we do more than that in certain instances.  I've seen profiles where
the setup/teardown costs of memory contexts are significant ... which
doesn't mean that those examples would perform better with fewer
memory contexts, but it's enough to make me pause for thought.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Patch to compute Max LSN of Data Pages

2012-11-13 Thread Robert Haas
On Tue, Nov 13, 2012 at 11:46 AM, Fujii Masao masao.fu...@gmail.com wrote:
 Without this utility, it's difficult to calculate the maximum LSN of
 data page, so
 basically we needed to take a backup when starting the standby. In the future,
 thanks to this utility, we can calculate the maximum LSN, and can skip a 
 backup
 if that LSN is less than the master (i.e., last applied LSN, IOW,
 timeline switch LSN).

Doesn't the minimum recovery point give us that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2012-11-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Hmm... what if we took this a step further and actually stored the
 checksums in a separate relation fork?  That would make it pretty
 simple to support enabling/disabling checksums for particular
 relations.  It would also allow us to have a wider checksum, like 32
 or 64 bits rather than 16.  I'm not scoffing at a 16-bit checksum,
 because even that's enough to catch a very high percentage of errors,
 but it wouldn't be terrible to be able to support a wider one, either.

What happens when you get an I/O failure on the checksum fork?  Assuming
you're using 8K pages there, that would mean you can no longer verify
the integrity of between one and four thousand pages of data.

Not to mention the race condition problems associated with trying to be
sure the checksum updates hit the disk at the same time as the data-page
updates.

I think you really have to store the checksums *with* the data they're
supposedly protecting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign key locks

2012-11-13 Thread Alvaro Herrera
Noah Misch wrote:
 On Wed, Oct 31, 2012 at 05:22:10PM -0300, Alvaro Herrera wrote:

  Not really sure about the proposed syntax, but yes clearly we need some
  other syntax to mean FOR NON KEY UPDATE.  I would rather keep FOR
  UPDATE to mean what I currently call FOR KEY UPDATE.  More proposals for
  the other (weaker) lock level welcome (but if you love FOR NON KEY
  UPDATE, please chime in too)
 
 Agree on having FOR UPDATE without any FOR KEY UPDATE synonym.  For the
 weaker lock, I mildly preferred the proposal of FOR NO KEY UPDATE.  NON KEY
 captures the idea better in English, but NO is close enough and already part
 of the SQL lexicon.

This is the proposal I like best; however there is an asymmetry, because
the locking options now are

FOR KEY SHARE
FOR SHARE
FOR NO KEY UPDATE
FOR UPDATE

I used to have comments such as

/* Currently, SELECT ... FOR [KEY] UPDATE/SHARE requires UPDATE privileges */
#define ACL_SELECT_FOR_UPDATE   ACL_UPDATE

but now they are slightly incorrect because the NO is not illustrated.
I guess I could use SELECT ... FOR [NO KEY] UPDATE/SHARE but this leaves
out the FOR KEY SHARE case (and can be thought to introduce a FOR NO
KEY SHARE case).  And getting much more verbose than that is probably
not warranted.  In some places I would like the use a phrase like the
locking clause, but I'm not sure that it's clear enough.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Process waiting for ExclusiveLock on INSERT

2012-11-13 Thread Jehan-Guillaume de Rorthais
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

I had a surprising question today from a customer about exclusive
locks on INSERT. His log file shows something like:


==
  LOG:  process 1881 still waiting for ExclusiveLock on extension of
relation 168318652 of database 58429185 after 1000.065 ms
  STATEMENT:  INSERT INTO  (...cols...) VALUES (...values...)
==


While asking on #postgresql and investigating in the code, I think I
hit a theory. When the system is I/O bound and backends are doing
shared buffer cleanups, at some point they have to extend a relation
with new pages, requiring for an exclusive lock on the relation to
forbid anyone else to extend it in the same time. At this time, if
multiple backends try to extend the relation, one win, all other wait
for the lock, leading to messages in the log file when log_lock_waits
is enabled.

This lock would comes from src/backend/access/heap/hio.c:432:
==
  /*
   * Have to extend the relation.
   *
   * We have to use a lock to ensure no one else is extending the rel
at the
   * same time, else we will both try to initialize the same new page.  We
   * can skip locking for new or temp relations, however, since no one
else
   * could be accessing them.
   */
   needLock = !RELATION_IS_LOCAL(relation);

   if (needLock)
   LockRelationForExtension(relation, ExclusiveLock);
==


Is this theory correct or this issue should be discussed a bit more on
this list ?

PFA a small bash script with a small PostgreSQL configuration in
comments that reproduce this behavior very often on my laptop, eg:


==
  2012-11-13 23:15:51 CET [23137]: [1-1] user=postgres,db=test LOG:
process 23137 still waiting for ExclusiveLock on extension of relation
28118 of database 28115 after 100.086 ms
  2012-11-13 23:15:51 CET [23137]: [2-1] user=postgres,db=test
STATEMENT:  insert into test select i, md5(i::text) from
generate_series(1,1000) AS i
==

Regards,
- -- 
Jehan-Guillaume de Rorthais
http://www.dalibo.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAlCiyT4ACgkQXu9L1HbaT6KbdgCgslQiKjP5bovr/eN5gi1TJB6i
9pcAoI9BpfD/4306xSUZTPUcQTLYHJS3
=HgzB
-END PGP SIGNATURE-


test_exclusivelock_on_insert.sh
Description: application/shellscript

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol

2012-11-13 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I agree there is lots of demand for simple single-user databases and I
 wish that too. What I don't agree with is something that casts that
 requirement in stone by architecturally/permanently disallowing
 secondary connections.

 If you want secondary connections, then I think you want a postmaster.

I would agree. I think you're both talking above each other, and that
what Simon is worried about (but I haven't asked him about that before
sending that email) is how to change the application setup to switch
from single user mode to multi user mode.

IIRC the way to implement single user mode in your application is quite
low-level with this patch, so switching to multi-user mode is not about
just changing the connection string, or is it?

 The case that this patch is meant to address is one where there is only
 one client application, period, and you'd rather that the database
 starts and stops automatically with that application instead of needing
 any management complexity.  Now we can debate whether we want only one
 process or multiple processes underneath the client application, but
 I think the restriction to one client connection is a key *feature*
 not a bug, precisely because it removes a whole bunch of user-visible
 complexity that we cannot escape otherwise.

Well I think your patch would be easier to accept as is if it was
documented only as a psql friendly single-user mode. I would really
welcome that.

 embedded database.  We've never had an adequate offering for those
 people before.  If we ratchet up the management complexity of single
 user mode then it still won't be an adequate offering for them.

Now, if we're talking about single user mode as in embedded database, I
really do think this patch should include a solution to run online
maintainance, logical and physical backups, replication, archiving and
all the production grade features you expect from PostgreSQL.

And then I understand Simon's POV about code complexity and bgworkers
for examples, which will *need* to be taken care of in that solution.

 I think this is nonsense.  It's not critical; it's a very small patch
 that provides a feature of interest to a limited audience.  And I don't

Yes, it's providing full psql capabilities where we only had that bizare
postgres --single interface. Maybe it will make initdb and debugging it
easier too.

 believe it's foreclosing providing other operating modes later, unless
 maybe people feel this is almost good enough and lose motivation to
 work on those other operating modes.  But if that happens, then I'd say
 the demand for the other modes isn't as high as you think.

Again, my concern on that point after reading Simon's comments is only
about the production procedure you have to follow to switch your
application from single user mode to multi user mode.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scans wiki page

2012-11-13 Thread Peter Geoghegan
On 13 November 2012 16:37, Robert Haas robertmh...@gmail.com wrote:
 I found this an interesting read.  As one of the people who worked on
 the feature, I'm sort of curious whether people have any experience
 yet with how this actually shakes out in the field.  Are you (or is
 anyone) aware of positive/negative field experiences with this
 feature?

Unfortunately, I don't think that I have any original insight about
the problems with index-only scans in the field right now.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Yeah.  The thing that concerns me is that I think we have a pretty
 decent number of memory contexts where the expected number of
 allocations is very small ... and we have the context *just in case*
 we do more than that in certain instances.  I've seen profiles where
 the setup/teardown costs of memory contexts are significant ... which
 doesn't mean that those examples would perform better with fewer
 memory contexts, but it's enough to make me pause for thought.

So, for my 2c, I'm on the other side of this, personally.  We have
memory contexts for more-or-less exactly this issue.  It's one of the
great things about PG- it's resiliant and very unlikely to have large or
bad memory leaks in general, much of which can, imv, be attributed to
our use of memory contexts.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Index only scans wiki page

2012-11-13 Thread Peter Geoghegan
On 13 November 2012 16:37, Robert Haas robertmh...@gmail.com wrote:
 I found this an interesting read.  As one of the people who worked on
 the feature, I'm sort of curious whether people have any experience
 yet with how this actually shakes out in the field.  Are you (or is
 anyone) aware of positive/negative field experiences with this
 feature?

Unfortunately, I don't think that I have any original insight about
the problems with index-only scans in the field right now.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Add contrib module functions to docs' function index

2012-11-13 Thread Craig Ringer
Hi all

Andreas Heiduk on -bugs suggested that we add the functions provided by
contrib modules to the function index in the docs, so it's easier to go
from, say,  what the heck is idx(...) to finding it in the intarray
contrib module.

This seems like a good idea and I'd like to pop it in the TODO until I
get time to check it out, flagged as a minor/newbie-friendly problem.
Any objections?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add contrib module functions to docs' function index

2012-11-13 Thread David Johnston
 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Craig Ringer
 Sent: Tuesday, November 13, 2012 6:15 PM
 To: PostgreSQL Hackers
 Subject: [HACKERS] Add contrib module functions to docs' function index
 
 Hi all
 
 Andreas Heiduk on -bugs suggested that we add the functions provided by
 contrib modules to the function index in the docs, so it's easier to go
from,
 say,  what the heck is idx(...) to finding it in the intarray contrib
module.
 
 This seems like a good idea and I'd like to pop it in the TODO until I get
time to
 check it out, flagged as a minor/newbie-friendly problem.
 Any objections?
 

For clarity does this proposal refer to Chapter 9 of the documentation, the
Index, or both.

If modifying Chapter 9 the function and operator tables should be extended
to include a source column with values of base or contrib: module
name or something similar.

As to the desirability of such a change I concur that it would be a nice
usability enhancement to consider beyond just updating the actual Index.

David J.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add contrib module functions to docs' function index

2012-11-13 Thread Craig Ringer
On 11/14/2012 07:56 AM, David Johnston wrote:

 For clarity does this proposal refer to Chapter 9 of the documentation, the
 Index, or both.

 If modifying Chapter 9 the function and operator tables should be extended
 to include a source column with values of base or contrib: module
 name or something similar.

 As to the desirability of such a change I concur that it would be a nice
 usability enhancement to consider beyond just updating the actual Index.
Sorry I was unclear.

I'm talking about making sure that contrib module functions (and
settings) appear in the documentation index (
http://www.postgresql.org/docs/current/static/bookindex.html
http://www.postgresql.org/docs/9.2/static/bookindex.html) so it's easy
to find a function by name whether it's in core or contrib. This is what
I want to add to TODO.

Separately, it might also be nice to add the contrib functions to the
section 9 tables with an extra column showing their origin, but that's
less clearly a good thing. Even if there's a column saying intarray
for intarray functions in the array functions list, people will still
try to use them without loading the extension and get confused when
they're not found. It'll also bloat the listings of core functions.
Rather than do that, I'd probably prefer to add a note to relevant
sections. For example, in array functions I'd want to add Additional
functions that operate only on arrays of integers are available in the
a href=...intarray extension/a.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-13 Thread Bruce Momjian
On Tue, Nov 13, 2012 at 07:03:51PM -0500, Bruce Momjian wrote:
 I am attaching an updated pg_upgrade patch, which I believe is ready for
 application for 9.3.

Correction, here is the proper patch.  The previous posted version was
had pending merges from the master branch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/file.c b/contrib/pg_upgrade/file.c
new file mode 100644
index a5d92c6..d8cd8f5
*** a/contrib/pg_upgrade/file.c
--- b/contrib/pg_upgrade/file.c
*** copy_file(const char *srcfile, const cha
*** 221,281 
  #endif
  
  
- /*
-  * load_directory()
-  *
-  * Read all the file names in the specified directory, and return them as
-  * an array of char * pointers.  The array address is returned in
-  * *namelist, and the function result is the count of file names.
-  *
-  * To free the result data, free each (char *) array member, then free the
-  * namelist array itself.
-  */
- int
- load_directory(const char *dirname, char ***namelist)
- {
- 	DIR		   *dirdesc;
- 	struct dirent *direntry;
- 	int			count = 0;
- 	int			allocsize = 64;		/* initial array size */
- 
- 	*namelist = (char **) pg_malloc(allocsize * sizeof(char *));
- 
- 	if ((dirdesc = opendir(dirname)) == NULL)
- 		pg_log(PG_FATAL, could not open directory \%s\: %s\n,
- 			   dirname, getErrorText(errno));
- 
- 	while (errno = 0, (direntry = readdir(dirdesc)) != NULL)
- 	{
- 		if (count = allocsize)
- 		{
- 			allocsize *= 2;
- 			*namelist = (char **)
- 		pg_realloc(*namelist, allocsize * sizeof(char *));
- 		}
- 
- 		(*namelist)[count++] = pg_strdup(direntry-d_name);
- 	}
- 
- #ifdef WIN32
- 	/*
- 	 * This fix is in mingw cvs (runtime/mingwex/dirent.c rev 1.4), but not in
- 	 * released version
- 	 */
- 	if (GetLastError() == ERROR_NO_MORE_FILES)
- 		errno = 0;
- #endif
- 
- 	if (errno)
- 		pg_log(PG_FATAL, could not read directory \%s\: %s\n,
- 			   dirname, getErrorText(errno));
- 
- 	closedir(dirdesc);
- 
- 	return count;
- }
- 
- 
  void
  check_hard_link(void)
  {
--- 221,226 
diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h
new file mode 100644
index 3058343..f35ce75
*** a/contrib/pg_upgrade/pg_upgrade.h
--- b/contrib/pg_upgrade/pg_upgrade.h
***
*** 7,13 
  
  #include unistd.h
  #include assert.h
- #include dirent.h
  #include sys/stat.h
  #include sys/time.h
  
--- 7,12 
*** const char *setupPageConverter(pageCnvCt
*** 366,372 
  typedef void *pageCnvCtx;
  #endif
  
- int			load_directory(const char *dirname, char ***namelist);
  const char *copyAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
    const char *dst, bool force);
  const char *linkAndUpdateFile(pageCnvCtx *pageConverter, const char *src,
--- 365,370 
diff --git a/contrib/pg_upgrade/relfilenode.c b/contrib/pg_upgrade/relfilenode.c
new file mode 100644
index 33a867f..d763ba7
*** a/contrib/pg_upgrade/relfilenode.c
--- b/contrib/pg_upgrade/relfilenode.c
***
*** 17,25 
  
  static void transfer_single_new_db(pageCnvCtx *pageConverter,
  	   FileNameMap *maps, int size);
! static void transfer_relfile(pageCnvCtx *pageConverter,
!  const char *fromfile, const char *tofile,
!  const char *nspname, const char *relname);
  
  
  /*
--- 17,24 
  
  static void transfer_single_new_db(pageCnvCtx *pageConverter,
  	   FileNameMap *maps, int size);
! static void transfer_relfile(pageCnvCtx *pageConverter, FileNameMap *map,
! 			 const char *suffix);
  
  
  /*
*** static void
*** 131,185 
  transfer_single_new_db(pageCnvCtx *pageConverter,
  	   FileNameMap *maps, int size)
  {
- 	char		old_dir[MAXPGPATH];
- 	char		file_pattern[MAXPGPATH];
- 	char		**namelist = NULL;
- 	int			numFiles = 0;
  	int			mapnum;
! 	int			fileno;
! 	bool		vm_crashsafe_change = false;
! 
! 	old_dir[0] = '\0';
! 
! 	/* Do not copy non-crashsafe vm files for binaries that assume crashsafety */
  	if (old_cluster.controldata.cat_ver  VISIBILITY_MAP_CRASHSAFE_CAT_VER 
  		new_cluster.controldata.cat_ver = VISIBILITY_MAP_CRASHSAFE_CAT_VER)
! 		vm_crashsafe_change = true;
  
  	for (mapnum = 0; mapnum  size; mapnum++)
  	{
! 		char		old_file[MAXPGPATH];
! 		char		new_file[MAXPGPATH];
! 
! 		/* Changed tablespaces?  Need a new directory scan? */
! 		if (strcmp(maps[mapnum].old_dir, old_dir) != 0)
! 		{
! 			if (numFiles  0)
! 			{
! for (fileno = 0; fileno  numFiles; fileno++)
! 	pg_free(namelist[fileno]);
! pg_free(namelist);
! 			}
! 
! 			snprintf(old_dir, sizeof(old_dir), %s, maps[mapnum].old_dir);
! 			numFiles = load_directory(old_dir, namelist);
! 		}
! 
! 		/* Copying files might take some time, so give feedback. */
! 
! 		snprintf(old_file, sizeof(old_file), %s/%u, maps[mapnum].old_dir,
!  maps[mapnum].old_relfilenode);
! 		snprintf(new_file, 

Re: [HACKERS] Doc patch, index search_path where it's used to secure functions

2012-11-13 Thread Peter Eisentraut
On Fri, 2012-09-28 at 12:17 -0500, Karl O. Pinc wrote:
 On 09/28/2012 11:28:39 AM, Karl O. Pinc wrote:
 
  Doc patch, index search_path where it's used to secure functions.
  search_path-securing.patch
 
 Second version.  Should be indexing the concept, not the run-time 
 setting.

Well, I'm not sure.  We currently have three index entries on the topic:

search path
search_path
search_path configuration parameter

I think I'd put them all under search_path.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, further describe and-mask nature of the permission system

2012-11-13 Thread Peter Eisentraut
On Sat, 2012-09-29 at 01:16 -0500, Karl O. Pinc wrote:
 This patch makes some sweeping statements.

Unfortunately, they are wrong.  What you term the additive nature is
really only a special case in the relationship between table and column
privileges.  Schema and database privileges are completely separate
things.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-13 Thread Ants Aasma
On Wed, Nov 14, 2012 at 2:03 AM, Bruce Momjian br...@momjian.us wrote:
 At 64k I see pg_upgrade taking 12% of the duration time, if I subtract
 out the dump/restore times.

My percentage numbers only included CPU time and I used SSD storage.
For the most part there was no IO wait to speak of, but it's
completely expected that thousands of link calls are not free.

 Postgres time itself breaks down with 10% for shutdown checkpoint and
 90% for regular running, consisting of 16% parsing, 13% analyze, 20%
 plan, 30% execute, 11% commit (AtEOXact_RelationCache) and 6% network.

 That SVG graph was quite impressive.

I used perf and Gprof2Dot for this. I will probably do a blog post on
how to generate these graphs. It's much more useful for me than a
plain flat profile as I don't know by heart which functions are called
by which.

 It looks to me that most benefit could be had from introducing more
 parallelism. Are there any large roadblocks to pipelining the dump and
 restore to have them happen in parallel?

 I talked to Andrew Dustan about parallelization in pg_restore.  First,
 we currently use pg_dumpall, which isn't in the custom format required
 for parallel restore, but if we changed to custom format, create table
 isn't done in parallel, only create index/check constraints, and trigger
 creation, etc.  Not sure if it worth perusing this just for pg_upgrade.

I agree that parallel restore for schemas is a hard problem. But I
didn't mean parallelism within the restore, I meant that we could
start both postmasters and pipe the output from dump directly to
restore. This way the times for dumping and restoring can overlap.

Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] recursive view syntax

2012-11-13 Thread Peter Eisentraut
I noticed we don't implement the recursive view syntax, even though it's
part of the standard SQL feature set for recursive queries.  Here is a
patch to add that.  It basically converts

CREATE RECURSIVE VIEW name (columns) AS SELECT ...;

to

CREATE VIEW name AS WITH RECURSIVE name (columns) AS (SELECT ...) SELECT
columns FROM name;

diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 838bf48..c13f3ec 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -21,7 +21,7 @@
 
  refsynopsisdiv
 synopsis
-CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] VIEW replaceable class=PARAMETERname/replaceable [ ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) ]
+CREATE [ OR REPLACE ] [ TEMP | TEMPORARY ] [ RECURSIVE ] VIEW replaceable class=PARAMETERname/replaceable [ ( replaceable class=PARAMETERcolumn_name/replaceable [, ...] ) ]
 [ WITH ( replaceable class=PARAMETERview_option_name/replaceable [= replaceable class=PARAMETERview_option_value/replaceable] [, ... ] ) ]
 AS replaceable class=PARAMETERquery/replaceable
 /synopsis
@@ -81,6 +81,23 @@ titleParameters/title
/varlistentry
 
varlistentry
+termliteralRECURSIVE//term
+listitem
+ para
+  Creates a recursive view.  The syntax
+synopsis
+CREATE RECURSIVE VIEW replaceablename/ (replaceablecolumns/) AS SELECT replaceable.../;
+/synopsis
+  is equivalent to
+synopsis
+CREATE VIEW replaceablename/ AS WITH RECURSIVE replaceablename/ (replaceablecolumns/) AS (SELECT replaceable.../) SELECT replaceablecolumns/ FROM replaceablename/;
+/synopsis
+  A view column list must be specified for a recursive view.
+ /para
+/listitem
+   /varlistentry
+
+   varlistentry
 termreplaceable class=parametername/replaceable/term
 listitem
  para
@@ -191,6 +208,16 @@ titleExamples/title
literal*/ was used to create the view, columns added later to
the table will not be part of the view.
   /para
+
+  para
+   Create a recursive view consisting of the numbers from 1 to 100:
+programlisting
+CREATE RECURSIVE VIEW nums_1_100 (n) AS
+VALUES (1)
+UNION ALL
+SELECT n+1 FROM nums_1_100 WHERE n  100;
+/programlisting
+  /para
  /refsect1
 
  refsect1
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e4ff76e..159096a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -164,6 +164,7 @@ static void SplitColQualList(List *qualList,
 static void processCASbits(int cas_bits, int location, const char *constrType,
 			   bool *deferrable, bool *initdeferred, bool *not_valid,
 			   bool *no_inherit, core_yyscan_t yyscanner);
+static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
 %}
 
@@ -7834,6 +7835,30 @@ ViewStmt: CREATE OptTemp VIEW qualified_name opt_column_list opt_reloptions
 	n-options = $8;
 	$$ = (Node *) n;
 }
+		| CREATE OptTemp RECURSIVE VIEW qualified_name '(' columnList ')' opt_reloptions
+AS SelectStmt
+{
+	ViewStmt *n = makeNode(ViewStmt);
+	n-view = $5;
+	n-view-relpersistence = $2;
+	n-aliases = $7;
+	n-query = makeRecursiveViewSelect(n-view-relname, n-aliases, $11);
+	n-replace = false;
+	n-options = $9;
+	$$ = (Node *) n;
+}
+		| CREATE OR REPLACE OptTemp RECURSIVE VIEW qualified_name '(' columnList ')' opt_reloptions
+AS SelectStmt
+{
+	ViewStmt *n = makeNode(ViewStmt);
+	n-view = $7;
+	n-view-relpersistence = $4;
+	n-aliases = $9;
+	n-query = makeRecursiveViewSelect(n-view-relname, n-aliases, $13);
+	n-replace = true;
+	n-options = $11;
+	$$ = (Node *) n;
+}
 		;
 
 opt_check_option:
@@ -13541,6 +13566,60 @@ processCASbits(int cas_bits, int location, const char *constrType,
 	}
 }
 
+/*--
+ * Recursive view transformation
+ *
+ * Convert
+ *
+ * CREATE RECURSIVE VIEW relname (aliases) AS query
+ *
+ * to
+ *
+ * CREATE VIEW relname (aliases) AS
+ * WITH RECURSIVE relname (aliases) AS (query)
+ * SELECT aliases FROM relname
+ *
+ * Actually, just the WITH ... part, which is then inserted into the original
+ * view definition as the query.
+ * --
+ */
+static Node *
+makeRecursiveViewSelect(char *relname, List *aliases, Node *query)
+{
+	SelectStmt *s = makeNode(SelectStmt);
+	WithClause *w = makeNode(WithClause);
+	CommonTableExpr *cte = makeNode(CommonTableExpr);
+	List	   *tl = NIL;
+	ListCell   *lc;
+
+	cte-ctename = relname;
+	cte-aliascolnames = aliases;
+	cte-ctequery = query;
+	cte-location = -1;
+
+	w-recursive = true;
+	w-ctes = list_make1(cte);
+	w-location = -1;
+
+	foreach (lc, aliases)
+	{
+		ResTarget *rt = makeNode(ResTarget);
+
+		rt-name = NULL;
+		rt-indirection = NIL;
+		rt-val = makeColumnRef(strVal(lfirst(lc)), NIL, -1, 0);
+		rt-location = -1;
+
+		tl = lappend(tl, rt);
+	}
+
+	s-targetList = tl;
+	s-fromClause = list_make1(makeRangeVar(NULL, relname, -1));
+	s-withClause = w;
+
+	return (Node *) 

Re: [HACKERS] recursive view syntax

2012-11-13 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 I noticed we don't implement the recursive view syntax, even though it's
 part of the standard SQL feature set for recursive queries.  Here is a
 patch to add that.

Can't you simplify that by using SELECT * FROM name?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Amit Kapila
On Wednesday, November 14, 2012 12:25 AM Robert Haas wrote:
 On Mon, Nov 12, 2012 at 10:59 PM, Amit kapila amit.kap...@huawei.com
 wrote:
  Is the above opinion about only locking or even on a way to write the
 changed things in a file as mentioned in point-1 in mail chain upthread.
  (Point-1:  1. While writing .auto file, it will always assume that
 .auto file contain
  all config parameters.
Now as this .auto file is of fixed format and fixed record size, it
 can
  directly write a given record to its particular position.)
  What my thinking was that if we can decide that the format and size of
 each configuration is fixed, it can be directly written without doing
 anything for it in memory.
 
 Uh, no, I don't think that's a good idea.  IMHO, what we should do is:
 
 1. Read postgresql.conf.auto and remember all the settings we saw.  If
 we see something funky like an include directive, barf.
 2. Forget the value we remembered for the particular setting being
 changed.  Instead, remember the user-supplied new value for that
 parameter.
 3. Write a new postgresql.conf.auto based on the information
 remembered in steps 1 and 2.

I am okay with implementing the above way because as per my understanding
this is almost very similar to what I have mentioned in my initial proposal
(Point-5 in Algorithm of Alter System Set ...).
http://archives.postgresql.org/pgsql-hackers/2012-10/msg01509.php

However as now Greg suggested to explore GUC concept as well, so I would
like to check and see the feasibility by that method.

The only reason I have mentioned about fixed format and fixed record size
concept is that during previous discussions for writing the file with GUC,
it came up that is it possible to write file without reading it in current
session.
(-- It seems to me that we ought to be able to rewrite a machine-generated
configuration file without loading those values into the current session.)
Now on second thought it seems to me may be you want to say by above comment
was without loading into session specific GUC.

 
 Of course, if we go with one-file-per-setting, then this becomes even
 simpler: just clobber the file for the single setting being updated -
 creating it if it exists - and ignore all the rest.  I don't
 personally favor that approach because I think I think it's clunky to
 manage, but YMMV.


 With either approach, it's worth noting that a RESET variant of this
 could be useful - which would either remove the chosen setting from
 postgresql.conf.auto, or remove the file containing the
 automatically-set value for that setting.  I think my personal
 favorite syntax is:
 
 ALTER SYSTEM .. SET wunk = 'thunk';
 ALTER SYSTEM .. RESET wunk;
 
 But I'm OK with something else if there's consensus.  I don't
 particularly like SET PERSISTENT because I think this is more like
 ALTER DATABASE .. SET than it is like SET LOCAL, but IJWH.

I think for this there are multiple ways, one is Alter System .., other is
provide this through built-in function.
For first version may be I will go with built-in function Approach, then if
there is consensus to give it through 
Alter System, we can change it.
One advantage, I am seeing in your above suggestion is that a method to
provide RESET will be better with ALTER SYSTEM rather than built-in
function. For the same to achieve through built-in, I think one way to
provide is to give a separate function.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, index search_path where it's used to secure functions

2012-11-13 Thread Karl O. Pinc
On 11/13/2012 08:46:19 PM, Peter Eisentraut wrote:
 
 Well, I'm not sure.  We currently have three index entries on the
 topic:
 
 search path
 search_path
 search_path configuration parameter
 
 I think I'd put them all under search_path.

Ok.

I think you are right that they need to all be
under one index entry.

It might make sense to have a search term
entry that says see search_term, but since
the two entries would be right next to each
other in the index this seems overkill.

I'm going to send this in as a single patch
that fixes all the search path related
index entries:

  search_path-index.patch

(replaces search_path-normalize.patch
  and search_path-securing_v2.patch)

This new patch also changes the
search path index in doc/src/sgml/func.sgml.
Perhaps I just don't understand the logic
in the indexing in that file but the
search path index there seems to point to the
entire file/nowhere useful.  Since there are 2 places
in the file that are concerned with
search path I've removed the global
index into the file and added new
index entries.  This also moved the
schema, current index target.  
(Since that too involves search path.)

I've also added a seealso entry to schema, current
to the top-level search_path index entry.

If you want this broken up into smaller
patches let me know.

Regards,

Karl k...@meme.com
Free Software:  You don't pay back, you pay forward.
 -- Robert A. Heinlein

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 88cea3d..3bad24c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4769,7 +4769,8 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
  varlistentry id=guc-search-path xreflabel=search_path
   termvarnamesearch_path/varname (typestring/type)/term
   indexterm
-   primaryvarnamesearch_path/ configuration parameter/primary
+   primaryvarnamesearch_path/varname/primary
+   secondaryconfiguration parameter/secondary
   /indexterm
   indextermprimarypath/secondaryfor schemas//
   listitem
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index d6e5d64..99ebd10 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1698,7 +1698,8 @@ CREATE TABLE public.products ( ... );
titleThe Schema Search Path/title
 
indexterm
-primarysearch path/primary
+primaryvarnamesearch_path/varname/primary
+seealsoschema, current/seealso
/indexterm
 
indexterm
@@ -1735,7 +1736,8 @@ CREATE TABLE public.products ( ... );
/para
 
indexterm
-primarysearch_path/primary
+primaryvarnamesearch_path/varname/primary
+secondaryruntime variable/secondary
/indexterm
 
para
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f8f63d8..dc6ca1e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12944,16 +12944,6 @@ postgres=# SELECT * FROM unnest2(ARRAY[[1,2],[3,4]]);
/indexterm
 
indexterm
-primaryschema/primary
-secondarycurrent/secondary
-   /indexterm
-
-   indexterm
-primarysearch path/primary
-secondarycurrent/secondary
-   /indexterm
-
-   indexterm
 primarysession_user/primary
/indexterm
 
@@ -12980,6 +12970,11 @@ postgres=# SELECT * FROM unnest2(ARRAY[[1,2],[3,4]]);
 the current user is the quoteeffective user/quote.
/para
 
+   indexterm
+primaryschema/primary
+secondarycurrent/secondary
+   /indexterm
+
para
 functioncurrent_schema/function returns the name of the schema that is
 first in the search path (or a null value if the search path is
@@ -13493,6 +13488,13 @@ SELECT has_function_privilege('joeuser', 'myfunc(int, text)', 'execute');
 are immediately available without doing commandSET ROLE/.
/para
 
+   indexterm
+primaryvarnamesearch_path/varname/primary
+secondary
+  object visibility given an unqualified name
+/secondary
+   /indexterm
+
   para
xref linkend=functions-info-schema-table shows functions that
determine whether a certain object is firsttermvisible/ in the
diff --git a/doc/src/sgml/ref/create_function.sgml b/doc/src/sgml/ref/create_function.sgml
index 4336e4b..cb5ee91 100644
--- a/doc/src/sgml/ref/create_function.sgml
+++ b/doc/src/sgml/ref/create_function.sgml
@@ -673,6 +673,11 @@ SELECT * FROM dup(42);
  refsect1 id=sql-createfunction-security
   titleWriting literalSECURITY DEFINER/literal Functions Safely/title
 
+   indexterm
+  primaryvarnamesearch_path/varname/
+  secondaryuse in securing functions/
+   /indexterm
+
para
 Because a literalSECURITY DEFINER/literal function is executed
 with the privileges of the user that created it, care is needed to


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-13 Thread Amit Kapila
On Tuesday, November 13, 2012 11:43 PM Josh Berkus wrote:
 On 11/12/12 7:59 PM, Amit kapila wrote:
  On Monday, November 12, 2012 12:07 PM Greg Smith wrote:
  On 11/9/12 11:59 PM, Amit kapila wrote:
 
  Please let me know if there are any objections or problems in above
 method of implementation,
  else I can go ahead to prepare the patch for the coming CF.
 
  It may be the case that the locking scheme Robert described is the
 best
  approach here.  It seems kind of heavy to me though.  I suspect that
  some more thinking about it might come up with something better.
 
 So, here's the problem I'm seeing with having a single .auto file:  when
 we write settings to a file, are we writing a *single* setting or *all
 of a user's current settings*?

Single setting.
 
 I was imagining writing single, specific settings, which inevitably
 leads to one-setting-per-file, e.g.:
 
 SET PERSISTENT work_mem = 256MB;

Yes, from beginning what I was discussing was setting of single config 
parameter as in your example.
However, it can be done with one-file for all variables as well. 
I have already mentioned 2 ways of doing it, one is fixed format and fixed size 
file, other is similar to what Robert has detailed
in his mail (http://archives.postgresql.org/pgsql-hackers/2012-11/msg00572.php).



 What Amit seems to be talking about is more EXPORT SETTINGS, where you
 dump all current settings in the session to a file.  This seems likely
 to produce accidental changes when the user writes out settings they've
 forgotten they changed.

I think may be I was not clear enough in my previous mails, but for sure 
whatever I am talking is never related to
dump all current settings in the session to a file. 
In fact both my ideas (fixed format file, initial proposal) was not to touch or 
check the current session parameters.
There is only one Approach which is to see if from GUC, we can write the file 
that talks about writing multiple parameters.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory leaks in record_out and record_send

2012-11-13 Thread Martijn van Oosterhout
On Tue, Nov 13, 2012 at 05:50:08PM -0500, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
  Yeah.  The thing that concerns me is that I think we have a pretty
  decent number of memory contexts where the expected number of
  allocations is very small ... and we have the context *just in case*
  we do more than that in certain instances.  I've seen profiles where
  the setup/teardown costs of memory contexts are significant ... which
  doesn't mean that those examples would perform better with fewer
  memory contexts, but it's enough to make me pause for thought.
 
 So, for my 2c, I'm on the other side of this, personally.  We have
 memory contexts for more-or-less exactly this issue.  It's one of the
 great things about PG- it's resiliant and very unlikely to have large or
 bad memory leaks in general, much of which can, imv, be attributed to
 our use of memory contexts.

If the problem is that we create memory context overhead which is not
necessary in many cases, perhaps we can reduce the overhead somehow. 
IIRC we have a seperate function for resetting a context and freeing it
entirely.  If there was a quick test we could do such that resetting a
context did nothing unless at least (say) 16k had been allocated, that
might reduce the cost for many very small allocations.

Ofcourse, unless someone comes up with a way to measure the cost this
is all handwaving, but it might a nice project for someone interested
in learning to hack postgres.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature