date:20110325

Re: [HACKERS] Avoiding timeline generation

2011-03-25 Thread Heikki Linnakangas


On 25.03.2011 03:00, Daniel Farina wrote:

Here is the mechanism:  I want to author a recovery.conf to perform
some amount of restore_command or streaming replication based
recovery, but I do *not* want to generate a new timeline.  Rather, I
want to stay in hot standby mode to allow read-only connections.


That's exactly what the standby mode is for. Add standby_mode=on to 
recovery.conf, and the server will do exactly that.


Perhaps the documentation is not clear on this. Any suggestions on how 
to improve that?



InArchiveRecovery gets set to true as soon as
readRecoveryCommandFile completes basically normally, and it looks as
though that will ensure we will get a new timeline.  If one tries a
bizarre hack, like ensuring the restore_command does not terminate,
one never finishes recovery


That's what pg_standby does. That was the only option before 
standby_mode was introduced, in version 9.0, although we didn't have hot 
standby until 9.0 either.



-- as one may expect -- and one cannot
connect to the server -- which one may not expect is necessarily the
case presuming hot standby, if the server was terminated cleanly.


That's not true. As long as you enable hot standby, the server will 
accept connections while restore command is running.



The things I want to do with the ability to suppress a new timeline:

* Offline WAL application -- I want to be able to bring up a second
server, perform some amount of point in time recovery, and then stop
and archive.  It would be nice to support read-only queries in this
case to test the recovered database.  The goal of this is to reduce
recovery time in a disaster scenario without tying up resources on a
live server.


Yep, that can be done with standby_mode=on.


* The ability to quiesce a system by bringing it into read-only state
that generates no new WAL while still being able to ship old WAL.


In theory it should be possible to stop a server, put it into hot 
standby mode by creating a recovery.conf file, and restart, but it won't 
try ship the old WAL after that. When you stop a server it will try to 
archive all existing WAL before exiting, though.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Heikki Linnakangas


On 24.03.2011 23:54, Stephen Frost wrote:

* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote:

The problem is that you still need to track which queries within the
transaction can see the tuples. For example:


Wow, that's a good point wrt cursors.  Sounds more and more like we'd
need a special data-loading mode for this where we'd have to disallow
those options.  I've been thinking that's a frowned-upon approach in
general, but let me ask specifically- are we uninterested in such a
special 'data-load' mode?  Or do we expect that the limitations would be
too great to make it useful enough for users?  That last I don't think I
agree with..


I don't think we should put the onus on the user to choose the right 
data loading mode. If we can reliably detect the cases where it's safe 
do these tricks, we can transparently apply them when possible. I would 
be cool with tricks that apply only in narrow cases, as long as we don't 
require the user to do anything.


That said, it can be surprising if some trivial change to the schema 
happens to disable the optimization, and your query performance after 
data load suddenly plummets. That can be a pain for the DBA to debug.


I'm skeptical of changes that need any extra checks in the fast-path of 
HeapTupleSatisfiesMVCC. That is a hot spot already, any extra cycles 
there would add up to hurt performance. Careful performance testing is 
required.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding timeline generation

2011-03-25 Thread Daniel Farina

On Fri, Mar 25, 2011 at 12:38 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 25.03.2011 03:00, Daniel Farina wrote:

 Here is the mechanism:  I want to author a recovery.conf to perform
 some amount of restore_command or streaming replication based
 recovery, but I do *not* want to generate a new timeline.  Rather, I
 want to stay in hot standby mode to allow read-only connections.

 That's exactly what the standby mode is for. Add standby_mode=on to
 recovery.conf, and the server will do exactly that.

 Perhaps the documentation is not clear on this. Any suggestions on how to
 improve that?

I was actually pretty well aware of this option, if that is the case,
I fat-fingered something or had a thinko (mental bit flip?) and then
what I thought I knew about standby_mode=on was invalidated (perhaps
incorrectly).  I will confirm tomorrow.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Heikki Linnakangas


On 25.03.2011 09:51, Heikki Linnakangas wrote:

I don't think we should put the onus on the user to choose the right
data loading mode. If we can reliably detect the cases where it's safe
do these tricks, we can transparently apply them when possible. I would
be cool with tricks that apply only in narrow cases, as long as we don't
require the user to do anything.


I believe the conditions for being able to set hint bits immediately at 
COPY are:


1. The table has been created or truncated in the same transaction
2. We are not in a subtransaction (or the table was created and 
truncated in the same subtransaction)

3. There are no open portals
4. Executing the COPY doesn't need to run any unsafe code that might 
access the same table. This includes triggers, check constraints and 
input functions. An expression is safe if it is immutable.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Daniel Farina

On Fri, Mar 25, 2011 at 1:00 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 1. The table has been created or truncated in the same transaction
 2. We are not in a subtransaction (or the table was created and truncated in
 the same subtransaction)
 3. There are no open portals
 4. Executing the COPY doesn't need to run any unsafe code that might
 access the same table. This includes triggers, check constraints and input
 functions. An expression is safe if it is immutable.

It sounds like these cases would, at the very least, be met by some
forms of pg_restore...that's a highly desirable property, I feel.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Heikki Linnakangas


On 25.03.2011 00:15, Stephen Frost wrote:

At the start of a load, we check if the table was created in the current
transaction.  If so, we check if we've already done a load which used
the frozen XID.  If we have, then we use the normal mechanics.  If we
havn't, then we stuff what the XID would have been in memory somewhere,
mark that we've used the frozen XID, and load the data using the frozen
XID.  On subsequent queries, if we come across a frozen XID in a table
created in this transaction, we use the XID we've stored instead of the
frozen XID.


The tricky part here is how to check if the table was created in the 
same transaction, within HeapTupleSatisfiesMVCC, with minimal overhead. 
If you do it naively, the check will be executed at every single tuple 
read in the system. It has to be really really fast.


I don't want to discourage, maybe it's possible with some clever code 
arrangement. However, it needs a lot of performance testing.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] maximum digits for NUMERIC

2011-03-25 Thread Gianni Ciolli

On Sun, Mar 20, 2011 at 08:14:21PM -0400, Noah Misch wrote:
 On Fri, Mar 11, 2011 at 11:36:14AM +, Gianni Ciolli wrote:
  maybe we should change the 1000 digits here:
  

  http://developer.postgresql.org/pgdocs/postgres/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL
  
  because ISTM that up to 2^17 digits are supported (which makes more
  sense than 1000).
 
 Agreed.  The documentation is suggestive of this limit:
 
 # CREATE TABLE n (c numeric(1001,0));
 ERROR:  NUMERIC precision 1001 must be between 1 and 1000
 LINE 1: CREATE TABLE n (c numeric(1001,0));
 
 However, that's indeed just a limit of the numeric typmod representation, not
 the data type itself.  An unqualified numeric column hits no such limit.

For the record, the limits I found from my tests are:
* 2^17 - 1 maximum total digits
* 2^14 - 1 maximum fractional digits

(I did tests as I couldn't extract any obvious limit from the source
code of numeric.c)

Best regards,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: making write location work (was: Efficient transaction-controlled synchronous replication)

2011-03-25 Thread Fujii Masao

On Fri, Mar 25, 2011 at 6:11 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 24, 2011 at 2:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The protocol supports sending two values, so two are displayed.

 If you wish to remove one from the display then that only makes sense
 if you also prevent the protocol from supporting two values.

 There is no benefit in doing that, so why do it? We are going to put
 that back in 9.2 if you remove it now. Why not leave it, so we don't
 need to rewrite all the monitoring tools that will use this view?

What are you planning to use write_location for? BTW, I'm thinking to
add recv_location (not write_location) in 9.2 to support another sync rep
mode which makes transactions wait until the standby has received
(not fsync'd) the WAL. You're planning the same?

 If we're going to put it back in 9.2, then we shouldn't remove it now.
  We should just make it work.  It's a three line patch.  If 9.2 is
 going to meaningfully distinguish between write location and flush
 location, then we may as well do the same thing in 9.1.  Then we'll be
 ahead of the game: not only will the view have the same columns in
 both releases, but they'll actually have the same semantics in both
 releases.

+1

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] really lazy vacuums?

2011-03-25 Thread Cédric Villemain

2011/3/24 Jim Nasby j...@nasby.net:
 On Mar 22, 2011, at 11:46 AM, Cédric Villemain wrote:
 2011/3/22 Greg Stark gsst...@mit.edu:
 On Mon, Mar 21, 2011 at 6:08 PM, Jim Nasby j...@nasby.net wrote:
 Has anyone looked at the overhead of measuring how long IO requests to the 
 kernel take? If we did that not only could we get an idea of what our IO 
 workload looked like, we could also figure out whether a block came out of 
 cache or not. That information could potentially be useful to the planner, 
 but even if the database couldn't use that knowledge itself it would be a 
 damn useful statistic to have... IMHO, far more useful than our current 
 hit rate statistics.


 I've done this -- actually better, I used mincore to actually check
 whether the block was in cache before issuing the read -- but it turns
 out you can't get what you're looking for this way.

 The linux fincore() syscall never get in the kernel, maybe something
 to revive...

 Is there an equivalent in other OSes? Could we use time measurement as an 
 alternative if not?

fincore() syscall is a shortcut for mmap+mincore calls, suggested by
people working on libprefetch.
see http://lwn.net/Articles/371538/

The alternative via time measurement is interesting, should be easy to
ouput both measures in pg_statio_* and see what happens...



 It turns out when you do this you see one block being read from disk
 followed by n blocks that all appear to be cache hits. Because they've
 been prefetched by the kernel.

 I did the same, I now believe that it is not very important to have
 the very exact numbers.
 Prefetech blocks *are* in memory when we request them, the first read
 access read more than one block because the cost is the same.

 Yeah... there's places in the planner where we make guesses as to the 
 likelyhood of something being in-cache. If we could actually track complete 
 hit rate over time (PG buffers + FS cache), then we wouldn't have to guess at 
 things anymore.

 And having this info in pg_stats would be extremely valuable.

yes, also Robert wrote some interesting items to keep in mind when
thinking about that, in another thread, recently.
A fs-cache snapshot or just a 'percent_in_cache' per relation/file (?)
 is easy to do/add to some auto-analyze daemon.

*but* making a good use of it in the planner is not as trivial as it
looks. (i.e. without breaking what is working well)

Once I get time to add hooks in costsize.c, a simple extension can do
the trick. (just need some shared_buffers to keep FS-pg_stats and
hooks to use it in some places).


 What you end up with is actually something like the number of iops
 which is also an interesting measure but not really what you were
 looking for.

 My getrusage patch, which I should still dig out though it's rather
 too late to be committing now unless someone tells me otherwise, would
 tell you how much i/o a plan node actually did. But you won't know
 which blocks did the i/o since I was only tracking totals for the plan
 node. That's probably what you're looking for here.

 Please show us the patch :)
 --
 Jim C. Nasby, Database Architect                   j...@nasby.net
 512.569.9461 (cell)                         http://jim.nasby.net






-- 
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Greg Stark

On Fri, Mar 25, 2011 at 8:09 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 The tricky part here is how to check if the table was created in the same
 transaction, within HeapTupleSatisfiesMVCC, with minimal overhead. If you do
 it naively, the check will be executed at every single tuple read in the
 system. It has to be really really fast.


You could have a single global boolean variable to indicate whether
any tables have been created in this transaction and inserted into
using this frozenxid hack in this transaction yet. That would be
branch predicted as false in any decent processor which shouldn't cost
anything. If it's true then you go off and check a small hash table.

The question is how often would the insert trigger with false
positives. That is, how often do people create tables, insert a small
number of records, then go off and do complex performance-sensitive
things in the same transaction. I suppose this does happen due to the
fairly common pattern of I want temporary tables but they need to be
visible to this other transaction. For example for batch jobs where
you want to be able to display some sort of progress indicator.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Stephen Frost

* Greg Stark (gsst...@mit.edu) wrote:
 You could have a single global boolean variable to indicate whether
 any tables have been created in this transaction and inserted into
 using this frozenxid hack in this transaction yet.

This was exactly where I was going, and, honestly, I was wondering if
there'd be a way to possibly even avoid running through
HeapTupleIsVisible for those relations entirely..  Up to the point where
some further action modifies the table again and forces us to have to go
through that path.

I'm not sure I feel it's a real 'bad' hack. :)

 The question is how often would the insert trigger with false
 positives. That is, how often do people create tables, insert a small
 number of records, then go off and do complex performance-sensitive
 things in the same transaction. I suppose this does happen due to the
 fairly common pattern of I want temporary tables but they need to be
 visible to this other transaction. For example for batch jobs where
 you want to be able to display some sort of progress indicator.

Right, that kind of issue is why I had been thinking it might be a
different mode, but it also may not be all that bad to use an approach
like what you described above.  We could descide to not do it for
temporary tables, perhaps, or, if we could possibly avoid having to
check for visibility at the tuple level for these relations, perhaps
it'd be faster on those use-cases..?

We would need to figure out, also, if people are doing batch
operations..  eg: they load 1 records at a time into a table.  We
wouldn't want to lose these optimizations in those cases, provied
they're done one after another.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

2011-03-25 Thread Fujii Masao

On Sat, Mar 19, 2011 at 12:07 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Mar 18, 2011 at 10:55 AM, Greg Stark gsst...@mit.edu wrote:
 On Thu, Mar 17, 2011 at 5:46 PM, Robert Haas robertmh...@gmail.com wrote:
 What makes more sense to me after having thought about this more
 carefully is to simply make a blanket rule that when
 synchronous_replication=on, synchronous_commit has no effect.  That is
 easy to understand and document.

 For what it's worth has no effect doesn't make much sense to me.
 It's a boolean, either commits are going to block or they're not.

 What happened to the idea of a three-way switch?

 synchronous_commit = off
 synchronous_commit = disk
 synchronous_commit = replica

 With on being a synonym for disk for backwards compatibility.

 Then we could add more options later for more complex conditions like
 waiting for one server in each data centre or waiting for one of a
 certain set of servers ignoring the less reliable mirrors, etc.

 This is similar to what I suggested upthread, except that I suggested
 on/local/off, with the default being on.  That way if you set
 synchronous_standby_names, you get synchronous replication without
 changing another setting, but you can say local instead if for some
 reason you want the middle behavior.  If we're going to do it all with
 one GUC, I think that way makes more sense.  If you're running sync
 rep, you might still have some transactions that you don't care about,
 but that's what async commit is for.  It's a funny kind of transaction
 that we're OK with losing if we have a failover but we're not OK with
 losing if we have a local crash from which we recover without failing
 over.

I'm OK with this.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Avoiding timeline generation

2011-03-25 Thread Robert Haas

On Mar 24, 2011, at 9:00 PM, Daniel Farina dan...@heroku.com wrote:
 * Offline WAL application -- I want to be able to bring up a second
 server, perform some amount of point in time recovery, and then stop
 and archive.  It would be nice to support read-only queries in this
 case to test the recovered database.  The goal of this is to reduce
 recovery time in a disaster scenario without tying up resources on a
 live server.

The new 9.1 feature paise_at_recovery_target seems like it might be what you 
need here.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.

2011-03-25 Thread Fujii Masao

On Sat, Mar 19, 2011 at 4:29 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Fri, 2011-03-18 at 20:19 +0100, Markus Wanner wrote:
 Simon,

 On 03/18/2011 05:19 PM, Simon Riggs wrote:
  Simon Riggssi...@2ndquadrant.com  wrote:
  In PostgreSQL other users cannot observe the commit until an
  acknowledgement has been received.

 On other nodes as well?  To me that means the standby needs to hold back
 COMMIT of an ACKed transaction, until receives a re-ACK from the master,
 that it committed the transaction there.  How else could the slave know
 when to commit its ACKed transactions?

 We could do that easily enough, actually, if we wished.

 Do we wish?

No.

I'm not sure what's the problem with seeing from the standby the data which is
not visible on the master yet? And, I'm really not sure whether that problem can
be solved by making the data visible on the master before the standby. If we
really want to see the consistent data from each node, we should implement
and use a cluster-wide snapshot as well as Postgres-XC does.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] How to Make a pg_filedump

2011-03-25 Thread aaronenabs

Hi All

I am trying to write a pg_filedump to read dead rows in pgsql, I am
relatively new to postgresql and have been trying to find out how to write
and compile a pg_filedump to help display dead rows. Please can anyone help
me 

i) Explain how i can write a pg_filedump to display dead rows or how to
write pg_filedump
ii) Explain how to compile a pg_filedump

Thanks

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-Make-a-pg-filedump-tp4264126p4264126.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] How to Make a pg_filedump

2011-03-25 Thread aaronenabs

Hi All

I am trying to write a pg_filedump to read dead rows in pgsql, I am
relatively new to postgresql and have been trying to find out how to write
and compile a pg_filedump to help display dead rows. Please can anyone help
me 

i) Explain how i can write a pg_filedump to display dead rows or how to
write pg_filedump
ii) Explain how to compile a pg_filedump

Thanks

--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/How-to-Make-a-pg-filedump-tp4264125p4264125.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 2nd Level Buffer Cache

2011-03-25 Thread Gurjeet Singh

On Tue, Mar 22, 2011 at 3:53 PM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Mar 22, 2011 at 11:24 AM, Jeff Janes jeff.ja...@gmail.com wrote:
  On Fri, Mar 18, 2011 at 9:19 AM, Robert Haas robertmh...@gmail.com
 wrote:
  On Fri, Mar 18, 2011 at 11:14 AM, Kevin Grittner
  kevin.gritt...@wicourts.gov wrote:
  Maybe the thing to focus on first is the oft-discussed benchmark
  farm (similar to the build farm), with a good mix of loads, so
  that the impact of changes can be better tracked for multiple
  workloads on a variety of platforms and configurations.  Without
  something like that it is very hard to justify the added complexity
  of an idea like this in terms of the performance benefit gained.
 
  A related area that could use some looking at is why performance tops
  out at shared_buffers ~8GB and starts to fall thereafter.
 
  Under what circumstances does this happen?  Can a simple pgbench -S
  with a large scaling factor elicit this behavior?

 To be honest, I'm mostly just reporting what I've heard Greg Smith say
 on this topic.   I don't have any machine with that kind of RAM.


I can sponsor a few hours (say 10) of one High-memory on-demand Quadruple
Extra Large instance (26 EC2 Compute Units (8 virtual cores with 3.25 EC2
Compute Units each), 1690 GB of local instance storage, 64-bit platform).
That's the largest memory AWS has.

Let me know if I can help.

Regards,
-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Marti Raudsepp

On Thu, Mar 24, 2011 at 23:47, Stephen Frost sfr...@snowman.net wrote:
 I'd be happy with a data loading mode that even disallowed
 subtransactions if necessary to achieve the write-once (well, plus WAL
 if you're archiving) operation...

Note that there's already an extension on pgFoundry for a data
loading mode, have you looked at pg_bulkload?

http://pgbulkload.projects.postgresql.org/

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-03-25 Thread YAMAMOTO Takashi

hi,

 YAMAMOTO Takashi y...@mwd.biglobe.ne.jp wrote:
  
 thanks for quickly fixing problems.
  
 Thanks for the rigorous testing.  :-)
  
 i tested the later version
 (a2eb9e0c08ee73208b5419f5a53a6eba55809b92) and only errors i got
 was out of shared memory.  i'm not sure if it was caused by SSI
 activities or not.
  
 PG_DIAG_SEVERITY: WARNING
 PG_DIAG_SQLSTATE: 53200
 PG_DIAG_MESSAGE_PRIMARY: out of shared memory
 PG_DIAG_SOURCE_FILE: shmem.c
 PG_DIAG_SOURCE_LINE: 190
 PG_DIAG_SOURCE_FUNCTION: ShmemAlloc
 
 PG_DIAG_SEVERITY: ERROR
 PG_DIAG_SQLSTATE: 53200
 PG_DIAG_MESSAGE_PRIMARY: out of shared memory
 PG_DIAG_SOURCE_FILE: dynahash.c
 PG_DIAG_SOURCE_LINE: 925
 PG_DIAG_SOURCE_FUNCTION: hash_search_with_hash_value
  
 Nor am I.  Some additional information would help.
  
 (1)  Could you post the non-default configuration settings?

none.  it can happen with just initdb+createdb'ed database.

 (2)  How many connections are in use in your testing?

4.

 (3)  Can you give a rough categorization of how many of what types
 of transactions are in the mix?

all transactions are SERIALIZABLE.

no transactions are with READ ONLY.
(but some of them are actually select-only.)

 (4)  Are there any long-running transactions?

no.

 (5)  How many of these errors do you get in what amount of time?

once it start happening, i see them somehow frequently.

 (6)  Does the application continue to run relatively sanely, or does
 it fall over at this point?

my application just exits on the error.

if i re-run the application without rebooting postgres, it seems that
i will get the error sooner than the first run.  (but it might be just
a matter of luck)

 (7)  The message hint would help pin it down, or a stack trace at
 the point of the error would help more.  Is it possible to get
 either?  Looking over the code, it appears that the only places that
 SSI could generate that error, it would cancel that transaction with
 the hint You might need to increase
 max_pred_locks_per_transaction. and otherwise allow normal
 processing.

no message hints.  these errors are not generated by SSI code,
at least directly.
(please look at PG_DIAG_SOURCE_xxx in the above log.)

YAMAMOTO Takashi

 Even with the above information it may be far from clear where
 allocations are going past their maximum, since one HTAB could grab
 more than its share and starve another which is staying below its
 maximum.  I'll take a look at the possibility of adding a warning
 or some such when an HTAB expands past its maximum size.
  
 -Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Set hint bits upon eviction from BufMgr

2011-03-25 Thread Merlin Moncure

Maybe I'm being overly simplistic or incorrect here, but I was
thinking that there might be a route to reducing hint bit impact to
the main sufferers of the feature without adding too much pain in the
general case.  I'm unfortunately convinced there is no getting rid of
them -- in fact their utility will become even more apparent with
faster storage and the pendulum of optimization swings back to the cpu
side.

My idea is to reserve a bit in the page header, say PD_ALL_SAME_XMIN
that indicates all the tuples are from the same transaction and set it
when the first insertion tuple hits the page and unset it when any
tuple is added from another xmin/touched/deleted.  The point here is
to set up a cheap check at the page level that we can make when a page
is getting evicted from the bufmgr.  If the bit is set, we grab off
the xmin of the first tuple on the page and test it for visibility
(assuming the hint bit is not already set). If we get a thumbs up on
the transaction, we can look the page and set all tuple hints as
during the page evict/sync process.  We don't worry about
logging/crash safety on the 'all same' hint because it's only
interesting to this bufmgr check (it can even be cleared when page is
loaded).

Without this bit, the only way to set hint bits going during bufmgr
eviction is to do a visibility check on every tuple, which would
probably be prohibitively expensive.  Since OLTP environments would
rarely see this bit, they would not have to pay for the check.

Also, we can maybe tweak the bufmgr to prefer not to evict pages with
this bit set if it's known they are not yet written out to primary
storage.  Maybe this impossible or not logical...just thinking out
loud.  Anyways, if this actually works, shared buffers can start to
play a role of mitigating hint bit i/o as long as the transaction
resolves before pages start jumping out into storage.  If you couple
this with a facility to do bulk loads that break up transactions on
regular intervals, you have a good shot at getting all your hint bits
written out properly in large load situation.

You might be able to do similar tricks with deletes -- I haven't
thought about that.  Also there might be some interplay with vacuum or
some other deal breaker -- curious to see if I have something worth
further thought  here.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: making write location work (was: Efficient transaction-controlled synchronous replication)

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 5:03 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Fri, Mar 25, 2011 at 6:11 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Mar 24, 2011 at 2:28 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The protocol supports sending two values, so two are displayed.

 If you wish to remove one from the display then that only makes sense
 if you also prevent the protocol from supporting two values.

 There is no benefit in doing that, so why do it? We are going to put
 that back in 9.2 if you remove it now. Why not leave it, so we don't
 need to rewrite all the monitoring tools that will use this view?

 What are you planning to use write_location for? BTW, I'm thinking to
 add recv_location (not write_location) in 9.2 to support another sync rep
 mode which makes transactions wait until the standby has received
 (not fsync'd) the WAL. You're planning the same?

 If we're going to put it back in 9.2, then we shouldn't remove it now.
  We should just make it work.  It's a three line patch.  If 9.2 is
 going to meaningfully distinguish between write location and flush
 location, then we may as well do the same thing in 9.1.  Then we'll be
 ahead of the game: not only will the view have the same columns in
 both releases, but they'll actually have the same semantics in both
 releases.

 +1

I think we have adequate consensus on this topic, so committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set hint bits upon eviction from BufMgr

2011-03-25 Thread Jim Nasby

On Mar 25, 2011, at 9:52 AM, Merlin Moncure wrote:
 Without this bit, the only way to set hint bits going during bufmgr
 eviction is to do a visibility check on every tuple, which would
 probably be prohibitively expensive.  Since OLTP environments would
 rarely see this bit, they would not have to pay for the check.

IIRC one of the biggest costs is accessing the CLOG, but what if the 
bufmgr.c/bgwriter didn't use the same CLOG lookup mechanism as backends did? 
Unlike when a backend is inspecting visibility, it's not necessary for 
something like bgwriter to know exact visibility as long as it doesn't mark 
something as visible when it shouldn't. If it uses a different CLOG 
caching/accessing method that lags behind the real CLOG then the worst-case 
scenario is that there's a delay on setting hint bits. But getting grwiter to 
dothis would likely still be a huge win over forcing backends to worry about 
it. It's also possible that the visibility check itself could be simplified.

BTW, I don't think you want to play these games when a backend is evicting a 
page because you'll be slowing a real backend down.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Stephen Frost

Greetings,

  We have a curious situation, consider this:

  Process 1:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
CRETE TABLE table1 (i integer);
INSERT INTO table1 VALUES (13);

  Process 2:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE;
CREATE TABLE table2 (i integer);
INSERT INTO table2 VALUES (123);
COMMIT;

  Process 1:
SELECT * FROM pg_class WHERE relname = 'table2'; -- no rows returned
SELECT * FROM table2; -- works?!  but at least no records returned
INSERT INTO table2 VALUES (456); 
  -- also works..  now we have a tuple in the table which appears to
  -- have been added before the table existed..
COMMIT;

  This happens, of course, because we use SysCache to look up table
  names to Oids and that uses SnapshotNow.  In my view, this violates
  the basic principle of least suprise and means that while we have
  transaction DDL, but it's not really serializable (no, I don't
  particularly care about that).

  What I do worry about is that the bulk load discussion going on could
  be shot down because of this.  We won't let the earlier transaction
  see any records in the table today because those tuples have an xmin
  later, but if we were to insert those tuples with the frozen XID (as I
  proposed in the other thread) then they'd be visible.

  I don't believe fixing this would be terribly difficult and, I
  believe, would be similar to what we've done else where (eg: with
  indexes)- basically, add a column to pg_class with the 'createdxmin'
  and then compare that against our transaction whenever we're doing
  table lookups.

  Thoughts?

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Jim Nasby

On Mar 24, 2011, at 4:42 PM, Greg Stark wrote:
 On Thu, Mar 24, 2011 at 9:39 PM, Greg Stark gsst...@mit.edu wrote:
 
 We could conceivably deal with that by not setting the frozenxid but
 setting the hint bit for those tuples and having a documented special
 case that if the hint bit is set but it's the same xid as your own you
 have to treat it as not-committed.
 
 Or I suppose we could set the frozenxid but maintain a hash table of
 relations that we use to remember which relations we created in this
 transaction and are treating this way. For any table in that hash
 table we ignore the xmin and just look at cmin/cmax.
 
 I'm not sure this solves the cases of subtransactions -- but perhaps
 we just wouldn't store frozenxid if we're in a subtransaction. And I'm
 not sure we have access to the relation id when we're doing visibility
 checks. I think we do. This would involve no serious ugliness in the
 actual on-disk storage and it would make data loads a write-once
 operation which is the holy grail :)

I have an even broader idea...

If we had the ability to specify in a transaction exactly what tables it was 
going to read from, enforce that it only read from those tables, and published 
that information to other backends, vacuum could then ignore long running 
transactions that are guaranteed not to hit the table they're vacuuming. This 
would eliminate a large pain-point for anyone that has a high-velocity table in 
an environment that also has to support longer running transactions.

If we had that mechanism you could also make use of it for this idea; you would 
specify that your transaction will never read from the new table, which means 
that any data inserted into that table can be pre-frozen (assuming we make sure 
that no one else can see that table exists yet).
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Tom Lane

Stephen Frost sfr...@snowman.net writes:
   I don't believe fixing this would be terribly difficult and, I
   believe, would be similar to what we've done else where (eg: with
   indexes)- basically, add a column to pg_class with the 'createdxmin'
   and then compare that against our transaction whenever we're doing
   table lookups.

Making DDL serializable is *not* simple, and half-baked hacks won't
make that situation better ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql \dt and table size

2011-03-25 Thread Alvaro Herrera

Excerpts from Pavel Stehule's message of vie mar 25 02:48:49 -0300 2011:
 2011/3/24 Robert Haas robertmh...@gmail.com:
  On Wed, Mar 23, 2011 at 4:50 PM, Pavel Stehule pavel.steh...@gmail.com 
  wrote:

  can we enhance a detail for table and show more accurate numbers?
 
  table size: xxx
  toast size: xxx
  indexes size: xxx
 
  Only if we don't mind going beyond 80 columns.
 
 sure, it is on new lines

That could get very long ... are you proposing something like 
\d++ ?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 2nd Level Buffer Cache

2011-03-25 Thread Jim Nasby

On Mar 25, 2011, at 10:07 AM, Gurjeet Singh wrote:
 On Tue, Mar 22, 2011 at 3:53 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Mar 22, 2011 at 11:24 AM, Jeff Janes jeff.ja...@gmail.com wrote:
  On Fri, Mar 18, 2011 at 9:19 AM, Robert Haas robertmh...@gmail.com wrote:
  On Fri, Mar 18, 2011 at 11:14 AM, Kevin Grittner
  kevin.gritt...@wicourts.gov wrote:
  Maybe the thing to focus on first is the oft-discussed benchmark
  farm (similar to the build farm), with a good mix of loads, so
  that the impact of changes can be better tracked for multiple
  workloads on a variety of platforms and configurations.  Without
  something like that it is very hard to justify the added complexity
  of an idea like this in terms of the performance benefit gained.
 
  A related area that could use some looking at is why performance tops
  out at shared_buffers ~8GB and starts to fall thereafter.
 
  Under what circumstances does this happen?  Can a simple pgbench -S
  with a large scaling factor elicit this behavior?
 
 To be honest, I'm mostly just reporting what I've heard Greg Smith say
 on this topic.   I don't have any machine with that kind of RAM.
 
 I can sponsor a few hours (say 10) of one High-memory on-demand Quadruple 
 Extra Large instance (26 EC2 Compute Units (8 virtual cores with 3.25 EC2 
 Compute Units each), 1690 GB of local instance storage, 64-bit platform). 
 That's the largest memory AWS has.

Related to that... after talking to Greg Smith at PGEast last night, he felt it 
would be very valuable just to profile how much time is being spent 
waiting/holding the freelist lock in a real environment. I'm going to see if we 
can do that on one of our slave databases.
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] psql \dt and table size

2011-03-25 Thread Pavel Stehule

2011/3/25 Alvaro Herrera alvhe...@commandprompt.com:
 Excerpts from Pavel Stehule's message of vie mar 25 02:48:49 -0300 2011:
 2011/3/24 Robert Haas robertmh...@gmail.com:
  On Wed, Mar 23, 2011 at 4:50 PM, Pavel Stehule pavel.steh...@gmail.com 
  wrote:

  can we enhance a detail for table and show more accurate numbers?
 
  table size: xxx
  toast size: xxx
  indexes size: xxx
 
  Only if we don't mind going beyond 80 columns.

 sure, it is on new lines

 That could get very long ... are you proposing something like
 \d++ ?

\d++ is good idea. I don't thing so it's necessary for detail about
sizes. But it can be used for super detail:

* sizes of all indexes
* statistics of usage
* statistics of indexes

maybe - it is just idea.

Pavel



 --
 Álvaro Herrera alvhe...@commandprompt.com
 The PostgreSQL Company - Command Prompt, Inc.
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] How to Make a pg_filedump

2011-03-25 Thread Alvaro Herrera

Excerpts from aaronenabs's message of vie mar 25 10:43:49 -0300 2011:
 Hi All
 
 I am trying to write a pg_filedump to read dead rows in pgsql, I am
 relatively new to postgresql and have been trying to find out how to write
 and compile a pg_filedump to help display dead rows. Please can anyone help
 me 
 
 i) Explain how i can write a pg_filedump to display dead rows or how to
 write pg_filedump
 ii) Explain how to compile a pg_filedump

Are you aware that pg_filedump already exists?  The source code lives at
http://pgfoundry.org/projects/pgfiledump (for 9.0) and
http://sources.redhat.com/rhdb (for older releases)

Moreover, pg_filedump will display all rows, dead or otherwise.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 2nd Level Buffer Cache

2011-03-25 Thread Jeff Janes

On Thu, Mar 24, 2011 at 7:51 PM, Greg Stark gsst...@mit.edu wrote:
 On Thu, Mar 24, 2011 at 11:33 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 I tried under the circumstances I thought were mostly likely to show a
 time difference, and I was unable to detect a reliable difference in
 timing between free list and clock sweep.

 It strikes me that it shouldn't be terribly hard to add a profiling
 option to Postgres to dump out a list of precisely which blocks of
 data were accessed in which order. Then it's fairly straightforward to
 process that list using different algorithms to measure which
 generates the fewest cache misses.

It is pretty easy to get the list by adding a couple elog.  To be
safe you probably also need to record pins and unpins, as you can't
evict a pinned buffer no matter how other-wise eligible it might be.
For most workloads you might be able to get away with just assuming
that if it is eligible for replacement under any reasonable strategy,
than it is very unlikely to still be pinned.  Also, if the list is
derived from a concurrent environment, then the order of access you
see under a particular policy might no longer be the same if a
different policy were adopted.

But whose work-load would you use to do the testing?  The ones I was
testing were simple enough that I just know what the access pattern
is, the root and 1st level branch blocks are almost always in shared
buffer, the leaf and table blocks almost never are.

Here my concern was not how to choose which block to replace in a
conceptual way, but rather how to code that selection in way that is
fast and concurrent and low latency for the latency-sensitive
processes.  Either method will evict the same blocks, with the
exception of differences introduced by race conditions that get
resolved differently.

A benefit of focusing on the implementation rather than the high level
selection strategy is that improvements in implementation are more
likely to better carry over to other workloads.

My high level conclusions were that the running of the selection is
generally not a bottleneck, and in the cases where it was, the
bottleneck was due to contention on the LWLock, regardless of what was
done under that lock.  Changing who does the clock-sweep is probably
not meaningful unless it facilitates a lock-strength reduction or
other contention reduction.

I have also played with simulations of different algorithms for
managing the usage_count, and I could get improvements but they
weren't big enough or general enough to be very exciting.  It was
generally the case were if the data size was X, the improvement was
maybe 30% over the current, but if the data size was 0.8X or 1.2X,
there was no difference.  So not very general.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] When and how many times does ExecSetParamPlan executes?

2011-03-25 Thread Vaibhav Kaushal

Hello all, 

I was going through the Expression Evaluator and was trying to
understand how the expressions are formed and evaluated. I was informed
on the IRC channel that the PARAM nodes are quite important and many
well written client applications use PARAMs for sending query to the
backend. I found while looking at the ExecEvalParam function that a call
is made to the ExecSetParamPlan for the first time when the PARAM is
caught in the ExprState tree.

I am confused about how many times the call is made to the
ExecSetParamPlan funtion. Since the executor runs the expression
evaluator for each tuple (trying to find qual satisfaction) while the
expression tree is created only once, the possibility (from my
viewpoint) is that the ExecSetParamPlan might be called once only; which
would be when the first PARAM is encountered during the query execution.
OR, it might get called individually for each tuple (when the PARAM node
is caught in the tree). 

So I am confused about which case is it? Does it (ExecSetParamPlan) get
called on per-tuple basis or first encounter in the qual/expression
evaluation basis? 

Kindly help.

Regards, 
Vaibhav


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set hint bits upon eviction from BufMgr

2011-03-25 Thread Merlin Moncure

On Fri, Mar 25, 2011 at 10:34 AM, Jim Nasby j...@nasby.net wrote:
 On Mar 25, 2011, at 9:52 AM, Merlin Moncure wrote:
 Without this bit, the only way to set hint bits going during bufmgr
 eviction is to do a visibility check on every tuple, which would
 probably be prohibitively expensive.  Since OLTP environments would
 rarely see this bit, they would not have to pay for the check.

 IIRC one of the biggest costs is accessing the CLOG, but what if the 
 bufmgr.c/bgwriter didn't use the same CLOG lookup mechanism as backends did? 
 Unlike when a backend is inspecting visibility, it's not necessary for 
 something like bgwriter to know exact visibility as long as it doesn't mark 
 something as visible when it shouldn't. If it uses a different CLOG 
 caching/accessing method that lags behind the real CLOG then the worst-case 
 scenario is that there's a delay on setting hint bits. But getting grwiter to 
 dothis would likely still be a huge win over forcing backends to worry about 
 it. It's also possible that the visibility check itself could be simplified.

 BTW, I don't think you want to play these games when a backend is evicting a 
 page because you'll be slowing a real backend down.

Well, I'm not so sure -- as noted above, you only pay for the check
above when all the records in a page are new, and only once per page,
not once per tuple.  Basically, only when you are bulk jamming records
through the buffers.  The amoritized cost of the clog lookup is going
to be near zero (maybe you could put a fuse in that would get tripped
if there weren't enough tuples in the page to justify the check).

If you are bulk loading more data than you have shared buffers, then
you get zero benefit.  However, you might having the makings of a
strategy of dealing with hint bit i/o in user land.  (by breaking up
transactions, tweaking shared buffers, etc).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Proposal: q-gram GIN and GiST indexes

2011-03-25 Thread Alexander Korotkov

Hackers,

I would like to ask you about currency of the work above. I propose to
develop functionality of GIN and GiST q-gram indexes with following
features:
1) Handle edit distance (e.g. levenshtein distance) and LIKE/ILIKE
queries(using GIN partial match if no full q-grams can be extracted
from wildcard)
2) Support of various q
3) Support of positional q-grams in GIN (for more effective edit
distance filtering)
4) Various signature size in GiST
As you can see, there are some significant differences from pg_trgm.
Do you see this functionality useful? If you think this functionality
useful, where do you like to see it: separate project, contrib module,
core (of course, in the case when code have sufficient quality)?
I have stong confidence level about implementability of this project
in few month. That's why I could propose this as an GSoC project.


With best regards,
Alexander Korotkov.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Making DDL serializable is *not* simple, and half-baked hacks won't
 make that situation better ...

Sorry, that obviously didn't come across clearly (I think I've just been
talking to Kevin far too much).

I'm not interested in making them serializable.  I'd like to not have
tables randomly appear during a serializable transaction.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Joshua Berkus


 Making DDL serializable is *not* simple, and half-baked hacks won't
 make that situation better ...

That seemed unnecessary.  Whether or not you approve of Stephen's solution, he 
is dealing with a real issue.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Stephen Frost

* Joshua Berkus (j...@agliodbs.com) wrote:
 That seemed unnecessary.  Whether or not you approve of Stephen's solution, 
 he is dealing with a real issue.

The solution felt, to me at least, to have a lot of parallel to an
index's indcheckxmin.  We've dealt with this issue there and have a
precedent for how to deal with it.  Based on discussions with other
folks it sounds like we may be forced to do it for constraints also, and
I think we'd want to try to deal with all of them in a similar way.

Perhaps the current solution for indexes is a hack and should be tossed
out with a wholesale replacment which solves all these problems, which
would certainly be quite a bit of work, but if that's necessary then
let's discuss it and get an idea down on a wiki somewhere about what
that should look like.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] 2nd Level Buffer Cache

2011-03-25 Thread Robert Haas

On Mar 25, 2011, at 11:58 AM, Jim Nasby j...@nasby.net wrote:
 Related to that... after talking to Greg Smith at PGEast last night, he felt 
 it would be very valuable just to profile how much time is being spent 
 waiting/holding the freelist lock in a real environment. I'm going to see if 
 we can do that on one of our slave databases.

Yeah, that would be great. Also, some LWLOCK_STATS output or oprofile output 
would be definitely be useful.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] When and how many times does ExecSetParamPlan executes?

2011-03-25 Thread Tom Lane

Vaibhav Kaushal vaibhavkaushal...@gmail.com writes:
 Hello all, 
 I was going through the Expression Evaluator and was trying to
 understand how the expressions are formed and evaluated. I was informed
 on the IRC channel that the PARAM nodes are quite important and many
 well written client applications use PARAMs for sending query to the
 backend. I found while looking at the ExecEvalParam function that a call
 is made to the ExecSetParamPlan for the first time when the PARAM is
 caught in the ExprState tree.

 I am confused about how many times the call is made to the
 ExecSetParamPlan funtion.

Indeed ... ExecSetParamPlan has nothing at all to do with Params
representing values coming from a client (PARAM_EXTERN parameters).
It's used in connection with PARAM_EXEC parameters that represent
the outputs of InitPlan subplans (that is, execute-only-once subplans).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Tom Lane

Stephen Frost sfr...@snowman.net writes:
 Sorry, that obviously didn't come across clearly (I think I've just been
 talking to Kevin far too much).

 I'm not interested in making them serializable.  I'd like to not have
 tables randomly appear during a serializable transaction.

Well, basically, you can't have that.  Example: you have an existing
table with primary key, and while you're in the middle of doing some
long transaction, somebody else creates a table with a foreign-key
reference to the one you're about to do a delete from.  Being
serializable does not excuse you from the obligation to check for
FK violations in that invisible table.  It might be acceptable to
fail entirely, but not to act as though the table isn't there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 4:00 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 25.03.2011 09:51, Heikki Linnakangas wrote:

 I don't think we should put the onus on the user to choose the right
 data loading mode. If we can reliably detect the cases where it's safe
 do these tricks, we can transparently apply them when possible. I would
 be cool with tricks that apply only in narrow cases, as long as we don't
 require the user to do anything.

 I believe the conditions for being able to set hint bits immediately at COPY
 are:

 1. The table has been created or truncated in the same transaction
 2. We are not in a subtransaction (or the table was created and truncated in
 the same subtransaction)
 3. There are no open portals
 4. Executing the COPY doesn't need to run any unsafe code that might
 access the same table. This includes triggers, check constraints and input
 functions. An expression is safe if it is immutable.

That's not enough... some other transaction could see the data before
the transaction commits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] When and how many times does ExecSetParamPlan executes?

2011-03-25 Thread Vaibhav Kaushal

Thanks for the reply Mr. Tom. 

So, I think that the function ExecSetParamPlan (as the code suggests
too) is called _once_ in any plan/expression and that should be mostly
for a sub-select query. 

Kindly correct me if I am wrong. Since I am not able to understand this
usecase completely, a sample query which is capable of calling this
function (ExecSetParamPlan) could show some light. It would be really
kind of you / anyone to show me a query executable through psql which
can actually call ExecSetParamPlan and involves the use of a on-disk
relation.

The reason I am asking for a query which I could run on psql is that I
am using gdb to debug (and notice the steps taken by) PG.

Regards,
Vaibhav

On Fri, 2011-03-25 at 14:37 -0400, Tom Lane wrote:
 Vaibhav Kaushal vaibhavkaushal...@gmail.com writes:
  Hello all, 
  I was going through the Expression Evaluator and was trying to
  understand how the expressions are formed and evaluated. I was informed
  on the IRC channel that the PARAM nodes are quite important and many
  well written client applications use PARAMs for sending query to the
  backend. I found while looking at the ExecEvalParam function that a call
  is made to the ExecSetParamPlan for the first time when the PARAM is
  caught in the ExprState tree.
 
  I am confused about how many times the call is made to the
  ExecSetParamPlan funtion.
 
 Indeed ... ExecSetParamPlan has nothing at all to do with Params
 representing values coming from a client (PARAM_EXTERN parameters).
 It's used in connection with PARAM_EXEC parameters that represent
 the outputs of InitPlan subplans (that is, execute-only-once subplans).
 
   regards, tom lane



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-03-25 Thread Robert Haas

On Fri, Mar 18, 2011 at 5:57 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 I'm still looking at whether it's sane to try to issue a warning
 when an HTAB exceeds the number of entries declared as its
 max_size when it was created.

 I think this does it.

 If nothing else, it might be instructive to use it while testing the
 SSI patch.  Would it make any sense to slip this into 9.1, or should
 I add it to the first 9.2 CF?

I don't think it's too late to commit something like this, but I'm not
clear on whether we want it.  Is this checking for what should be a
can't-happen case, or are these soft limits that we expect to be
exceeded from time to time?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-03-25 Thread Robert Haas

On Fri, Mar 18, 2011 at 4:51 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Dan Ports d...@csail.mit.edu wrote:

 I am surprised to see that error message without SSI's hint about
 increasing max_predicate_locks_per_xact.

 After reviewing this, I think something along the following lines
 might be needed, for a start.  I'm not sure the Asserts are actually
 needed; they basically are checking that the current behavior of
 hash_search doesn't change.

 I'm still looking at whether it's sane to try to issue a warning
 when an HTAB exceeds the number of entries declared as its max_size
 when it was created.

I don't see much advantage in changing these to asserts - in a debug
build, that will promote ERROR to PANIC; whereas in a production
build, they'll cause a random failure somewhere downstream.

The HASH_ENTER to HASH_ENTER_NULL changes look like they might be
needed, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Greg Stark

On Fri, Mar 25, 2011 at 7:06 PM, Robert Haas robertmh...@gmail.com wrote:
 1. The table has been created or truncated in the same transaction
,,,
 That's not enough... some other transaction could see the data before
 the transaction commits.

How?


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set hint bits upon eviction from BufMgr

2011-03-25 Thread Heikki Linnakangas


On 25.03.2011 16:52, Merlin Moncure wrote:

Without this bit, the only way to set hint bits going during bufmgr
eviction is to do a visibility check on every tuple, which would
probably be prohibitively expensive.


I don't think the naive approach of scanning all tuples would be too 
bad, actually. The hint bits only need to be set once, and it'd be 
bgwriter shouldering the overhead.


The problem with setting hing bits when a buffer is evicted is that it 
doesn't help with the bulk load case. The hint bits can't be set for a 
bulk load until the load is finished and the transaction commits.


Maybe it would still be worthwhile to have bgwriter set hint bits, to 
reduce I/O caused by hint bit updates in an OLTP workload, but that's 
not what people usually complain about.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Matthew Draper


Attached is a WIP patch that allows SQL-language functions to reference
their parameters by name.

It uses p_post_columnref_hook, so potentially ambiguous references
prefer the column... that seems to make the most sense, both because it
avoids a backwards incompatibility, and it conforms with SQL's usual
notion of assuming you mean the nearest name.

It allows the parameter name to be qualified with the function name, for
when you really mean you want the parameter.


This being my first foray into the PostgreSQL source, I expect the code
is horribly wrong in a variety of ways. Other than that, the regression
tests I've been using are a slight modification of existing queries; I
imagine they should look measurably different.

And finally, conspicuously absent are the documentation changes that
will obviously need to accompany a real patch.

(This builds  passes `make check` on current HEAD, a4425e3)


Thanks!

Matthew


-- 
matt...@trebex.net

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
new file mode 100644
index ce3b77b..be71fbb
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
*** typedef SQLFunctionCache *SQLFunctionCac
*** 116,122 
--- 116,124 
   */
  typedef struct SQLFunctionParseInfo
  {
+ 	char	   *name;			/* function's name */
  	Oid		   *argtypes;		/* resolved types of input arguments */
+ 	char	  **argnames;		/* names of input arguments */
  	int			nargs;			/* number of input arguments */
  	Oid			collation;		/* function's input collation, if known */
  } SQLFunctionParseInfo;
*** typedef struct SQLFunctionParseInfo
*** 124,129 
--- 126,133 
  
  /* non-export function prototypes */
  static Node *sql_fn_param_ref(ParseState *pstate, ParamRef *pref);
+ static Node *sql_fn_post_column_ref(ParseState *pstate, ColumnRef *cref, Node *var);
+ static Node *sql_fn_param_ref_num(ParseState *pstate, SQLFunctionParseInfoPtr pinfo, int paramno, int location);
  static List *init_execution_state(List *queryTree_list,
  	 SQLFunctionCachePtr fcache,
  	 bool lazyEvalOK);
*** prepare_sql_fn_parse_info(HeapTuple proc
*** 163,168 
--- 167,173 
  	int			nargs;
  
  	pinfo = (SQLFunctionParseInfoPtr) palloc0(sizeof(SQLFunctionParseInfo));
+ 	pinfo-name = NameStr(procedureStruct-proname);
  
  	/* Save the function's input collation */
  	pinfo-collation = inputCollation;
*** prepare_sql_fn_parse_info(HeapTuple proc
*** 201,206 
--- 206,241 
  		pinfo-argtypes = argOidVect;
  	}
  
+ 	if (nargs  0)
+ 	{
+ 		Datum		proargnames;
+ 		Datum		proargmodes;
+ 		int			argnum;
+ 		int			n_arg_names;
+ 		bool		isNull;
+ 
+ 		proargnames = SysCacheGetAttr(PROCNAMEARGSNSP, procedureTuple,
+ 	  Anum_pg_proc_proargnames,
+ 	  isNull);
+ 		if (isNull)
+ 			proargmodes = PointerGetDatum(NULL);	/* just to be sure */
+ 
+ 		proargmodes = SysCacheGetAttr(PROCNAMEARGSNSP, procedureTuple,
+ 	  Anum_pg_proc_proargmodes,
+ 	  isNull);
+ 		if (isNull)
+ 			proargmodes = PointerGetDatum(NULL);	/* just to be sure */
+ 
+ 		n_arg_names = get_func_input_arg_names(proargnames, proargmodes, pinfo-argnames);
+ 
+ 		if (n_arg_names  nargs)
+ 			pinfo-argnames = NULL;
+ 	}
+ 	else
+ 	{
+ 		pinfo-argnames = NULL;
+ 	}
+ 
  	return pinfo;
  }
  
*** prepare_sql_fn_parse_info(HeapTuple proc
*** 210,223 
  void
  sql_fn_parser_setup(struct ParseState *pstate, SQLFunctionParseInfoPtr pinfo)
  {
- 	/* Later we might use these hooks to support parameter names */
  	pstate-p_pre_columnref_hook = NULL;
! 	pstate-p_post_columnref_hook = NULL;
  	pstate-p_paramref_hook = sql_fn_param_ref;
  	/* no need to use p_coerce_param_hook */
  	pstate-p_ref_hook_state = (void *) pinfo;
  }
  
  /*
   * sql_fn_param_ref		parser callback for ParamRefs ($n symbols)
   */
--- 245,354 
  void
  sql_fn_parser_setup(struct ParseState *pstate, SQLFunctionParseInfoPtr pinfo)
  {
  	pstate-p_pre_columnref_hook = NULL;
! 	pstate-p_post_columnref_hook = sql_fn_post_column_ref;
  	pstate-p_paramref_hook = sql_fn_param_ref;
  	/* no need to use p_coerce_param_hook */
  	pstate-p_ref_hook_state = (void *) pinfo;
  }
  
+ static Node *
+ sql_fn_resolve_name(ParseState *pstate, SQLFunctionParseInfoPtr pinfo, const char *paramname, int location)
+ {
+ 	int		i;
+ 	for (i = 0; i  pinfo-nargs; i++)
+ 		if (pinfo-argnames[i]  strcmp(pinfo-argnames[i], paramname) == 0)
+ 			return sql_fn_param_ref_num(pstate, pinfo, i + 1, location);
+ 
+ 	return NULL;
+ }
+ 
+ /*
+  * sql_fn_post_column_ref		parser callback for ColumnRefs
+  */
+ static Node *
+ sql_fn_post_column_ref(ParseState *pstate, ColumnRef *cref, Node *var)
+ {
+ 	SQLFunctionParseInfoPtr pinfo = (SQLFunctionParseInfoPtr) pstate-p_ref_hook_state;
+ 	int			names;
+ 	Node	   *field1;
+ 	Node	   *subfield = NULL;
+ 	const char *pname;
+ 	Node	   *param;
+ 
+ 	if (var != NULL)
+ 		return NULL;			/* there's a table column, prefer

Re: [HACKERS] Set hint bits upon eviction from BufMgr

2011-03-25 Thread Merlin Moncure

On Fri, Mar 25, 2011 at 2:32 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 25.03.2011 16:52, Merlin Moncure wrote:

 Without this bit, the only way to set hint bits going during bufmgr
 eviction is to do a visibility check on every tuple, which would
 probably be prohibitively expensive.

 I don't think the naive approach of scanning all tuples would be too bad,
 actually. The hint bits only need to be set once, and it'd be bgwriter
 shouldering the overhead.

 The problem with setting hing bits when a buffer is evicted is that it
 doesn't help with the bulk load case. The hint bits can't be set for a bulk
 load until the load is finished and the transaction commits.

Not the true bulk load case.  However, if you can break up a load into
multiple transactions and sneak out 10-100mb of pages into the buffer
per transaction, you have a good chance of getting most/all the bits
out correct before bgwriter eats them up.  I was thinking to also
teach bgwriter to keep xmin flagged pages in a separate lower priority
pool so that it didn't race to them before the transaction had a
chance to go in.

Long term, I'm imagining more direct transaction control in the
backend, either via autonomous transactions, or stored procedures with
explicit transaction control, so we don't have to load N gigabytes in
a single transaction.

 Maybe it would still be worthwhile to have bgwriter set hint bits, to reduce
 I/O caused by hint bit updates in an OLTP workload, but that's not what
 people usually complain about.

well, if bgwriter does it, you lose the ability to bail the clog check
via TransactionIdIsCurrentTransactionId, right? If it's done in the
bufmgr you at least have a chance to not have to go all the way out.
Either way though, you at least have to teach bgwriter to be more
cooperative.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Tom Lane

Matthew Draper matt...@trebex.net writes:
 Attached is a WIP patch that allows SQL-language functions to reference
 their parameters by name.

 It uses p_post_columnref_hook, so potentially ambiguous references
 prefer the column... that seems to make the most sense, both because it
 avoids a backwards incompatibility, and it conforms with SQL's usual
 notion of assuming you mean the nearest name.

Personally I'd vote for *not* having any such dangerous semantics as
that.  We should have learned better by now from plpgsql experience.
I think the best idea is to throw error for ambiguous references,
period.  That means you do need ways to disambiguate in both directions.
For column references you can just qualify with the table name/alias.
If the parameter reference is intended, allow qualification with the
function name.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] When and how many times does ExecSetParamPlan executes?

2011-03-25 Thread Tom Lane

Vaibhav Kaushal vaibhavkaushal...@gmail.com writes:
 So, I think that the function ExecSetParamPlan (as the code suggests
 too) is called _once_ in any plan/expression and that should be mostly
 for a sub-select query. 

 Kindly correct me if I am wrong. Since I am not able to understand this
 usecase completely, a sample query which is capable of calling this
 function (ExecSetParamPlan) could show some light. It would be really
 kind of you / anyone to show me a query executable through psql which
 can actually call ExecSetParamPlan and involves the use of a on-disk
 relation.

regression=# explain verbose select *, (select sum(f1) from int4_tbl) ss from 
int8_tbl;
 QUERY PLAN  
-
 Seq Scan on public.int8_tbl  (cost=1.07..2.12 rows=5 width=16)
   Output: int8_tbl.q1, int8_tbl.q2, $0
   InitPlan 1 (returns $0)
 -  Aggregate  (cost=1.06..1.07 rows=1 width=4)
   Output: sum(int4_tbl.f1)
   -  Seq Scan on public.int4_tbl  (cost=0.00..1.05 rows=5 width=4)
 Output: int4_tbl.f1
(7 rows)

$0 here represents the PARAM_EXEC Param.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-03-25 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 18, 2011 at 5:57 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 I'm still looking at whether it's sane to try to issue a warning
 when an HTAB exceeds the number of entries declared as its
 max_size when it was created.

 I don't think it's too late to commit something like this, but I'm not
 clear on whether we want it.

We do *not* want that.

Up to now, I believe the lockmgr's lock table is the only shared hash
table that is expected to grow past the declared size; that can happen
anytime a session exceeds max_locks_per_transaction, which we consider
to be only a soft limit.  I don't know whether SSI has introduced any
other hash tables that behave similarly.  (If it has, we might want to
rethink the amount of slop space we leave in shmem...)

There might perhaps be some value in adding a warning like this if it
were enabled per-table (and not enabled by default).  But I'd want to
see a specific reason for it, not just let's see if we can scare users
with a WARNING appearing out of nowhere.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Pavel Stehule

2011/3/25 Tom Lane t...@sss.pgh.pa.us:
 Matthew Draper matt...@trebex.net writes:
 Attached is a WIP patch that allows SQL-language functions to reference
 their parameters by name.

 It uses p_post_columnref_hook, so potentially ambiguous references
 prefer the column... that seems to make the most sense, both because it
 avoids a backwards incompatibility, and it conforms with SQL's usual
 notion of assuming you mean the nearest name.

 Personally I'd vote for *not* having any such dangerous semantics as
 that.  We should have learned better by now from plpgsql experience.
 I think the best idea is to throw error for ambiguous references,
 period.  That means you do need ways to disambiguate in both directions.
 For column references you can just qualify with the table name/alias.
 If the parameter reference is intended, allow qualification with the
 function name.

I agree with Tom.

There can be GUC for controlling use or don't use a parameter names. I
am for GUC, because there will be a bilion conflicts. But a talk about
priorities - sql identifier or parameter is useless.

Regards

Pavel Stehule


                        regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Set hint bits upon eviction from BufMgr

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 3:32 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 25.03.2011 16:52, Merlin Moncure wrote:

 Without this bit, the only way to set hint bits going during bufmgr
 eviction is to do a visibility check on every tuple, which would
 probably be prohibitively expensive.

 I don't think the naive approach of scanning all tuples would be too bad,
 actually. The hint bits only need to be set once, and it'd be bgwriter
 shouldering the overhead.

I was thinking the same thing.  The only thing I'm worried about is
whether it'd make the bgwriter less responsive; we already have some
issues in that department.

 The problem with setting hing bits when a buffer is evicted is that it
 doesn't help with the bulk load case. The hint bits can't be set for a bulk
 load until the load is finished and the transaction commits.

 Maybe it would still be worthwhile to have bgwriter set hint bits, to reduce
 I/O caused by hint bit updates in an OLTP workload, but that's not what
 people usually complain about.

Yeah.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Tom Lane

Pavel Stehule pavel.steh...@gmail.com writes:
 2011/3/25 Tom Lane t...@sss.pgh.pa.us:
 I think the best idea is to throw error for ambiguous references,
 period.

 There can be GUC for controlling use or don't use a parameter names. I
 am for GUC, because there will be a bilion conflicts. But a talk about
 priorities - sql identifier or parameter is useless.

GUCs are not tremendously helpful for problems such as this.  If we
actually wanted to preserve full backwards compatibility, we'd need to
think of a way to mark SQL functions per-function as to what to do.
But I don't think that's necessary.  Up to now there's been relatively
little use for naming the parameters of SQL functions, so I think there
will be few conflicts in the field if we just change the behavior.  The
mess and complication we have for the comparable behavior in plpgsql
seemed necessary because of the number of existing usages that would
certainly break --- but I doubt that SQL-language functions will have
anywhere near as big a problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 3:29 PM, Greg Stark gsst...@mit.edu wrote:
 On Fri, Mar 25, 2011 at 7:06 PM, Robert Haas robertmh...@gmail.com wrote:
 1. The table has been created or truncated in the same transaction
,,,
 That's not enough... some other transaction could see the data before
 the transaction commits.

 How?

Hmm.  Maybe I'm wrong.  I guess the XID would still be in the MVCC
snapshot of every other running transaction, so maybe it would be OK.
*scratches head*

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI bug?

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 4:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 18, 2011 at 5:57 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 I'm still looking at whether it's sane to try to issue a warning
 when an HTAB exceeds the number of entries declared as its
 max_size when it was created.

 I don't think it's too late to commit something like this, but I'm not
 clear on whether we want it.

 We do *not* want that.

 Up to now, I believe the lockmgr's lock table is the only shared hash
 table that is expected to grow past the declared size; that can happen
 anytime a session exceeds max_locks_per_transaction, which we consider
 to be only a soft limit.  I don't know whether SSI has introduced any
 other hash tables that behave similarly.  (If it has, we might want to
 rethink the amount of slop space we leave in shmem...)

 There might perhaps be some value in adding a warning like this if it
 were enabled per-table (and not enabled by default).  But I'd want to
 see a specific reason for it, not just let's see if we can scare users
 with a WARNING appearing out of nowhere.

What about a system view that shows declared  actual sizes of all
these hash tables?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pre-set Hint bits/VACUUM FREEZE on data load..?

2011-03-25 Thread Heikki Linnakangas


On 25.03.2011 22:21, Robert Haas wrote:

On Fri, Mar 25, 2011 at 3:29 PM, Greg Starkgsst...@mit.edu  wrote:

On Fri, Mar 25, 2011 at 7:06 PM, Robert Haasrobertmh...@gmail.com  wrote:

1. The table has been created or truncated in the same transaction

,,,
That's not enough... some other transaction could see the data before
the transaction commits.


How?


Hmm.  Maybe I'm wrong.  I guess the XID would still be in the MVCC
snapshot of every other running transaction, so maybe it would be OK.
*scratches head*


Right. You can't mark the tuples as frozen, but you can set the 
xmin-committed hint bits.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Problem with streaming replication, backups, and recovery (9.0.x)

2011-03-25 Thread hubert depesz lubaczewski

hi,

So, I hit a strange problem with Streaming Replication, that I cannot explain.

Executive summary:

when using hot backup made on straming replication slave, sometimes
(depending on load) generated backup is created in such a way, that while it
can be brough back as standalone Pg, and it can be brough back as streaming
slave, such slave (created off the backup) cannot be promoted to standalone.

Disclaimer:

I know that making hot backups on slave is not the suggested way, yet I was
doing it without any problem on earlier Postgres versions (8.2,8.3,8.4), and do
not have this problem with backups generated from the masters, so the problem I
hit now is so peculiar, that I thought that it might be just an effect of some
underlying, more serious, condition.

Longer explanation:

First, let me explain how omnipitr-backup-slave works, because it's the tool
that I use to make backups on slave.

Steps that it does:

1. gets pg_controldata for $PGDATADIR
2. compresses $PGDATA to data tar.gz, putting inside backup_label file,
   which contains:
START WAL LOCATION: %s (file %s)
CHECKPOINT LOCATION: %s
START TIME: %s
LABEL: OmniPITR_Slave_Hot_Backup
   where START WAL LOCATION uses value from Latest checkpoint's REDO
   location from pg_controldata from step #1, CHECKPOINT LOCATION is
   taken from Latest checkpoint location from pg_controldata taken in step
   #1, and START TIME is based on current (before starting compression of
   $PGDATA) timestamp.
3. gets another copy of pg_controldata for $PGDATA
4. repeats step #3 until value in Latest checkpoint location will change
5. wait until file that contains WAL location, from Minimum recovery ending
   location from pg_controldata from step #4, will be available
6. creates .backup file which is named based on START WAL
   LOCATION (from step #2), and contains the same lines as backup_label file
   from step #2, plus two more lines:
STOP WAL LOCATION: %s (file %s)
STOP TIME: %s
   where STOP WAL LOCATION is taken from Minimum recovery ending location
   from pg_controldata from step #4, and STOP time is current timestamp as
   of before starting compression of wal segments.
7. compresses xlogs plus the .backup file generated in step #6.

This approach worked for a long time on various hosts, systems, versions, etc.

But now, it fails.

I'm using for tests PostgreSQL 9.0.2 and 9.0.3 (mostly 9.0.2 as this is the
most critical for me, but I tested on 9.0.3 too, and the problem is the same),
on linux (ubuntu), 64bit.  

I do the procedure as always, and it produces backup. With this backup I can
setup new standalone server, and it works.

I can also setup streaming slave, and it also works, but when I create
trigger file to promote this slave to master it fails with error:
2011-03-24 21:01:58.051 CET @ 9680  LOG:  trigger file found: 
/home/depesz/slave2/finish.recovery
2011-03-24 21:01:58.051 CET @ 9930  FATAL:  terminating walreceiver process due 
to administrator command
2011-03-24 21:01:58.151 CET @ 9680  LOG:  redo done at 0/1F58
2011-03-24 21:01:58.151 CET @ 9680  LOG:  last completed transaction was at log 
time 2011-03-24 20:58:25.836333+01
2011-03-24 21:01:58.238 CET @ 9680  FATAL:  WAL ends before consistent recovery 
point

Which is interesting, because this particular backup was done using .backup 
file containing:

START WAL LOCATION: 0/A20 (file 0001000A)
STOP WAL LOCATION: 0/12C9D7E8 (file 00010012)
CHECKPOINT LOCATION: 0/B803050
START TIME: 2011-03-24 20:52:46 CET
STOP TIME: 2011-03-24 20:53:41 CET
LABEL: OmniPITR_Slave_Hot_Backup

Which means that minimum recovery ending location was in fact reached (it was
on 0/12C9D7E8, and recovery continued till 0/1F58).

I have set of script that can be used to replicate the problem, but the test
takes some time (~ 30 minutes).

What's most interesting is that this problem does not happen always. It
happens only when there was non-trivial load on db server - this is in my tests
where both master and slave are the same machine. I think that in normal cases
load on slave is more important.

If anyone would be able to help, I can give you access to test machine
and/or provide set of script which replicate (usually) the problem.

Alternatively - if there is anything I can do to help you solve the mystery
- I'd be very willing to.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Well, basically, you can't have that.  Example: you have an existing
 table with primary key, and while you're in the middle of doing some
 long transaction, somebody else creates a table with a foreign-key
 reference to the one you're about to do a delete from.  Being
 serializable does not excuse you from the obligation to check for
 FK violations in that invisible table.  It might be acceptable to
 fail entirely, but not to act as though the table isn't there.

That's an excellent example and point.  Is there a 'right' answer (with
regard to the SQL spec, what other databases do, etc)?  When you go to
delete a record from the existing table you could get a FK violation due
to the invisible table, which could end up being rolled back and
removed.

It seems like the semantics around this would call for the adding-FK
transaction to be treated as if the table did already exist and then
handle this case as we would if there wasn't any DDL involved.  Another
approach might be to wait till commit to check the FK, but that'd
probably be considered unkind.

If the spec doesn't dictate anything and/or we can't find anyone else's
semantics that make sense, I suppose we'll need to define our own.  To
that end, perhaps we should put up something on a wiki or similar to
start capturing these and considering what the 'right' answer would be.

Apologies for my ignorance on this.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Pavel Stehule

2011/3/25 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2011/3/25 Tom Lane t...@sss.pgh.pa.us:
 I think the best idea is to throw error for ambiguous references,
 period.

 There can be GUC for controlling use or don't use a parameter names. I
 am for GUC, because there will be a bilion conflicts. But a talk about
 priorities - sql identifier or parameter is useless.

 GUCs are not tremendously helpful for problems such as this.  If we
 actually wanted to preserve full backwards compatibility, we'd need to
 think of a way to mark SQL functions per-function as to what to do.
 But I don't think that's necessary.  Up to now there's been relatively
 little use for naming the parameters of SQL functions, so I think there
 will be few conflicts in the field if we just change the behavior.  The
 mess and complication we have for the comparable behavior in plpgsql
 seemed necessary because of the number of existing usages that would
 certainly break --- but I doubt that SQL-language functions will have
 anywhere near as big a problem.

should be nice some converting tool for pg_dump or pg_upgrade. It can
dump SQL functions with only qualified identifiers.

Pavel


                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] maximum digits for NUMERIC

2011-03-25 Thread Gianni Ciolli

On Fri, Mar 25, 2011 at 08:46:17AM +, Gianni Ciolli wrote:
 On Sun, Mar 20, 2011 at 08:14:21PM -0400, Noah Misch wrote:
  Agreed.  The documentation is suggestive of this limit:
  
  # CREATE TABLE n (c numeric(1001,0));
  ERROR:  NUMERIC precision 1001 must be between 1 and 1000
  LINE 1: CREATE TABLE n (c numeric(1001,0));
  
  However, that's indeed just a limit of the numeric typmod representation, 
  not
  the data type itself.  An unqualified numeric column hits no such limit.
 
 For the record, the limits I found from my tests are:
 * 2^17 - 1 maximum total digits
 * 2^14 - 1 maximum fractional digits
 
 (I did tests as I couldn't extract any obvious limit from the source
 code of numeric.c)

The attached patch resumes this short discussion.

Best regards,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni.cio...@2ndquadrant.it | www.2ndquadrant.it
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 0bb6594..259523d 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -476,7 +476,7 @@
 /indexterm
 
 para
- The type typenumeric/type can store numbers with up to 1000
+ The type typenumeric/type can store numbers with up to 131071
  digits of precision and perform calculations exactly. It is
  especially recommended for storing monetary amounts and other
  quantities where exactness is required. However, arithmetic on
@@ -493,7 +493,7 @@
  the whole number, that is, the number of digits to both sides of
  the decimal point.  So the number 23.5141 has a precision of 6
  and a scale of 4.  Integers can be considered to have a scale of
- zero.
+ zero. The maximum allowed scale is 16383.
 /para
 
 para
@@ -525,6 +525,15 @@ NUMERIC
  explicitly.)
 /para
 
+note
+ para
+  The maximum allowed precision when explicitely specified in the
+  type declaration is 1000; otherwise the typeNUMERIC/type
+  data type supports a maximum precision of 131071 and a maximum
+  scale of 16383.
+ /para
+/note
+
 para
  If the scale of a value to be stored is greater than the declared
  scale of the column, the system will round the value to the specified

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Tom Lane

Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 Well, basically, you can't have that.  Example: you have an existing
 table with primary key, and while you're in the middle of doing some
 long transaction, somebody else creates a table with a foreign-key
 reference to the one you're about to do a delete from.  Being
 serializable does not excuse you from the obligation to check for
 FK violations in that invisible table.  It might be acceptable to
 fail entirely, but not to act as though the table isn't there.

 That's an excellent example and point.  Is there a 'right' answer (with
 regard to the SQL spec, what other databases do, etc)?

I'm not aware that anybody's got an amazingly satisfactory solution.
PG's answer is of course to use up-to-the-minute DDL regardless of what
the transaction might see for other purposes, which certainly has got
disadvantages if you're hoping for truly serializable behavior.  But I'm
not sure there's a better answer.  You could make an argument for
failing any serializable transaction that's affected by DDL changes that
happen after it started.  I don't know whether that cure would be worse
than the disease.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Transactional DDL, but not Serializable

2011-03-25 Thread Darren Duncan


Tom Lane wrote:

Stephen Frost sfr...@snowman.net writes:

* Tom Lane (t...@sss.pgh.pa.us) wrote:

Well, basically, you can't have that.  Example: you have an existing
table with primary key, and while you're in the middle of doing some
long transaction, somebody else creates a table with a foreign-key
reference to the one you're about to do a delete from.  Being
serializable does not excuse you from the obligation to check for
FK violations in that invisible table.  It might be acceptable to
fail entirely, but not to act as though the table isn't there.



That's an excellent example and point.  Is there a 'right' answer (with
regard to the SQL spec, what other databases do, etc)?


I'm not aware that anybody's got an amazingly satisfactory solution.
PG's answer is of course to use up-to-the-minute DDL regardless of what
the transaction might see for other purposes, which certainly has got
disadvantages if you're hoping for truly serializable behavior.  But I'm
not sure there's a better answer.  You could make an argument for
failing any serializable transaction that's affected by DDL changes that
happen after it started.  I don't know whether that cure would be worse
than the disease.


If transaction A commits successfully before transaction B commits, regardless 
of when transaction B started, and transaction A changes/adds/etc any 
constraints on the database, then I would expect transaction B to only commit 
successfully if all of its data changes pass those new/changed constraints.


If B were allowed to commit without that being the case, then it would leave the 
database in an inconsistent state, that is a state where its data doesn't 
conform to its constraints.  A database should always be consistent on 
transaction boundaries, at the very least, if not on statement boundaries.


As to whether B's failure happens when it tries to commit or happens earlier, 
based on visibility issues with A's changes, doesn't matter to me so much (do 
what works best for you/others), but it should fail at some point if it would 
otherwise cause inconsistencies.


-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 9.1 Beta

2011-03-25 Thread Simon Riggs

Judging by the number of new threads about development for 9.2, I
think its time we declared 9.1 Beta. I just had a conversation with
some Debian developers about how PostgreSQL 9.0 got pulled out of
their release because we delayed by 3 weeks. So we missed our slot to
deliver useful new features to our very best supporters by 2 years. I
really hope that was deliberate.

I've never understood why we timebox useful development, yet tweaking
is allowed to go on without limit. Personally, I don't see the
rationale to allow developers some kind of priority over their input.
This tweaking period is essentially a time when insiders can put their
votes in, but nobody else can. Beta is where we get feedback from a
wider audience.

The sooner we declare Beta, the sooner people will test. Then we will
have user feedback, bugs to fix etc.. Everybody is very clearly
sitting idle. With a longer bug list we will make faster progress to
release. We're just wasting time.

If we had a hard date for feature freeze, lets have a hard date for
Beta of +2 months (next time), and +2.5 months now. (I know +1 month
was suggested, well that's just unrealistic). Beta is a great time to
resolve difficult decisions, by opening the floor to wider debate and
feedback. Delaying beta because we still have unresolved issues is
exactly backwards of what we should be doing. Let's hear from a wider
audience.

Vox populi, vox dei

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] locale operation to be invoked, but no collation was derived (in trigger)

2011-03-25 Thread Tom Lane

Stefan Huehner ste...@huehner.org writes:
 first i am not sure how the state of the collation work in current git is 
 supposed to be with all the discussion going on here... but wanted to get out 
 that bug report:


 create table ad_tab (ad_tab_id varchar(32), name varchar(32));

 create function test_trg() RETURNS TRIGGER LANGUAGE plpgsql AS $function$ 
 DECLARE
  BEGIN
 --Check tab name starts with a upper case letter
 IF (not (substr(new.Name,1,1) between 'A' and 'Z')) THEN
   RAISE EXCEPTION '%', '@TabName1stCharUpper@' ; --OBTG:-2--
 END IF;
  END
 ; $function$
 ;

 create trigger test_trg after update on test for each row execute procedure 
 test_trg();

 insert into test values ('1', 'test');
 update test set name = 'test2' where test_id = '1';

 ERROR:  locale operation to be invoked, but no collation was derived
 CONTEXT:  PL/pgSQL function test_trg line 4 at IF

Fixed, thanks for the test case.  (I have a feeling I'd better grep the
whole backend for other places that are likewise neglecting to set the
collation fields in manually-created expression nodes ...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Robert Haas

On Mar 25, 2011, at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 GUCs are not tremendously helpful for problems such as this.  If we
 actually wanted to preserve full backwards compatibility, we'd need to
 think of a way to mark SQL functions per-function as to what to do.
 But I don't think that's necessary.  Up to now there's been relatively
 little use for naming the parameters of SQL functions, so I think there
 will be few conflicts in the field if we just change the behavior. 

Oh wow, I don't agree with that at all. People may name the parameters for 
documentation purposes, and then have things like WHERE foo = $1, foo happening 
also to be the name associated with $1.  Boom!

In any case, I think this is 9.2 material. We need to get a beta out the door, 
and I emphatically think we should be focusing on resolving the issues with 
features already committed, not committing new ones.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mar 25, 2011, at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But I don't think that's necessary.  Up to now there's been relatively
 little use for naming the parameters of SQL functions, so I think there
 will be few conflicts in the field if we just change the behavior. 

 Oh wow, I don't agree with that at all. People may name the parameters for 
 documentation purposes, and then have things like WHERE foo = $1, foo 
 happening also to be the name associated with $1.  Boom!

Well, maybe, but it's not like it's subtle or hard to fix.

 In any case, I think this is 9.2 material.

Oh, of course.  It *is* just a WIP patch, anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: making write location work (was: Efficient transaction-controlled synchronous replication)

2011-03-25 Thread Simon Riggs

On Wed, Mar 23, 2011 at 6:22 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Specifically, if we're not going to remove write location, then I
 think we need to apply something like the attached.

 The protocol supports different write/fsync values, so the view should
 display them.

 That's exactly the point.

No its not.

 Currently, we have a protocol that supports
 different write and fsync values, but the code as written does not
 actually ever send a reply at any time when the two values can ever be
 different.  So there is no point in sending both of them.  The write
 location is completely redundant with the fsync location and therefore
 completely useless.  We shouldn't bother sending the value twice, or
 displaying it twice, if it's absolutely 100% guaranteed to be
 identical in every case.

As of 9.1, we now support other tools that use the protocol, so you
cannot assume you know what is being sent, just because one sender has
certain characteristics.

 The point of the patch that I posted is that it restores the previous
 behavior, where we send an update before flushing WAL and again after
 flushing WAL.  If we do that, then the write location can be ahead of
 the flush location when we've written but not flushed.  If we don't do
 that, and only send replies after flushing everything, then the two
 fields are perforce always the same on the master.  I don't see that
 as being a useful behavior, and in fact I think it could be quite
 confusing.  Someone might assume that if we bother to expose both a
 write_location and a flush_location, they are somehow different.

They can be in 9.1 and almost certainly will be in 9.2

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: making write location work (was: Efficient transaction-controlled synchronous replication)

2011-03-25 Thread Fujii Masao

On Thu, Mar 24, 2011 at 3:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Specifically, if we're not going to remove write location, then I
 think we need to apply something like the attached.

 The protocol supports different write/fsync values, so the view should
 display them.

 That's exactly the point.  Currently, we have a protocol that supports
 different write and fsync values, but the code as written does not
 actually ever send a reply at any time when the two values can ever be
 different.  So there is no point in sending both of them.  The write
 location is completely redundant with the fsync location and therefore
 completely useless.  We shouldn't bother sending the value twice, or
 displaying it twice, if it's absolutely 100% guaranteed to be
 identical in every case.

 The point of the patch that I posted is that it restores the previous
 behavior, where we send an update before flushing WAL and again after
 flushing WAL.  If we do that, then the write location can be ahead of
 the flush location when we've written but not flushed.  If we don't do
 that, and only send replies after flushing everything, then the two
 fields are perforce always the same on the master.  I don't see that
 as being a useful behavior, and in fact I think it could be quite
 confusing.  Someone might assume that if we bother to expose both a
 write_location and a flush_location, they are somehow different.

I agree with Robert. It's useless and confusing to send the same
location as flush_location as write_location redundantly. We should
either remove write_location from the pg_stat_replication view and
the protocol at all, or apply the proposed patch.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: making write location work (was: Efficient transaction-controlled synchronous replication)

2011-03-25 Thread Simon Riggs

On Thu, Mar 24, 2011 at 11:45 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Thu, Mar 24, 2011 at 3:22 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Specifically, if we're not going to remove write location, then I
 think we need to apply something like the attached.

 The protocol supports different write/fsync values, so the view should
 display them.

 That's exactly the point.  Currently, we have a protocol that supports
 different write and fsync values, but the code as written does not
 actually ever send a reply at any time when the two values can ever be
 different.  So there is no point in sending both of them.  The write
 location is completely redundant with the fsync location and therefore
 completely useless.  We shouldn't bother sending the value twice, or
 displaying it twice, if it's absolutely 100% guaranteed to be
 identical in every case.

 The point of the patch that I posted is that it restores the previous
 behavior, where we send an update before flushing WAL and again after
 flushing WAL.  If we do that, then the write location can be ahead of
 the flush location when we've written but not flushed.  If we don't do
 that, and only send replies after flushing everything, then the two
 fields are perforce always the same on the master.  I don't see that
 as being a useful behavior, and in fact I think it could be quite
 confusing.  Someone might assume that if we bother to expose both a
 write_location and a flush_location, they are somehow different.

 I agree with Robert. It's useless and confusing to send the same
 location as flush_location as write_location redundantly. We should
 either remove write_location from the pg_stat_replication view and
 the protocol at all, or apply the proposed patch.

You may be confused, but that doesn't mean its useless.

The protocol supports sending two values, so two are displayed.

If you wish to remove one from the display then that only makes sense
if you also prevent the protocol from supporting two values.

There is no benefit in doing that, so why do it? We are going to put
that back in 9.2 if you remove it now. Why not leave it, so we don't
need to rewrite all the monitoring tools that will use this view?

Just say this in the docs. Currently, standbys return the same value
for write and flush locations and so in most cases these values will
be the same. It is possible to write a user program that sends replies
at different times and in that case the values would differ. In later
release these values will differ for standbys. when more options are
available.

It's almost certain that Magnus will fix the bug in pg_xlogstream (or
whatever its called) and a tool will be available, so lets leave this.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.1 Beta

2011-03-25 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 Judging by the number of new threads about development for 9.2, I
 think its time we declared 9.1 Beta. I just had a conversation with
 some Debian developers about how PostgreSQL 9.0 got pulled out of
 their release because we delayed by 3 weeks. So we missed our slot to
 deliver useful new features to our very best supporters by 2 years. I
 really hope that was deliberate.

We do *not* make release decisions based on Debian's schedule.  Even if
we wanted to, going beta is hardly likely to affect their decisions.

The correct question is whether we're ready for beta, and I would say
the answer is clearly no, unless you have a pretty low standard for what
ready for beta means.  Perhaps it would be suitable to discuss what
the standard for that really ought to be; but I don't agree in the
slightest that we ought to decide based on predetermined calendar dates
rather than the state of the code.

 If we had a hard date for feature freeze, lets have a hard date for
 Beta of +2 months (next time), and +2.5 months now. (I know +1 month
 was suggested, well that's just unrealistic). Beta is a great time to
 resolve difficult decisions, by opening the floor to wider debate and
 feedback.

The reason we get wider testing during beta is that people have some
confidence (perhaps misplaced) that the database won't eat their data.
Releasing alpha-grade code and calling it beta isn't going to get us
wider testing ... at least, not more than once.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Robert Haas

On Mar 25, 2011, at 7:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, maybe, but it's not like it's subtle or hard to fix.

Depends how much of it you have. I've become very skeptical of anything that 
breaks pg_dump-and-reload-ability.  And doubly so now that such problems also 
mean breaking pg_upgrade after the old cluster has been shut down.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2011 - Mentors? Projects?

2011-03-25 Thread Tomas Vondra

Dne 8.3.2011 07:44, Selena Deckelmann napsal(a):
 Hi!
 
 PostgreSQL is applying for GSoC again this year. We're looking for:
 
 * Mentors
 * Project ideas
 
 Would you like to mentor? Please let me know! Our application closes
 on Friday, so please contact me *before* Friday.
 
 I've started a wiki page: http://wiki.postgresql.org/wiki/GSoC_2011

Hi,

I spoke to a teacher from a local university last week, mainly as we
were looking for a place where a local PUG could meet regularly. I
realized this could be a good opportunity to head-hunt some students to
participate in this GSoC. Are we still interested in new students?

I've never been involved in GSoC before, so I'm not quite sure how all
this works. Should the student choose a project idea from the wiki,
contact one of the mentors, write to pgsql-students or how is that
supposed to work?

I've noticed the student application deadline is April 8, so it would be
a bit rush, I guess. Is that worth a try?

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mar 25, 2011, at 7:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, maybe, but it's not like it's subtle or hard to fix.

 Depends how much of it you have. I've become very skeptical of
 anything that breaks pg_dump-and-reload-ability.

This wouldn't break pg_dump scripts, because they disable
check_function_bodies.  You would get a failure on first *use*
of a function, which is something different.

Basically my concern here is that in the name of easing a short-term
conversion issue, we'll be condemning users to a future of subtle,
hard-to-find bugs due to ambiguous names.  How many hundreds of
reports have we seen about the equivalent problem in plpgsql?

You could argue that the frequency of plpgsql issues was at least partly
due to having a poor choice of which way to resolve the ambiguity, but
I don't think it can be entirely blamed on that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2011 - Mentors? Projects?

2011-03-25 Thread Joshua Berkus

Tomas,

 I spoke to a teacher from a local university last week, mainly as we
 were looking for a place where a local PUG could meet regularly. I
 realized this could be a good opportunity to head-hunt some students
 to
 participate in this GSoC. Are we still interested in new students?

Yes, please!   We have had students from Charles University several times 
before, and would be glad to have more.  The wiki page has links to the 
information about the program.  Talk to Zdenek if you have more questions.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] race condition in sync rep

2011-03-25 Thread Robert Haas

I believe I've figured out why synchronous replication has such
terrible performance with fsync=off: it has a nasty race condition.
It may happen - if the standby responds very quickly - that the
standby acks the commit record and awakens waiters before the
committing backend actually begins to wait.  There's no cross-check
for this: the committing backend waits unconditionally, with no regard
to whether the necessary ACK has already arrived.  At this point we
may be in for a very long wait: another ACK will be required to
release waiters, and that may not be immediately forthcoming.  I had
thought that the next ACK (after at most wal_receiver_status_interval)
would do the trick, but it appears to be even worse than that: by
making the standby win the race, I was easily able to get the master
to hang for over a minute, and it only got released when I committed
another transaction.  Had I been sufficiently patient, the next
checkpoint probably would have done the trick.

Of course, with fsync=off on the standby, it's much easier for the
standby to win the race.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 8:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mar 25, 2011, at 7:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, maybe, but it's not like it's subtle or hard to fix.

 Depends how much of it you have. I've become very skeptical of
 anything that breaks pg_dump-and-reload-ability.

 This wouldn't break pg_dump scripts, because they disable
 check_function_bodies.  You would get a failure on first *use*
 of a function, which is something different.

 Basically my concern here is that in the name of easing a short-term
 conversion issue, we'll be condemning users to a future of subtle,
 hard-to-find bugs due to ambiguous names.  How many hundreds of
 reports have we seen about the equivalent problem in plpgsql?

 You could argue that the frequency of plpgsql issues was at least partly
 due to having a poor choice of which way to resolve the ambiguity, but
 I don't think it can be entirely blamed on that.

As I've said before, I believe that the root cause of this problem is
that using the same syntax for variables and column names is a bad
idea in the first place.  If we used $foo or ?foo or ${foo} or $.foo
or foo!!$#? to mean the parameter called foo, then this would all
be a non-issue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Joshua Berkus

Tom,

 Personally I'd vote for *not* having any such dangerous semantics as
 that. We should have learned better by now from plpgsql experience.
 I think the best idea is to throw error for ambiguous references,
 period. 

As a likely heavy user of this feature, I agree with Tom here.  I really don't 
want the column being silently preferred in SQL functions, when PL/pgSQL 
functions are throwing an error.  I'd end up spending hours debugging this.

Also, I don't understand why this would be a dump/reload issue if $1 and $2 
continue to work.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication server timeout patch

2011-03-25 Thread Robert Haas

On Wed, Mar 23, 2011 at 6:33 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 16.03.2011 11:11, Fujii Masao wrote:

 On Wed, Mar 16, 2011 at 4:49 PM, Fujii Masaomasao.fu...@gmail.com
  wrote:

 Agreed. I'll change the patch.

 Done. I attached the updated patch.

 I don't much like the API for this. Walsender shouldn't need to know about
 the details of the FE/BE protocol, pq_putbytes_if_available() seems too low
 level to be useful.

 I think a better API would be to have a non-blocking version of
 pq_putmessage(). We can make the output buffer in pqcomm.c resizeable, so
 that when the message doesn't fit in the output buffer in pq_putmessage(),
 the buffer is enlarged instead of trying to flush it.

 Attached is a patch using that approach. This is a much smaller patch, and
 easier to understand. I'm not totally happy with the walsender main loop, it
 seems to work as it is, but the logic has become quite complicated. Ideas
 welcome on how to simplify that.

Heikki, are you planning to commit this, either with or without
further revisions?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2011 - Mentors? Projects?

2011-03-25 Thread Tomas Vondra

Dne 26.3.2011 02:05, Joshua Berkus napsal(a):
 Tomas,
 
 I spoke to a teacher from a local university last week, mainly as we
 were looking for a place where a local PUG could meet regularly. I
 realized this could be a good opportunity to head-hunt some students
 to
 participate in this GSoC. Are we still interested in new students?
 
 Yes, please!   We have had students from Charles University several
 times before, and would be glad to have more.  The wiki page has
 links to the information about the program.  Talk to Zdenek if you
 have more questions.
 

I know Zdenek was mentoring some students in the previous years, but
he's been a bit hard to reach recently. And the deadline is near.

I've read some info about the program on a wiki, but I'm not sure what
should the students do. Let's say they will read the list of project
ideas on the wiki, and they'll choose one or two of them. What should
they do next? Should they write to the pgsql-students mailing list?

I guess most of the students won't have much experience with PostgreSQL,
and most of the ideas is described just very briefly, so they'll need
help with the proposal.

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.1 Beta

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 6:18 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I've never understood why we timebox useful development, yet tweaking
 is allowed to go on without limit. Personally, I don't see the
 rationale to allow developers some kind of priority over their input.
 This tweaking period is essentially a time when insiders can put their
 votes in, but nobody else can. Beta is where we get feedback from a
 wider audience.

I think 9.0 got delayed quite a bit by the fact that we need
approximately 347 people to wrap a release, and they all had vacations
at different times over the summer.  The code was pretty stable by
July 1; I think we could easily have released in August if we had a
slightly less awkward process for getting these things out the door.

 The sooner we declare Beta, the sooner people will test. Then we will
 have user feedback, bugs to fix etc.. Everybody is very clearly
 sitting idle. With a longer bug list we will make faster progress to
 release. We're just wasting time.

I can't resist observing that if you want beta to happen sooner, it
would be better not to commit major and largely unreviewed patches
three weeks after the end of the last CommitFest.  Before you insist
that it was reviewed, the version that was actually committed bore so
little resemblance to the versions that were posted earlier that any
earlier review that was done was basically meaningless in terms of
ensuring that the final product was bug free, and it wasn't and isn't.
 I complained *repeatedly* about the need to get both collation
support and sync rep finished and committed sooner, for exactly this
reason.  We are now reaping the entirely predictable fruit of having
failed to make that happen.  But for those two patches, we would
likely be in beta already, or darn close.

http://archives.postgresql.org/pgsql-hackers/2010-12/msg01257.php
http://archives.postgresql.org/pgsql-hackers/2011-01/msg01209.php
http://archives.postgresql.org/pgsql-hackers/2011-01/msg02811.php
http://archives.postgresql.org/pgsql-hackers/2011-02/msg00438.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.1 Beta

2011-03-25 Thread Robert Haas

On Fri, Mar 25, 2011 at 8:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The correct question is whether we're ready for beta, and I would say
 the answer is clearly no, unless you have a pretty low standard for what
 ready for beta means.  Perhaps it would be suitable to discuss what
 the standard for that really ought to be; but I don't agree in the
 slightest that we ought to decide based on predetermined calendar dates
 rather than the state of the code.

Agreed.  I think some discussion of which of the things on the open
item lists need to be done before beta might be helpful, and we ought
to add any items that are not there but are blockers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Robert Haas

On Mar 25, 2011, at 9:22 PM, Joshua Berkus j...@agliodbs.com wrote:
 Tom,
 
 Personally I'd vote for *not* having any such dangerous semantics as
 that. We should have learned better by now from plpgsql experience.
 I think the best idea is to throw error for ambiguous references,
 period. 
 
 As a likely heavy user of this feature, I agree with Tom here.  I really 
 don't want the column being silently preferred in SQL functions, when 
 PL/pgSQL functions are throwing an error.  I'd end up spending hours 
 debugging this.
 
 Also, I don't understand why this would be a dump/reload issue if $1 and $2 
 continue to work.
 

Because an identifier that previously referred unambiguously to a column might 
now be ambiguous, if there is a parameter with the same name.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Mar 25, 2011, at 9:22 PM, Joshua Berkus j...@agliodbs.com wrote:
 Also, I don't understand why this would be a dump/reload issue if $1 and $2 
 continue to work.

 Because an identifier that previously referred unambiguously to a column 
 might now be ambiguous, if there is a parameter with the same name.

Yes, a function that previously worked might now throw error, if we make
ambiguous names be errors.  But this is *not* a failure that would occur
during dump/reload.  You'd have to actually run the function.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread David E. Wheeler

On Mar 25, 2011, at 9:12 PM, Robert Haas wrote:

 
 As I've said before, I believe that the root cause of this problem is
 that using the same syntax for variables and column names is a bad
 idea in the first place.  If we used $foo or ?foo or ${foo} or $.foo
 or foo!!$#? to mean the parameter called foo, then this would all
 be a non-issue.

Yes *please*. Man that would make maintenance of such functions easier.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Allow SQL-language functions to reference parameters by parameter name

2011-03-25 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 As I've said before, I believe that the root cause of this problem is
 that using the same syntax for variables and column names is a bad
 idea in the first place.  If we used $foo or ?foo or ${foo} or $.foo
 or foo!!$#? to mean the parameter called foo, then this would all
 be a non-issue.

If this were PL/perl, or PL/almost-anything-except-SQL, I could get
behind such a proposal.  But it's not, it's SQL; and SQL doesn't do
things that way.  SQL's idea of disambiguation is qualified names.

And even more to the point: to the extent you think that weird syntax
might be a suitable solution, you have to keep in mind that the SQL
committee could take over any such syntax at the drop of a hat.
See the recent unpleasantness concerning = ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] resolving SQL ambiguity (was Re: WIP: Allow SQL-lang funcs to ref params by param name)

2011-03-25 Thread Darren Duncan


Robert Haas wrote:

On Mar 25, 2011, at 9:22 PM, Joshua Berkus j...@agliodbs.com wrote:

Tom,


Personally I'd vote for *not* having any such dangerous semantics as
that. We should have learned better by now from plpgsql experience.
I think the best idea is to throw error for ambiguous references,
period. 

As a likely heavy user of this feature, I agree with Tom here.  I really don't 
want the column being silently preferred in SQL functions, when PL/pgSQL 
functions are throwing an error.  I'd end up spending hours debugging this.

Also, I don't understand why this would be a dump/reload issue if $1 and $2 
continue to work.


Because an identifier that previously referred unambiguously to a column might 
now be ambiguous, if there is a parameter with the same name.


I mention 2 possible solutions here, both which involve syntax alterations, each 
between the -- lines.  I personally like the second/lower option more.




Might it be reasonable, perhaps as a 9.2 feature, to add top-level-namespaces so 
that one could always explicitly qualify what they are referring to?


For example, you could have the 3 sch, lex, attr (I may have missed some 
useful ones).


The sch TLN would unambiguously refer directly to a schema object, such as a 
database table.


The lex TLN would unambiguously refer directly to a lexical, either a 
parameter of the current routine or to a lexical variable.


The attr TLN would unambiguously refer to a table/etc column/attribute in the 
manner typical for SQL.


Use them like:

  sch.foo - the table/etc foo
  lex.foo - the lexical variable foo
  attr.foo - the column foo

Use of these TLN are optional where there is no ambiguity.

The TLN are not reserved words, but if one has an entity named the same, then 
references to it must be TLN-qualified; eg:


  lex.sch
  lex.lex
  lex.attr

Now these are just examples.  You may find a different set works better.

--

There are also alternate solutions.

For example, it could be mandated that lexical-scope aliases for any 
data/var-like schema object are required in routines, where the aliases are 
distinct from all lexical vars/params/etc, and then all SQL/code in the routines 
may only refer to the schema objects by the aliases.


Effectively this makes it so that routines can no longer see non-lexical vars 
but for those from parameters, and this aliasing is defining a parameter whose 
argument is supplied by the DBMS automatically rather than as an explicit 
routine caller argument.


That way, inside a routine body there are only lexical names for things, and so 
no namespace-qualification is ever needed by the regular SQL.


Similarly, if you always think of table column names as referring to an 
attribute or element of a table variable, then just reference the column 
qualified by the table name (or the lexical alias thereof).  Same as you do in 
any other programming language.  Of course, sometimes you don't have to qualify 
column name references as context could make it unambiguous.  Or, a shorthand 
like a simple leading . could unambiguously say you're referring to a column 
of the particular table in context.


With those in place, all unqualified references are straight to lexical 
variables or parameters.


And so, this is also an effective way to resolve the ambiguity and I prefer the 
latter design personally.


Here's an example in quasi-PL/PgSQL:

  create function myrtn (myparam integer, mytbl ::= mydb.myschema.mytbl) as
  declare
  myvar integer := 5;
  $body$
  begin
  select (.mycol + myvar * myparam) as mynewcol from mytbl;
  end;
  $body$

Note that I've already thought through this last example as these methods of 
avoiding ambiguity are loosely-speaking how my language Muldis D avoids the 
problem faced by many SQL procedures.


The .mycol syntax specifically was inspired originally for me by Perl 6 where 
the lack of something just before the . means that the implicit topic variable 
is referred to, like if you said $_.mycol.


A Perl 6 analogy being something like:

  $mytbl.map:{ .mycol + $myvar * $myparam }

aka:

  $mytbl.map:{ $_.mycol + $myvar * $myparam }

--

-- Darren Duncan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Open issues for collations

2011-03-25 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 I think some discussion of which of the things on the open
 item lists need to be done before beta might be helpful, and we ought
 to add any items that are not there but are blockers.

Here's a quick enumeration of some things I think need discussion about
the collations patch:

* Are we happy yet with the collation assignment behavior (see
parse_collate.c)?  A couple of specific subtopics:

** Selecting a field from a record-returning function's output.
Currently, we'll use the field's declared collation; except that
if the field has default collation, we'll replace that with the common
collation of the function's inputs, if any.  Is either part of that
sane?  Do we need to make this work for functions invoked with other
syntax than a plain function call, eg operator or cast syntax?

** What to do with domains whose declaration includes a COLLATE clause?
Currently, we'll impute that collation to the result of a cast to the
domain type --- even if the cast's input expression includes an
explicit COLLATE clause.  It's not clear that that's per spec.  If it
is correct, should we behave similarly for functions that are declared
to return a domain type?  Should it matter if the cast-to-domain is
explicit or implicit?  Perhaps it'd be best if domain collations only
mattered for columns declared with that domain type.  Then we'd have
a general rule that collations only come into play in an expression
as a result of (a) the declared type of a column reference or (b)
an explicit COLLATE clause.


* In plpgsql, is it OK for declared local variables to inherit the
function's input collation?  Should we provide a COLLATE option in
variable declarations to let that be overridden?  If Oracle understands
COLLATE, probably we should look at what they do in PL/SQL.

* RI triggers should insert COLLATE clauses in generated queries to
satisfy SQL2008 9.13 SR 4a, which says that RI comparisons use the
referenced column's collation.  Right now you may get either table's
collation depending on which query type is involved.  I think an obvious
failure may not be possible so long as equality means the same thing in
all collations, but it's definitely possible that the planner might
decide it can't use the referenced column's unique index, which would
suck for performance.  (Note: this rule seems to prove that the
committee assumes equality can mean different things in different
collations, else they'd not have felt the need to specify.)

* It'd sure be nice if we had some nontrivial test cases that work in
encodings besides UTF8.  I'm still bothered that the committed patch
failed to cover single-byte-encoding cases in upper/lower/initcap.

* Remove initdb's warning about useless locales?  Seems like pointless
noise, or at least something that can be relegated to debug mode.

* Is it worth adding a cares-about-collation flag to pg_proc?  Probably
too late to be worrying about such refinements for 9.1.

There are a bunch of other minor issues that I'm still working through,
but these are the ones that seem to merit discussion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

87 matches

Mail list logo