Re: [HACKERS] Any reason why the default_with_oids GUC is still there?

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 04:18, Josh Berkus wrote:

... or did we just forget to remove it?


Backwards-compatibility? ;-) There hasn't been any pressing reason to 
remove it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Simon Riggs
On Mon, 2010-09-20 at 22:42 +0100, Thom Brown wrote:
 On 20 September 2010 22:14, Robert Haas robertmh...@gmail.com wrote:
  Well, if you need to talk to all the other standbys and see who has
  the furtherst-advanced xlog pointer, it seems like you have to have a
  list somewhere of who they all are.
 
 When they connect to the master to get the stream, don't they in
 effect, already talk to the primary with the XLogRecPtr being relayed?
  Can the connection IP, port, XLogRecPtr and request time of the
 standby be stored from this communication to track the states of each
 standby?  They would in effect be registering upon WAL stream
 request... and no doubt this is a horrifically naive view of how it
 works.

It's not viable to record information at the chunk level in that way.

But the overall idea is fine. We can track who was connected and how to
access their LSNs. They don't need to be registered ahead of time on the
master to do that. They can register and deregister each time they
connect.

This discussion is reminiscent of the discussion we had when Fujii first
suggested that the standby should connect to the master. At first I
though don't be stupid, the master needs to connect to the standby!.
It stood everything I had thought about on its head and that hurt, but
there was no logical reason to oppose. We could have used standby
registration on the master to handle that, but we didn't. I'm happy that
we have a more flexible system as a result.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 06:00, Tom Lane t...@sss.pgh.pa.us wrote:
 Back here I asked what we were going to do about .gitignore files:
 http://archives.postgresql.org/pgsql-hackers/2010-08/msg01232.php
 The thread died off when the first git conversion attempt crashed and
 burned; but not before it became apparent that we didn't have much
 consensus.  It seemed that there was lack of agreement as to:

 1. Whether to keep the per-subdirectory ignore files (which CVS
 insisted on, but git doesn't) or centralize in a single ignore file.

Both :-)

If there are wildcard ones to be made (*.o - though that one I
believe is excluded by default).

Direct build targets should go in a local one - alongside the Makefile
that builds them.


 2. Whether to have the ignore files ignore common cruft such as
 editor backup files, or only expected build product files.

Editor backup files: no. That should be done locally, because everyone
has a different editor which may have different ideas about that.
Expected build product files: yes, because everybody gets those.


 Although this point wasn't really brought up during that thread, it's
 also the case that the existing implementation is far from consistent
 about ignoring build products.  We really only have .cvsignore entries
 for files that are not in CVS but are meant to be present in
 distribution tarballs.  CVS will, of its own accord, ignore certain
 build products such as .o files; but it doesn't ignore executables for
 instance.  So unless you do a make distclean before cvs update,
 you will get notices about non-ignored files.  That never bothered me
 particularly but I believe it annoys some other folks.  So really there
 is a third area of disagreement:

 3. What are the ignore filesets *for*, in particular should they list
 just the derived files expected in a distribution tarball, or all the
 files in the set of build products in a normal build?

I would like to see us exclude all build products. That'll make git
status a lot more useful (which it can be - whereas cvs status is
always annoying), particularly if you're working with multiple
branches and stashes and so.


I assume once we have a decision, we're backporting this to all active
branches, right?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Fujii Masao
On Sat, Sep 18, 2010 at 4:36 AM, Dimitri Fontaine
dfonta...@hi-media.com wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On Fri, 2010-09-17 at 21:20 +0900, Fujii Masao wrote:
 What synchronization level does each combination of sync_replication
 and sync_replication_service lead to?

 There are only 4 possible outcomes. There is no combination, so we don't
 need a table like that above.

 The service specifies the highest request type available from that
 specific standby. If someone requests a higher service than is currently
 offered by this standby, they will either
 a) get that service from another standby that does offer that level
 b) automatically downgrade the sync rep mode to the highest available.

 I like the a) part, I can't say the same about the b) part. There's no
 reason to accept to COMMIT a transaction when the requested durability
 is known not to have been reached, unless the user said so.

Yep, I can imagine that some people want to ensure that *all* the
transactions are synchronously replicated to the synchronous standby,
without regard to sync_replication. So I'm not sure if automatic
downgrade/upgrade of the mode makes sense. We should introduce new
parameter specifying whether to allow automatic degrade/upgrade or not?
It seems complicated though.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq changes for synchronous replication

2010-09-21 Thread Boszormenyi Zoltan
Hi,

Tom Lane írta:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
   
 It doesn't feel right to always accept PQputCopyData in COPY OUT mode, 
 though. IMHO there should be a new COPY IN+OUT mode.
 

 Yeah, I was going to make the same complaint.  Breaking basic
 error-checking functionality in libpq is not very acceptable.
   

if you looked at my sync replication patch, basically I only added
the checking in PQputCopyData that it's allowed in COPY IN mode
iff the pgconn was set up for replication. I introduced a new libpq
function PQsetDuplexCopy() at the time but Fujii's idea was that
it can be omitted and use the conn-replication pointer instead.
It seems he forgot about it. Something like this might work:

if (conn-asyncStatus != PGASYNC_COPY_IN 
!(conn-asyncStatus == PGASYNC_COPY_OUT 
conn-replication  conn-replication[0]))
  ...

This way the original error checking is still in place and only
a replication client can do a duplex COPY.

 It should be pretty safe to add a CopyInOutResponse message to the 
 protocol without a protocol version bump. Thoughts on that?
 

 Not if it's something that an existing application might see.  If
 it can only happen in replication mode it's OK.
   

My PQsetDuplexCopy() call was only usable for a replication client,
it resulted in an unknown protocol message for a regular client.
For a replication client, walsender sent an ack and libpq have set
the duplex copy flag so it allowed PQputCopyData while in
COPY OUT. I'd like a little comment from you whether it's a
good idea, or the above check is enough.

 Personally I think this demonstrates that piggybacking replication
 data transfer on the COPY protocol was a bad design to start with.
 It's probably time to split them apart.
   

Best regards,
Zoltán Böszörményi

-- 
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq changes for synchronous replication

2010-09-21 Thread Boszormenyi Zoltan
Simon Riggs írta:
 On Fri, 2010-09-17 at 18:22 +0900, Fujii Masao wrote:
   
 On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 
 That said, there's a few small things that can be progressed regardless of
 the details of synchronous replication. There's the changes to trigger
 failover with a signal, and it seems that we'll need some libpq changes to
 allow acknowledgments to be sent back to the master regardless of the rest
 of the design. We can discuss those in separate threads in parallel.
   
 Agreed. The attached patch introduces new function which is used
 to send ACK back from walreceiver. The function sends a message
 to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData
 to be called even during COPY OUT.
 

 Does this differ from Zoltan's code?
   

Somewhat. See my other mail to Tom.

Best regards,
Zoltán Böszörményi

-- 
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: general purpose requirements

2010-09-21 Thread Markus Wanner
On 09/21/2010 02:49 AM, Robert Haas wrote:
 OK.  At least for me, what is important is not only how many GUCs
 there are but how likely they are to require tuning and how easy it
 will be to know what the appropriate value is.  It seems fairly easy
 to tune the maximum number of background workers, and it doesn't seem
 hard to tune an idle timeout, either.  Both of those are pretty
 straightforward trade-offs between, on the one hand, consuming more
 system resources, and on the other hand, better throughput and/or
 latency.

Hm.. I thought of it the other way around. It's more obvious and direct
for me to determine a min and max of the amount of parallel jobs I want
to perform at once. Based on the number of spindles, CPUs and/or nodes
in the cluster (in case of Postgres-R). Admittedly, not necessarily per
database, but at least overall.

I wouldn't known what to set a timeout to. And you didn't make a good
argument for any specific value so far. Nor did you offer a reasoning
for how to find one. It's certainly very workload and feature specific.

 On the other hand, the minimum number of workers to keep
 around per-database seems hard to tune.  If performance is bad, do I
 raise it or lower it?

Same applies for the timeout value.

 And it's certainly not really a hard minimum
 because it necessarily bumps up against the limit on overall number of
 workers if the number of databases grows too large; one or the other
 has to give.

I'd consider the case of min_spare_background_workers * number of
databases  max_background_workers to be a configuration error, about
which the coordinator should warn.

 I think we need to look for a way to eliminate the maximum number of
 workers per database, too.

Okay, might make sense, yes.

Dropping both of these per-database GUCs, we'd simply end up with having
max_background_workers around all the time.

A timeout would mainly help to limit the max amount of time workers sit
around idle. I fail to see how that's more helpful than the proposed
min/max. Quite the opposite, it's impossible to get any useful guarantees.

It assumes that the workload remains the same over time, but doesn't
cope well with sudden spikes and changes in the workload. Unlike the
proposed min/max combination, which forks new bgworkers in advance, even
if the database already uses lots of them. And after the spike, it
quickly reduces the amount of spare bgworkers to a certain max. While
not perfect, it's definitely more adaptive to the workload (at least in
the usual case of having only few databases).

Maybe we need a more sophisticated algorithm in the coordinator. For
example measuring the avg. amount of concurrent jobs per database over
time and adjust the number of idle backends according to that, the
current workload and the max_background_workers, or some such. The
min/max GUCs were simply easier to implement, but I'm open to a more
sophisticated thing.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 So, here, we have two quite different things to be concerned
 about. First is the configuration, and I say that managing a distributed
 setup will be easier for the DBA.

 Yeah, I disagree with that, but I suppose it's a question of opinion.

I'd be willing to share your thoughts if it was only for the initial
setup. This one is hard enough to sketch on the paper that you prefer an
easy way to implement it afterwards, and in some cases a central setup
would be just that.

The problem is that I'm concerned with upgrading the setup once the
system is live. Not at the best time for that in the project, either,
but when you finally get the budget to expand the number of servers.

From experience with skytools, no manual registering works best. But…

 I think that without standby registration it will be tricky to display
 information like the last time that standby foo was connected.
 Yeah, you could set a standby name on the standby server and just have
 the master remember details for every standby name it's ever seen, but
 then how do you prune the list?

… I now realize there are 2 parts under the registration bit. What I
don't see helping is manual registration. For some use cases you're
talking about maintaining a list of known servers sounds important, and
that's also what londiste is doing.

Pruning the list would be done with some admin function. You need one to
see the current state already, add some other one to unregister a known
standby.

In londiste, that's how it works, and events are kept in the queues for
all known subscribers. For the ones that won't ever connect again,
that's of course a problem, so you SELECT pgq.unregister_consumer(…);.

 Heikki mentioned another application for having a list of the current
 standbys only (rather than every standby that has ever existed)
 upthread: you can compute the exact amount of WAL you need to keep
 around.

Well, either way, the system can not decide on its own whether a
currently not available standby is going to join the party again later
on.

 Now it seems to me that all you need here is the master sending one more
 information with each WAL segment, the currently fsync'ed position,
 which pre-9.1 is implied as being the current LSN from the stream,
 right?

 I don't see how that would help you.

I think you want to refrain to apply any WAL segment you receive at the
standby and instead only advance as far as the master is known to have
reached. And you want this information to be safe against slave restart,
too: don't replay any WAL you have in pg_xlog or in the archive.

The other part of your proposal is another story (having slaves talk to
each-other at master crash).

 Well, if you need to talk to all the other standbys and see who has
 the furtherst-advanced xlog pointer, it seems like you have to have a
 list somewhere of who they all are.

Ah sorry I was thinking on the other part of the proposal only (sending
WAL segments that are not been fsync'ed yet on the master). So, yes.

But I thought you were saying that replicating a (shared?) catalog of
standbys is technically hard (or impossible), so how would you go about
it? As it's all about making things simpler for the users, you're not
saying that they should keep the main setup in sync manually on all the
standbys servers, right?

 Maybe there's some way to get
 this to work without standby registration, but I don't really
 understand the resistance to the idea

In fact I'm now realising what I don't like is having to manually do the
registration work: as I already have to setup the slaves, it only
appears like a useless burden on me, giving information the system
already has.

Automatic registration I'm fine with, I now realize.

Regards,
-- 
Dimitri Fontaine
PostgreSQL DBA, Architecte

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Fujii Masao
On Sun, Sep 19, 2010 at 7:20 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Sep 18, 2010 at 5:42 PM, Josh Berkus j...@agliodbs.com wrote:
 There are considerable benefits to having a standby registry with a
 table-like interface.  Particularly, one where we could change
 replication via UPDATE (or ALTER STANDBY) statements.

 I think that using a system catalog for this is going to be a
 non-starter, but we could use a flat file that is designed to be
 machine-editable (and thus avoid repeating the mistake we've made with
 postgresql.conf).

Yep, the standby registration information should be accessible and
changable while the server is not running. So using only system
catalog is not an answer.

My patch has implemented standbys.conf which was proposed before.
This format is the almost same as the pg_hba.conf. Is this
machine-editable, you think? If not, we should the format to
something like xml?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Thom Brown
On 21 September 2010 09:29, Fujii Masao masao.fu...@gmail.com wrote:
 On Sun, Sep 19, 2010 at 7:20 AM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Sep 18, 2010 at 5:42 PM, Josh Berkus j...@agliodbs.com wrote:
 There are considerable benefits to having a standby registry with a
 table-like interface.  Particularly, one where we could change
 replication via UPDATE (or ALTER STANDBY) statements.

 I think that using a system catalog for this is going to be a
 non-starter, but we could use a flat file that is designed to be
 machine-editable (and thus avoid repeating the mistake we've made with
 postgresql.conf).

 Yep, the standby registration information should be accessible and
 changable while the server is not running. So using only system
 catalog is not an answer.

 My patch has implemented standbys.conf which was proposed before.
 This format is the almost same as the pg_hba.conf. Is this
 machine-editable, you think? If not, we should the format to
 something like xml?

I really don't think an XML config would improve anything.  In fact it
would just introduce more ways to break the config by the mere fact it
has to be well-formed.  I'd be in favour of one similar to
pg_hba.conf, because then, at least, we'd still only have 2 formats of
configuration.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Dave Page
On Tue, Sep 21, 2010 at 9:34 AM, Thom Brown t...@linux.com wrote:
 I really don't think an XML config would improve anything.  In fact it
 would just introduce more ways to break the config by the mere fact it
 has to be well-formed.  I'd be in favour of one similar to
 pg_hba.conf, because then, at least, we'd still only have 2 formats of
 configuration.

Want to spend a few days hacking on a config editor for pgAdmin, and
then re-evaluate that comment?

:-)

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SHOW TABLES

2010-09-21 Thread Boszormenyi Zoltan
Guillaume Lelarge írta:
 Le 15/07/2010 17:48, Joshua D. Drake a écrit :
   
 On Thu, 2010-07-15 at 16:20 +0100, Simon Riggs wrote:
 
 On Thu, 2010-07-15 at 11:05 -0400, Tom Lane wrote:
   
 Simon Riggs si...@2ndquadrant.com writes:
 
 The biggest turn off that most people experience when using PostgreSQL
 is that psql does not support memorable commands.
   
 I would like to implement the following commands as SQL, allowing them
 to be used from any interface.
   
 SHOW TABLES
 SHOW COLUMNS
 SHOW DATABASES
   
 This has been discussed before, and rejected before.  Please see
 archives.
 
 Many years ago. I think it's worth revisiting now in light of the number
 of people now joining the PostgreSQL community and the greater
 prevalence other ways of doing it. The world has changed, we have not.

 I'm not proposing any change in function, just a simpler syntax to allow
 the above information to be available, for newbies.

 Just for the record, I've never ever met anyone that said Oh, this \d
 syntax makes so much sense. I'm a real convert to Postgres now you've
 shown me this. The reaction is always the opposite one; always
 negative. Which detracts from our efforts elsewhere.
   
 I have to agree with Simon here. \d is ridiculous for the common user.

 SHOW TABLES, SHOW COLUMNS makes a lot of sense. Just has something like
 DESCRIBE TABLE foo makes a lot more sense than \d.

 

 And would you add the complete syntax? I mean:

   SHOW [OPEN] TABLES [FROM db_name] [LIKE 'pattern']

 I'm wondering what one can do with the [FROM db_name] clause :)
   

I think it's related to making this work:
SELECT * FROM db.schema.table;

Best regards,
Zoltán Böszörményi

-- 
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Thom Brown
On 21 September 2010 09:37, Dave Page dp...@pgadmin.org wrote:
 On Tue, Sep 21, 2010 at 9:34 AM, Thom Brown t...@linux.com wrote:
 I really don't think an XML config would improve anything.  In fact it
 would just introduce more ways to break the config by the mere fact it
 has to be well-formed.  I'd be in favour of one similar to
 pg_hba.conf, because then, at least, we'd still only have 2 formats of
 configuration.

 Want to spend a few days hacking on a config editor for pgAdmin, and
 then re-evaluate that comment?

It would be quicker to add in support for a config format we don't use
yet than to duplicate support for a new config in the same format as
an existing one?  Plus it's a compromise between user-screw-up-ability
and machine-readability.

My fear would be standby.conf would be edited by users who don't
really know XML and then we'd have 3 different styles of config to
tell the user to edit.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TYPE extensions

2010-09-21 Thread KaiGai Kohei
Sorry, I missed a bug when we create a typed table using composite
type which has been altered.

  postgres=# CREATE TYPE comp_1 AS (x int, y int, z int);
  CREATE TYPE
  postgres=# ALTER TYPE comp_1 DROP ATTRIBUTE y;
  ALTER TYPE
  postgres=# CREATE TABLE t1 OF comp_1;
  ERROR:  cache lookup failed for type 0
  postgres=# SELECT attname, attnum, attisdropped FROM pg_attribute
WHERE attrelid = 'comp_1'::regclass;
 attname| attnum | attisdropped
  --++--
   x|  1 | f
   pg.dropped.2 |  2 | t
   z|  3 | f
  (3 rows)

Perhaps, we also need to patch at transformOfType() to
skip attributes with attisdropped.

An additional question. It seems me we can remove all the attributes
from the composite type, although CREATE TYPE prohibits to create
a composite type without any attribute.
What does it mean a composite type with no attribute?
Or, do we need a restriction to prevent the last one attribute?

Rest of comments are below.

(2010/09/18 5:44), Peter Eisentraut wrote:
 On fre, 2010-09-17 at 18:15 +0900, KaiGai Kohei wrote:
 * At the ATPrepAddColumn(), it seems to me someone added a check
to prevent adding a new column to typed table, as you try to
add in this patch.
 
 Good catch.  Redundant checks removed.
 
OK,

 * At the ATPrepAlterColumnType(), you enclosed an existing code
block by if (tab-relkind == RELKIND_RELATION) { ... }, but
it is not indented to appropriate level.
 
 Yeah, just to keep the patch small. ;-)
 
Hmm...
Although I expect the patched routine also should follow the common
coding style in spite of patch size, but it may not be a thing that
I should decide here.
So, I'd like to entrust this decision to committer. OK?

 * RENAME ATTRIBUTE ... TO ...

Even if the composite type to be altered is in use, we can alter
the name of attribute. Is it intended?
 
 No.  Added a check for it now.
 
OK,

 BTW, is there any requirement from SQL standard about behavior
 when we try to add/drop an attribute of composite type in use?
 This patch always prohibit it, using find_typed_table_dependencies()
 and find_composite_type_dependencies().
 However, it seems to me not difficult to alter columns of typed
 tables subsequent with this ALTER TYPE, although it might be
 not easy to alter definitions of embedded composite type already
 in use.
 Of course, it may be our future works. If so, it's good.
 
 The prohibition on altering types that are used in typed tables is
 actually from the SQL standard.  But for now it's just because it's not
 implemented; I plan to work on extending that later.
 
 The restriction by find_composite_type_dependencies() was already there
 for altering tables, and I just kept it the same for now.
 
Thanks for your explanation. It made me clear.

Thanks,
-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Fujii Masao
On Mon, Sep 20, 2010 at 3:27 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 However, the wait forever behavior becomes useful if you have a monitoring
 application outside the DB that decides when enough is enough and tells the
 DB that the slave can be considered dead. So wait forever actually means
 wait until I tell you that you can give up. The monitoring application can
 STONITH to ensure that the slave stays down, before letting the master
 proceed with the commit.

This is also useful for preventing a failover from causing some data loss
by promoting the lagged standby to the master. To avoid any data loss, we
must STONITH the standby before any transactions resume on the master, when
replication connection is terminated or the crash of the standby happens.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 11:52, Thom Brown wrote:

My fear would be standby.conf would be edited by users who don't
really know XML and then we'd have 3 different styles of config to
tell the user to edit.


I'm not a big fan of XML either. That said, the format could use some 
hierarchy. If we add many more per-server options, one server per line 
will quickly become unreadable.


Perhaps something like the ini-file syntax Robert Haas just made up 
elsewhere in this thread:


---
globaloption1 = value

[servername1]
synchronization_level = async
option1 = value

[servername2]
synchronization_level = replay
option2 = value1
---

I'm not sure I like the ini-file style much, but the two-level structure 
it provides seems like a perfect match.


Then again, maybe we should go with something like json or yaml that 
would allow deeper hierarchies for the sake of future expandability. Oh, 
and there Dimitri's idea of service levels for per-transaction control 
(http://archives.postgresql.org/message-id/m2sk1868hb@hi-media.com):



  sync_rep_services = {critical: recv=2, fsync=2, replay=1;
   important: fsync=3;
   reporting: recv=2, apply=1}


We'll need to accommodate something like that too.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 05:38, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Ok, I've pushed a new repository to both gitmaster and the
 postgresql-migration.git mirror, that has this setting.
 NOTE! Do a complete wipe of your repository before you clone this
 again - it's a completely new repo that will have different SHA1s.

 AFAICT this version is good: it passes comparisons against all the
 historical tarballs I have, as well as against my checked-out copies of
 branch tips.  History looks sane as best I can tell, too.  I'm ready
 to sign off on this.

Great. All my scripts and manual checks says it's fine too, so...


 NOTE: Magnus told me earlier that the new repository isn't ready to
 accept commits, so committers please hold your fire till he gives
 the all-clear.  It looks okay to clone this and start working locally,
 though.

It is now ready to go. The scripts shuold be in place, and I've
verified both disallowed and allowed commits. Commit messages seem to
be working.

Do keep an eye out on things in the beginning, of course. And remember
that if you do a commit, it might end up getting graylisted by the
antispam servers the first time, so it might not show up right away.


 For the archives' sake, below are the missing historical tags that
 match available tarballs, plus re-instantiation of the Release_2_0
 and Release_2_0_0 tags on non-manufactured commits.  I will push
 these up to the repo once it's open for pushing.

Go for it.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 1:06 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I suppose you already know my votes, but here they are again just in case.
 ...
 Centralize.
 ...
 All the build products in a normal build.

 I don't understand your preference for this together with a centralized
 ignore file.  That will be completely unmaintainable IMNSHO.  A
 centralized file would work all right if it's limited to the couple
 dozen files that are currently listed in .cvsignore's, but I can't see
 doing it that way if it has to list every executable and .so built
 anywhere in the tree.  You'd get merge conflicts from
 completely-unrelated patches, not to mention the fundamental
 action-at-a-distance nastiness of a top-level file that knows about
 everything going on in every part of the tree.

Oh.  I was just figuring it would be pretty easy to regenerate from
the output of git status.  You might have merge conflicts but they'll
be trivial.  But then again, the effort of breaking up the output of
git status into individual per-directory files is probably largely a
one-time effort, so maybe it doesn't matter very much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SHOW TABLES

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 4:52 AM, Boszormenyi Zoltan z...@cybertec.at wrote:
 I think it's related to making this work:
    SELECT * FROM db.schema.table;

Which is a non-starter, I think.  Every function in the system that
thinks an OID uniquely identifies a database object would need to
modified, or else you'd need unique indices that can span tables in
multiple different databases.  It would also require blowing a massive
hole in the isolation wall between databases, and reengineering of
every place that thinks a backend can be connected to only one
database at a time.  None of which would be good for either code
stability or performance.

The only way I can imagine making this work is if any references of
that type got treated like foreign tables: spawn a helper backend
connected to the correct DB (failing if you haven't permissions), and
then stream the tuples back to the main backend from there.
Considering the amount of effort that would be required for the amount
of benefit you'd actually derive from it, I doubt anyone is likely to
tackle this any time soon...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 13:12, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Sep 21, 2010 at 1:06 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I suppose you already know my votes, but here they are again just in case.
 ...
 Centralize.
 ...
 All the build products in a normal build.

 I don't understand your preference for this together with a centralized
 ignore file.  That will be completely unmaintainable IMNSHO.  A
 centralized file would work all right if it's limited to the couple
 dozen files that are currently listed in .cvsignore's, but I can't see
 doing it that way if it has to list every executable and .so built
 anywhere in the tree.  You'd get merge conflicts from
 completely-unrelated patches, not to mention the fundamental
 action-at-a-distance nastiness of a top-level file that knows about
 everything going on in every part of the tree.

 Oh.  I was just figuring it would be pretty easy to regenerate from
 the output of git status.  You might have merge conflicts but they'll
 be trivial.  But then again, the effort of breaking up the output of
 git status into individual per-directory files is probably largely a
 one-time effort, so maybe it doesn't matter very much.

Breaking it up was quite trivial. Here's what I came up with after
building on my box. I'm sure there are some on other platforms showing
up, but this should be the majority.

I just realized it does not include contrib, but's that a mechanical
copy of the same thing.

So if we want to go with this way, i have the scripts/changes ready :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


gitignore.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Basic JSON support

2010-09-21 Thread Itagaki Takahiro
On Mon, Sep 20, 2010 at 1:38 PM, Joseph Adams
joeyadams3.14...@gmail.com wrote:
 I have written a patch that amends the basic_json-20100915.patch .

Thanks. I merged your patch and added json_to_array(), as a demonstration
of json_stringify(). As the current code, json_stringify(json) just returns
the input text as-is, but json_stringify(json, NULL) trims all of unnecessary
whitespaces. We could do it in json_in() and json_parse() and always store
values in compressed representation instead. We leave room for discussion.

I also merge json_test_strings.sql into the main test file.
I slimed some tests -- many test cases seem to be duplicated for me.

 I went ahead and added json_validate() now because it's useful for
 testing (my test strings use it).

Good idea, but how about calling it json_is_well_formed()? We have
similar name of functions for xml type. I renamed it in the patch.


 Here's one thing I'm worried about: the bison/flex code in your patch
 looks rather similar to the code in
 http://www.jsonlint.com/bin/jsonval.tgz , which is licensed under the
 GPL.  In particular, the incorrect number regex I discussed above can
 also be found in jsonval verbatim.  However, because there are a lot
 of differences in both the bison and flex code now,  I'm not sure
 they're close enough to be copied, but I am not a lawyer.  It might
 be a good idea to contact Ben Spencer and ask him for permission to
 license our modified version of the code under PostgreSQL's more
 relaxed license, just to be on the safe side.

Sorry for my insincere manner. Surely I read his code.
Do you know his contact address? I cannot find it...

-- 
Itagaki Takahiro


basic_json-20100921.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Basic JSON support

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 8:38 AM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 Sorry for my insincere manner. Surely I read his code.
 Do you know his contact address? I cannot find it...

It alarms me quite a bit that someone who is a committer on this
project would accidentally copy code from another project with a
different license into PostgreSQL.  How does that happen?  And how
much got copied, besides the regular expression?  I would be inclined
to flush this patch altogether rather than take ANY risk of GPL
contamination.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Basic JSON support

2010-09-21 Thread Itagaki Takahiro
On Tue, Sep 21, 2010 at 9:54 PM, Robert Haas robertmh...@gmail.com wrote:
 It alarms me quite a bit that someone who is a committer on this
 project would accidentally copy code from another project with a
 different license into PostgreSQL.  How does that happen?  And how
 much got copied, besides the regular expression?  I would be inclined
 to flush this patch altogether rather than take ANY risk of GPL
 contamination.

Only regular expressions in the scanner. So I've thought it's OK,
but I should have been more careful. Sorry.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: general purpose requirements

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 4:23 AM, Markus Wanner mar...@bluegap.ch wrote:
 On 09/21/2010 02:49 AM, Robert Haas wrote:
 OK.  At least for me, what is important is not only how many GUCs
 there are but how likely they are to require tuning and how easy it
 will be to know what the appropriate value is.  It seems fairly easy
 to tune the maximum number of background workers, and it doesn't seem
 hard to tune an idle timeout, either.  Both of those are pretty
 straightforward trade-offs between, on the one hand, consuming more
 system resources, and on the other hand, better throughput and/or
 latency.

 Hm.. I thought of it the other way around. It's more obvious and direct
 for me to determine a min and max of the amount of parallel jobs I want
 to perform at once. Based on the number of spindles, CPUs and/or nodes
 in the cluster (in case of Postgres-R). Admittedly, not necessarily per
 database, but at least overall.

Wait, are we in violent agreement here?  An overall limit on the
number of parallel jobs is exactly what I think *does* make sense.
It's the other knobs I find odd.

 I wouldn't known what to set a timeout to. And you didn't make a good
 argument for any specific value so far. Nor did you offer a reasoning
 for how to find one. It's certainly very workload and feature specific.

I think my basic contention is that it doesn't matter very much, so
any reasonable value should be fine.  I think 5 minutes will be good
enough for 99% of cases.  But if you find that this leaves too many
extra backends around and you start to run out of file descriptors or
your ProcArray gets too full, then you might want to drop it down.
Conversely, if you want to fine-tune your system for sudden load
spikes, you could raise it.

 I'd consider the case of min_spare_background_workers * number of
 databases  max_background_workers to be a configuration error, about
 which the coordinator should warn.

The number of databases isn't a configuration parameter.  Ideally,
users shouldn't have to reconfigure the system because they create
more databases.

 I think we need to look for a way to eliminate the maximum number of
 workers per database, too.

 Okay, might make sense, yes.

 Dropping both of these per-database GUCs, we'd simply end up with having
 max_background_workers around all the time.

 A timeout would mainly help to limit the max amount of time workers sit
 around idle. I fail to see how that's more helpful than the proposed
 min/max. Quite the opposite, it's impossible to get any useful guarantees.

 It assumes that the workload remains the same over time, but doesn't
 cope well with sudden spikes and changes in the workload.

I guess we differ on the meaning of cope well...  being able to spin
up 18 workers in one second seems very fast to me.  How many do you
expect to ever need?!!

 Unlike the
 proposed min/max combination, which forks new bgworkers in advance, even
 if the database already uses lots of them. And after the spike, it
 quickly reduces the amount of spare bgworkers to a certain max. While
 not perfect, it's definitely more adaptive to the workload (at least in
 the usual case of having only few databases).

 Maybe we need a more sophisticated algorithm in the coordinator. For
 example measuring the avg. amount of concurrent jobs per database over
 time and adjust the number of idle backends according to that, the
 current workload and the max_background_workers, or some such. The
 min/max GUCs were simply easier to implement, but I'm open to a more
 sophisticated thing.

Possibly, but I'm still having a hard time understanding why you need
all the complexity you already have.  The way I'd imagine doing this
is:

1. If a new job arrives, and there is an idle worker available for the
correct database, then allocate that worker to that job.  Stop.
2. Otherwise, if the number of background workers is less than the
maximum number allowable, then start a new worker for the appropriate
database and allocate it to the new job.  Stop.
3. Otherwise, if there is at least one idle background worker, kill it
and start a new one for the correct database.  Allocate that new
worker to the new job.  Stop.
4. Otherwise, you're already at the maximum number of background
workers and they're all busy.  Wait until some worker finishes a job,
and then try again beginning with step 1.

When a worker finishes a job, it hangs around for a few minutes to see
if it gets assigned a new job (as per #1) and then exits.

Although there are other tunables that can be exposed, I would expect,
in this design, that the only thing most people would need to adjust
would be the maximum pool size.

It seems (to me) like your design is being driven by start-up latency,
which I just don't understand.  Sure, 50 ms to start up a worker isn't
fantastic, but the idea is that it won't happen much because there
will probably already be a worker in that database from previous
activity.  The only exception is when there's a sudden surge 

Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Peter Eisentraut
On tis, 2010-09-21 at 00:55 -0400, Robert Haas wrote:
 One of the infelicities of
 git is that 'git status' shows the untracked files at the bottom.  So
 if you have lots of unignored stuff floating around, the information
 about which files you've actually changed or added to the index
 scrolls right off the screen.

Perhaps you knew this, but 'git status -uno' is moderately useful
against that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 16:27, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-09-21 at 00:55 -0400, Robert Haas wrote:
 One of the infelicities of
 git is that 'git status' shows the untracked files at the bottom.  So
 if you have lots of unignored stuff floating around, the information
 about which files you've actually changed or added to the index
 scrolls right off the screen.

 Perhaps you knew this, but 'git status -uno' is moderately useful
 against that.

It is, but that one has the problem of not showing any untracked files
- so if you forgot to add a file/directory you *wanted* to be added,
it will also be hidden with -uno.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Peter Eisentraut
On tis, 2010-09-21 at 00:00 -0400, Tom Lane wrote:
 3. What are the ignore filesets *for*, in particular should they list
 just the derived files expected in a distribution tarball, or all the
 files in the set of build products in a normal build?

My personal vote: Forget the whole thing.

I have never found the .cvsignore files useful for anything, but they
have only been a small annoyance when someone else quietly updated them
when I supposedly forget.  Some of the new proposed schemes
for .gitignore appear to be significantly more involved.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 8:12 AM, Magnus Hagander mag...@hagander.net wrote:
 Breaking it up was quite trivial. Here's what I came up with after
 building on my box. I'm sure there are some on other platforms showing
 up, but this should be the majority.

 I just realized it does not include contrib, but's that a mechanical
 copy of the same thing.

 So if we want to go with this way, i have the scripts/changes ready :)

Sounds good to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On tis, 2010-09-21 at 00:00 -0400, Tom Lane wrote:
 3. What are the ignore filesets *for*, in particular should they list
 just the derived files expected in a distribution tarball, or all the
 files in the set of build products in a normal build?

 My personal vote: Forget the whole thing.

The folks who are more familiar with git than I seem to be pretty clear
that we need to ignore all build products.  I don't think that ignore
nothing is going to work pleasantly at all.  On reflection I realize
that cvs ignore and git ignore are considerably different because they
come into play at different times: cvs ignore really only matters while
doing cvs update to pull in new code, while git ignore matters while
you're constructing a commit.  So you really do need git ignore to
ignore all build products; otherwise you'll have lots of chatter in
git status.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Robert Haas
On Mon, Sep 20, 2010 at 11:31 AM, Colin 't Hart colinth...@gmail.com wrote:
 I think to_date is the wrong gadget to use here. You should probably be 
 using the date input routine and trapping any data exception. e.g.:

    test_date := date_in(textout(some_text));

 In plpgsql you'd put that inside a begin/exception/end block that traps 
 SQLSTATE '22000' which is the class covering data exceptions.

 So it's not possible using pure SQL unless one writes a function?

I think that is true.

 Are the is_type family of functions still desired?

I think it would be useful to have a way of testing whether a cast to
a given type will succeed.  The biggest problem with the
exception-catching method is not that it requires writing a function
(which, IMHO, is no big deal) but that exception handling is pretty
slow and inefficient.  You end up doing things like... write a regexp
to see whether the data is in approximately the right format and then
if it is try the cast inside an exception block.  Yuck.

(On the other hand, whether the work that was done in 2002 is still
relevant to today's code is questionable.  Things have changed a lot.)

 Also, where are the to_type conversions done?

I think maybe you are looking for the type input functions?

select typname, typinput::regprocedure from pg_type;

There are also some functions with names of the form to_type.  You
can get a list of those with the following psql command:

\dfS to_*

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 11:02 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On tis, 2010-09-21 at 00:00 -0400, Tom Lane wrote:
 3. What are the ignore filesets *for*, in particular should they list
 just the derived files expected in a distribution tarball, or all the
 files in the set of build products in a normal build?

 My personal vote: Forget the whole thing.

 The folks who are more familiar with git than I seem to be pretty clear
 that we need to ignore all build products.  I don't think that ignore
 nothing is going to work pleasantly at all.  On reflection I realize
 that cvs ignore and git ignore are considerably different because they
 come into play at different times: cvs ignore really only matters while
 doing cvs update to pull in new code, while git ignore matters while
 you're constructing a commit.  So you really do need git ignore to
 ignore all build products; otherwise you'll have lots of chatter in
 git status.

Back when I used CVS for anything, I used to use 'cvs -q update -d'
somewhat the way I now use 'git status', so I've always been in favor
of ignoring all the build products.  But it is true that you tend to
use 'git status' even a bit more, because you typically want to make
sure you've staged everything correctly before committing (unless, of
course, you always just do git commit -a, but that doesn't describe my
workflow very well).  At any rate, whatever the reasons, I'll be very,
very happy if we can settle on a rule to ignore all build products.
FWIW, man gitignore has these comments.

# A project normally includes such .gitignore files
# in its repository, containing patterns for files generated as part
# of the project build.

and further down:

# Patterns which a user wants git to
# ignore in all situations (e.g., backup or temporary files generated by
# the user's editor of choice) generally go into a file specified by
# core.excludesfile in the user's ~/.gitconfig.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Peter Eisentraut
On tis, 2010-09-21 at 14:12 +0200, Magnus Hagander wrote:
 Breaking it up was quite trivial. Here's what I came up with after
 building on my box. I'm sure there are some on other platforms showing
 up, but this should be the majority.

Note that shared library names are platform dependent.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 21/09/10 11:52, Thom Brown wrote:
 My fear would be standby.conf would be edited by users who don't
 really know XML and then we'd have 3 different styles of config to
 tell the user to edit.

 I'm not a big fan of XML either.
 ...
 Then again, maybe we should go with something like json or yaml

The fundamental problem with all those machine editable formats is
that they aren't people editable.  If you have to have a tool (other
than a text editor) to change a config file, you're going to be very
unhappy when things are broken at 3AM and you're trying to fix it
while ssh'd in from your phone.

I think the ini file format suggestion is probably a good one; it
seems to fit this problem, and it's something that people are used to.
We could probably shoehorn the info into a pg_hba-like format, but
I'm concerned about whether we'd be pushing that format beyond what
it can reasonably handle.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 18:02, Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

On tis, 2010-09-21 at 00:00 -0400, Tom Lane wrote:

3. What are the ignore filesets *for*, in particular should they list
just the derived files expected in a distribution tarball, or all the
files in the set of build products in a normal build?



My personal vote: Forget the whole thing.


The folks who are more familiar with git than I seem to be pretty clear
that we need to ignore all build products.  I don't think that ignore
nothing is going to work pleasantly at all.  On reflection I realize
that cvs ignore and git ignore are considerably different because they
come into play at different times: cvs ignore really only matters while
doing cvs update to pull in new code, while git ignore matters while
you're constructing a commit.  So you really do need git ignore to
ignore all build products; otherwise you'll have lots of chatter in
git status.


Agreed. It's not a big deal though, until now I've just always used git 
status | less and scrolled up to the beginning, ignoring the chatter.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 21/09/10 11:52, Thom Brown wrote:
 My fear would be standby.conf would be edited by users who don't
 really know XML and then we'd have 3 different styles of config to
 tell the user to edit.

 I'm not a big fan of XML either.
 ...
 Then again, maybe we should go with something like json or yaml

 The fundamental problem with all those machine editable formats is
 that they aren't people editable.  If you have to have a tool (other
 than a text editor) to change a config file, you're going to be very
 unhappy when things are broken at 3AM and you're trying to fix it
 while ssh'd in from your phone.

Agreed.  Although, if things are broken at 3AM and I'm trying to fix
it while ssh'd in from my phone, I reserve the right to be VERY
unhappy no matter what format the file is in.  :-)

 I think the ini file format suggestion is probably a good one; it
 seems to fit this problem, and it's something that people are used to.
 We could probably shoehorn the info into a pg_hba-like format, but
 I'm concerned about whether we'd be pushing that format beyond what
 it can reasonably handle.

It's not clear how many attributes we'll want to associate with a
server.  Simon seems to think we can keep it to zero; I think it's
positive but I can't say for sure how many there will eventually be.
It may also be that a lot of the values will be optional things that
are frequently left unspecified.  Both of those make me think that a
columnar format is probably not best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Breaking it up was quite trivial. Here's what I came up with after
 building on my box. I'm sure there are some on other platforms showing
 up, but this should be the majority.

 I just realized it does not include contrib, but's that a mechanical
 copy of the same thing.

 So if we want to go with this way, i have the scripts/changes ready :)

This works for me, modulo some things:

If we are going to ignore *.so at the top level, we also need to ignore
*.sl (for HPUX) and *.dll (for Windows).  I also wonder why we have
entries like this:

 +libecpg.a
 +libecpg.so.*

rather than global ignore patterns for *.a and *.so.[0-9]

We should probably ignore src/Makefile.custom, since that is still a
supported way to customize builds (and some of us still use it).

 diff --git a/src/timezone/.gitignore b/src/timezone/.gitignore
 new file mode 100644
 index 000..f844c9f
 --- /dev/null
 +++ b/src/timezone/.gitignore
 @@ -0,0 +1 @@
 +/zic

Why does this entry have a / when none of the rest do?  Shouldn't
we be consistent about that?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Kevin Grittner
I just went to do my usual merge from the git version of HEAD (at
git://git.postgresql.org/git/postgresql.git), and it seemed to be
doing an awful lot of work to prepare to attempt the merge.  That
leads me to think that the newly converted git, or a copy of it, is
now at that location, which is cool.  But I have concerns about what
to do with my development branch off the old one.
 
I'm afraid that in spite of several attempts, I don't yet properly
have my head around the git approach, and fear that I'll muck things
up without a little direction; and I'd be surprised if I'm the only
one in this position.
 
Can someone give advice, preferably in the form of a recipe, for
how to set up a new repo here based on the newly converted repo, and
merge the work from my branch (with all the related history) into a
branch off the new repo?
 
Thanks for any advice.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-21 Thread David Fetter
On Mon, Sep 20, 2010 at 05:48:40PM -0700, fazool mein wrote:
 Hi,
 
 I want to shut down the server under certain conditions that can be
 checked inside a backend process.  For instance, while running
 symmetric

Synchronous?

 replication, if the primary dies, I want the the walreceiver to
 detect that and shutdown the standby.  The reason for shutdown is
 that I want to execute some other stuff before I start the standby
 as a primary.  Creating a trigger file doesn't help as it converts
 the standby into primary at run time.
 
 Using proc_exit() inside walreceiver only terminates the walreceiver
 process, which postgres starts again.  The other way I see is using
 ereport(PANIC, ...).  Is there some other way to shutdown the main
 server from within a backend process?

Perhaps I've misunderstood, but since there's already Something
Else(TM) which takes actions, why not send a message to it so it can
take appropriate action on the node, starting with shutting it down?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: general purpose requirements

2010-09-21 Thread Markus Wanner
On 09/21/2010 03:46 PM, Robert Haas wrote:
 Wait, are we in violent agreement here?  An overall limit on the
 number of parallel jobs is exactly what I think *does* make sense.
 It's the other knobs I find odd.

Note that the max setting I've been talking about here is the maximum
amount of *idle* workers allowed. It does not include busy bgworkers.

 I guess we differ on the meaning of cope well...  being able to spin
 up 18 workers in one second seems very fast to me.  

Well, it's obviously use case dependent. For Postgres-R (and sync
replication) in general, people are very sensitive to latency. There's
the network latency already, but adding a 50ms latency for no good
reason is not going to make these people happy.

 How many do you expect to ever need?!!

Again, very different. For Postgres-R, easily a couple of dozens. Same
applies for parallel querying when having multiple concurrent parallel
queries.

 Possibly, but I'm still having a hard time understanding why you need
 all the complexity you already have.

To make sure I we only pay the startup cost in very rare occasions, and
not every time the workload changes a bit (or isn't in conformance with
an arbitrary timeout).

(BTW the min/max is hardly any more complex than a timeout. It doesn't
even need a syscall).

 It seems (to me) like your design is being driven by start-up latency,
 which I just don't understand.  Sure, 50 ms to start up a worker isn't
 fantastic, but the idea is that it won't happen much because there
 will probably already be a worker in that database from previous
 activity.  The only exception is when there's a sudden surge of
 activity.

I'm less optimistic about the consistency of the workload.

 But I don't think that's the case to optimize for.  If a
 database hasn't had any activity in a while, I think it's better to
 reclaim the memory and file descriptors and ProcArray slots that we're
 spending on it so that the rest of the system can run faster.

Absolutely. It's what I call a change in workload. The min/max approach
is certainly faster at reclaiming unused workers, but (depending on the
max setting) doesn't necessarily ever go down to zero.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Andrew Dunstan



On 09/21/2010 11:20 AM, Heikki Linnakangas wrote:

On 21/09/10 18:02, Tom Lane wrote:

Peter Eisentrautpete...@gmx.net  writes:

On tis, 2010-09-21 at 00:00 -0400, Tom Lane wrote:

3. What are the ignore filesets *for*, in particular should they list
just the derived files expected in a distribution tarball, or all the
files in the set of build products in a normal build?



My personal vote: Forget the whole thing.


The folks who are more familiar with git than I seem to be pretty clear
that we need to ignore all build products.  I don't think that ignore
nothing is going to work pleasantly at all.  On reflection I realize
that cvs ignore and git ignore are considerably different because they
come into play at different times: cvs ignore really only matters while
doing cvs update to pull in new code, while git ignore matters while
you're constructing a commit.  So you really do need git ignore to
ignore all build products; otherwise you'll have lots of chatter in
git status.


Agreed. It's not a big deal though, until now I've just always used 
git status | less and scrolled up to the beginning, ignoring the 
chatter.




FWIW, the buildfarm's git mode does not rely on ignore files any more, 
unlike what we had for CVS. This came about after I followed up on a 
suggestion Robert made at pgCon to use git clean.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 18:28, Kevin Grittner wrote:

I just went to do my usual merge from the git version of HEAD (at
git://git.postgresql.org/git/postgresql.git), and it seemed to be
doing an awful lot of work to prepare to attempt the merge.  That
leads me to think that the newly converted git, or a copy of it, is
now at that location, which is cool.  But I have concerns about what
to do with my development branch off the old one.

I'm afraid that in spite of several attempts, I don't yet properly
have my head around the git approach, and fear that I'll muck things
up without a little direction; and I'd be surprised if I'm the only
one in this position.

Can someone give advice, preferably in the form of a recipe, for
how to set up a new repo here based on the newly converted repo, and
merge the work from my branch (with all the related history) into a
branch off the new repo?


Some ideas:

A) Generate a patch in the old repo, and apply it to the new one. 
Simple, but you lose the history.


B) git rebase. First git fetch the new upstream repository into your 
local repository, and use git rebase to apply all the commits in your 
private branch over the new upstream branch. You will likely get some 
conflicts and will need to resolve them by hand, but if you're lucky 
it's not a lot of work.


C) Git grafts. I just tested this method for our internal EDB 
repository, and seems to work pretty well. You will need one line in 
your .git/info/grafts file for each merge commit with upstream that you 
have made. On each line you have 1. commitid of the merge commit 2. 
commitid of the old PostgreSQL commit that was merged 3. commitid of the 
corresponding PostgreSQL commit in the new repository. This lets you 
continue working on your repository as you used to, merging and all, but 
git diff will show that all the $PostgreSQL$ are different from the new 
upstream repository.


I'd suggest that you just do A) and keep the old repository around for 
reference.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do we need a ShmList implementation?

2010-09-21 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 I'm not excited about inventing an API with just one use-case;
 it's unlikely that you actually end up with anything generally
 useful.  (SHM_QUEUE seems like a case in point...)  Especially
 when there are so many other constraints on what shared memory is
 usable for.  You might as well just do this internally to the
 SERIALIZABLEXACT management code.
  
 Fair enough.  I'll probably abstract it within the SSI patch
 anyway, just because it will keep the other code cleaner where the
 logic is necessarily kinda messy anyway, and I think it'll reduce
 the chance of weird memory bugs.  I just won't get quite so formal
 about the interface.
 
OK, I'd say it's a little rough yet, but it works.  Is this
reasonable?:
 
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=b8eca245ab63725d0fbfc3b5969f4a17fc765f2c
 
In particular, I'm a little squeamish about how I allocated the
shared memory for the list, but I couldn't think of anything that
seemed better.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I think it would be useful to have a way of testing whether a cast to
 a given type will succeed.  The biggest problem with the
 exception-catching method is not that it requires writing a function
 (which, IMHO, is no big deal) but that exception handling is pretty
 slow and inefficient.  You end up doing things like... write a regexp
 to see whether the data is in approximately the right format and then
 if it is try the cast inside an exception block.  Yuck.

The problem here is that putting the exception handling in C doesn't
make things any better: it's still slow and inefficient.  And in the
general case the only way to be sure that a string will be accepted by
the input function is to try it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 11:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think it would be useful to have a way of testing whether a cast to
 a given type will succeed.  The biggest problem with the
 exception-catching method is not that it requires writing a function
 (which, IMHO, is no big deal) but that exception handling is pretty
 slow and inefficient.  You end up doing things like... write a regexp
 to see whether the data is in approximately the right format and then
 if it is try the cast inside an exception block.  Yuck.

 The problem here is that putting the exception handling in C doesn't
 make things any better: it's still slow and inefficient.  And in the
 general case the only way to be sure that a string will be accepted by
 the input function is to try it.

Given the current API, that is true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: general purpose requirements

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 11:31 AM, Markus Wanner mar...@bluegap.ch wrote:
 On 09/21/2010 03:46 PM, Robert Haas wrote:
 Wait, are we in violent agreement here?  An overall limit on the
 number of parallel jobs is exactly what I think *does* make sense.
 It's the other knobs I find odd.

 Note that the max setting I've been talking about here is the maximum
 amount of *idle* workers allowed. It does not include busy bgworkers.

Oh, wow.  Is there another limit on the total number of bgworkers?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Andrew Dunstan



On 09/21/2010 11:28 AM, Kevin Grittner wrote:

I just went to do my usual merge from the git version of HEAD (at
git://git.postgresql.org/git/postgresql.git), and it seemed to be
doing an awful lot of work to prepare to attempt the merge.  That
leads me to think that the newly converted git, or a copy of it, is
now at that location, which is cool.  But I have concerns about what
to do with my development branch off the old one.

I'm afraid that in spite of several attempts, I don't yet properly
have my head around the git approach, and fear that I'll muck things
up without a little direction; and I'd be surprised if I'm the only
one in this position.

Can someone give advice, preferably in the form of a recipe, for
how to set up a new repo here based on the newly converted repo, and
merge the work from my branch (with all the related history) into a
branch off the new repo?


I was just mentioning to Magnus a couple of hours ago on chat that this 
would create headaches for some people.


Basically, AIUI, you have to move the old repo aside and freshly clone 
the new repo.


I haven't migrated my development trees yet, but I'm planning on simply 
applying a diff from the old repo to a newly created branch in the new 
repo. However, that does mean losing the private commit history. I'm not 
sure much can be done about that, unless you migrate each commit 
separately, which could be painful. Maybe some of the git gurus have 
better ideas, though.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Aidan Van Dyk
* Andrew Dunstan and...@dunslane.net [100921 11:59]:


 I was just mentioning to Magnus a couple of hours ago on chat that this  
 would create headaches for some people.

 Basically, AIUI, you have to move the old repo aside and freshly clone  
 the new repo.

 I haven't migrated my development trees yet, but I'm planning on simply  
 applying a diff from the old repo to a newly created branch in the new  
 repo. However, that does mean losing the private commit history. I'm not  
 sure much can be done about that, unless you migrate each commit  
 separately, which could be painful. Maybe some of the git gurus have  
 better ideas, though.

Someone mentioned git rebase.  That' probably going to be slow on
distint repositories too.  The grafts mentioned will speed that up.

But probably the easiest way, if you have a nice clean history, is to
use git formatpatch.  This produces a nice series of patches, with
your commit message, and content, and dates, all preserved, ready for
re-applying (git am can do that automatically on the new branch), or
emailing, or whatever.

If you're history is a complicated tangle of merges because you
constantly just re-merge the CVS HEAD into your dev branch, then it
might be time to just do a massive diff and apply anyways ;-)

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Kevin Grittner
Andrew Dunstan and...@dunslane.net wrote:
 
 Basically, AIUI, you have to move the old repo aside and freshly
 clone the new repo.
 
I was assuming that, but it's good to have confirmation.  What about
my repo at
 
http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git ?
 
Can that be reset to a copy of the new repo?  (Or is that not really
beneficial?)
 
 I haven't migrated my development trees yet, but I'm planning on
 simply applying a diff from the old repo to a newly created branch
 in the new repo. However, that does mean losing the private commit
 history.
 
Yeah, I'd really rather not lose that.
 
 I'm not sure much can be done about that, unless you migrate each
 commit separately, which could be painful.
 
Perhaps.  I might be able to use grep and sed to script it, though. 
Right now I think I'd be alright to just pick off commits where the
committer was myself or Dan Ports.  My bash-fu is tolerably good for
such purposes.
 
 Maybe some of the git gurus have better ideas, though.
 
I'm all ears.  ;-)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-21 Thread Dan S
A starvation scenario is what worries me:

Lets say we have a slow complex transaction with many tables involved.
Concurrently smaller transactions begins and commits .

Wouldn't it be possible for a starvation scenario where the slower
transaction will
never run to completion but give a serialization failure over and over again
on retry ?

If I know at what sql-statement the serialization failure occurs can i then
conclude that
some of the tables in that exact query were involved in the conflict ?

If the serialization failure occurs at commit time what can I conclude then
?
They can  occur at commit time right ?

What is the likelyhood that there exists an update pattern that always give
the failure in the slow transaction ?

How would one break such a recurring pattern ?
You could maybe try to lock each table used in the slow transaction but that
would be prohibitively costly
for concurrency.
But what else if there is no way of knowing what the slow transaction
conflicts against.

As things with concurrency involved have a tendency to pop up in production
and not in test I think it is important to
start thinking about them as soon as possible.

Best Regards
Dan S


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Elvis Pranskevichus
On September 21, 2010 12:08:49 pm Kevin Grittner wrote:
 Andrew Dunstan and...@dunslane.net wrote:
  Basically, AIUI, you have to move the old repo aside and freshly
  clone the new repo.
 
 I was assuming that, but it's good to have confirmation.  What about
 my repo at
 
 http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git ?
 
 Can that be reset to a copy of the new repo?  (Or is that not really
 beneficial?)
 
  I haven't migrated my development trees yet, but I'm planning on
  simply applying a diff from the old repo to a newly created branch
  in the new repo. However, that does mean losing the private commit
  history.
 
 Yeah, I'd really rather not lose that.
 
  I'm not sure much can be done about that, unless you migrate each
  commit separately, which could be painful.
 
 Perhaps.  I might be able to use grep and sed to script it, though.
 Right now I think I'd be alright to just pick off commits where the
 committer was myself or Dan Ports.  My bash-fu is tolerably good for
 such purposes.
 
  Maybe some of the git gurus have better ideas, though.
 
 I'm all ears.  ;-)
 
 -Kevin

Here's a quick and easy way to move dev history to a new repo:

$ cd postgresql.old
$ git checkout yourbranch

# stream your commits into a patch mailbox
$ git format-patch --stdout master..HEAD  patches.mbox

# switch to the new repo
$ cd ../postgresql

# create a branch if not already
$ git checkout -b yourbranch 

# apply the patch mailbox
$ git am ../postgresql.old/patches.mbox

That should do the trick.  Your dev history will be kept.


Elvis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Andrew Dunstan



On 09/21/2010 12:07 PM, Aidan Van Dyk wrote:

But probably the easiest way, if you have a nice clean history, is to
use git formatpatch.  This produces a nice series of patches, with
your commit message, and content, and dates, all preserved, ready for
re-applying (git am can do that automatically on the new branch), or
emailing, or whatever.


Ah. I thought there was something like this but for some reason when I 
went looking for it just now I failed to find it.


Thanks for the info. This looks like the best way to go.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Abhijit Menon-Sen
At 2010-09-21 11:59:09 -0400, and...@dunslane.net wrote:

 However, that does mean losing the private commit history. I'm not
 sure much can be done about that, unless you migrate each commit
 separately, which could be painful.

It doesn't have to be painful.

Determine what patches from the old repository you want to apply, and
create a branch in the newly-cloned repository to apply them to. Then
use (cd ../oldrepo;git format-patch -k --stdout R1..R2)|git am -3 -k
to apply a series of patches (between revisions R1 and R2; adjust as
needed) to your branch (i.e. when you have it checked out).

See git-format-patch(1) and git-am(1) for more details (or feel free
to ask if you need more help).

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Abhijit Menon-Sen
At 2010-09-21 11:02:30 -0400, t...@sss.pgh.pa.us wrote:

 So you really do need git ignore to ignore all build products;
 otherwise you'll have lots of chatter in git status.

Right.

I usually put build products into a top-level build directory and put
build/ in my top-level .gitignore (but I haven't tried to figure out
how hard it would be to do that with the Postgres Makefiles, so it's
just a thought, not a serious suggestion).

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 05:38, Tom Lane t...@sss.pgh.pa.us wrote:
 For the archives' sake, below are the missing historical tags that
 match available tarballs, plus re-instantiation of the Release_2_0
 and Release_2_0_0 tags on non-manufactured commits.  I will push
 these up to the repo once it's open for pushing.

 Go for it.

Done.  The commit hook seems to be a bit verbose about that sort of
thing ... is it worth trying to collapse the pgsql-committers messages
into one email?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 12:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 05:38, Tom Lane t...@sss.pgh.pa.us wrote:
 For the archives' sake, below are the missing historical tags that
 match available tarballs, plus re-instantiation of the Release_2_0
 and Release_2_0_0 tags on non-manufactured commits.  I will push
 these up to the repo once it's open for pushing.

 Go for it.

 Done.  The commit hook seems to be a bit verbose about that sort of
 thing ... is it worth trying to collapse the pgsql-committers messages
 into one email?

I was thinking the same thing, until I realized that pushing a whole
boatload of tags at the same time is probably going to be an extremely
rare event.

And I am STRONGLY of the opinion that we do NOT want to collapse
multiple *commits* into a single email, at least not unless we start
merging or something.  The scripts EDB uses internally do this and it
is, at least IMO, just awful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Tom Lane
I wrote:
 Magnus Hagander mag...@hagander.net writes:
 Go for it.

 Done.

Having done that, I now realize that the historical tag release-6-3
is identical to what I applied as REL6_3.  It would probably be
reasonable to remove release-6-3, if that's still possible, but
I'm not clear on how.

regards, tom lane

PS: this page is slightly amazing:
http://git.postgresql.org/gitweb?p=postgresql.git;a=tags

Fourteen years of project history.  Wow.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Sep 21, 2010 at 12:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Done.  The commit hook seems to be a bit verbose about that sort of
 thing ... is it worth trying to collapse the pgsql-committers messages
 into one email?

 I was thinking the same thing, until I realized that pushing a whole
 boatload of tags at the same time is probably going to be an extremely
 rare event.

True.  We will be creating four or five tags at a time during
back-branch update cycles, but those might well arrive in separate
pushes anyway, depending on how Marc chooses to arrange his workflow.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 18:47, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Sep 21, 2010 at 12:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Done.  The commit hook seems to be a bit verbose about that sort of
 thing ... is it worth trying to collapse the pgsql-committers messages
 into one email?

 I was thinking the same thing, until I realized that pushing a whole
 boatload of tags at the same time is probably going to be an extremely
 rare event.

 True.  We will be creating four or five tags at a time during
 back-branch update cycles, but those might well arrive in separate
 pushes anyway, depending on how Marc chooses to arrange his workflow.

I could look into if it's possible to group the tags together if they
come in a single push. I'm not entirely sure it's possible (I don't
know if the commitmsg script gets called once in total or once for
each), but I could look into it.

However, I agree with Robert I doubt it's worth it. I definitely don't
want to group the commits together, and then suddenly tags and commits
are handled differently...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 18:47, Tom Lane t...@sss.pgh.pa.us wrote:
 True.  We will be creating four or five tags at a time during
 back-branch update cycles, but those might well arrive in separate
 pushes anyway, depending on how Marc chooses to arrange his workflow.

 I could look into if it's possible to group the tags together if they
 come in a single push. I'm not entirely sure it's possible (I don't
 know if the commitmsg script gets called once in total or once for
 each), but I could look into it.

 However, I agree with Robert I doubt it's worth it.

Agreed.  It's definitely not something to spend time on before the
update workflow becomes clear --- the case may never arise anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-21 Thread Kevin Grittner
Dan S strd...@gmail.com wrote:
 
 A starvation scenario is what worries me:
 
 Lets say we have a slow complex transaction with many tables
 involved.  Concurrently smaller transactions begins and commits .
 
 Wouldn't it be possible for a starvation scenario where the slower
 transaction will never run to completion but give a serialization
 failure over and over again on retry ?
 
At least theoretically, yes.  One of the reasons I want to try
converting the single conflict reference to a list is to make for a
better worst-case situation.  Since anomalies can only occur when
the TN transaction (convention used in earlier post) commits first,
and by definition TN has done writes, with a list of conflicts you
could make sure that some transaction which writes has successfully
committed before any transaction rolls back.  So progress with
writes would be guaranteed.  There would also be a guarantee that if
you restart a canceled transaction, it would not immediately fail
again on conflicts *with the same transactions*.  Unfortunately,
with the single field for tracking conflicts, the self-reference on
multiple conflicting transactions loses detail, and you lose these
guarantees.
 
Now, could the large, long-running transaction still be the
transaction canceled?  Yes.  Are there ways to ensure it can
complete?  Yes.  Some are prettier than others.  I've already come
up with some techniques to avoid some classes of rollbacks with
transactions flagged as READ ONLY, and with the conflict lists there
would be a potential to recognize de facto read only transactions
apply similar logic, so a long-running transaction which didn't
write to any permanent tables (or at least not to ones which other
transactions were reading) would be pretty safe -- and with one of
our RD point, you could guarantee its safety by blocking the
acquisition of its snapshot until certain conditions were met.
 
With conflict lists we would also always have two candidates for
cancellation at the point where we found something needed to be
canceled.  Right now I'm taking the coward's way out and always
canceling the transaction active in the process which detects the
need to roll something back.  As long as one process can cancel
another, we can use other heuristics for that.  Several possible
techniques come to mind to try to deal with the situation you raise.
 
If all else fails, the transaction could acquire explicit table
locks up front, but that sort of defeats the purpose of having an
isolation level which guarantees full serializable behavior without
adding any blocking to snapshot isolation.  :-(
 
 If I know at what sql-statement the serialization failure occurs
 can i then conclude that some of the tables in that exact query
 were involved in the conflict ?
 
No.  It could be related to any statements which had executed in the
transaction up to that point.
 
 If the serialization failure occurs at commit time what can I
 conclude then ?
 
That a dangerous combination of read-write dependencies occurred
which involved this transaction.
 
 They can  occur at commit time right ?
 
Yes.  Depending on the heuristics chosen, it could happen while
idle in transaction.  (We can kill transactions in that state now,
right?)
 
 What is the likelyhood that there exists an update pattern that
 always give the failure in the slow transaction ?
 
I don't know how to quantify that.  I haven't seen it yet in
testing, but many of my tests so far have been rather contrived.  We
disparately need more testing of this patch with realistic
workloads.
 
 How would one break such a recurring pattern ?
 
As mentioned above, the conflict list enhancement would help ensure
that *something* is making progress.  As mentioned above, we could
tweak the heuristics on *what* gets canceled to try to deal with
this.
 
 You could maybe try to lock each table used in the slow
 transaction but that would be prohibitively costly for
 concurrency.
 
Exactly.
 
 But what else if there is no way of knowing what the slow
 transaction conflicts against.
 
Well, that is supposed to be the situation where this type of
approach is a good thing.  The trick is to get enough experience
with different loads to make sure we're using good heuristics to
deal with various loads well.  Ultimately, there may be some loads
for which this technique is just not appropriate.  Hopefully those
cases can be addressed with the techniques made possible with
Florian's patch.
 
 As things with concurrency involved have a tendency to pop up in
 production and not in test I think it is important to start
 thinking about them as soon as possible.
 
Oh, I've been thinking about it a great deal for quite a while.  The
problem is exactly as you state -- it is very hard to construct
tests which give a good idea of what the impact will be in
production loads.  I'm sure I could construct a test which would
make the patch look glorious.  I'm sure I could construct a test
which would make the patch look 

Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar sep 21 11:56:51 -0400 2010:
 On Tue, Sep 21, 2010 at 11:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:

  The problem here is that putting the exception handling in C doesn't
  make things any better: it's still slow and inefficient.  And in the
  general case the only way to be sure that a string will be accepted by
  the input function is to try it.
 
 Given the current API, that is true.

So we could refactor the input functions so that there's an internal
function that returns the accepted datum in the OK case and an ErrorData
for the failure case.  The regular input function would just throw the
error data in the latter case; but this would allow another function to
just return whether it worked or not.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable snapshot isolation error logging

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 12:57 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 What is the likelyhood that there exists an update pattern that
 always give the failure in the slow transaction ?

 I don't know how to quantify that.  I haven't seen it yet in
 testing, but many of my tests so far have been rather contrived.  We
 disparately need more testing of this patch with realistic
 workloads.

I'm really hoping that Tom or Heikki will have a chance to take a
serious look at this patch soon with a view to committing it.  It
sounds like Kevin has done a great deal of testing on his own, but
we're not going to really get field experience with this until it's in
the tree.  It would be nice to get this in well before feature freeze
so that we have a chance to see what shakes out while there's still
time to adjust it.  Recall that Hot Standby was committed in December
and we were still adjusting the code in May.  It would be much nicer
to commit in September and finish up adjusting the code in February.
It helps get the release out on schedule.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] english parser in text search: support for multiple words in the same position

2010-09-21 Thread Sushant Sinha
 I looked at this patch a bit.  I'm fairly unhappy that it seems to be
 inventing a brand new mechanism to do something the ts parser can
 already do.  Why didn't you code the url-part mechanism using the
 existing support for compound words? 

I am not familiar with compound word implementation and so I am not sure
how to split a url with compound word support. I looked into the
documentation for compound words and that does not say much about how to
identify components of a token. Does a compound word split by matching
with a list of words? If yes, then we will not be able to use that as we
do not know all the words that can appear in a url/host/email/file.

I think another approach can be to use the dict_regex dictionary
support. However, we will have to match the regex with something that
parser is doing. 

The current patch is not inventing any new mechanism. It uses the
special handler mechanism already present in the parser. For example,
when the current parser finds a URL it runs a special handler called
SpecialFURL which resets the parser position to the start of token to
find hostname. After finding the host it moves to finding the path. So
you first get the URL and then the host and finally the path.

Similarly, we are resetting the parser to the start of the token on
finding a url to output url parts. Then before entering the state that
can lead to a url we output the url part. The state machine modification
is similar for other tokens like file/email/host.


 The changes made to parsetext()
 seem particularly scary: it's not clear at all that that's not breaking
 unrelated behaviors.  In fact, the changes in the regression test
 results suggest strongly to me that it *is* breaking things.  Why are
 there so many diffs in examples that include no URLs at all?
 

I think some of the difference is coming from the fact that now pos
starts with 0 and it used to be 1 earlier. That is easily fixable
though. 

 An issue that's nearly as bad is the 100% lack of documentation,
 which makes the patch difficult to review because it's hard to tell
 what it intends to accomplish or whether it's met the intent.
 The patch is not committable without documentation anyway, but right
 now I'm not sure it's even usefully reviewable.

I did not provide any explanation as I could not find any place in the
code to provide the documentation (that was just a modification of state
machine). Should I do a separate write-up to explain the desired output
and the changes to achieve it?

 
 In line with the lack of documentation, I would say that the choice of
 the name parttoken for the new token type is not helpful.  Part of
 what?  And none of the other token type names include the word token,
 so that's not a good decision either.  Possibly url_part would be a
 suitable name.
 

I can modify it to output url-part/host-part/email-part/file-part if
there is an agreement over the rest of the issues. So let me know if I
should go ahead with this.

-Sushant.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 17:27, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Breaking it up was quite trivial. Here's what I came up with after
 building on my box. I'm sure there are some on other platforms showing
 up, but this should be the majority.

 I just realized it does not include contrib, but's that a mechanical
 copy of the same thing.

 So if we want to go with this way, i have the scripts/changes ready :)

 This works for me, modulo some things:

 If we are going to ignore *.so at the top level, we also need to ignore
 *.sl (for HPUX) and *.dll (for Windows).  I also wonder why we have

*.sl was missing because I didn't know about it.
*.dll was missing because on msvc we always build out of tree. And I
forgot about mingw not doing that :-)


 entries like this:

 +libecpg.a
 +libecpg.so.*

 rather than global ignore patterns for *.a and *.so.[0-9]

Yeah, that seems better.


 We should probably ignore src/Makefile.custom, since that is still a
 supported way to customize builds (and some of us still use it).

Ok, added.


 diff --git a/src/timezone/.gitignore b/src/timezone/.gitignore
 new file mode 100644
 index 000..f844c9f
 --- /dev/null
 +++ b/src/timezone/.gitignore
 @@ -0,0 +1 @@
 +/zic

 Why does this entry have a / when none of the rest do?  Shouldn't
 we be consistent about that?

We should. I've removed it.

The difference is that zic matches zic in any subdirectory and
/zic matches just in the top dir. But we're not having any other
thing called zic further down - it's really only a potential problem
at the top level.

How's this?


Btw, what's the stamp-h file? Should that be excluded globally?
-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 000..5118035
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,14 @@
+# Global excludes across all subdirectories
+*.o
+*.so
+*.so.*
+*.sl
+*.dll
+*.a
+objfiles.txt
+.deps/
+
+# Local excludes in root directory
+GNUmakefile
+config.log
+config.status
diff --git a/contrib/adminpack/.gitignore b/contrib/adminpack/.gitignore
new file mode 100644
index 000..07d3199
--- /dev/null
+++ b/contrib/adminpack/.gitignore
@@ -0,0 +1 @@
+adminpack.sql
diff --git a/contrib/btree_gin/.gitignore b/contrib/btree_gin/.gitignore
new file mode 100644
index 000..8e9f4c4
--- /dev/null
+++ b/contrib/btree_gin/.gitignore
@@ -0,0 +1 @@
+btree_gin.sql
diff --git a/contrib/btree_gist/.gitignore b/contrib/btree_gist/.gitignore
new file mode 100644
index 000..cc855cf
--- /dev/null
+++ b/contrib/btree_gist/.gitignore
@@ -0,0 +1 @@
+btree_gist.sql
diff --git a/contrib/chkpass/.gitignore b/contrib/chkpass/.gitignore
new file mode 100644
index 000..2427d62
--- /dev/null
+++ b/contrib/chkpass/.gitignore
@@ -0,0 +1 @@
+chkpass.sql
diff --git a/contrib/citext/.gitignore b/contrib/citext/.gitignore
new file mode 100644
index 000..cb8c4d9
--- /dev/null
+++ b/contrib/citext/.gitignore
@@ -0,0 +1 @@
+citext.sql
diff --git a/contrib/cube/.cvsignore b/contrib/cube/.cvsignore
deleted file mode 100644
index 19ecc85..000
--- a/contrib/cube/.cvsignore
+++ /dev/null
@@ -1,2 +0,0 @@
-cubeparse.c
-cubescan.c
diff --git a/contrib/cube/.gitignore b/contrib/cube/.gitignore
new file mode 100644
index 000..3d15800
--- /dev/null
+++ b/contrib/cube/.gitignore
@@ -0,0 +1,3 @@
+cubeparse.c
+cubescan.c
+cube.sql
diff --git a/contrib/dblink/.gitignore b/contrib/dblink/.gitignore
new file mode 100644
index 000..c5f6774
--- /dev/null
+++ b/contrib/dblink/.gitignore
@@ -0,0 +1 @@
+dblink.sql
diff --git a/contrib/dict_int/.gitignore b/contrib/dict_int/.gitignore
new file mode 100644
index 000..b1fe21b
--- /dev/null
+++ b/contrib/dict_int/.gitignore
@@ -0,0 +1 @@
+dict_int.sql
diff --git a/contrib/dict_xsyn/.gitignore b/contrib/dict_xsyn/.gitignore
new file mode 100644
index 000..f639d69
--- /dev/null
+++ b/contrib/dict_xsyn/.gitignore
@@ -0,0 +1 @@
+dict_xsyn.sql
diff --git a/contrib/earthdistance/.gitignore b/contrib/earthdistance/.gitignore
new file mode 100644
index 000..35e7437
--- /dev/null
+++ b/contrib/earthdistance/.gitignore
@@ -0,0 +1 @@
+earthdistance.sql
diff --git a/contrib/fuzzystrmatch/.gitignore b/contrib/fuzzystrmatch/.gitignore
new file mode 100644
index 000..8006def
--- /dev/null
+++ b/contrib/fuzzystrmatch/.gitignore
@@ -0,0 +1 @@
+fuzzystrmatch.sql
diff --git a/contrib/hstore/.gitignore b/contrib/hstore/.gitignore
new file mode 100644
index 000..acaeaa1
--- /dev/null
+++ b/contrib/hstore/.gitignore
@@ -0,0 +1 @@
+hstore.sql
diff --git a/contrib/intarray/.gitignore b/contrib/intarray/.gitignore
new file mode 100644
index 000..17a6d14
--- /dev/null
+++ b/contrib/intarray/.gitignore
@@ -0,0 +1 @@
+_int.sql
diff --git a/contrib/isn/.gitignore b/contrib/isn/.gitignore
new file mode 100644
index 000..3352289
--- /dev/null
+++ b/contrib/isn/.gitignore
@@ -0,0 +1 @@
+isn.sql
diff --git 

Re: [HACKERS] Git conversion status

2010-09-21 Thread Alvaro Herrera
Excerpts from Magnus Hagander's message of lun sep 20 12:49:28 -0400 2010:

 Committers can (and should! please test!) clone from git clone
 ssh://g...@gitmaster.postgresql.org/postgresql.git.
 
 Please do *NOT* commit or push anything to this repository yet though:
 The repo is there - all the scripts to manage it are *not*. So don't
 commit until I confirm that it is.
 
 But please clone and verify the stuff we have now.

I tried to follow the instructions on the Wiki but they didn't work.
The ones under the heading Dependent Clone per Branch, Pushing and
Pulling From a Local Repository that is.

What I find is that after doing the local clone for the branch, i.e.
  git clone postgresql REL9_0_STABLE
this clones only the master branch somehow, not the other branches; so
when I later run 
  git checkout REL9_0_STABLE
on that clone, it fails with this message:

$ git checkout REL9_0_STABLE
error: pathspec 'REL9_0_STABLE' did not match any file(s) known to git.


So I first need to checkout each branch on the postgresql clone (the
one tracking the remote), and then do the local clone.  So the
instructions are:

branches=REL9_0_STABLE REL8_4_STABLE REL8_3_STABLE REL8_2_STABLE REL8_1_STABLE 
REL8_0_STABLE REL7_4_STABLE

pushd postgresql/
for i in $branches; do git checkout $i; done
popd

for i in $branches; do git clone postgresql $i --branch $i; done

and then set the config variables on each clone, as specified.



-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring Text Search parser?

2010-09-21 Thread Sushant Sinha
Your changes are somewhat fine. It will get you tokens with _
characters in it. However, it is not nice to mix your new token with
existing token like NUMWORD. Give a new name to your new type of
token .. probably UnderscoreWord. Then on seeing _, move to a state
that can identify the new token. If you finally recognize that token,
then output it.

In order to extract portions of the newly created token,  you can write
a special handler for the token that resets the parser position to the
start of the token to get parts of it. And then modify the state machine
to output the part-token before going into the state that can lead to
the token that was identified earlier.


Look at these changes to the text parser as well:

http://archives.postgresql.org/pgsql-hackers/2010-09/msg4.php

-Sushant.


On Mon, 2010-09-20 at 16:01 +0200, jes...@krogh.cc wrote:
 Hi.
 
 I'm trying to migrate an application off an existing Full Text Search engine
 and onto PostgreSQL .. one of my main (remaining) headaches are the
 fact that PostgreSQL treats _ as a seperation charachter whereas the existing
 behaviour is to not split. That means:
 
 testdb=# select ts_debug('database_tag_number_999');
ts_debug
 --
  (asciiword,Word, all ASCII,database,{english_stem},english_stem,{databas})
  (blank,Space symbols,_,{},,)
  (asciiword,Word, all ASCII,tag,{english_stem},english_stem,{tag})
  (blank,Space symbols,_,{},,)
  (asciiword,Word, all ASCII,number,{english_stem},english_stem,{number})
  (blank,Space symbols,_,{},,)
  (uint,Unsigned integer,999,{simple},simple,{999})
 (7 rows)
 
 Where the incoming data, by design contains a set of tags which includes _
 and are expected to be one lexeme.
 
 I've tried patching my way out of this using this patch.
 
 $ diff -w -C 5 src/backend/tsearch/wparser_def.c.orig
 src/backend/tsearch/wparser_def.c
 *** src/backend/tsearch/wparser_def.c.orig2010-09-20 15:58:37.06460
 +0200
 --- src/backend/tsearch/wparser_def.c 2010-09-20 15:58:41.193335577 +0200
 ***
 *** 967,986 
 --- 967,988 
 
   static const TParserStateActionItem actionTPS_InNumWord[] = {
   {p_isEOF, 0, A_BINGO, TPS_Base, NUMWORD, NULL},
   {p_isalnum, 0, A_NEXT, TPS_InNumWord, 0, NULL},
   {p_isspecial, 0, A_NEXT, TPS_InNumWord, 0, NULL},
 + {p_iseqC, '_', A_NEXT, TPS_InNumWord, 0, NULL},
   {p_iseqC, '@', A_PUSH, TPS_InEmail, 0, NULL},
   {p_iseqC, '/', A_PUSH, TPS_InFileFirst, 0, NULL},
   {p_iseqC, '.', A_PUSH, TPS_InFileNext, 0, NULL},
   {p_iseqC, '-', A_PUSH, TPS_InHyphenNumWordFirst, 0, NULL},
   {NULL, 0, A_BINGO, TPS_Base, NUMWORD, NULL}
   };
 
   static const TParserStateActionItem actionTPS_InAsciiWord[] = {
   {p_isEOF, 0, A_BINGO, TPS_Base, ASCIIWORD, NULL},
   {p_isasclet, 0, A_NEXT, TPS_Null, 0, NULL},
 + {p_iseqC, '_', A_NEXT, TPS_Null, 0, NULL},
   {p_iseqC, '.', A_PUSH, TPS_InHostFirstDomain, 0, NULL},
   {p_iseqC, '.', A_PUSH, TPS_InFileNext, 0, NULL},
   {p_iseqC, '-', A_PUSH, TPS_InHostFirstAN, 0, NULL},
   {p_iseqC, '-', A_PUSH, TPS_InHyphenAsciiWordFirst, 0, NULL},
   {p_iseqC, '@', A_PUSH, TPS_InEmail, 0, NULL},
 ***
 *** 995,1004 
 --- 997,1007 
 
   static const TParserStateActionItem actionTPS_InWord[] = {
   {p_isEOF, 0, A_BINGO, TPS_Base, WORD_T, NULL},
   {p_isalpha, 0, A_NEXT, TPS_Null, 0, NULL},
   {p_isspecial, 0, A_NEXT, TPS_Null, 0, NULL},
 + {p_iseqC, '_', A_NEXT, TPS_Null, 0, NULL},
   {p_isdigit, 0, A_NEXT, TPS_InNumWord, 0, NULL},
   {p_iseqC, '-', A_PUSH, TPS_InHyphenWordFirst, 0, NULL},
   {NULL, 0, A_BINGO, TPS_Base, WORD_T, NULL}
   };
 
 
 
 This will obviously break other peoples applications, so my questions would
 be: If this should be made configurable.. how should it be done?
 
 As a sidenote... Xapian doesn't split on _ .. Lucene does.
 
 Thanks.
 
 -- 
 Jesper
 
 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 20:32, Alvaro Herrera wrote:

What I find is that after doing the local clone for the branch, i.e.
   git clone postgresql REL9_0_STABLE
this clones only the master branch somehow, not the other branches; so
when I later run
   git checkout REL9_0_STABLE
on that clone, it fails with this message:


It clones all branches, but it only creates a local tracking branch for 
master automatically. The others you'll have to create manually:


 git branch REL9_0_STABLE origin/REL9_0_STABLE

Try also git branch -a, it is quite enlightening.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 On Tue, Sep 21, 2010 at 11:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The problem here is that putting the exception handling in C doesn't
 make things any better:

 So we could refactor the input functions so that there's an internal
 function that returns the accepted datum in the OK case and an ErrorData
 for the failure case.

This makes the untenable assumption that there are no elog(ERROR)s in
the internal input function *or anything it calls*.  Short of truly
massive restructuring, including uglifying many internal APIs to have
error return codes instead of allowing elog within the callee, you will
never make this work for anything more complicated than say float8in().

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 1:32 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Magnus Hagander's message of lun sep 20 12:49:28 -0400 2010:

 Committers can (and should! please test!) clone from git clone
 ssh://g...@gitmaster.postgresql.org/postgresql.git.

 Please do *NOT* commit or push anything to this repository yet though:
 The repo is there - all the scripts to manage it are *not*. So don't
 commit until I confirm that it is.

 But please clone and verify the stuff we have now.

 I tried to follow the instructions on the Wiki but they didn't work.
 The ones under the heading Dependent Clone per Branch, Pushing and
 Pulling From a Local Repository that is.

 What I find is that after doing the local clone for the branch, i.e.
  git clone postgresql REL9_0_STABLE
 this clones only the master branch somehow, not the other branches; so
 when I later run
  git checkout REL9_0_STABLE
 on that clone, it fails with this message:

 $ git checkout REL9_0_STABLE
 error: pathspec 'REL9_0_STABLE' did not match any file(s) known to git.

Oops.  I left out a step.  Fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Greg Stark
On Tue, Sep 21, 2010 at 6:02 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 So we could refactor the input functions so that there's an internal
 function that returns the accepted datum in the OK case and an ErrorData
 for the failure case.  The regular input function would just throw the
 error data in the latter case; but this would allow another function to
 just return whether it worked or not.

You're assuming the input function won't have any work it has to undo
which it would need the savepoint for anyways. For most of the
built-in datatypes -- all of the ones intended for holding real data
-- that's true. But for things like regclass or regtype it might not
be and for user-defined data types who knows?

Of course all people really want is to test whether something is a
valid integer, floating point value, etc.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 17:27, Tom Lane t...@sss.pgh.pa.us wrote:
 Why does this entry have a / when none of the rest do?  Shouldn't
 we be consistent about that?

 We should. I've removed it.

 The difference is that zic matches zic in any subdirectory and
 /zic matches just in the top dir. But we're not having any other
 thing called zic further down - it's really only a potential problem
 at the top level.

Hmm.  In leaf subdirectories it doesn't matter of course, but I'm
worried about .gitignore files in non-leaf subdirectories accidentally
excluding files further down the tree.  Wouldn't it be better to
standardize on always using the slash, rather than not using it?

 Btw, what's the stamp-h file? Should that be excluded globally?

I'd say no, there's only one or two instances.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-21 Thread fazool mein
On Mon, Sep 20, 2010 at 9:44 PM, Fujii Masao masao.fu...@gmail.com wrote:

 On Tue, Sep 21, 2010 at 9:48 AM, fazool mein fazoolm...@gmail.com wrote:
  Hi,
 
  I want to shut down the server under certain conditions that can be
 checked
  inside a backend process. For instance, while running symmetric
 replication,
  if the primary dies, I want the the walreceiver to detect that and
 shutdown
  the standby. The reason for shutdown is that I want to execute some other
  stuff before I start the standby as a primary. Creating a trigger file
  doesn't help as it converts the standby into primary at run time.
 
  Using proc_exit() inside walreceiver only terminates the walreceiver
  process, which postgres starts again. The other way I see is using
  ereport(PANIC, ...). Is there some other way to shutdown the main server
  from within a backend process?

 Are you going to change the source code? If yes, you might be able to
 do that by making walreceiver send the shutdown signal to postmaster.


Yes, I'll be modifying the code. In the walreceiver, I used the following to
send a shutdown to the postmaster:

kill(getppid(), SIGTERM);


 If no, I think that a straightforward approach is to use a clusterware
 like pacemaker. That is, you need to make a clusterware periodically
 check the master and cause the standby to end when detecting the crash
 of the master.


This was another option, but I have to modify the code for this particular
case.

Thanks for your help.

Regards,


Re: [HACKERS] Shutting down server from a backend process, e.g. walrceiver

2010-09-21 Thread fazool mein
On Tue, Sep 21, 2010 at 8:32 AM, David Fetter da...@fetter.org wrote:

 On Mon, Sep 20, 2010 at 05:48:40PM -0700, fazool mein wrote:
  Hi,
 
  I want to shut down the server under certain conditions that can be
  checked inside a backend process.  For instance, while running
  symmetric

 Synchronous?


I meant streaming :), but the question is in general for any process forked
by the postmaster.



  replication, if the primary dies, I want the the walreceiver to
  detect that and shutdown the standby.  The reason for shutdown is
  that I want to execute some other stuff before I start the standby
  as a primary.  Creating a trigger file doesn't help as it converts
  the standby into primary at run time.
 
  Using proc_exit() inside walreceiver only terminates the walreceiver
  process, which postgres starts again.  The other way I see is using
  ereport(PANIC, ...).  Is there some other way to shutdown the main
  server from within a backend process?

 Perhaps I've misunderstood, but since there's already Something
 Else(TM) which takes actions, why not send a message to it so it can
 take appropriate action on the node, starting with shutting it down?


(wondering)

Thanks.


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Robert Haas
On Tue, Sep 21, 2010 at 1:45 PM, Greg Stark gsst...@mit.edu wrote:
 On Tue, Sep 21, 2010 at 6:02 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 So we could refactor the input functions so that there's an internal
 function that returns the accepted datum in the OK case and an ErrorData
 for the failure case.  The regular input function would just throw the
 error data in the latter case; but this would allow another function to
 just return whether it worked or not.

 You're assuming the input function won't have any work it has to undo
 which it would need the savepoint for anyways. For most of the
 built-in datatypes -- all of the ones intended for holding real data
 -- that's true. But for things like regclass or regtype it might not
 be and for user-defined data types who knows?

 Of course all people really want is to test whether something is a
 valid integer, floating point value, etc.

Right.  Or a date - that's a case that comes up for me pretty
frequently.  It's not too hard to write a regular expression to test
whether something is an integer -- although there is the question of
whether it will overflow, which is sometimes relevant -- but a date or
timestamp field is a bit harder.

I don't understand the argument that we need type input functions to
be protected by a savepoint.  That seems crazy to me.  We're taking a
huge performance penalty here to protect against something that seems
insane to me in the first instance.  Not to mention cutting ourselves
off from really important features, like the ability to recover from
errors during COPY.  I don't understand why we can't just make some
rules about what type input functions are allowed to do.  And if you
break those rules then you get to keep both pieces.  Why is this
unreasonable?  A savepoint can hardly protect you against damage
inflicted by the execution of arbitrary code; IOW, we're already
relying on the user to follow some rules.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Kevin Grittner
Elvis Pranskevichus e...@prans.net wrote:
 
 Here's a quick and easy way to move dev history to a new repo:
 
 $ cd postgresql.old
 $ git checkout yourbranch
 
 # stream your commits into a patch mailbox
 $ git format-patch --stdout master..HEAD  patches.mbox
 
 # switch to the new repo
 $ cd ../postgresql
 
 # create a branch if not already
 $ git checkout -b yourbranch 
 
 # apply the patch mailbox
 $ git am ../postgresql.old/patches.mbox
 
 That should do the trick.  Your dev history will be kept.
 
Thanks for the recipe.  (And thanks to all others who responded.)
 
That still leaves me wondering how I get that out to my public git
repo without someone resetting it on the server.  Or do I have the
ability to clean out the old stuff at:
 
ssh://g...@git.postgresql.org/users/kgrittn/postgres.git
 
so that I can push the result of the above to it cleanly?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 20:01, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Elvis Pranskevichus e...@prans.net wrote:

 Here's a quick and easy way to move dev history to a new repo:

 $ cd postgresql.old
 $ git checkout yourbranch

 # stream your commits into a patch mailbox
 $ git format-patch --stdout master..HEAD  patches.mbox

 # switch to the new repo
 $ cd ../postgresql

 # create a branch if not already
 $ git checkout -b yourbranch

 # apply the patch mailbox
 $ git am ../postgresql.old/patches.mbox

 That should do the trick.  Your dev history will be kept.

 Thanks for the recipe.  (And thanks to all others who responded.)

 That still leaves me wondering how I get that out to my public git
 repo without someone resetting it on the server.  Or do I have the
 ability to clean out the old stuff at:

 ssh://g...@git.postgresql.org/users/kgrittn/postgres.git

 so that I can push the result of the above to it cleanly?

a git push *should* work, but we've seen issues with that.

The cleanest is probably if I wipe the repo on git.postgresql.org for
you, and you then re-push from scratch. Does thta work for you?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configuring synchronous replication

2010-09-21 Thread Simon Riggs
On Tue, 2010-09-21 at 16:58 +0900, Fujii Masao wrote:
 On Sat, Sep 18, 2010 at 4:36 AM, Dimitri Fontaine
 dfonta...@hi-media.com wrote:
  Simon Riggs si...@2ndquadrant.com writes:
  On Fri, 2010-09-17 at 21:20 +0900, Fujii Masao wrote:
  What synchronization level does each combination of sync_replication
  and sync_replication_service lead to?
 
  There are only 4 possible outcomes. There is no combination, so we don't
  need a table like that above.
 
  The service specifies the highest request type available from that
  specific standby. If someone requests a higher service than is currently
  offered by this standby, they will either
  a) get that service from another standby that does offer that level
  b) automatically downgrade the sync rep mode to the highest available.
 
  I like the a) part, I can't say the same about the b) part. There's no
  reason to accept to COMMIT a transaction when the requested durability
  is known not to have been reached, unless the user said so.

Hmm, no reason? The reason is that the alternative is that the session
would hang until a standby arrived that offered that level of service.
Why would you want that behaviour? Would you really request that option?

 Yep, I can imagine that some people want to ensure that *all* the
 transactions are synchronously replicated to the synchronous standby,
 without regard to sync_replication. So I'm not sure if automatic
 downgrade/upgrade of the mode makes sense. We should introduce new
 parameter specifying whether to allow automatic degrade/upgrade or not?
 It seems complicated though.

I agree, but I'm not against any additional parameter if people say they
really want them *after* the consequences of those choices have been
highlighted.

IMHO we should focus on the parameters that deliver key use cases.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What happened to the is_type family of functions proposal?

2010-09-21 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mar sep 21 13:41:32 -0400 2010:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  On Tue, Sep 21, 2010 at 11:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  The problem here is that putting the exception handling in C doesn't
  make things any better:
 
  So we could refactor the input functions so that there's an internal
  function that returns the accepted datum in the OK case and an ErrorData
  for the failure case.
 
 This makes the untenable assumption that there are no elog(ERROR)s in
 the internal input function *or anything it calls*.  Short of truly
 massive restructuring, including uglifying many internal APIs to have
 error return codes instead of allowing elog within the callee, you will
 never make this work for anything more complicated than say float8in().

... which is what people want anyway.  I mean, the day someone requests
is_sthcomplex, we could happily tell them that they need to use the
expensive workaround involving savepoints.  I don't think we really need
to support the ones that would require truly expensive refactoring; the
simple ones would cover 99% of the use cases.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 21:01, Kevin Grittner wrote:

That still leaves me wondering how I get that out to my public git
repo without someone resetting it on the server.  Or do I have the
ability to clean out the old stuff at:

ssh://g...@git.postgresql.org/users/kgrittn/postgres.git

so that I can push the result of the above to it cleanly?


git push --force allows you to push the new branches over the old ones.

I don't think it will automatically garbage collect the old stuff 
though, so the repository will be bloated until git gc runs (with 
--aggressive or something, not sure). I don't know when or how that 
happens in the public repos.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Sep 21, 2010 at 1:32 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 I tried to follow the instructions on the Wiki but they didn't work.

 Oops.  I left out a step.  Fixed.

While we're discussing possible errors on that page ... at the bottom of
the page under the multiple workdirs alternative are these recipes for
re-syncing your local checkouts:

git checkout REL9_0_STABLE
git pull

git checkout master
git reset --hard origin/master

Are the git checkout steps really needed, considering each workdir would
normally be on its target branch all the time?  If so, what are they
accomplishing exactly?  I don't think I've entirely internalized what
that command does.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 19:46, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 17:27, Tom Lane t...@sss.pgh.pa.us wrote:
 Why does this entry have a / when none of the rest do?  Shouldn't
 we be consistent about that?

 We should. I've removed it.

 The difference is that zic matches zic in any subdirectory and
 /zic matches just in the top dir. But we're not having any other
 thing called zic further down - it's really only a potential problem
 at the top level.

 Hmm.  In leaf subdirectories it doesn't matter of course, but I'm
 worried about .gitignore files in non-leaf subdirectories accidentally
 excluding files further down the tree.  Wouldn't it be better to
 standardize on always using the slash, rather than not using it?

Yeah, good point. I just took the path of least resistance :-) I'll
update with that before commit.

Have we decided to do this? If so, I'll start backpatching it...

 Btw, what's the stamp-h file? Should that be excluded globally?

 I'd say no, there's only one or two instances.

Ok. Since I don't know what it is, I didn't know if it's likely to pop
up anywhere else under different circumstances.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Kevin Grittner
Magnus Hagander mag...@hagander.net wrote:
 
 The cleanest is probably if I wipe the repo on git.postgresql.org
 for you, and you then re-push from scratch. Does thta work for
 you?
 
Sure.  Thanks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Heikki Linnakangas

On 21/09/10 21:10, Tom Lane wrote:

While we're discussing possible errors on that page ... at the bottom of
the page under the multiple workdirs alternative are these recipes for
re-syncing your local checkouts:

git checkout REL9_0_STABLE
git pull

git checkout master
git reset --hard origin/master

Are the git checkout steps really needed, considering each workdir would
normally be on its target branch all the time?


No, you're right, they're not really needed. Those steps were 
copy-pasted from the Committing Using a Single Clone recipe, and not 
adjusted.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Peter Eisentraut
On tis, 2010-09-21 at 11:27 -0400, Tom Lane wrote:
 rather than global ignore patterns for *.a and *.so.[0-9]

Probably rather *.so.[0-9.]+


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 20:16, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Magnus Hagander mag...@hagander.net wrote:

 The cleanest is probably if I wipe the repo on git.postgresql.org
 for you, and you then re-push from scratch. Does thta work for
 you?

 Sure.  Thanks.

done, should be available for push now.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] trailing whitespace in psql table output

2010-09-21 Thread Peter Eisentraut
Everyone using git diff in color mode will already or soon be aware that
psql, for what I can only think is an implementation oversight, produces
trailing whitespace in the table headers, like this:

 two | f1 $
-+$
 | asdfghjkl;$
 | d34aaasdf$
(2 rows)$

($ is the line end; cf. cat -A).  Note that this only applies to
headers, not content cells.

Attached is a patch to fix that.

diff --git a/src/bin/psql/print.c b/src/bin/psql/print.c
index e55404b..da23b7b 100644
--- a/src/bin/psql/print.c
+++ b/src/bin/psql/print.c
@@ -817,20 +817,24 @@ print_aligned_text(const printTableContent *cont, FILE *fout)
 		nbspace = width_wrap[i] - this_line-width;
 
 		/* centered */
-		fprintf(fout, %-*s%s%-*s,
-nbspace / 2, , this_line-ptr, (nbspace + 1) / 2, );
+		fprintf(fout, %-*s%s,
+nbspace / 2, , this_line-ptr);
 
 		if (!(this_line + 1)-ptr)
 		{
 			more_col_wrapping--;
-			header_done[i] = 1;
+			header_done[i] = true;
 		}
+
+		if (i  cont-ncolumns - 1 || !header_done[i])
+			fprintf(fout, %-*s,
+	(nbspace + 1) / 2, );
 	}
 	else
 		fprintf(fout, %*s, width_wrap[i], );
 
 	if (opt_border != 0 || format-wrap_right_border == true)
-		fputs(!header_done[i] ? format-header_nl_right :  ,
+		fputs(!header_done[i] ? format-header_nl_right : (i  cont-ncolumns -1 ?   : ),
 			  fout);
 
 	if (opt_border != 0  i  col_count - 1)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Peter Eisentraut
On tis, 2010-09-21 at 20:04 +0200, Magnus Hagander wrote:
 The cleanest is probably if I wipe the repo on git.postgresql.org for
 you, and you then re-push from scratch.

We probably need a solution that doesn't require manual intervention for
everyone separately.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Have we decided to do this? If so, I'll start backpatching it...

Yeah, go for it.

BTW, a look at the recommended GitExclude on the wiki suggests that
we need these two additional global exclusions:

*.mo... for NLS builds
*.dylib ... Darwin spelling of *.so

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bg worker: general purpose requirements

2010-09-21 Thread Markus Wanner
On 09/21/2010 05:59 PM, Robert Haas wrote:
 Oh, wow.  Is there another limit on the total number of bgworkers?

There currently are three GUCs that control bgworkers:

max_background_workers
min_spare_background_workers
max_spare_background_workers

The first replaces the former autovacuum_max_workers GUC. As before, it
is an overall limit, much like max_connections.

The later two are additional. They are per-database lower and upper
limits for the amount of idle workers an any point in time. These later
two are what I'm referring to as the min/max approach. And what I'm
arguing cannot be replaced by a timeout without loosing functionality.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Andrew Dunstan



On 09/21/2010 02:10 PM, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Tue, Sep 21, 2010 at 1:32 PM, Alvaro Herrera
alvhe...@commandprompt.com  wrote:

I tried to follow the instructions on the Wiki but they didn't work.

Oops.  I left out a step.  Fixed.

While we're discussing possible errors on that page ... at the bottom of
the page under the multiple workdirs alternative are these recipes for
re-syncing your local checkouts:

git checkout REL9_0_STABLE
git pull

git checkout master
git reset --hard origin/master

Are the git checkout steps really needed, considering each workdir would
normally be on its target branch all the time?  If so, what are they
accomplishing exactly?  I don't think I've entirely internalized what
that command does.



What I'm planning (unless someone convinces me it's a really bad idea) 
doesn't quite match any of the patterns on the wiki.


It's kinda like this:

   git clone --mirrorssh://g...@gitmaster.postgresql.org/postgresql.git  
   git clone postgresql.git pg_rel9_0

   cd pg_rel9_0
   git checkout REL9_0_STABLE
   git remote set-url --push 
originssh://g...@gitmaster.postgresql.org/postgresql.git


with a cron job to do a git fetch -q fairly frequently on the mirror. 
That way I'll pull from the local mirror but push back to the remote 
master. (git remote set-url --push is a relatively recent addition to 
the ever changing git landscape.)


cheers

andrew


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 20:21, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-09-21 at 11:27 -0400, Tom Lane wrote:
 rather than global ignore patterns for *.a and *.so.[0-9]

 Probably rather *.so.[0-9.]+

Any particular reason not to just do .so.*?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] moving development branch activity to new git repo

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 20:28, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-09-21 at 20:04 +0200, Magnus Hagander wrote:
 The cleanest is probably if I wipe the repo on git.postgresql.org for
 you, and you then re-push from scratch.

 We probably need a solution that doesn't require manual intervention for
 everyone separately.

Are there really that many? If nothing else, it's a good way to figure
out which repos are actually used ;)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Make tuples_per_page pr. table configureable.

2010-09-21 Thread Jesper Krogh

Hi.

This is a follow up and updated patch on several old discussions:
http://archives.postgresql.org/pgsql-hackers/2009-07/msg01065.php
http://archives.postgresql.org/pgsql-admin/2010-04/msg00164.php
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00831.php
First patch:
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00096.php

Currently the aim for the amount of tuples per page is 4 resulting
in a target size of the individual tuple to be more than 2KB before
the tupletoaster kicks in. This patch makes it tuneable on a per
table basis.

The main reasoning is that if people have knowledge about the usage
pattern of their database, they can have huge benefit in tuning
TOAST to be more or less aggressive. This is obviously true if:

* The dataset isn't entirely memory cached and
* columns stored in main (and just visibility checking) is more frequently
  done than accessing data columns stored in TOAST.

But even in the case where the dataset is entirely memory cached this
tuneable can transform the database to a widely different performance
numbers than currently. This typically happens in cases where only
visibillity checks are done (select count()) and when aggregates on
stuff stored in main is used.

I must admit that I have chosen a poor test data set, since based
on the average length of the tuple the sweet point is just around
the current default, but constructing a dataset with an average  2.5KB 
tuple

size would absolutely benefit. But I hope that people can see the benefit
anyway. The dataset is 500.000 records in a table with:

id serial,
code text, (small text block)
entry text (larger text block)

where code is length(code)  10 and entry:

  avg  | max  | min
---+--+--
 3640.2042755914488171 | 8708 | 1468

The queries are run multiple time and numbers are based on runs where
iowait was 0 while the query executed. So entirely memory and cpu-bound 
numbers:


testdb=# select * from data order by tuples_per_page;
 time_sum_length | time_count | tuples_per_page | main_size | toast_size
-++-+---+
5190.258 | 689.34 |   1 | 1981MB| 0MB
5478.519 |660.841 |   2 | 1894MB| 0MB
9740.768 |481.822 |   3 | 1287MB| 4MB
   12875.479 | 73.895 |(default)  4 | 79MB  | 1226MB
   13082.768 | 58.023 |   8 | 29MB  | 1276MB
(5 rows)

time_sum_length = select sum(length(entry)) from data;
time_count = select count(*) from data;
All timings are in ms.

With this data

Command to set tuples_per_page is:
ALTER TABLE tablename set (tuples_per_page = X)
where 1 = X = 32.

The patch really need some feedback, I've tried to adress Tom Lane's 
earlier

comment about fixing the place where it figure out wether it needs a toast
table (and actually tested that it works).

While there surely are more that can be done in order to improve the 
flexibillity

in this area I do think that there is sufficient benefit.

This is my second shot at coding C, so please let me know if I have been 
doing

anything wrong. Comments are all welcome.

Thanks.

--
Jesper
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 52b2dc8..ba36923 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -848,6 +848,27 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } ] TABLE [ IF NOT EXISTS ] repl
/varlistentry
 
varlistentry
+termliteraltuples_per_page/ (typeinteger/)/term
+listitem
+ para
+  The tuples_per_page for a table is an between 1 and 32. It will 
+  instruct the database to aim for this amount of tuples per page (8KB)
+  when updating or inserting rows, thereby tuning how agressive columns
+  will be compressed and/or transferred to the corresponding TOAST table. 
+  Default is 4, which aims for tuplesizes less the 2KB. 
+  Tuning the amount of tuples per page up, will increase the density 
+  of tuples in the main table giving more speed for queries that only fetches
+  simple values or checking visibillity at the cost of having slower access
+  to the larger entries. Tuning the amount of tuples per page down will give
+  more tuple data in the main table thus faster access to data that would
+  otherwise have been moved to toast. This functionality can be viewed 
+  as a way to vertically partition data into two files. 
+ /para
+/listitem
+   /varlistentry
+
+
+   varlistentry
 termliteralautovacuum_enabled/, literaltoast.autovacuum_enabled/literal (typeboolean/)/term
 listitem
  para
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 1e619b1..6e6d0eb 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -15,6 +15,7 @@
 
 #include postgres.h
 

Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 20:29, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 Have we decided to do this? If so, I'll start backpatching it...

 Yeah, go for it.

 BTW, a look at the recommended GitExclude on the wiki suggests that
 we need these two additional global exclusions:

        *.mo            ... for NLS builds
        *.dylib         ... Darwin spelling of *.so

Added to list.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 20:21, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-09-21 at 11:27 -0400, Tom Lane wrote:
 rather than global ignore patterns for *.a and *.so.[0-9]
 
 Probably rather *.so.[0-9.]+

 Any particular reason not to just do .so.*?

Just paranoia, I guess.  I can't actually see a reason why we'd have
any committable files in the tree matching that pattern.  OTOH, we
probably also need the same type of pattern for .sl and .dylib,
so at some point a more conservative pattern would be wise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 20:59, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 20:21, Peter Eisentraut pete...@gmx.net wrote:
 On tis, 2010-09-21 at 11:27 -0400, Tom Lane wrote:
 rather than global ignore patterns for *.a and *.so.[0-9]

 Probably rather *.so.[0-9.]+

 Any particular reason not to just do .so.*?

 Just paranoia, I guess.  I can't actually see a reason why we'd have
 any committable files in the tree matching that pattern.  OTOH, we
 probably also need the same type of pattern for .sl and .dylib,
 so at some point a more conservative pattern would be wise.

Do we know what the exact pattern would be for .sl and .dylib? Are
they following the same basic pattern of .sl.major.minor?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Git conversion status

2010-09-21 Thread Abhijit Menon-Sen
At 2010-09-21 12:45:20 -0400, t...@sss.pgh.pa.us wrote:
 
 Having done that, I now realize that the historical tag release-6-3
 is identical to what I applied as REL6_3.  It would probably be
 reasonable to remove release-6-3, if that's still possible, but
 I'm not clear on how.

You can safely delete the tag from the upstream repository with:

git push origin :refs/tags/release-6-3

New clones of the repository will not see that tag, but existing clones
will continue to have it. Anyone who runs git push --tags from such a
clone without deleting the tag manually (git tag -d release-6-3) will,
however, restore the tag upstream.

I'd say it's not worth the bother.

-- ams

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 20:59, Tom Lane t...@sss.pgh.pa.us wrote:
 Just paranoia, I guess.  I can't actually see a reason why we'd have
 any committable files in the tree matching that pattern.  OTOH, we
 probably also need the same type of pattern for .sl and .dylib,
 so at some point a more conservative pattern would be wise.

 Do we know what the exact pattern would be for .sl and .dylib? Are
 they following the same basic pattern of .sl.major.minor?

Yes, they'll be just the same --- Makefile.shlib treats all those
extensions alike.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Magnus Hagander
On Tue, Sep 21, 2010 at 21:32, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Tue, Sep 21, 2010 at 20:59, Tom Lane t...@sss.pgh.pa.us wrote:
 Just paranoia, I guess.  I can't actually see a reason why we'd have
 any committable files in the tree matching that pattern.  OTOH, we
 probably also need the same type of pattern for .sl and .dylib,
 so at some point a more conservative pattern would be wise.

 Do we know what the exact pattern would be for .sl and .dylib? Are
 they following the same basic pattern of .sl.major.minor?

 Yes, they'll be just the same --- Makefile.shlib treats all those
 extensions alike.

Hmm. Hold on.

My gitignore manpage doesn't say anything about supporting regular
expressions at all. And actually adding the line proposed by Peter
doesn't work.

What works is adding all of:
*.so
*.so.[0-9]
*.so.[0-9].[0-9]

That will break if there's a two-digit number, i guess. Do we want to
go with that anyway?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] .gitignore files, take two

2010-09-21 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 My gitignore manpage doesn't say anything about supporting regular
 expressions at all. And actually adding the line proposed by Peter
 doesn't work.

Yeah, I was wondering about that.  They're meant to be shell patterns
not regexps, I think.

 What works is adding all of:
 *.so
 *.so.[0-9]
 *.so.[0-9].[0-9]

 That will break if there's a two-digit number, i guess. Do we want to
 go with that anyway?

What we can do, when and if any of those numbers get to two digits,
is add

*.so.[0-9][0-9]

etc etc.  Which would not need to be back-patched.  So let's just go in
that direction.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >