Re: [HACKERS] beta3 the open items list

2010-06-21 Thread Greg Stark
On Mon, Jun 21, 2010 at 4:54 AM, Robert Haas robertmh...@gmail.com wrote:
 I feel like we're getting off in the weeds, here.  Obviously, the user
 would ideally like the connection to the master to last forever, but
 equally obviously, if the master unexpectedly reboots, they'd like the
 slave to notice - ideally within some reasonable time period - that it
 needs to reconnect.



  There's no perfect way to distinguish the master
 croaked from the network administrator unplugged the Ethernet cable
 and is planning to plug it back in any hour now, so we'll just need
 to pick some reasonable timeout and go with it.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers

2010-06-21 Thread Fujii Masao
On Wed, Jun 16, 2010 at 5:06 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jun 15, 2010 at 3:57 PM, Josh Berkus j...@agliodbs.com wrote:
 I wonder if it would be possible to jigger things so that we send the
 WAL to the standby as soon as it is generated, but somehow arrange
 things so that the standby knows the last location that the master has
 fsync'd and never applies beyond that point.

 I can't think of any way which would not require major engineering.  And
 you'd be slowing down replication *in general* to deal with a fairly
 unlikely corner case.

 I think the panic is the way to go.

 I have yet to convince myself of how likely this is to occur.  I tried
 to reproduce this issue by crashing the database, but I think in 9.0
 you need an actual operating system crash to cause this problem, and I
 haven't yet set up an environment in which I can repeatedly crash the
 OS.  I believe, though, that in 9.1, we're going to want to stream
 from WAL buffers as proposed in the patch that started out this
 thread, and then I think this issue can be triggered with just a
 database crash.

 In 9.0, I think we can fix this problem by (1) only streaming WAL that
 has been fsync'd and (2) PANIC-ing if the problem occurs anyway.  But
 in 9.1, with sync rep and the performance demands that entails, I
 think that we're going to need to rethink it.

The problem is not that the master streams non-fsync'd WAL, but that the
standby can replay that. So I'm thinking that we can send non-fsync'd WAL
safely if the standby makes the recovery wait until the master has fsync'd
WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
location to walreceiver, and the standby applies only the WAL which the
master has already fsync'd. Thought?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-21 Thread Thom Brown
On 6 June 2010 21:04, Alexander Korotkov aekorot...@gmail.com wrote:
 Hello hackers,
 I would like to share some my thoughts about usage of multidimensional
 indexes for queries which deal with ordinal unidimensional data types. I
 think that gist indexes (especially with knngist) can produce great benefit
 for complex multi-criterion queries.
 Let's consider come example. I use postgresql-9.0beta1 with knngist patch.
 Also I have created simple patch that allows to use knngist for ordinal
 sorting in cube extension (patch is attached). The * operator was
 introduced in my patch. The first operand is the cube and the second operand
 is number n. If n = 2*k then the ascending ordering by k-dimension occurs.
 If n = 2*k + 1 then descending ordering by k-dimension occurs. Now this
 operator have a limitation and works only with nonnegative coordinate
 values.
 Let's create table with 3 float-point columns and fill it with 10M rows;
 create table test (id serial primary key, v1 double precision, v2 double
 precision, v3 double precision);
 insert into test (v1,v2,v3) (select random()*1000, random()*1000,
 random()*1000 from generate_series(1,1000,1));
 Now, let's create 3 separate btree indexes and one gist cube index.
 create index test_v1_idx on test(v1);
 create index test_v2_idx on test(v2);
 create index test_v3_idx on test(v3);
 create index test_cube_idx on test using gist(cube(ARRAY[v1,v2,v3]));
 Let's consider some complex query with filtering, ordering and limit.
 test=# select * from test where v1 between 480 and 500 and v2 between 480
 and 500 order by v3 limit 10;
     id    |        v1        |        v2        |        v3
 --+--+--+---
  12283631 | 485.982828773558 | 496.795611456037 | 0.213871244341135
   4936086 | 497.239370364696 | 491.878624074161 |  1.26481195911765
   8963067 | 484.963194001466 | 497.094289399683 |  1.30057940259576
  12435440 | 498.670902103186 | 498.667187988758 |  1.33110675960779
  11667415 | 494.398592971265 | 497.440234292299 |  1.44533207640052
   8530558 |  482.85893118009 | 496.267838869244 |  1.48530444130301
   4004942 | 483.679085504264 | 489.547223784029 |  1.57393841072917
  14897796 |  491.37338064611 | 487.47524273 |  1.81775307282805
   4105759 | 489.506138022989 |  486.91446846351 |  1.94038823246956
  12895656 | 499.508572742343 | 487.065799534321 |  2.34963605180383
 (10 rows)
 test=# explain analyze select * from test where v1 between 480 and 500 and
 v2 between 480 and 500 order by v3 limit 10;

  QUERY PLAN

 ---
  Limit  (cost=22786.73..22786.75 rows=10 width=28) (actual
 time=3242.135..3242.162 rows=10 loops=1)
    -  Sort  (cost=22786.73..22797.59 rows=4345 width=28) (actual
 time=3242.131..3242.144 rows=10 loops=1)
          Sort Key: v3
          Sort Method:  top-N heapsort  Memory: 25kB
          -  Bitmap Heap Scan on test  (cost=8755.91..22692.83 rows=4345
 width=28) (actual time=1281.030..3234.934 rows=4027 loops=1)
                Recheck Cond: ((v1 = 480::double precision) AND (v1 =
 500::double precision) AND (v2 = 480::double precision) AND (v2 =
 500::double precision))
                -  BitmapAnd  (cost=8755.91..8755.91 rows=4345 width=0)
 (actual time=1280.783..1280.783 rows=0 loops=1)
                      -  Bitmap Index Scan on test_v1_idx
  (cost=0.00..4243.12 rows=202177 width=0) (actual time=644.702..644.702
 rows=200715 loops=1)
                            Index Cond: ((v1 = 480::double precision) AND
 (v1 = 500::double precision))
                      -  Bitmap Index Scan on test_v2_idx
  (cost=0.00..4510.37 rows=214902 width=0) (actual time=630.085..630.085
 rows=200200 loops=1)
                            Index Cond: ((v2 = 480::double precision) AND
 (v2 = 500::double precision))
  Total runtime: 3242.253 ms
 (12 rows)
 This query can be rewritten in order to let planner use gist cube index.
 test=# select * from test where cube(array[v1,v2,v3]) @
 cube(array[480,480,'-Infinity'::float],array[500,500,'+Infinity'::float])
 order by cube(array[v1,v2,v3]) * 4 limit 10;
     id    |        v1        |        v2        |        v3
 --+--+--+---
  12283631 | 485.982828773558 | 496.795611456037 | 0.213871244341135
   4936086 | 497.239370364696 | 491.878624074161 |  1.26481195911765
   8963067 | 484.963194001466 | 497.094289399683 |  1.30057940259576
  12435440 | 498.670902103186 | 498.667187988758 |  1.33110675960779
  11667415 | 494.398592971265 | 497.440234292299 |  1.44533207640052
   8530558 |  482.85893118009 | 496.267838869244 |  1.48530444130301
   4004942 | 483.679085504264 | 489.547223784029 |  1.57393841072917
  14897796 |  491.37338064611 | 487.47524273 |  1.81775307282805
   4105759 | 489.506138022989 |  486.91446846351 |  

Re: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers

2010-06-21 Thread Heikki Linnakangas

On 21/06/10 12:08, Fujii Masao wrote:

On Wed, Jun 16, 2010 at 5:06 AM, Robert Haasrobertmh...@gmail.com  wrote:

In 9.0, I think we can fix this problem by (1) only streaming WAL that
has been fsync'd and (2) PANIC-ing if the problem occurs anyway.  But
in 9.1, with sync rep and the performance demands that entails, I
think that we're going to need to rethink it.


The problem is not that the master streams non-fsync'd WAL, but that the
standby can replay that. So I'm thinking that we can send non-fsync'd WAL
safely if the standby makes the recovery wait until the master has fsync'd
WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
location to walreceiver, and the standby applies only the WAL which the
master has already fsync'd. Thought?


I guess, but you have to be very careful to correctly refrain from 
applying the WAL. For example, a naive implementation might write the 
WAL to disk in walreceiver immediately, but refrain from telling the 
startup process about it. If walreceiver is then killed because the 
connection is broken (and it will be because the master just crashed), 
the startup process will read the streamed WAL from the file in pg_xlog, 
and go ahead to apply it anyway.


So maybe there's some room for optimization there, but given the 
round-trip required for the acknowledgment anyway it might not buy you 
much, and the implementation is not very straightforward. This is 
clearly 9.1 material, if worth optimizing at all.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 the open items list

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 4:37 AM, Greg Stark gsst...@mit.edu wrote:
 On Mon, Jun 21, 2010 at 4:54 AM, Robert Haas robertmh...@gmail.com wrote:
 I feel like we're getting off in the weeds, here.  Obviously, the user
 would ideally like the connection to the master to last forever, but
 equally obviously, if the master unexpectedly reboots, they'd like the
 slave to notice - ideally within some reasonable time period - that it
 needs to reconnect.



  There's no perfect way to distinguish the master
 croaked from the network administrator unplugged the Ethernet cable
 and is planning to plug it back in any hour now, so we'll just need
 to pick some reasonable timeout and go with it.

Eh... was there supposed to be some text here?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers

2010-06-21 Thread Greg Stark
On Mon, Jun 21, 2010 at 10:40 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I guess, but you have to be very careful to correctly refrain from applying
 the WAL. For example, a naive implementation might write the WAL to disk in
 walreceiver immediately, but refrain from telling the startup process about
 it. If walreceiver is then killed because the connection is broken (and it
 will be because the master just crashed), the startup process will read the
 streamed WAL from the file in pg_xlog, and go ahead to apply it anyway.

So the goal is that when you *do* failover to the standby it replays
these additional records. So whether the startup process obeys this
limit would have to be conditional on whether it's still in standby
mode.

 So maybe there's some room for optimization there, but given the round-trip
 required for the acknowledgment anyway it might not buy you much, and the
 implementation is not very straightforward. This is clearly 9.1 material, if
 worth optimizing at all.

I don't see any need for a round-trip acknowledgement -- no more than
currently. the master just includes the flush location in every
response. It might have to send additional responses though when
fsyncs happen to update the flush location even if no additional
records are sent. Otherwise a hot standby might spend a long time with
out-dated data even if on failover it would be up to date that seems
nonideal for the hot standby users.

I think this would be a good improvement for databases processing
large batch updates so the standby doesn't have an increased risk of
losing a large amount of data if there's a crash after processing such
a large query. I agree it's 9.1 material.

Earlier we made a change to the WAL streaming protocol on the basis
that we wanted to get the protocol right even if we don't use the
change right away. I'm not sure I understand that -- it's not like
we're going to stream WAL from 9.0 to 9.1. But if that was true then
perhaps we need to add the WAL flush location to the protocol now even
if we're not going to use yet?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keepalive for max_standby_delay

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 12:20 AM, Ron Mayer
rm...@cheapcomplexdevices.com wrote:
 Robert Haas wrote:
 On Wed, Jun 16, 2010 at 9:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sorry, I've been a bit distracted by other responsibilities (libtiff
 security issues for Red Hat, if you must know).  I'll get on it shortly.

 What?  You have other things to do besides hack on PostgreSQL?  Shocking!  
 :-)

 I suspect you're kidding, but in case some on the list didn't realize,
 Tom's probably as famous (if not moreso) in the image compression
 community as he is in the database community:

 http://www.jpeg.org/jpeg/index.html
 Probably the largest and most important contribution however was the work
  of the Independent JPEG Group (IJG), and Tom Lane in particular.

 http://www.w3.org/TR/PNG-Credits.html , http://www.w3.org/TR/PNG/
 PNG (Portable Network Graphics) Specification
  Version 1.0
  ...
  Contributing Editor
  Tom Lane, t...@sss.pgh.pa.us

 http://www.fileformat.info/format/tiff/egff.htm
 ... by Dr. Tom Lane of the Independent JPEG Group, a member of the
  TIFF Advisory Committee

Yes, I was joking, hence the smiley.  I did know he was involved in
the above, although I confess I didn't know to what degree... or that
he had a doctorate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 5:19 AM, Thom Brown thombr...@gmail.com wrote:
 I can't answer this, but is anyone else able to provide Alexander some 
 feedback?

It seems like you can get more or less the same benefit from a
multicolumn btree index.  On my system, with the individual btree
indices, the query ran in 7625 ms; with an additional index on (v1,
v2, v3), it ran in 94 ms.  I didn't get the same plans as Alexander
did, though, so it may not really be apples to apples.  See attached
session trace.

Having said that, I'm very interested in hearing what other ideas
people have for using indices to speed up ORDER BY operations.
Currently, we support only ORDER BY indexed-value.  KNNGIST will
allow ORDER BY indexed-value op constant, but why stop there?
In theory, an index might know how to order the data by any arbitrary
expression the user might happen to enter.  If the user asks for
ORDER BY (indexed-value op1 constan1t) op2 constan2t, who
is to say that we can't use an index scan to get that ordering
quickly?  (As a trivial case, suppose both ops are +, but there could
easily be more interesting ones.)  Or what about ORDER BY
somefunc(indexed-value)?  The trouble is that it's hard to think of
a way of teaching the planner about these cases without hard-coding
lots and lots of special-case kludges into the planner.  Still, if
someone has a clever idea...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
rhaas=# create table test (id serial primary key, v1 double precision, v2 
double precision, v3 double precision);
NOTICE:  CREATE TABLE will create implicit sequence test_id_seq for serial 
column test.id
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index test_pkey for 
table test
CREATE TABLE
rhaas=# insert into test (v1,v2,v3) (select random()*1000, random()*1000, 
random()*1000 from generate_series(1,1000,1));
INSERT 0 1000
rhaas=# 
rhaas=# select * from test where v1 between 480 and 500 and v2 between 480 and 
500 order by v3 limit 10;
   id|v1|v2| v3 
-+--+--+
 7707160 | 499.468160793185 | 497.105565853417 | 0.0262199901044369
 9836934 | 488.509620074183 | 489.591513760388 | 0.0412175431847572
 8299674 | 488.991918507963 | 494.098918512464 |  0.524408183991909
 4962322 | 484.455766621977 | 496.633755043149 |  0.686612911522388
 5253466 | 493.753412738442 | 481.965465471148 |  0.731946900486946
 3642389 |  496.36858003214 | 483.764411881566 |  0.750890467315912
 3332916 | 486.504513770342 | 492.682197596878 |  0.930964481085539
 3063189 | 483.532963320613 | 481.065005529672 |1.0284804739058
 6368341 | 493.383017368615 | 497.755419462919 |   1.20219821110368
 1587774 |  496.25625833869 | 484.364923555404 |   1.58307375386357
(10 rows)

rhaas=# explain analyze select * from test where v1 between 480 and 500 and v2 
between
 QUERY 
PLAN  
-
 Limit  (cost=273617.93..273617.96 rows=10 width=28) (actual 
time=7625.039..7625.042 rows=10 loops=1)
   -  Sort  (cost=273617.93..273627.92 rows=3995 width=28) (actual 
time=7625.037..7625.039 rows=10 loops=1)
 Sort Key: v3
 Sort Method:  top-N heapsort  Memory: 25kB
 -  Seq Scan on test  (cost=0.00..273531.60 rows=3995 width=28) 
(actual time=5.934..7620.583 rows=3995 loops=1)
   Filter: ((v1 = 480::double precision) AND (v1 = 500::double 
precision) AND (v2 = 480::double precision) AND (v2 = 500::double precision))
 Total runtime: 7625.101 ms
(7 rows)

rhaas=# create index x on test(v1,v2,v3);
CREATE INDEX
rhaas=# explain analyze select * from test where v1 between 480 and 500 and v2 
between 480 and 500 order by v3 limit 10;
  
QUERY PLAN  
 
---
 Limit  (cost=19892.64..19892.66 rows=10 width=28) (actual time=73.735..73.738 
rows=10 loops=1)
   -  Sort  (cost=19892.64..19902.62 rows=3995 width=28) (actual 
time=73.733..73.735 rows=10 loops=1)
 Sort Key: v3
 Sort Method:  top-N heapsort  Memory: 25kB
 -  Bitmap Heap Scan on test  (cost=6850.60..19806.31 rows=3995 
width=28) (actual time=34.682..72.238 rows=3995 loops=1)
   Recheck Cond: ((v1 = 480::double precision) AND (v1 = 
500::double precision) AND (v2 = 480::double precision) AND (v2 = 500::double 
precision))
   -  Bitmap Index Scan on x  

Re: [HACKERS] Proposal for 9.1: WAL streaming from WAL buffers

2010-06-21 Thread Simon Riggs
On Mon, 2010-06-21 at 18:08 +0900, Fujii Masao wrote:

 The problem is not that the master streams non-fsync'd WAL, but that the
 standby can replay that. So I'm thinking that we can send non-fsync'd WAL
 safely if the standby makes the recovery wait until the master has fsync'd
 WAL. That is, walsender sends not only non-fsync'd WAL but also WAL flush
 location to walreceiver, and the standby applies only the WAL which the
 master has already fsync'd. Thought?

Yes, good thought. The patch just applied seems too much.

I had the same thought, though it would mean you'd need to send two xlog
end locations, one for write, one for fsync. Though not really clear why
we send the current end of WAL on the server anyway, so maybe we can
just alter that.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: psql \whoami option

2010-06-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Jun 20, 2010 at 10:51 PM, Steve Singer ssinger...@sympatico.ca 
 wrote:
 One comment I have on the output format is that values (ie the database
 name) are enclosed in double quotes but the values being quoted can contain
 double quotes that are not being escaped.

This is the same as standard practice in just about every other
message...

 It seems like for user and database it might be sensible to apply
 PQescapeIdentifier to the value before printing it.

I think this would actually be a remarkably bad idea in this particular
instance, because in the majority of cases psql does not apply
identifier dequoting rules to user and database names.  What is printed
should be the same as what you'd need to give to \connect, for example.

 The port is, I guess, being stored as a string, but doesn't it have to
 be an integer?  In which case, why quote it at all?

Agreed, no need for quotes there.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-21 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:
  
Another thought: could we add a column to pg_type with a flag that's 
true if the oids are in sort order? Then the comparison routines could 
just look that up in the type cache and if it's true (as it often will 
be) just return the oid comparison.



Well, having to do a cache lookup already makes it a couple orders of
magnitude more expensive than an OID comparison.  However, it's hard to
say how much that matters in terms of total application performance.
We really could do with a bit of performance testing here ...


  


I have done some. The performance hit is fairly horrible. Adding cache 
lookups for the enum rows to the comarison routines made a REINDEX on a 
1m row table where the index is on an enum column (the enum has 500 
randomly ordered labels) jump from around 10s to around 70s. I think 
that probably rules out doing anything like this for the existing enum  
types. I think the most we can reasonably do there is to allow adding a 
label to the end of the enum list. I'm fairly resistant to doing 
something which will have a major performance impact, as I know there 
are users who are relying on enums for performce reasons. I'm also 
fairly resistant to doing things which will require table rewriting.


So the question then is: do we want to allow lots of flexibility for 
positioning new labels with significant degradation in comparison 
performace for a new enum variant, or have a new variant with some 
restrictions which probably won't impact most users but would have 
equivalent performance to the current enum family, or do nothing?



cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: psql \whoami option

2010-06-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Is there really a point to the non-DSN format or should we just use
 the DSN format always?

BTW, didn't have an opinion on that to start with, but after thinking
about it I'd turn it around.  psql doesn't deal in DSN format anywhere
else, so why should it do so here?  To make the point more obvious,
what's the justification for printing DSN format and not, say, JDBC URL
format?  I'd vote for removing the DSN printout option, not the other
way round.  If there was some mechanically readable format to offer to
print, it would be conninfo string format, which you can actually use
with psql if you have a mind to.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-21 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Tom Lane wrote:
 Well, having to do a cache lookup already makes it a couple orders of
 magnitude more expensive than an OID comparison.  However, it's hard to
 say how much that matters in terms of total application performance.
 We really could do with a bit of performance testing here ...

 I have done some. The performance hit is fairly horrible. Adding cache 
 lookups for the enum rows to the comarison routines made a REINDEX on a 
 1m row table where the index is on an enum column (the enum has 500 
 randomly ordered labels) jump from around 10s to around 70s.

Hmmm... that's bad, but I bet it's still less than the cost of comparing
NUMERICs.  Also, did you make any attempt to avoid repetitive cache
lookups by storing a pointer in fn_extra (cf array comparisons)?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] deprecating =, take two

2010-06-21 Thread Robert Haas
By consensus, we have removed the new-to-9.0 operator text[] = text[]
and renamed the hstore = text[] operator.  (The current name is %,
but there is some discussion of %, some yet other name, or getting
rid of it altogether; please comment on that thread if you wish to
weigh in.)  This means that the only remaining = operator in CVS is
the text = text operator which constructs a single-element hstore,
which has been around since 8.2.  In lieu of providing a substitute
operator, Tom Lane proposed that we simply encourage people to use the
hstore(text, text) function which does the same thing:

http://archives.postgresql.org/pgsql-hackers/2010-06/msg00711.php

Per that email, and subsequent concurrence, here is a series of
patches which does the following:

1. In CVS HEAD, document the hstore(text, text) function and adjust
CREATE OPERATOR to throw a warning when = is used as an operator
name, using the wording previously suggested by Tom.
2. In the back branches, add an hstore(text, text) function.  These
branches already have a tconvert(text, text) function which does the
same thing, but the consensus seemed to be that we do not want to go
back to the name tconvert() for this functionality, and that
back-patching the new name was preferable.
3. In 8.4 and 8.3, also add hstore(text, text) to the documentation.
8.2 appears to have no contrib documentation.

Barring vigorous objections, I will apply these tomorrow so that we
can consider deprecating = as an operator name in 9.1, for better
compliance with the SQL standard.

http://archives.postgresql.org/pgsql-hackers/2010-05/msg01501.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


hstore_82.patch
Description: Binary data


hstore_83.patch
Description: Binary data


hstore_84.patch
Description: Binary data


hstore_90.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-21 Thread Andrew Dunstan



Tom Lane wrote:
Adding cache 
lookups for the enum rows to the comarison routines made a REINDEX on a 
1m row table where the index is on an enum column (the enum has 500 
randomly ordered labels) jump from around 10s to around 70s.



Hmmm... that's bad, but I bet it's still less than the cost of comparing
NUMERICs.  Also, did you make any attempt to avoid repetitive cache
lookups by storing a pointer in fn_extra (cf array comparisons)?


  


No. Will work on that. Thanks.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread David E. Wheeler
On Jun 21, 2010, at 9:20 AM, Robert Haas wrote:

 Per that email, and subsequent concurrence, here is a series of
 patches which does the following:
 
 1. In CVS HEAD, document the hstore(text, text) function and adjust
 CREATE OPERATOR to throw a warning when = is used as an operator
 name, using the wording previously suggested by Tom.
 2. In the back branches, add an hstore(text, text) function.  These
 branches already have a tconvert(text, text) function which does the
 same thing, but the consensus seemed to be that we do not want to go
 back to the name tconvert() for this functionality, and that
 back-patching the new name was preferable.
 3. In 8.4 and 8.3, also add hstore(text, text) to the documentation.
 8.2 appears to have no contrib documentation.

+1

I was just about to email asking where this was. Glad I checked for new mail 
first. :-)

 Barring vigorous objections, I will apply these tomorrow so that we
 can consider deprecating = as an operator name in 9.1, for better
 compliance with the SQL standard.

So will the CREATE OPERATOR code be updated to issue the warning, rather than 
just for the case of hstore's = operator?

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Alvaro Herrera
Excerpts from Robert Haas's message of lun jun 21 12:20:59 -0400 2010:

 Barring vigorous objections, I will apply these tomorrow so that we
 can consider deprecating = as an operator name in 9.1, for better
 compliance with the SQL standard.

Maybe this is just a matter of semantics, but I thought we were going to
deprecate = in 9.0 so that people started to avoid its use altogether.
Why wait till 9.1 to recommend avoidance?

I had imagined that 9.1 was going to ban = altogether.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upgrade procedure for 9.0 with HS/SR ... ?

2010-06-21 Thread Marc G. Fournier


What is the recommended procedure for this?

For instance, normally I would do a dump, upgrade, reload, when dealing 
with a single server, just to make sure all my system tables and such are 
clean ... but, if I have HS/SR setup to a slave, what is the recommended 
method of doing an upgrade?


This will be of more concern later, I imagine, when we're dealing with a 
9.0 - 9.1 upgrade ...



Marc G. FournierHub.Org Hosting Solutions S.A.
scra...@hub.org http://www.hub.org

Yahoo:yscrappySkype: hub.orgICQ:7615664MSN:scra...@hub.org

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 the open items list

2010-06-21 Thread Robert Haas
On Sun, Jun 20, 2010 at 5:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On a quick read, I think I see a problem with this: if a parameter is
 specified with a non-zero value and there is no OS support available
 for that parameter, it's an error.  Presumably, for our purposes here,
 we'd prefer to simply ignore any parameters for which OS support is
 not available.  Given the nature of these parameters, one might argue
 that's a more useful behavior in general.

 Also, what about Windows?

 Well, of course that patch hasn't been reviewed yet ... but shouldn't we
 just be copying the existing server-side behavior, as to both points?

The existing server-side behavior is apparently to do elog(LOG) if a
given parameter is unsupported; I'm not sure what the equivalent for
libpq would be.

The current code does not seem to have any special cases for Windows
in this area, but that doesn't tell me whether it works or not.  It
looks like Windows must at least report success when you ask to turn
on keepalives, but whether it actually does anything, and whether
there extra parameters exist/work, I can't tell.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Cannot cancel the change of a tablespace

2010-06-21 Thread Guillaume Lelarge
Hi,

Today, I tried to cancel the change of a tablespace for a table (ALTER
TABLE ... SET TABLESPACE). I got the Cancel request sent but the query
continued and finally succeed. It was a big issue for my customer, and I
wanted to look more into that issue. So, I got a look at the source code
and found we didn't check for interrupts in this part of the code. I
added them, and it seems to work as I wanted.

I added a CHECK_FOR_INTERRUPTS call in the copy_relation_data(),
copy_dir(), and copy_file() functions. Works for me on ALTER TABLE ...
SET TABLESPACE and ALTER DATABASE ... SET TABLESPACE, in 9.0 and 8.4.

Not sure we really want that change, and it don't feel like a bug to me.
Should I add it to to the next commitfest?

Comments?


-- 
Guillaume
 http://www.postgresql.fr
 http://dalibo.com
Index: src/backend/commands/tablecmds.c
===
RCS file: /opt/cvsroot_postgresql/pgsql/src/backend/commands/tablecmds.c,v
retrieving revision 1.330
diff -c -p -c -r1.330 tablecmds.c
*** src/backend/commands/tablecmds.c	28 Apr 2010 16:10:41 -	1.330
--- src/backend/commands/tablecmds.c	21 Jun 2010 16:33:30 -
*** copy_relation_data(SMgrRelation src, SMg
*** 7049,7054 
--- 7049,7057 
  
  	for (blkno = 0; blkno  nblocks; blkno++)
  	{
+ /* If we got a cancel signal during the copy of the data, quit */
+ CHECK_FOR_INTERRUPTS();
+ 
  		smgrread(src, forkNum, blkno, buf);
  
  		/* XLOG stuff */
Index: src/port/copydir.c
===
RCS file: /opt/cvsroot_postgresql/pgsql/src/port/copydir.c,v
retrieving revision 1.36
diff -c -p -c -r1.36 copydir.c
*** src/port/copydir.c	1 Mar 2010 14:54:00 -	1.36
--- src/port/copydir.c	21 Jun 2010 16:33:30 -
***
*** 23,28 
--- 23,29 
  #include sys/stat.h
  
  #include storage/fd.h
+ #include miscadmin.h
  
  /*
   *	On Windows, call non-macro versions of palloc; we can't reference
*** copydir(char *fromdir, char *todir, bool
*** 67,72 
--- 68,76 
  
  	while ((xlde = ReadDir(xldir, fromdir)) != NULL)
  	{
+ /* If we got a cancel signal during the copy of the directory, quit */
+ CHECK_FOR_INTERRUPTS();
+ 
  		struct stat fst;
  
  		if (strcmp(xlde-d_name, .) == 0 ||
*** copy_file(char *fromfile, char *tofile)
*** 172,177 
--- 176,184 
  	 */
  	for (offset = 0;; offset += nbytes)
  	{
+ /* If we got a cancel signal during the copy of the file, quit */
+ CHECK_FOR_INTERRUPTS();
+ 
  		nbytes = read(srcfd, buffer, COPY_BUF_SIZE);
  		if (nbytes  0)
  			ereport(ERROR,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 12:40 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of lun jun 21 12:20:59 -0400 2010:

 Barring vigorous objections, I will apply these tomorrow so that we
 can consider deprecating = as an operator name in 9.1, for better
 compliance with the SQL standard.

 Maybe this is just a matter of semantics, but I thought we were going to
 deprecate = in 9.0 so that people started to avoid its use altogether.
 Why wait till 9.1 to recommend avoidance?

 I had imagined that 9.1 was going to ban = altogether.

Sorry, bad phrasing on my part.  Your understanding matches mine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 12:37 PM, David E. Wheeler da...@kineticode.com wrote:
 Barring vigorous objections, I will apply these tomorrow so that we
 can consider deprecating = as an operator name in 9.1, for better
 compliance with the SQL standard.

 So will the CREATE OPERATOR code be updated to issue the warning, rather than 
 just for the case of hstore's = operator?

Yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-21 Thread Kevin Grittner
Peter Geoghegan peter.geoghega...@gmail.com wrote:
 
 In my experience, lookup tables generally have two columns, an
 integer PK and a description/state.
 
Eek.  If that's what you consider a lookup table, I wouldn't
advocate their use for anything.  Ever.  Period.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About tapes

2010-06-21 Thread mac_man2...@hotmail.it

Il 21/06/2010 04:25, Tom Lane ha scritto:

No.  You could do that if the rate at which you need to write data to
the file is= the rate at which you extract it.  But for what we
are doing, namely merging runs from several tapes into one output run,
it's pretty much guaranteed that you need new space faster than you are
consuming data from any one input tape.  It balances out as long as you
keep *all* the tapes in one operating-system file; otherwise not.

regards, tom lane

   

Tom, hope you could clarify the issue of the rates.

During the initialisation phase (loading blocks into heap) of course we 
can mark as garbage more space than we are consuming (since we haven't 
still begun merging blocks). The time to do that is after prereading as 
much tuples as possible. Of course even during the algorithm we cannot 
output more tuples than we preread. So there is no problem in terms of 
total number of tuples read and output: at each time, read tuples are = 
output tuples.


Of course, in this case, output blocks should be placed in the free 
space spread around the various files and we should keep track of this 
placement.


But, recall that even in case of using a LogicalTapeSet we should keep 
track of the output blocks, as Robert said in his example.


What's wrong in my picture?

Thank you.
Manolo.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-21 Thread Simon Riggs
On Mon, 2010-06-21 at 12:04 -0500, Kevin Grittner wrote:
 Peter Geoghegan peter.geoghega...@gmail.com wrote:
  
  In my experience, lookup tables generally have two columns, an
  integer PK and a description/state.
  
 Eek.  If that's what you consider a lookup table, I wouldn't
 advocate their use for anything.  Ever.  Period.

Do you mean you don't use relational modelling, or do you mean you would
never implement your physical database that way because of the
performance impact of RI on PostgreSQL? Or?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Barring vigorous objections, I will apply these tomorrow so that we
 can consider deprecating = as an operator name in 9.1, for better
 compliance with the SQL standard.

Two documentation comments:

1. Perhaps, rather than

 +   The literal=gt;/ operator is deprecated and may be removed in a
 +   future release.  The use of the literalhstore(text, text)/literal
 +   function is recommended as an alternative.

write

 +   The literal=gt;/ operator is deprecated and will be removed in a
 +   future release.  Use the literalhstore(text, text)/literal
 +   function instead.

in particular, s/may/will/ and avoid passive voice in the second sentence.

2. The 8.4 and 8.3 doc patches should include this same paragraph.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 1:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Barring vigorous objections, I will apply these tomorrow so that we
 can consider deprecating = as an operator name in 9.1, for better
 compliance with the SQL standard.

 Two documentation comments:

 1. Perhaps, rather than

 +   The literal=gt;/ operator is deprecated and may be removed in a
 +   future release.  The use of the literalhstore(text, text)/literal
 +   function is recommended as an alternative.

 write

 +   The literal=gt;/ operator is deprecated and will be removed in a
 +   future release.  Use the literalhstore(text, text)/literal
 +   function instead.

 in particular, s/may/will/ and avoid passive voice in the second sentence.

Avoiding the passive voice is a good idea, and I like your suggested
phrasing.  I'm reluctant to say what we will do in a future release
because we don't always do what we claim we'll do.  For example, we're
planning to remove contrib/xml2 in 8.4.

http://developer.postgresql.org/pgdocs/postgres/xml2.html

Still, I haven't heard too many arguments against disallowing = as an
operator, so perhaps it's safe to bank on it actually happening in
this case?

 2. The 8.4 and 8.3 doc patches should include this same paragraph.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hstore == and deprecate =

2010-06-21 Thread David E. Wheeler
On Jun 17, 2010, at 1:30 PM, Florian Pflug wrote:

 How about turning it into a function
hstore hstore(hstore, text[])
 instead?

I just searched through the 2008 spec for a slice/subset operator and came up 
empty. It seems to define a bunch of predicates for multisets, but not much for 
arrays.

And looking again at the options, I'm *okay* with %, but not keen on % anymore 
(I could see a future where % and % it as complement @ and @ by confirming 
the presence of keys in an hstore:

bool = hstore % text[];

So, frankly, I'm coming back to what Florian has suggested here. What about 
calling it slice?

hstore = slice(hstore, text[]);

It'd be nice to have one for arrays, too:

anyarray[] = slice(anyarray[], int[]);

An operator could always be added later if a good one appeared.

Okay, no more bikeshedding for me on this issue. I'm covered in paint.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:

 By consensus, we have removed the new-to-9.0 operator text[] = text[]
 and renamed the hstore = text[] operator.  (The current name is %,
 but there is some discussion of %, some yet other name, or getting
 rid of it altogether; please comment on that thread if you wish to
 weigh in.)  

Hey, you're asking for bikesheding! % would be my choice too.
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hstore == and deprecate =

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 1:37 PM, David E. Wheeler da...@kineticode.com wrote:
 On Jun 17, 2010, at 1:30 PM, Florian Pflug wrote:

 How about turning it into a function
    hstore hstore(hstore, text[])
 instead?

 I just searched through the 2008 spec for a slice/subset operator and came up 
 empty. It seems to define a bunch of predicates for multisets, but not much 
 for arrays.

 And looking again at the options, I'm *okay* with %, but not keen on % 
 anymore (I could see a future where % and % it as complement @ and @ by 
 confirming the presence of keys in an hstore:

    bool = hstore % text[];

 So, frankly, I'm coming back to what Florian has suggested here. What about 
 calling it slice?

    hstore = slice(hstore, text[]);

 It'd be nice to have one for arrays, too:

    anyarray[] = slice(anyarray[], int[]);

 An operator could always be added later if a good one appeared.

 Okay, no more bikeshedding for me on this issue. I'm covered in paint.

I don't much like hstore(hstore, text[]) because it's not strictly a
constructor.  But I could certainly live with something based on the
word slice.  The existing SQL function backing the operator is called
slice_hstore(), whereas I would probably prefer hstore_slice() or just
slice(), but I can't talk about it right now because I have to go
finish laundering the paint out of my entire wardrobe.  Having already
written three patches to rename this operator (to three different
names), I'm in no hurry to write a fourth unless the degree of
consensus is sufficient to convince me I shan't need to write a fifth
one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 1:46 PM, Dimitri Fontaine
dfonta...@hi-media.com wrote:
 Robert Haas robertmh...@gmail.com writes:

 By consensus, we have removed the new-to-9.0 operator text[] = text[]
 and renamed the hstore = text[] operator.  (The current name is %,
 but there is some discussion of %, some yet other name, or getting
 rid of it altogether; please comment on that thread if you wish to
 weigh in.)

 Hey, you're asking for bikesheding! % would be my choice too.

The point was that if you want to bikeshed, please do it on the OTHER
thread, not this one.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 21, 2010 at 1:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 in particular, s/may/will/ and avoid passive voice in the second sentence.

 Avoiding the passive voice is a good idea, and I like your suggested
 phrasing.  I'm reluctant to say what we will do in a future release
 because we don't always do what we claim we'll do.

True.

 Still, I haven't heard too many arguments against disallowing = as an
 operator, so perhaps it's safe to bank on it actually happening in
 this case?

AFAICS, the only way we won't do it is if the SQL committee reverses
field on the syntax they're using.  I'm not going to promise that it
will change in 9.1 --- we might wait longer --- but once = is official
standard syntax the pressure to do it will be high.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hstore == and deprecate =

2010-06-21 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 So, frankly, I'm coming back to what Florian has suggested here. What about 
 calling it slice?

 hstore = slice(hstore, text[]);

+1, particularly seeing that our solution for the other two cases also
comes down to use the function instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hstore == and deprecate =

2010-06-21 Thread David E. Wheeler
On Jun 21, 2010, at 10:46 AM, Robert Haas wrote:

 I don't much like hstore(hstore, text[]) because it's not strictly a
 constructor.  But I could certainly live with something based on the
 word slice.  The existing SQL function backing the operator is called
 slice_hstore(), whereas I would probably prefer hstore_slice() or just
 slice(), but I can't talk about it right now because I have to go
 finish laundering the paint out of my entire wardrobe.  Having already
 written three patches to rename this operator (to three different
 names), I'm in no hurry to write a fourth unless the degree of
 consensus is sufficient to convince me I shan't need to write a fifth
 one.

That seems wise. :-)

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] deprecating =, take two

2010-06-21 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 The point was that if you want to bikeshed, please do it on the OTHER
 thread, not this one.  :-)

Ouch, asking for bikeshed and understanding what you read around… I
call that a trap ;)
-- 
dim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Upgrade procedure for 9.0 with HS/SR ... ?

2010-06-21 Thread Heikki Linnakangas

On 21/06/10 19:41, Marc G. Fournier wrote:

What is the recommended procedure for this?

For instance, normally I would do a dump, upgrade, reload, when dealing
with a single server, just to make sure all my system tables and such
are clean ... but, if I have HS/SR setup to a slave, what is the
recommended method of doing an upgrade?

This will be of more concern later, I imagine, when we're dealing with a
9.0 - 9.1 upgrade ...


Wal shipping doesn't work across major versions, and I wouldn't 
recommend doing it across different minor versions either. So:


1. On master: dump, upgrade binaries, reload master,
2. On slave: Shutdown, upgrade binaries, take new base backup from 
master and reinitialize slave from it.


On a new minor version, I'd imagine that you upgrade the binaries on 
slave first, then master. The newer version can generally read WAL from 
old version, but not necessarily vice versa. But check the release notes..


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dividing money by money

2010-06-21 Thread Kevin Grittner
Andy Balholm a...@balholm.com wrote:
 On May 30, 2010, at 6:53 AM, Kevin Grittner wrote:
 You would then generate a diff in context format and post to the
 -hackers list with that file as an attachment.  
 
 Here it is
 
=
Submission review
=
 
* Is the patch in context diff format?
 
Yes, although the line endings are Windows format (CR/LF).  The
patch utility on my system just ignored the CRs, but if they can be
filtered, all the better.
 
* Does it apply cleanly to the current CVS HEAD?
 
It does.
 
* Does it include reasonable tests, necessary doc patches, etc?
 
The doc patches seemed reasonable to me.  There were no test
patches; I'm not sure if they're necessary.
 

Usability review

 
** Read what the patch is supposed to do, and consider:
 
* Does the patch actually implement that?
 
Yes.
 
* Do we want that?
 
I think we do -- it allows easy casting between money and numeric,
and allows one number to be divided by another to get a ratio.
 
* Do we already have it?
 
There are work-arounds, but they are clumsy and error-prone.
 
* Does it follow SQL spec, or the community-agreed behavior?
 
There was discussion on the lists, and this patch implements the
consensus, as far as I can determine.
 
* Does it include pg_dump support (if applicable)?
 
Not applicable.
 
* Are there dangers?
 
None that I can see.
 
* Have all the bases been covered?
 
The only possible issue is that cast from numeric to money lets
overflow be noticed and handled by the numeric_int8 function, which
puts out an error message on overflow which might be confusing
(ERROR: bigint out of range).
 

Feature test

 
** Apply the patch, compile it and test:
 
* Does the feature work as advertised?
 
Yes.
 
* Are there corner cases the author has failed to consider?
 
Just the content of the error message on the cast from numeric to
money (see above).  I'm not sure whether it's worth addressing that
since the money class silently yields the wrong value everywhere
else.  For example, if you cast the numeric to text and then cast it
to money, you'll quietly get the wrong amount rather than an error
-- the behavior of this patch on the cast from numeric seem like an
improvement compared to that; perhaps we should create a TODO entry
to include overflow checking with reasonable errors in *all* money
functions?  Alternatively, we could modify this cast to behave the
same as the cast from text, but that hardly seems like an
improvement.
 
* Are there any assertion failures or crashes?
 
No.
 
==
Performance review
==

* Does the patch slow down simple tests?
 
No.  It seems to provide a very slight performance improvement for
the tests run.  For example, a loop through a million casts of a
money literal to text runs about 1% slower than a cast of the same
money literal to numeric and then to text; which is reasonable
because it avoids the need to insert commas and a dollar sign. 
Given the number of tests, there's maybe a 10% chance that the
apparent slight improvement was just noise, but given the nature of
the patch, it seems reasonable to expect that there would be a
slight improvement.
 
* If it claims to improve performance, does it?
 
It makes no such claim.
 
* Does it slow down other things?
 
No.
 
=
Coding review
=
 
** Read the changes to the code in detail and consider:
 
* Does it follow the project coding guidelines?
 
The only issue is with the general guideline to make the new code
blend in with existing code:
 
http://wiki.postgresql.org/wiki/Submitting_a_Patch
 
| Generally, try to blend in with the surrounding code.
 
| Comments are for clarification not for delineating your code from
| the surroundings.
 
There are comments to set off the new code, and some of the new DATA
lines (and similar) are separated away from where one would expect
them to be if they had been included with the rest.  Moving a few
lines and deleting a few comment lines would resolve it.
 
* Are there portability issues?
 
I don't think so.
 
* Will it work on Windows/BSD etc?
 
I think so.
 
* Are the comments sufficient and accurate?
 
They seem so to me.
 
* Does it do what it says, correctly?
 
It looks like it both in reading the code and in testing.
 
* Does it produce compiler warnings?
 
No.
 
* Can you make it crash?
 
No.
 
===
Architecture review
===
 
** Consider the changes to the code in the context of the project as
** a whole:
 
* Is everything done in a way that fits together coherently with
* other features/modules?
 
Yes.
 
* Are there interdependencies that can cause problems?
 
No.
 
=
Review review
=
 
** Did the reviewer cover all the things that kind of reviewer is
** supposed to do?
 
I think so.
 
I'm going to set this back to Waiting on Author for the minor
rearrangement suggested in Coding review.  I welcome any comments

Re: [HACKERS] About tapes

2010-06-21 Thread Tom Lane
mac_man2...@hotmail.it mac_man2...@hotmail.it writes:
 Of course, in this case, output blocks should be placed in the free 
 space spread around the various files and we should keep track of this 
 placement.

And once you've done that, what benefit have you got over the current
design?  None that I can see.  It's only more complicated.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About tapes

2010-06-21 Thread mac_man2...@hotmail.it

Tom, you are right: it is just more complicated.

In fact, I did not pretend to demonstrate that it was easier or faster 
using one file per tape.
As you can remember, I just did not understand why you said it was 
*impossible* to recycle space in that case.


So, the conclusion is: you can do recycle space when using one file per 
tape, but it is just more complicated than current design, isn't it?


PD: are we sure it is more complicated?

Thanks.

Manolo.


Il 21/06/2010 21:27, Tom Lane ha scritto:


And once you've done that, what benefit have you got over the current
design?  None that I can see.  It's only more complicated.

regards, tom lane

   


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] what exactly is a PlaceHolderVar?

2010-06-21 Thread Robert Haas
I can't find any good documentation of this in the source tree
anywhere.  placeholder.c just says:

 *PlaceHolderVar and PlaceHolderInfo manipulation routines

and placeholder.h says:

 *prototypes for optimizer/util/placeholder.c.

...which is less than informative.  The commit message that introduced
them has a few details:

Add a concept of placeholder variables to the planner.  These
are variables
that represent some expression that we desire to compute below the top level
of the plan, and then let that value bubble up as though it were a plain
Var (ie, a column value).

...but I'm still having a hard time wrapping my head around it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] server authentication over Unix-domain sockets

2010-06-21 Thread Peter Eisentraut
On fre, 2010-06-11 at 08:07 -0400, Stephen Frost wrote:
 Having the option wouldn't do much unless users know of it and use it
 and it strikes that will very often not be the case.

That situation is the same as with SSL over TCP/IP with certificate
validation.  I don't think we can make either of these the default
without risking breaking a lot of things.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dividing money by money

2010-06-21 Thread Andy Balholm
On Jun 21, 2010, at 11:47 AM, Kevin Grittner wrote:

 Yes, although the line endings are Windows format (CR/LF).  

The line endings must have gotten changed in transit. My original diff used 
just LF. I made it on a Mac.

 The only issue is with the general guideline to make the new code
 blend in with existing code:
 
 http://wiki.postgresql.org/wiki/Submitting_a_Patch
 
 | Generally, try to blend in with the surrounding code.
 
 | Comments are for clarification not for delineating your code from
 | the surroundings.
 
 There are comments to set off the new code, and some of the new DATA
 lines (and similar) are separated away from where one would expect
 them to be if they had been included with the rest.  Moving a few
 lines and deleting a few comment lines would resolve it.

I deleted the excess comments and moved some lines around. Here it is with the 
changes.


dividing-money.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-21 Thread Alexander Korotkov
On Mon, Jun 21, 2010 at 5:42 PM, Robert Haas robertmh...@gmail.com wrote:

 It seems like you can get more or less the same benefit from a
 multicolumn btree index.  On my system, with the individual btree
 indices, the query ran in 7625 ms; with an additional index on (v1,
 v2, v3), it ran in 94 ms.  I didn't get the same plans as Alexander
 did, though, so it may not really be apples to apples.  See attached
 session trace.

Benefit of multicolumn btree index was more or less the same than cube
benefit because of very bad picksplit behavior in this case. I attached the
patch which significally improves cube index search performance:

test=# explain (analyze, buffers) select * from test where
cube(ARRAY[v1,v2,v3]) @ cube(ARRAY[480,480,'-inf'::float8],
ARRAY[500,500,'+inf'::float8]) order by cube(ARRAY[v1,v2,v3]) * 4 LIMIT
10;
 QUERY PLAN



 Limit  (cost=0.00..38.07 rows=10 width=28) (actual time=0.495..0.570
rows=10 loops=1)
   Buffers: shared hit=21
   -  Index Scan using test_cube_idx on test  (cost=0.00..38064.52
rows=1 width=28) (actual time=0.489..0.537 rows=10 loops=1)
 Index Cond: (cube(ARRAY[v1, v2, v3]) @ '(480, 480, -inf),(500,
500, inf)'::cube)
 Sort Cond: (cube(ARRAY[v1, v2, v3]) * 4)
 Buffers: shared hit=21
 Total runtime: 0.659 ms
(7 rows)

Now this patch greatly increases tree construction time, but I believe that
picksplit implementation, that is good enough for tree search and tree
construction, can be found.


 The trouble is that it's hard to think of
 a way of teaching the planner about these cases without hard-coding
 lots and lots of special-case kludges into the planner.  Still, if
 someone has a clever idea...


I think that two things can be done to improve the situation:
1) Make knngist deal with negative values. I think this will make easier
using knngist just for sorting, not only k-neighbor searching.
2) Let gist interface methods take care about multicolumn indexes. I think
that if cube index from the example above will be constructed on separate
columns v1, v2, v3 then it would be easier for planner to use cube index for
queries with filters on these columns. I don't know exactly how to do that.


cube.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using multidimensional indexes in ordinal queries

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 5:20 PM, Alexander Korotkov
aekorot...@gmail.com wrote:
 1) Make knngist deal with negative values. I think this will make easier
 using knngist just for sorting, not only k-neighbor searching.

It doesn't?   I didn't think it was making any assumptions about the
ordering data type beyond the fact that it had a default btree
opclass.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cannot cancel the change of a tablespace

2010-06-21 Thread Simon Riggs
On Mon, 2010-06-21 at 18:46 +0200, Guillaume Lelarge wrote:

 Today, I tried to cancel the change of a tablespace for a table (ALTER
 TABLE ... SET TABLESPACE). I got the Cancel request sent but the query
 continued and finally succeed. It was a big issue for my customer, and I
 wanted to look more into that issue. So, I got a look at the source code
 and found we didn't check for interrupts in this part of the code. I
 added them, and it seems to work as I wanted.
 
 I added a CHECK_FOR_INTERRUPTS call in the copy_relation_data(),
 copy_dir(), and copy_file() functions. Works for me on ALTER TABLE ...
 SET TABLESPACE and ALTER DATABASE ... SET TABLESPACE, in 9.0 and 8.4.
 
 Not sure we really want that change, and it don't feel like a bug to me.
 Should I add it to to the next commitfest?

Patch looks fine to me. Seems important.

Will apply tomorrow to 9.0, barring objections.

-- 
 Simon Riggs   www.2ndQuadrant.com



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Server crash while trying to read expression using pg_get_expr()

2010-06-21 Thread Heikki Linnakangas

On 15/06/10 10:31, Heikki Linnakangas wrote:

You could avoid changing the meaning of fn_expr by putting the check in
the parse analysis phase, into transformFuncCall(). That would feel
safer at least for back-branches.


Here's a patch using that approach.

I grepped through PostgreSQL and pgadmin source code to find the system 
columns where valid node-strings are stored:


pg_index.indexprs
pg_index.indprep
pg_attrdef.adbin
pg_proc.proargdefaults
pg_constraint.conbin

Am I missing anything?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 5e60374..7c375a9 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -16,6 +16,9 @@
 #include postgres.h
 
 #include catalog/pg_type.h
+#include catalog/pg_attrdef.h
+#include catalog/pg_constraint.h
+#include catalog/pg_proc.h
 #include commands/dbcommands.h
 #include miscadmin.h
 #include nodes/makefuncs.h
@@ -30,6 +33,7 @@
 #include parser/parse_target.h
 #include parser/parse_type.h
 #include utils/builtins.h
+#include utils/fmgroids.h
 #include utils/lsyscache.h
 #include utils/xml.h
 
@@ -1210,6 +1214,7 @@ transformFuncCall(ParseState *pstate, FuncCall *fn)
 {
 	List	   *targs;
 	ListCell   *args;
+	Node	   *result;
 
 	/* Transform the list of arguments ... */
 	targs = NIL;
@@ -1220,7 +1225,7 @@ transformFuncCall(ParseState *pstate, FuncCall *fn)
 	}
 
 	/* ... and hand off to ParseFuncOrColumn */
-	return ParseFuncOrColumn(pstate,
+	result = ParseFuncOrColumn(pstate,
 			 fn-funcname,
 			 targs,
 			 fn-agg_order,
@@ -1230,6 +1235,58 @@ transformFuncCall(ParseState *pstate, FuncCall *fn)
 			 fn-over,
 			 false,
 			 fn-location);
+
+	/* FIXME explain why this hack is needed */
+	if (result  IsA(result, FuncExpr)  !superuser())
+	{
+		FuncExpr *fe = (FuncExpr *) result;
+		if (fe-funcid == F_PG_GET_EXPR || fe-funcid == F_PG_GET_EXPR_EXT)
+		{
+			Expr *arg = linitial(fe-args);
+			bool allowed = false;
+
+			/*
+			 * Check that the argument came directly from one of the
+			 * allowed system catalog columns
+			 */
+			if (IsA(arg, Var))
+			{
+Var *var = (Var *) arg;
+RangeTblEntry *rte;
+
+rte = GetRTEByRangeTablePosn(pstate,
+			 var-varno, var-varlevelsup);
+
+switch(rte-relid)
+{
+	case IndexRelationId:
+		if (var-varattno == Anum_pg_index_indexprs ||
+			var-varattno == Anum_pg_index_indpred)
+			allowed = true;
+		break;
+
+	case AttrDefaultRelationId:
+		if (var-varattno == Anum_pg_attrdef_adbin)
+			allowed = true;
+		break;
+
+	case ProcedureRelationId:
+		if (var-varattno == Anum_pg_proc_proargdefaults)
+			allowed = true;
+		break;
+	case ConstraintRelationId:
+		if (var-varattno == Anum_pg_constraint_conbin)
+			allowed = true;
+		break;
+}
+			}
+			if (!allowed)
+ereport(ERROR,
+		(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+		 errmsg(argument to pg_get_expr() must come from system catalogs)));
+		}
+	}
+	return result;
 }
 
 static Node *
diff --git a/src/backend/tcop/fastpath.c b/src/backend/tcop/fastpath.c
index 575fa86..32c4fa9 100644
--- a/src/backend/tcop/fastpath.c
+++ b/src/backend/tcop/fastpath.c
@@ -29,6 +29,7 @@
 #include tcop/fastpath.h
 #include tcop/tcopprot.h
 #include utils/acl.h
+#include utils/fmgroids.h
 #include utils/lsyscache.h
 #include utils/snapmgr.h
 #include utils/syscache.h
@@ -347,6 +348,11 @@ HandleFunctionRequest(StringInfo msgBuf)
 		aclcheck_error(aclresult, ACL_KIND_PROC,
 	   get_func_name(fid));
 
+	if ((fid == F_PG_GET_EXPR || fid == F_PG_GET_EXPR_EXT)  !superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg(argument to pg_get_expr() must come from system catalogs)));
+
 	/*
 	 * Prepare function call info block and insert arguments.
 	 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-06-21 Thread gabrielle
On Thu, 2010-06-17 at 14:50 -0400, Alvaro Herrera asked:
 How does it play with ON_ERROR_STOP/ROLLBACK?

With ON_ERROR_STOP=ON, psql issues an error when it encounters one,
stops processing the file that contains the error, and then continues
to process any remaining files.

I'm still investigating ON_ERROR_ROLLBACK.  I need to tinker with it
some more before I say anything concrete.

On Fri, Jun 18, 2010 at 1:48 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Also, how does it play with --single-transaction.
That was buried in our original report :) BEGIN-COMMIT statements
within the files cause warnings when the command is wrapped in a
transaction with the -1 switch (as specified in the patch submission)

To expand upon that a bit:  when psql encounters a file that contains
a BEGIN statement, you get the expected WARNING: there is already a
transaction in progress message.  The COMMIT at the end of that file
(assuming the user doesn't forget it) generates a COMMIT.  Commands
after that commit, or in any remaining files to be processed, are
dealt with according to the user's autocommit settings:
- if autocommit is ON, statements in the remaining files are processed
 committed;  the implicit COMMIT at the end of the whole thing then
generates a WARNING: there is no transaction in progress message
- if autocommit is OFF, statements in the remaining files generate
ERROR:  current transaction is aborted, commands ignored until end of
transaction block messages.

 I would like multiple -c commands also, as well as a mix of -f and -c.
 Can we add that at the same time please?

I'll leave this one for someone else to answer. :)

gabrielle

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] server authentication over Unix-domain sockets

2010-06-21 Thread KaiGai Kohei
I've checked on this patch.

As you described at the source code comments as follows,
it is not portable except for Linux due to the getsockopt() API.

+   // TODO: currently Linux-only code, needs to be made
+   // portable; see backend/libpq/auth.c

I expect it shall be fixed (using the code come from ident_unix()?)
before committing.


I'd like to point out one other point.
It uses getpwuid() to translate a user identifier into a user name,
but it returns a pointer of the static variable within glibc.
So, it is not thread-safe. I recommend to use getpwnam_r() instead.

Except for the issue, it looks to me fine.

* The patch can be applied on the head of the git repository.
* We can build the code without any warnings/errors.
* It works as described in the documentation.

  [kai...@saba ~]$ php -r 'pg_connect(dbname=postgres requirepeer=);'
  PHP Warning:  pg_connect(): Unable to connect to PostgreSQL server: invalid 
connection option requirepeer in Command line code on line 1
= Existing library, so not supported.

  [kai...@saba ~]$ env LD_LIBRARY_PATH=/usr/local/pgsql/lib/ \
   php -r 'pg_connect(dbname=postgres requirepeer=);'
  PHP Warning:  pg_connect(): Unable to connect to PostgreSQL server: 
requirepeer failed (actual: kaigai != required: ) in Command line code on 
line 1
  LOG:  incomplete startup packet
= Patched library, so it prevent unexpected user-id of server process

  [kai...@saba ~]$ env LD_LIBRARY_PATH=/usr/local/pgsql/lib/ \
   php -r 'pg_connect(dbname=postgres requirepeer=kaigai);'
= Patched library, so it does not prevent anything for the expected user-id.

  [kai...@saba ~]$ env LD_LIBRARY_PATH=/usr/local/pgsql/lib/ \
   php -r 'pg_connect(dbname=postgres);'
= No requirepeer, so it does not prevent anything.

  [kai...@saba ~]$ env LD_LIBRARY_PATH=/usr/local/pgsql/lib/ \
   env PGREQUIREPEER=xyz php -r 'pg_connect(dbname=postgres);'
  PHP Warning:  pg_connect(): Unable to connect to PostgreSQL server: 
requirepeer failed (actual: kaigai != required: xyz) in Command line code on 
line 1
  LOG:  incomplete startup packet
= PGREQUIREPEER environment variable, instead of requirepeer option. Same 
result.

  [kai...@saba ~]$ env LD_LIBRARY_PATH=/usr/local/pgsql/lib/ \
   env PGREQUIREPEER=kaigai php -r 
'pg_connect(dbname=postgres);'
= PGREQUIREPEER environment variable, instead of requirepeer option. Same 
result.

Thanks,

(2010/05/30 20:00), Peter Eisentraut wrote:
 It has been discussed several times in the past that there is no way for
 a client to authenticate a server over Unix-domain sockets.  So
 depending on circumstances, a local user could easily insert his own
 server and collect passwords and data.  Suggestions for possible
 remedies included:
 
 You can put the socket file in a sufficiently write-protected directory.
 But that would strongly deviate from the default setup, and anyway the
 client still cannot readily verify that the server is the right one.
 
 You can also run SSL over Unix-domain sockets.  This is currently
 disabled in the code, but it would work just fine.  But it's obviously
 kind of awkward, and the connection overhead was noticeable in tests.
 
 Then it was suggested to use the local ident mechanism in reverse, so
 the client could verify what user the server runs under.  I have
 implemented a prototype of this.  You can put, e.g.,
 
 requirepeer=postgres
 
 into the connection parameters, and the connection will be rejected
 unless the process at the other end of the socket is running as
 postgres.
 
 The patch needs some portability work and possible refactoring because
 of that, but before I embark on that, comments on the concept?
 
 
 
 
 


-- 
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 7:51 PM, gabrielle gor...@gmail.com wrote:
 On Thu, 2010-06-17 at 14:50 -0400, Alvaro Herrera asked:
 How does it play with ON_ERROR_STOP/ROLLBACK?

 With ON_ERROR_STOP=ON, psql issues an error when it encounters one,
 stops processing the file that contains the error, and then continues
 to process any remaining files.

 I'm still investigating ON_ERROR_ROLLBACK.  I need to tinker with it
 some more before I say anything concrete.

 On Fri, Jun 18, 2010 at 1:48 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Also, how does it play with --single-transaction.
 That was buried in our original report :) BEGIN-COMMIT statements
 within the files cause warnings when the command is wrapped in a
 transaction with the -1 switch (as specified in the patch submission)

 To expand upon that a bit:  when psql encounters a file that contains
 a BEGIN statement, you get the expected WARNING: there is already a
 transaction in progress message.  The COMMIT at the end of that file
 (assuming the user doesn't forget it) generates a COMMIT.  Commands
 after that commit, or in any remaining files to be processed, are
 dealt with according to the user's autocommit settings:
 - if autocommit is ON, statements in the remaining files are processed
  committed;  the implicit COMMIT at the end of the whole thing then
 generates a WARNING: there is no transaction in progress message
 - if autocommit is OFF, statements in the remaining files generate
 ERROR:  current transaction is aborted, commands ignored until end of
 transaction block messages.

So none of the above sounds like desired behavior to me...  is that just me?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explicit psqlrc

2010-06-21 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 So none of the above sounds like desired behavior to me...  is that just me?

Yeah, I'm not really thrilled with this..  I mentioned earlier what I
thought would be a useful feature (basically, a switch which would
ignore the main psqlrc and turn on the various options that make sense
for a script), but that seems to have fallen to the wayside..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Explicit psqlrc

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 9:13 PM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 So none of the above sounds like desired behavior to me...  is that just me?

 Yeah, I'm not really thrilled with this..  I mentioned earlier what I
 thought would be a useful feature (basically, a switch which would
 ignore the main psqlrc and turn on the various options that make sense
 for a script), but that seems to have fallen to the wayside..

Well, that might be a good idea, too, but my expectation is that:

psql -f one -f two -f three

ought to behave in a manner fairly similar to:

cat one two three  all
psql -f all

and it sounds like with this patch that's far from being the case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] missing else in postmaster.c?

2010-06-21 Thread Robert Haas
In pmdie(), we have the following code, which doesn't seem to make
much sense.  If the state is PM_RECOVERY at the top of this section it
will get changed to PM_WAIT_BACKENDS and then to PM_WAIT_BACKENDS
again.  Either the two if statements should be merged (and both bits
should be handled with the same block of code) or the second one
should say else if.  Or at least, I think so...

if (pmState == PM_RECOVERY)
{
/* only bgwriter is active in this state */
pmState = PM_WAIT_BACKENDS;
}
if (pmState == PM_RUN ||
pmState == PM_WAIT_BACKUP ||
pmState == PM_WAIT_READONLY ||
pmState == PM_WAIT_BACKENDS ||
pmState == PM_HOT_STANDBY)
{
ereport(LOG,
(errmsg(aborting any active transactions)));
/* shut down all backends and autovac workers */
SignalSomeChildren(SIGTERM,
 BACKEND_TYPE_NORMAL |
BACKEND_TYPE_AUTOVAC);/* and the autovac launcher too
*/
if (AutoVacPID != 0)
signal_child(AutoVacPID, SIGTERM);
/* and the walwriter too */
if (WalWriterPID != 0)
signal_child(WalWriterPID, SIGTERM);
pmState = PM_WAIT_BACKENDS;
}

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-21 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:
  

Tom Lane wrote:


Well, having to do a cache lookup already makes it a couple orders of
magnitude more expensive than an OID comparison.  However, it's hard to
say how much that matters in terms of total application performance.
We really could do with a bit of performance testing here ...
  


  
I have done some. The performance hit is fairly horrible. Adding cache 
lookups for the enum rows to the comarison routines made a REINDEX on a 
1m row table where the index is on an enum column (the enum has 500 
regards, tom lane


randomly ordered labels) jump from around 10s to around 70s.



Hmmm... that's bad, but I bet it's still less than the cost of comparing
NUMERICs.  Also, did you make any attempt to avoid repetitive cache
lookups by storing a pointer in fn_extra (cf array comparisons)?


  


OK, making a bit of progress. Attached is a sort of proof of concept 
patch that does that. It stores a bsearchable list of {enum, sort_order} 
pairs in fn_extra, along with a flag that indicates if the oids are  in 
fact ordered. This flag, which would be maintained in and populated from 
pg_type, would allow avoidance of any significant performance penalty in 
such cases by relying on straight Oid comparison. We'd probably need to 
keep a count of labels in pg_type too so we could size the cache 
appropriately. This approach just about buys the best of both worlds. 
The execution time for the test mentioned above is down from around 70s 
to around 20s. I think for a worst case that's not too bad, especially 
when it is completely avoided unless we have perturbed the sort order.


If anyone wants to play along, my test set is available at 
http://developer.postgresql.org/~adunstan/enumtest.dmp It's about 8.5Mb.


cheers

andrew
Index: enum.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/enum.c,v
retrieving revision 1.11
diff -c -r1.11 enum.c
*** enum.c	26 Feb 2010 02:01:08 -	1.11
--- enum.c	22 Jun 2010 02:16:48 -
***
*** 14,19 
--- 14,20 
  #include postgres.h
  
  #include catalog/pg_enum.h
+ #include catalog/pg_type.h
  #include fmgr.h
  #include utils/array.h
  #include utils/builtins.h
***
*** 22,27 
--- 23,52 
  #include libpq/pqformat.h
  #include miscadmin.h
  
+ typedef struct 
+ {
+ 	Oid  enum_oid;
+ 	uint32   sort_order;
+ } enum_sort;
+ 
+ typedef struct 
+ {
+ 	bool  oids_are_sorted;
+ 	int   sort_list_length;
+ 	enum_sort sort_order_list[1024];
+ } enum_sort_cache;
+ 	
+ 
+ static int 
+ enum_sort_cmp(void * es1, void * es2)
+ {
+ 	enum_sort *p1, *p2;
+ 	p1 = (enum_sort *)es1;
+ 	p2 = (enum_sort *)es2;
+ 	return p1-enum_oid - p2-enum_oid;
+ }
+ 
+ 
  
  static ArrayType *enum_range_internal(Oid enumtypoid, Oid lower, Oid upper);
  static int	enum_elem_cmp(const void *left, const void *right);
***
*** 155,167 
  
  /* Comparison functions and related */
  
  Datum
  enum_lt(PG_FUNCTION_ARGS)
  {
  	Oid			a = PG_GETARG_OID(0);
  	Oid			b = PG_GETARG_OID(1);
  
! 	PG_RETURN_BOOL(a  b);
  }
  
  Datum
--- 180,283 
  
  /* Comparison functions and related */
  
+ static inline int 
+ enum_ccmp(Oid arg1, Oid arg2, FunctionCallInfo fcinfo)
+ {
+ 
+ 	enum_sort_cache * mycache;
+ 	enum_sort *es1, *es2;
+ 	int sort1, sort2;
+ 	bool added = false;
+ 	HeapTuple	tup;
+ 	Form_pg_enum en;
+ 	Oid typeoid;
+ 	Form_pg_type typ;
+ 
+ 	if (arg1 == arg2)
+ 		return 0;
+ 
+ 	mycache = (enum_sort_cache *) fcinfo-flinfo-fn_extra;
+ 	if (mycache == NULL )
+ 	{
+ fcinfo-flinfo-fn_extra = MemoryContextAlloc(fcinfo-flinfo-fn_mcxt,
+   sizeof(enum_sort_cache));
+ mycache = (enum_sort_cache *) fcinfo-flinfo-fn_extra;
+ 		mycache-sort_list_length = 1;
+ 		tup = SearchSysCache1(ENUMOID, ObjectIdGetDatum(arg1));
+ 		en = (Form_pg_enum) GETSTRUCT(tup);
+ 		mycache-sort_order_list[0].enum_oid = arg1;
+ 		mycache-sort_order_list[0].sort_order = arg1;
+ 		typeoid = en-enumtypid;
+ 		ReleaseSysCache(tup);
+ 		tup = SearchSysCache1(TYPEOID, ObjectIdGetDatum(typeoid));
+ 		typ = (Form_pg_type)  GETSTRUCT(tup);
+ 		if (typ-typtype != 'e')
+ 			elog(ERROR,wrong type for oid %u,typeoid);
+ 		/* XXX TODO fill in oids_are_sorted property from type tuple here */
+ 		mycache-oids_are_sorted = false;
+ 		ReleaseSysCache(tup);
+ 	}
+ 
+ 	if (mycache-oids_are_sorted)
+ 		return arg1 - arg2;
+ 
+ 	es1 = bsearch(arg1,mycache-sort_order_list,mycache-sort_list_length,
+   sizeof(enum_sort),enum_sort_cmp);
+ 	es2 = bsearch(arg2,mycache-sort_order_list,mycache-sort_list_length,
+   sizeof(enum_sort),enum_sort_cmp);
+ 
+ 	if (es1 == NULL)
+ 	{
+ 		
+ 		tup = SearchSysCache1(ENUMOID, ObjectIdGetDatum(arg1));
+ 		en = (Form_pg_enum) GETSTRUCT(tup);
+ 		mycache-sort_order_list[mycache-sort_list_length].enum_oid = arg1;
+ 		sort1 = 

Re: [HACKERS] what exactly is a PlaceHolderVar?

2010-06-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 ...but I'm still having a hard time wrapping my head around it.

The fundamental point is to be able to force a value to go to NULL
when outer-join logic says it ought to.  Consider

CREATE VIEW foo AS SELECT x,y,'zed' FROM bar;

SELECT * FROM baz LEFT JOIN foo ON (baz.a = foo.x);

If you try to flatten the view then you end up with a constant 'zed'
that needs to be replaced by NULL whenever baz.a hasn't got a match
in bar.x.  There's no way to make a constant go to NULL though: it's
a constant, n'est-ce pas?  Instead, we have the idea of an expression
PlaceHolderVar(foo, 'zed').  This will go to null if variables from foo
ought to go to null.  Otherwise it produces 'zed'.  Sort of an
anti-COALESCE.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] system views for walsender activity

2010-06-21 Thread Takahiro Itagaki

Tom Lane t...@sss.pgh.pa.us wrote:

 I'm of the opinion that this is a 9.1 problem.  It needs more thought
 than we can put into it now --- one obvious question is what about
 monitoring on the slave side?  Another is who should be able to see the
 data?

Sure. We should research user's demands for monitoring and management
of replication. I'll report some voices from users as of this moment:

* Managers often ask DBAs How long standby servers are behind the master?
  We should provide such methods for DBAs. We have pg_xlog_location()
  functions, but they should be improved for:
- The returned values are xxx/yyy texts, but more useful information
  is the difference of two values. Subtraction functions are required.
- For easier management, the master server should provide not only
  sent/flush locations but also received/replayed locations for each
  standby servers. Users don't want to access both master and slaves.

* Some developers want to pause and restart replication from the master
  server. They're going to use replication for application version
  managements. They'll pause all replications, and test their new features
  at the master, and restart replication to spread the changes to slaves.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers