date:20080601

Re: [HACKERS] Packages in oracle Style

2008-06-01 Thread Harald Armin Massa

Joe and all,

 The below listed tarball is out of date at this point, but I have updated
 code laying around if someone really wanted it:
  http://www.joeconway.com/sessfunc.tar.gz
 I've used variations of this over the years on several projects.

is someone able and willing to provide this tarball compiled to a
PostgreSQL-8.3.1 usable win32-dll ?

ATM I am using session variables in pure PL/SQL via temp tables. ..:)


Harald


-- 
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
no fx, no carrier pidgeon
-
EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Packages in oracle Style

2008-06-01 Thread Pavel Stehule

2008/6/1 Harald Armin Massa [EMAIL PROTECTED]:
 Joe and all,

 The below listed tarball is out of date at this point, but I have updated
 code laying around if someone really wanted it:
  http://www.joeconway.com/sessfunc.tar.gz
 I've used variations of this over the years on several projects.

 is someone able and willing to provide this tarball compiled to a
 PostgreSQL-8.3.1 usable win32-dll ?

 ATM I am using session variables in pure PL/SQL via temp tables. ..:)



It should by contrib module

Pavel

 Harald


 --
 GHUM Harald Massa
 persuadere et programmare
 Harald Armin Massa
 Spielberger Straße 49
 70435 Stuttgart
 0173/9409607
 no fx, no carrier pidgeon
 -
 EuroPython 2008 will take place in Vilnius, Lithuania - Stay tuned!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] explain doesn't work with execute using

2008-06-01 Thread Pavel Stehule

Hello

I found following bug - using explain in stored procedures like:

CREATE OR REPLACE FUNCTION test(int)
RETURNS void AS $$
DECLARE s varchar;
BEGIN
  FOR s IN EXECUTE 'EXPLAIN SELECT * FROM o WHERE a = $1+1' USING $1 LOOP
RAISE NOTICE '%', s;
  END LOOP;
END; $$
LANGUAGE plpgsql;

produce wrong result. Real plan is correct, etc variables are
substituted. Bud this explain show variables. Reason is in difference
in pflags. Planner works with PARAM_FLAG_CONST's variables, but
explain (proc ExplainQuery) get variables from Portal, where flag
PARAM_FLAG_CONST is lost.

Portal
SPI_cursor_open_with_args(const char *name,
  const char *src,
  int nargs, Oid *argtypes,
  Datum *Values, const
char *Nulls,
  bool read_only, int
cursorOptions)
{
   ...
paramLI = _SPI_convert_params(nargs, argtypes,
  Values, Nulls,

PARAM_FLAG_CONST);

   // variables are correct

but
   result = SPI_cursor_open(name, plan, Values, Nulls, read_only);
   // result-portalParams lost flags

Portal
SPI_cursor_open(const char *name, SPIPlanPtr plan,
Datum *Values, const char *Nulls,
bool read_only)
{
CachedPlanSource *plansource;
CachedPlan *cplan;
List   *stmt_list;
char   *query_string;
ParamListInfo paramLI;
 
if (plan-nargs  0)
{
/* sizeof(ParamListInfoData) includes the first array
element */
paramLI = (ParamListInfo) palloc(sizeof(ParamListInfoData) +

(plan-nargs - 1) *sizeof(ParamExternData));
paramLI-numParams = plan-nargs;

for (k = 0; k  plan-nargs; k++)
{
ParamExternData *prm = paramLI-params[k];

prm-ptype = plan-argtypes[k];

/***/
prm-pflags = 0; // correct flags is overwritten
/***/
prm-isnull = (Nulls  Nulls[k] == 'n');
if (prm-isnull)
{
/* nulls just copy */
prm-value = Values[k];
}

so this is strange bug - EXECUTE USING use well plan, but isn't
possible verify it.

Regards
Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized scans for VACUUM

2008-06-01 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 Jeff Davis [EMAIL PROTECTED] writes:
 The objections to synchronized scans for VACUUM as listed in that thread
 (summary):

 2. vacuum takes breaks from the scan to clean up the indexes when it
 runs out of maintenance_work_mem.

 2. There have been suggestions about a more compact representation for
 the tuple id list. If this works, it will solve this problem.

 It will certainly not solve the problem.  What it will do is mean that
 the breaks are further apart and longer, which seems to me to make the
 conflict with syncscan behavior worse not better.

How would it make them longer? They still have the same amount of i/o to do
scanning the indexes. I suppose they would dirty more pages which might slow
them down?

In any case I think the representation you proposed back when this idea last
came up was so compact that pretty much any size table ought to be
representable in a reasonable work_mem -- at least for the kind of machine
which would normally be dealing with that size table.

 It still seems to me that vacuum is unlikely to be a productive member
 of a syncscan herd --- it just isn't going to have similar scan-speed
 behavior to typical queries.

That's my thinking too. Our general direction has been toward reducing
vacuum's i/o bandwidth requirements, not worrying about making it run as fast
as possible.

That said if it happened to latch on to a sync scan herd it would have very
few cache misses which would cause it to rack up very few vacuum cost delay
points. Perhaps the vacuum cost delay for a cache hit ought to be 0?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronized scans for VACUUM

2008-06-01 Thread Tom Lane

Gregory Stark [EMAIL PROTECTED] writes:
 It will certainly not solve the problem.  What it will do is mean that
 the breaks are further apart and longer, which seems to me to make the
 conflict with syncscan behavior worse not better.

 How would it make them longer? They still have the same amount of i/o to do
 scanning the indexes. I suppose they would dirty more pages which might slow
 them down?

More tuples to delete = more writes (in WAL, if not immediately in the
index itself) = longer to complete the indexscan.  It's still cheaper
than doing multiple indexscans, of course, but my point is that the
index-fixing work gets concentrated.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Tom Lane

Pavel Stehule [EMAIL PROTECTED] writes:
 I found following bug - using explain in stored procedures like:
 ...
 produce wrong result. Real plan is correct, etc variables are
 substituted. Bud this explain show variables.

This seems to be correctable with a one-line patch: make SPI_cursor_open
set the CONST flag on parameters it puts into the portal (attached).
I'm not entirely sure if it's a good idea or not --- comments?

regards, tom lane

Index: src/backend/executor/spi.c
===
RCS file: /cvsroot/pgsql/src/backend/executor/spi.c,v
retrieving revision 1.195
diff -c -r1.195 spi.c
*** src/backend/executor/spi.c  12 May 2008 20:02:00 -  1.195
--- src/backend/executor/spi.c  1 Jun 2008 15:33:13 -
***
*** 997,1003 
ParamExternData *prm = paramLI-params[k];
  
prm-ptype = plan-argtypes[k];
!   prm-pflags = 0;
prm-isnull = (Nulls  Nulls[k] == 'n');
if (prm-isnull)
{
--- 997,1010 
ParamExternData *prm = paramLI-params[k];
  
prm-ptype = plan-argtypes[k];
!   /*
!* We mark the parameters as const.  This has no effect 
for simple
!* execution of a plan, but if more planning happens 
within the
!* portal (eg via EXPLAIN), the effect will be to treat 
the
!* parameters as constants.  This is good and correct 
as long as
!* no plan generated inside the portal is used outside 
it.
!*/
!   prm-pflags = PARAM_FLAG_CONST;
prm-isnull = (Nulls  Nulls[k] == 'n');
if (prm-isnull)
{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Tom Lane

Pavel Stehule [EMAIL PROTECTED] writes:
 2008/6/1 Tom Lane [EMAIL PROTECTED]:
 This seems to be correctable with a one-line patch: make SPI_cursor_open
 set the CONST flag on parameters it puts into the portal (attached).
 I'm not entirely sure if it's a good idea or not --- comments?

 We can do less invasive patch - it's much more ugly, but don't change
 any other behave. I am afraid, so one-line patch can change behave of
 explain statements in some cases where using variables is correct.

If you can name a case where that is correct, then I'll worry about
this, but offhand I don't see one.

What do you think a less invasive patch would be, anyway?  I don't
buy that, say, having SPI_cursor_open_with_args set the flag but
SPI_cursor_open not do so is any safer.  There is no difference between
the two as to what might get executed, so if there's a problem then
both would be at risk.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Pavel Stehule

2008/6/1 Tom Lane [EMAIL PROTECTED]:
 Pavel Stehule [EMAIL PROTECTED] writes:
 2008/6/1 Tom Lane [EMAIL PROTECTED]:
 This seems to be correctable with a one-line patch: make SPI_cursor_open
 set the CONST flag on parameters it puts into the portal (attached).
 I'm not entirely sure if it's a good idea or not --- comments?

 We can do less invasive patch - it's much more ugly, but don't change
 any other behave. I am afraid, so one-line patch can change behave of
 explain statements in some cases where using variables is correct.

 If you can name a case where that is correct, then I'll worry about
 this, but offhand I don't see one.

this case - there variables are correct

postgres=# create or replace function foo(_a integer) returns void as
$$declare s varchar; begin for s in explain select * from o where a =
_a loop raise notice '%', s; end loop; end; $$ language plpgsql;
CREATE FUNCTION
Time: 43,138 ms
postgres=# select foo(20);
NOTICE:  Index Scan using o_pkey on o  (cost=0.00..8.27 rows=1 width=4)
NOTICE:Index Cond: (a = 20) -- wrong :(
 foo
-

(1 row)



 What do you think a less invasive patch would be, anyway?  I don't
 buy that, say, having SPI_cursor_open_with_args set the flag but
 SPI_cursor_open not do so is any safer.  There is no difference between
 the two as to what might get executed, so if there's a problem then
 both would be at risk.

SPI_cursor_open_with_args is new function, it's used only in FOR
EXECUTE statement - and in this context variables are really
constants.

Pavel


regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Robert Hodges

Hi Merlin,

My point here is that with reasonably small extensions to the core you can 
build products that are a lot better than SLONY.   Triggers do not cover DDL, 
among other issues, and it's debatable whether they are the best way to 
implement quorum policies like Google's semi-synchronous replication.  As I 
mentioned separately this topic deserves another thread which I promise to 
start.

It is of course possible to meet some of these needs with an appropriate client 
interface to WAL shipping.  There's no a-priori reason why built-in PostgreSQL 
slaves need to be the only client.  I would put a vote in for covering this 
possibility in the initial replication design.  We are using a very similar 
approach in our own master/slave replication product.

Thanks, Robert

P.S., No offense intended to Jan Wieck et al.  There are some pretty cool 
things in SLONY.

On 5/29/08 8:16 PM, Merlin Moncure [EMAIL PROTECTED] wrote:

On Thu, May 29, 2008 at 3:05 PM, Robert Hodges
[EMAIL PROTECTED] wrote:
 Third, you can't stop with just this feature.  (This is the BUT part of the
 post.)  The use cases not covered by this feature area actually pretty
 large.  Here are a few that concern me:

 1.) Partial replication.
 2.) WAN replication.
 3.) Bi-directional replication.  (Yes, this is evil but there are problems
 where it is indispensable.)
 4.) Upgrade support.  Aside from database upgrade (how would this ever
 really work between versions?), it would not support zero-downtime app
 upgrades, which depend on bi-directional replication tricks.
 5.) Heterogeneous replication.
 6.) Finally, performance scaling using scale-out over large numbers of
 replicas.  I think it's possible to get tunnel vision on this-it's not a big
 requirement in the PG community because people don't use PG in the first
 place when they want to do this.  They use MySQL, which has very good
 replication for performance scaling, though it's rather weak for
 availability.

These type of things are what Slony is for.  Slony is trigger based.
This makes it more complex than log shipping style replication, but
provides lots of functionality.

wal shipping based replication is maybe the fastest possible
solution...you are already paying the overhead so it comes virtually
for free from the point of view of the master.

mysql replication is imo nearly worthless from backup standpoint.

merlin



--
Robert Hodges, CTO, Continuent, Inc.
Email:  [EMAIL PROTECTED]
Mobile:  +1-510-501-3728  Skype:  hodgesrm

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Pavel Stehule

hello

2008/6/1 Tom Lane [EMAIL PROTECTED]:
 Pavel Stehule [EMAIL PROTECTED] writes:
 I found following bug - using explain in stored procedures like:
 ...
 produce wrong result. Real plan is correct, etc variables are
 substituted. Bud this explain show variables.

 This seems to be correctable with a one-line patch: make SPI_cursor_open
 set the CONST flag on parameters it puts into the portal (attached).
 I'm not entirely sure if it's a good idea or not --- comments?

We can do less invasive patch - it's much more ugly, but don't change
any other behave. I am afraid, so one-line patch can change behave of
explain statements in some cases where using variables is correct.

Regards
Pavel Stehule


regards, tom lane

 Index: src/backend/executor/spi.c
 ===
 RCS file: /cvsroot/pgsql/src/backend/executor/spi.c,v
 retrieving revision 1.195
 diff -c -r1.195 spi.c
 *** src/backend/executor/spi.c  12 May 2008 20:02:00 -  1.195
 --- src/backend/executor/spi.c  1 Jun 2008 15:33:13 -
 ***
 *** 997,1003 
ParamExternData *prm = paramLI-params[k];

prm-ptype = plan-argtypes[k];
 !   prm-pflags = 0;
prm-isnull = (Nulls  Nulls[k] == 'n');
if (prm-isnull)
{
 --- 997,1010 
ParamExternData *prm = paramLI-params[k];

prm-ptype = plan-argtypes[k];
 !   /*
 !* We mark the parameters as const.  This has no 
 effect for simple
 !* execution of a plan, but if more planning happens 
 within the
 !* portal (eg via EXPLAIN), the effect will be to 
 treat the
 !* parameters as constants.  This is good and correct 
 as long as
 !* no plan generated inside the portal is used 
 outside it.
 !*/
 !   prm-pflags = PARAM_FLAG_CONST;
prm-isnull = (Nulls  Nulls[k] == 'n');
if (prm-isnull)
{

*** ./src/backend/executor/spi.c.orig	2008-06-01 17:26:19.0 +0200
--- ./src/backend/executor/spi.c	2008-06-01 17:35:01.0 +0200
***
*** 63,68 
--- 63,71 
  static MemoryContext _SPI_execmem(void);
  static MemoryContext _SPI_procmem(void);
  static bool _SPI_checktuples(void);
+ static Portal _SPI_cursor_open(const char *name, SPIPlanPtr plan,
+ Datum *Values, const char *Nulls,
+ bool read_only, int pflags);
  
  
  /* === interface functions === */
***
*** 908,921 
  
  
  /*
!  * SPI_cursor_open()
!  *
!  *	Open a prepared SPI plan as a portal
   */
! Portal
! SPI_cursor_open(const char *name, SPIPlanPtr plan,
  Datum *Values, const char *Nulls,
! bool read_only)
  {
  	CachedPlanSource *plansource;
  	CachedPlan *cplan;
--- 911,923 
  
  
  /*
!  * _SPI_cursor_open()
!  *	Open a prepared SPI plan as portal, allows set parameter's pflags
   */
! static Portal
! _SPI_cursor_open(const char *name, SPIPlanPtr plan,
  Datum *Values, const char *Nulls,
! bool read_only, int pflags)
  {
  	CachedPlanSource *plansource;
  	CachedPlan *cplan;
***
*** 997,1003 
  			ParamExternData *prm = paramLI-params[k];
  
  			prm-ptype = plan-argtypes[k];
! 			prm-pflags = 0;
  			prm-isnull = (Nulls  Nulls[k] == 'n');
  			if (prm-isnull)
  			{
--- 999,1005 
  			ParamExternData *prm = paramLI-params[k];
  
  			prm-ptype = plan-argtypes[k];
! 			prm-pflags = pflags;
  			prm-isnull = (Nulls  Nulls[k] == 'n');
  			if (prm-isnull)
  			{
***
*** 1130,1135 
--- 1132,1154 
  
  
  /*
+  * SPI_cursor_open()
+  *
+  *	Open a prepared SPI plan as a portal
+  */
+ Portal
+ SPI_cursor_open(const char *name, SPIPlanPtr plan,
+ Datum *Values, const char *Nulls,
+ bool read_only)
+ {
+ 	return _SPI_cursor_open(name, plan,
+ Values, Nulls,
+ read_only, 0);
+ 
+ }
+ 
+ 
+ /*
   * SPI_cursor_open_with_args()
   *
   * Parse and plan a query and open it as a portal.  Like SPI_execute_with_args,
***
*** 1177,1183 
  	/* SPI_cursor_open expects to be called in procedure memory context */
  	_SPI_procmem();
  
! 	result = SPI_cursor_open(name, plan, Values, Nulls, read_only);
  
  	/* And clean up */
  	_SPI_curid++;
--- 1196,1203 
  	/* SPI_cursor_open expects to be called in procedure memory context */
  	_SPI_procmem();
  
! 	/* all params has PARAM_FLAG_CONST flag */
! 	result = _SPI_cursor_open(name, plan, Values, Nulls, read_only, PARAM_FLAG_CONST);
  
  	/* And clean up */
  	_SPI_curid++;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Overhauling GUCS

2008-06-01 Thread Ron Mayer


Gregory Stark wrote:


I think we do a pretty good job of this already. Witness things like
effective_cache_size -- imagine if this were nested_loop_cache_hit_rate for
example, good luck figuring out what to set it to.


I think either of these are fine if we describe how to measure
them.   Ideally if we had a GUC that said log_nested_loop_cache_hit_rate
that enabled some timing code (understandably with lots of overhead) that
made an attempt to measure the hit rate, it'd be easier to figure out
than the effective cache size, no?


The vacuum cost delay factors are probably ripe for such a recast though. I
think we need just one parameter vacuum_io_bandwidth or something like that.


+1; though perhaps the inverse of that is more useful.  When my
machines are idle I'd be happy if they vacuum more.   Wouldn't
we be better served specifying the I/O bandwidth of each device/tablespace
and letting vacuum use whatever portion would be otherwise idle?


The bgwriter parameters might also be a candidate but I'm less certain.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Tom Lane

Pavel Stehule [EMAIL PROTECTED] writes:
 2008/6/1 Tom Lane [EMAIL PROTECTED]:
 What do you think a less invasive patch would be, anyway?  I don't
 buy that, say, having SPI_cursor_open_with_args set the flag but
 SPI_cursor_open not do so is any safer.  There is no difference between
 the two as to what might get executed, so if there's a problem then
 both would be at risk.

 SPI_cursor_open_with_args is new function, it's used only in FOR
 EXECUTE statement - and in this context variables are really
 constants.

This argument seems entirely bogus.  How are they any more constant
than in the other case?  The value isn't going to change for the life
of the portal in either case.

ISTM you're expecting EXPLAIN to behave in some magic way that has
got little to do with correctness.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Pavel Stehule

2008/6/1 Tom Lane [EMAIL PROTECTED]:
 Pavel Stehule [EMAIL PROTECTED] writes:
 2008/6/1 Tom Lane [EMAIL PROTECTED]:
 What do you think a less invasive patch would be, anyway?  I don't
 buy that, say, having SPI_cursor_open_with_args set the flag but
 SPI_cursor_open not do so is any safer.  There is no difference between
 the two as to what might get executed, so if there's a problem then
 both would be at risk.

 SPI_cursor_open_with_args is new function, it's used only in FOR
 EXECUTE statement - and in this context variables are really
 constants.

 This argument seems entirely bogus.  How are they any more constant
 than in the other case?  The value isn't going to change for the life
 of the portal in either case.

this is true Tom, but problem is in EXPLAIN. I thing, so my and your
solution are little bit incorect. We solve result, not reason. We have
problem, bacause plan doesn't carry parameter's flags, and with
EXPLAIN planner is called two times with different param's flags.



 ISTM you're expecting EXPLAIN to behave in some magic way that has
 got little to do with correctness.


It is first time when I do some with EXPLAIN and I don't understad
well, but I would correct EXPLAIN output. When original plan use
variables I would to see variables in plan and when plan use constant
I would to see constant.


regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Merlin Moncure

On Sun, Jun 1, 2008 at 11:58 AM, Robert Hodges
[EMAIL PROTECTED] wrote:
 Hi Merlin,

 My point here is that with reasonably small extensions to the core you can
 build products that are a lot better than SLONY.   Triggers do not cover
 DDL, among other issues, and it's debatable whether they are the best way to
 implement quorum policies like Google's semi-synchronous replication.  As I
 mentioned separately this topic deserves another thread which I promise to
 start.

These issues are much discussed and well understood.  At this point,
the outstanding points of discussion are technical...how to make this
thing work.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Tom Lane

Pavel Stehule [EMAIL PROTECTED] writes:
 2008/6/1 Tom Lane [EMAIL PROTECTED]:
 This argument seems entirely bogus.  How are they any more constant
 than in the other case?  The value isn't going to change for the life
 of the portal in either case.

 this is true Tom, but problem is in EXPLAIN. I thing, so my and your
 solution are little bit incorect. We solve result, not reason. We have
 problem, bacause plan doesn't carry parameter's flags, and with
 EXPLAIN planner is called two times with different param's flags.

[ shrug... ]  Well, I'm willing to change the code as you suggest,
but if you're thinking that this will make EXPLAIN exactly reproduce
the plan that would be generated for a plain SELECT invoked in the
same context, you're still mistaken.  It doesn't account for the
effects of the fast-start-cursor option.  And for what you seem to
want EXPLAIN to do here, it probably shouldn't.  The whole thing
seems pretty unprincipled to me ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Tom Lane

Merlin Moncure [EMAIL PROTECTED] writes:
 On Sun, Jun 1, 2008 at 11:58 AM, Robert Hodges
 [EMAIL PROTECTED] wrote:
 My point here is that with reasonably small extensions to the core you can
 build products that are a lot better than SLONY.

 These issues are much discussed and well understood.

Well, what we know is that previous attempts to define replication hooks
to be added to the core have died for lack of interest.  Maybe Robert
can start a new discussion that will actually get somewhere; if so, more
power to him.  (Is the replica-hooks-discuss list still working?)  But
that is entirely orthogonal to what is proposed in this thread, which
is to upgrade the existing PITR support into a reasonably useful
replication feature.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Dawid Kuroczko

On Thu, May 29, 2008 at 4:12 PM, Tom Lane [EMAIL PROTECTED] wrote:
 The Postgres core team met at PGCon to discuss a few issues, the largest
 of which is the need for simple, built-in replication for PostgreSQL.
[...]
 We believe that the most appropriate base technology for this is
1 probably real-time WAL log shipping, as was demoed by NTT OSS at PGCon.
 We hope that such a feature can be completed for 8.4.  Ideally this
 would be coupled with the ability to execute read-only queries on the
 slave servers, but we see technical difficulties that might prevent that
 from being completed before 8.5 or even further out.  (The big problem
 is that long-running slave-side queries might still need tuples that are
 vacuumable on the master, and so replication of vacuuming actions would
 cause the slave's queries to deliver wrong answers.)

 Again, this will not replace Slony, pgPool, Continuent, Londiste, or
 other systems for many users, as it will be not be highly scalable nor
 support long-distance replication nor replicating less than an entire
 installation.  But it is time to include a simple, reliable basic
 replication feature in the core system.

Hello!

I thought I would share a few thoughts of my own about the issue.
I have a hands-on experience with Oracle and MySQL apart from
PostgreSQL so I hope it will be a bit interesting.

The former has a feature called physical standby, which looks
quite like our WAL-shipping based replication.  Simply archived
logs are replayed on the standby database.  A primary database
and standby database are connected, and can stream the logs
directly.  They either copy the log when its finished (as we do now)
or can do it in coninuous manner (as I hope we will be able to).

It is possible to have a synchronous replication (where COMMIT
on primary database succeeds when the data is safely stored on
the standby database).  I think such a feature would be a great
advantage for PostgreSQL (where you cannot afford to loose
any transactions).

Their standby database is not accessible.  It can be opened read-only,
but during that time replication stops.  So PostgreSQL having
read-only and still replicating standby database would be great.

The other method is logical standby which works by dissecting
WAL-logs and recreating DDLs/DMLs from it.  Never seen anyone
use it. ;-)

Then we have a mysql replication -- done by replaying actual DDLs/DMLs
on the slaves.  This approach has issues, most notably when slaves are
highly loaded and lag behind the master -- so you end up with infrastructure
to monitor lags and turn off slaves which lag too much.  Also it is painful
to setup -- you have to stop, copy, configure and run.

* Back to PostgreSQL world

As for PostgreSQL solutions we have a slony-I, which is great as long as
you don't have too many people managing the database and/or your
schema doesn't change too frequently.  Perhaps it would be maintainable
more easily if there would be to get DDLs (as DDL triggers or similar).
Its main advantages for me is ability to prepare complex setups and
easily add new slaves).  The pgpool solution is quite nice but then
again adding a new slave is not so easy.  And being a filtering
layer between client and server it feels a bit fragile (I know it is not,
but then again it is harder to convince someone that yes it will work
100% right all the time).

* How I would like PostgreSQL WAL-replication to evolve:

First of all it would be great if a slave/standby would contact the master
and maintain the state with it (tell it its xmin, request a log to stream,
go online-streaming).  Especially I hope that it should be possible
to make a switchover (where the two databases exchange roles),
and in this the direct connection between the two should help.

In detail, I think it should go like this:
* A slave database starts up, checks that it works as a replica
(hopefully it would not be a postgresql.conf constant, but rather
some file maintained by the database).
* It would connect to the master database, tell where in the WAL
it is now, and request a log N.
* If log N is not available, request a log from external supplied
script (so that it could be fetched from log archive repository
somewhere, recovered from a backup tape, etc).
* Continue asking, until we get to the logs which are available
at master database.
* Continue replaying until we get within max_allowed_replication_lag
time, and open our slave for read-only queries.
* If we start lagging too much perhaps close the read-only access
to the database (perhaps configurable?).

I think that replication should be easy to set up.  I think our
archive_command is quite easy, but many a person come
with a lot of misconceptions how it works (and it takes time
to explain them how it actually work, especially what is
archive_command for, and that pg_start_backup() doesn't
actually _do_ backup, but just tells PostgreSQL that
backup is being done).

Easy to setup and easy to switchover (change the

Re: [HACKERS] proposal: table functions and plpgsql

2008-06-01 Thread Pavel Stehule

Hello

After some days I thing, so idea of local types is wrong. Maybe we can
register output types for or SRF functions (maybe only for table
functions), but this mechanism is redundant to explicit custom types.
Local functions types are nice, they allows better compile time check,
but they are unnecessary.

Sample:
CREATE OR REPLACE FUNCTION foo(a integer)
RETURNS TABLE(a integer,  b integer) AS $$
DECLARE r record;
BEGIN
  FOR i IN 1..a LOOP
r := ROW(i, i+1);
RETURN NEXT r;
  END LOOP;
  RETURN;
END;
$$ LANGUAGE plpgsql;

or
-- more in SQL/PSM character
CREATE OR REPLACE FUNCTION foo(a integer)
RETURNS TABLE(a integer, b integer) AS $$
BEGIN
  RETURN TABLE SELECT i, i+1
   FROM generate_series(1,a) g(i);
  RETURN;
END;
$$ LANGUAGE plpgsql;

any comments??

Regards
Pavel Stehule
2008/5/21 Pavel Stehule [EMAIL PROTECTED]:
 Hello

 I am returning back to my patch and older proposal
 http://archives.postgresql.org/pgsql-hackers/2007-02/msg00318.php .

 Some work did Neil Conway
 http://archives.postgresql.org/pgsql-hackers/2007-07/msg00501.php and
 he commited half of this patch - RETURN QUERY part.

 Problematic part of my patch is implementation. Tom Lane proposal
 implenation RETURNS TABLE only as syntactic sugar for RETURNS SETOF
 RECORD. This is not comaptible with potential implementation, because
 it adds some default variables. My solution was special argmode, so I
 was able don't create default variables for output. My solution wasn't
 best too. It was ugly for current plpgsql where is often used RETURN
 NEXT statement (PSM doesn't know similar statement). I unlike default
 variables - it simply way to variables and column names collision.

 I propose following syntax for plpgsql:

 CREATE OR REPLACE FUNCTION foo(m integer)
 RETURNS TABLE (a integer, b integer) AS $$
 DECLARE r foo; -- same name as function, this type has local visibility
 BEGIN
  FOR i IN 1..m LOOP
r.a := i; r.b := i + 1;
RETURN NEXT r;
  END LOOP;
  RETURN;
 END;
 $$ LANGUAGE plpgsql;

 In my proposal I don't create any default variables. Result type is
 only virtual - I don't need write it to system directory. I thing it's
 better than using some specific predeclared type as RESULTTYPE OR
 RESULTSET.

 What do you thing about?

 Regards
 Pavel Stehule


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Where can I find the doxyfile?

2008-06-01 Thread Zdenek Kotala

Xin Wang napsal(a):
 Hi,
 I don't know where I can find the doxyfile which generate
 doxygen.postgresql.org web site. I found that when reading code the
 doxygen source code is quite helpful. However, I want to generate an
 off-line copy of doxygen docs myself, but I can't find the doxyfile in
 the lastest source release.

I think it is good idea. Stefan, what's about put it on the wiki?

Zdenek



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] explain doesn't work with execute using

2008-06-01 Thread Pavel Stehule

2008/6/1 Tom Lane [EMAIL PROTECTED]:
 Pavel Stehule [EMAIL PROTECTED] writes:
 2008/6/1 Tom Lane [EMAIL PROTECTED]:
 This argument seems entirely bogus.  How are they any more constant
 than in the other case?  The value isn't going to change for the life
 of the portal in either case.

 this is true Tom, but problem is in EXPLAIN. I thing, so my and your
 solution are little bit incorect. We solve result, not reason. We have
 problem, bacause plan doesn't carry parameter's flags, and with
 EXPLAIN planner is called two times with different param's flags.

 [ shrug... ]  Well, I'm willing to change the code as you suggest,
 but if you're thinking that this will make EXPLAIN exactly reproduce
 the plan that would be generated for a plain SELECT invoked in the
 same context, you're still mistaken.  It doesn't account for the
 effects of the fast-start-cursor option.  And for what you seem to
 want EXPLAIN to do here, it probably shouldn't.  The whole thing
 seems pretty unprincipled to me ...


It's not best, and it's surprise for me, so EXPLAIN can be different
then real plan. It's basic tool for identification of plpgsql
procedure's performance problems. So this can be short fix and point
for ToDo?

Regards
Pavel Stehule

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread James Mansion


David Fetter wrote:

This part is a deal-killer.  It's a giant up-hill slog to sell warm
standby to those in charge of making resources available because the
warm standby machine consumes SA time, bandwidth, power, rack space,
etc., but provides no tangible benefit, and this feature would have
exactly the same problem.

IMHO, without the ability to do read-only queries on slaves, it's not
worth doing this feature at all.
  
That's not something that squares with my experience *at all*, which 
admitedly is entirely in
investment banks. Business continuity is king, and in some places the 
warm standby rep
from the database vendor is trusted more than block-level rep from the 
SAN vendor

(though that may be changing to some extent in favour of the SAN).

James


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] replication hooks

2008-06-01 Thread James Mansion


Marko Kreen wrote:

There is this tiny matter of replicating schema changes asynchronously,
but I suspect nobody actually cares.  Few random points about that:
  
I'm not sure I follow you - the Sybase 'warm standby' replication of 
everything is really
useful for business continuity.  The per-table rep is more effective for 
publishing reference

data, but is painful to maintain.

Not having something that automagically reps a complete copy including 
DDL (except

for temp tables) is a major weakness IMO.

James


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread James Mansion


Aidan Van Dyk wrote:

The whole single-threaded WAL replay problem is going to rear it's ugly
head here too, and mean that a slave *won't* be able to keep up with a
busy master if it's actually trying to apply all the changes in
real-time.
Is there a reason to commit at the same points that the master 
committed?  Wouldn't relaxing
that mean that at least you would get 'big' commits and some economy of 
scale?  It might
not be too bad.  All I can say is that Sybase warm standby is useful, 
even though the rep
for an update that changes a hundred rows is a hundred updates keyed on 
primary key,

which is pretty sucky in terms of T-SQL performance.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Hannu Krosing

On Thu, 2008-05-29 at 13:37 -0400, Tom Lane wrote:
 David Fetter [EMAIL PROTECTED] writes:
  On Thu, May 29, 2008 at 08:46:22AM -0700, Joshua D. Drake wrote:
  The only question I have is... what does this give us that PITR
  doesn't give us?
 
  It looks like a wrapper for PITR to me, so the gain would be ease of
  use.
 
 A couple of points about that:
 
 * Yeah, ease of use is a huge concern here.  We're getting beat up
 because people have to go find a separate package (and figure out
 which one they want), install it, learn how to use it, etc.  It doesn't
 help that the most mature package is Slony which is, um, not very
 novice-friendly or low-admin-complexity.  I personally got religion
 on this about two months ago when Red Hat switched their bugzilla
 from Postgres to MySQL because the admins didn't want to deal with Slony
 any more.  People want simple.
 
 * The proposed approach is trying to get to real replication
 incrementally.  Getting rid of the loss window involved in file-by-file
 log shipping is step one, 

Actually we can already do better than file-by-file by using
pg_xlogfile_name_offset() which was added sometime in 2006. SkyTools for
example does this to get no more than a few seconds failure window.

Doing this synchronously would be of course better.

probably we should use the same modes/protocols as DRBD when
determining when a sync wal write is done

quote from 
http://www.slackworks.com/~dkrovich/DRBD/usingdrbdsetup.html#AEN76


Table 1. DRBD Protocols

Protocol
Description
 A
A write operation is complete as
soon as the data is written to disk
and sent to the network.
 B
A write operation is complete as
soon as a reception acknowledgement
arrives.
 C
A write operation is complete as
soon as a write acknowledgement
arrives.

There are also additional paramaters you can pass to the disk and net
options. See the drbdsetup man page for additional information

/end quote

 and I suspect that step two is going to be
 fixing performance issues in WAL replay to ensure that slaves can keep
 up.  After that we'd start thinking about how to let slaves run
 read-only queries.  But even without read-only queries, this will be
 a useful improvement for HA/backup scenarios.
 
   regards, tom lane
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Hannu Krosing

On Fri, 2008-05-30 at 15:16 -0400, Robert Treat wrote:
 On Friday 30 May 2008 01:10:20 Tom Lane wrote:
  Greg Smith [EMAIL PROTECTED] writes:
   I fully accept that it may be the case that it doesn't make technical
   sense to tackle them in any order besides sync-read-only slaves because
   of dependencies in the implementation between the two.
 
  Well, it's certainly not been my intention to suggest that no one should
  start work on read-only-slaves before we finish the other part.  The
  point is that I expect the log shipping issues will be done first
  because they're easier, and it would be pointless to not release that
  feature if we had it.
 
  But since you mention it: one of the plausible answers for fixing the
  vacuum problem for read-only slaves is to have the slaves push an xmin
  back upstream to the master to prevent premature vacuuming.  The current
  design of pg_standby is utterly incapable of handling that requirement.
  So there might be an implementation dependency there, depending on how
  we want to solve that problem.
 
 
 Sure, but whose to say that after synchronous wal shipping is finished it 
 wont need a serious re-write due to new needs from the hot standby feature. I 
 think going either way carries some risk. 

The simplest form of synchronous wal shipping would not even need
postgresql running on slave, just a small daemon which reports when wal
blocks are a) received and b) synced to disk. 

This setup would just guarantee no data loss on single machine
failure. form there on you could add various features, including
support for both switchover and failover, async replication to multiple
slaves, etc.

the only thing that needs anything additional from slave wal-receiving
daemon is when you want the kind of wal-sync which would guarantee that
read-only query on slave issued after commit returns from master sees
latest data. for this kinds of guarantees you need at least feedback
about wal-replay, but possibly also shared transaction numbers and
shared snapshots, to be sure that OLTP type queries see the latest and
OLAP queries are not denied seeing VACUUMED on master.

--
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Overhauling GUCS

2008-06-01 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Andrew Dunstan wrote:

 We already have include directives, and have had since 8.2.

Heh, thanks - which proves how useless they are to me. :)

Joshua Drake wrote:

 I kind of agree with this but actually think we should have the bare
 minimum comments in the file. Why? Because our documentation links are
 static. Each setting should have the URL to the full documentation on a
 particular setting.

Ugh, why so much context switching? Put the docs next to the setting. URLs
are nice but not necessary. If you are arguing for minimum comments in
conf files, please make a patch for pg_hba.conf ;)

 * Create a tool, or at least a best practices, for controlling and tracking
 changes to the file.

 This I disagree with. There are plenty of tools to handle this should
 someone really want to. SVN, CVS, parrot, etc... Let systems management
 be the domain of systems management.

Well, perhaps just a note in the docs at least that one might want to put
postgresql.conf in version control. I've seen people not doing so more often
than you would think. Perhaps because they are DBAs and not sysadmins? I also
meant a tool to do things like verify that the changes are valid, as someone
else mentioned elsewhere in this thread.

 * It might be nice to mention other ways to reload the file, such as
 'service postgresql reload', or whatever Windows uses.

 I think a url to the docs is a better idea here.

Good point. Maybe a sort of DBA basics page in the docs is warranted for
things like this.

 * Since the executable is now named postgres (thank goodness we got
 rid of postmaster), the file should be named 'postgres.conf'. This would
 also be a way to quickly distinguish 'old' vs 'new' style conf files if
 we end up making major changes to it.

 It was never postmaster.conf (that I can recall). I don't see the issue
 here. Consider apache... It isn't apache.conf.

Not saying it ever was postmaster.conf: just that I'm glad we finally
changed the name. As for the Apache project, the httpd executable reads the
httpd.conf file. Hence, one might expect the postgres executable to read a
postgres.conf file.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
End Point Corporation
PGP Key: 0x14964AC8 200806011656
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAkhDEJMACgkQvJuQZxSWSsgeogCfT0g69NDoxyWGiWmDcB3PxH8h
wJ8AnjzssA7aIk0rBdJzL+bB5vSQSeBV
=lgZG
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add dblink function to check if a named connection exists

2008-06-01 Thread Joe Conway


Tom Lane wrote:

Tommy Gildseth [EMAIL PROTECTED] writes:
One obvious disadvantage of this approach, is that I need to connect and 
disconnect in every function. A possible solution to this, would be 
having a function f.ex dblink_exists('connection_name') that returns 
true/false depending on whether the  connection already exists.


Can't you do this already?

SELECT 'myconn' = ANY (dblink_get_connections());

A dedicated function might be a tad faster, but it probably isn't going
to matter compared to the overhead of sending a remote query.


I agree. The above is about as simple as
  SELECT dblink_exists('dtest1');
and probably not measurably slower. If you still think a dedicated 
function is needed, please send the output of some performance testing 
to justify it.


If you really want the notational simplicity, you could use an SQL 
function to wrap it:


CREATE OR REPLACE FUNCTION dblink_exists(text)
RETURNS bool AS $$
  SELECT $1 = ANY (dblink_get_connections())
$$ LANGUAGE sql;

contrib_regression=# SELECT dblink_exists('dtest1');
 dblink_exists
---
 f
(1 row)

I guess it might be worthwhile adding the SQL function definition to 
dblink.sql.in as an enhancement in 8.4.


Joe


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler


Howdy,

I'm sure I'm just showing off my ignorance here, but here goes…

I really need case-insensitive string comparison in my database.  
Ideally there'd be a nice ITEXT data type (and friends, ichar,  
ivarchar, etc.). But of course there isn't, and for years I've just  
used LOWER() on indexes and queries to get the same result.


Only it turns out that I'm of course not getting the same result. This  
script:


#!/usr/local/bin/perl -w

use strict;
use warnings;
use utf8;
binmode STDOUT, ':utf8';
use DBI;

my $dbh = DBI-connect('dbi:Pg:dbname=try', 'postgres', '',  
{ pg_enable_utf8 = 1 });

for my $char qw( À Á Â Ã Ä Å Ç Ć Č Ĉ Ċ Ď Đ A B C D ) {
print $char: , $dbh-selectrow_array('SELECT LOWER(?)', undef,  
$char ), $/;

}

Yields this output:

À: À
Á: Á
Â: Â
Ã: Ã
Ä: Ä
Å: Å
Ç: Ç
Ć: Ć
Č: Č
Ĉ: Ĉ
Ċ: Ċ
Ď: Ď
Đ: Đ
A: a
B: b
C: c
D: d

So it doesn't really work on anything other than ASCII, it looks like.  
So I have two questions:


1. Does the use of the tolower() C function in the citext data type on  
pgfoundry basically give me the same results as using lower() in my  
SQL has for all these years? IOW, does it convert letters to lowercase  
in the same way that the LOWER() SQL function does? If so, I think I  
might start to use it for my case-insensitive columns and simplify my  
SQL a bit.


  http://pgfoundry.org/projects/citext/

2. Isn't the ICU library distributed with PostgreSQL? And if so, could  
it not be used to create proper case conversions in LOWER() and  
friends and, ultimately, to create a case-insensitive text type in  
core? I'm seeing that it has a constant named U_COMPARE_IGNORE_CASE  
that can be used with its unorm_compare() function:


  
http://www.icu-project.org/apiref/icu4c/unorm_8h.html#6cc4c8b0d5be7ce1ac4b600ace7817f5
  
http://www.icu-project.org/apiref/icu4c/unorm_8h.html#991e0fe6f0d062dd6e8e924517f3f437

I don't really know C, but if that's stuff there, can't we take  
advantage of it for proper case-insensitive comparisons (and  
conversions)?


Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Tom Lane

David E. Wheeler [EMAIL PROTECTED] writes:
 I really need case-insensitive string comparison in my database.  

Okay ... according to whose locale?

 Ideally there'd be a nice ITEXT data type (and friends, ichar,  
 ivarchar, etc.). But of course there isn't, and for years I've just  
 used LOWER() on indexes and queries to get the same result.

 Only it turns out that I'm of course not getting the same result.

I think that means you're not using the right locale.

 1. Does the use of the tolower() C function in the citext data type on  
 pgfoundry basically give me the same results as using lower() in my  
 SQL has for all these years?

[ broken record... ]  Kinda depends on your locale.  However, tolower()
is 100% guaranteed not to work for multibyte encodings, so citext is
quite useless if you're using UTF8.  This is fixable, no doubt, but
it's not fixed in the project as it stands.

 2. Isn't the ICU library distributed with PostgreSQL?

Nope, it is not, and we have already pretty much determined that we
do not want to make Postgres depend on ICU.  See the archives.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Zdenek Kotala


David E. Wheeler napsal(a):

Howdy,

I'm sure I'm just showing off my ignorance here, but here goes…

I really need case-insensitive string comparison in my database.


Collation per database level should be help you. It is now under development and 
I hope it will be part of 8.4. You can see 
http://archives.postgresql.org/pgsql-hackers/2008-05/msg00857.php


Zdenek

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Overhauling GUCS

2008-06-01 Thread Joshua D. Drake


Greg Sabino Mullane wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160



Joshua Drake wrote:


I kind of agree with this but actually think we should have the bare
minimum comments in the file. Why? Because our documentation links are
static. Each setting should have the URL to the full documentation on a
particular setting.


Ugh, why so much context switching? Put the docs next to the setting. URLs
are nice but not necessary. If you are arguing for minimum comments in
conf files, please make a patch for pg_hba.conf ;)


Hah! Well I don't know that a minimum of comments is what I am arguing 
as much as not too much comments. The comments in general in the 
postgresql.conf are useless unless you have previous knowledge. I really 
think that if we take advantage of the fact that we have static URLs 
that life would be easier overall.





* Create a tool, or at least a best practices, for controlling and tracking
changes to the file.



This I disagree with. There are plenty of tools to handle this should
someone really want to. SVN, CVS, parrot, etc... Let systems management
be the domain of systems management.


Well, perhaps just a note in the docs at least that one might want to put
postgresql.conf in version control.


I could certainly buy into this. No reason we can't help people better 
administrators. I would suggest a link to a static wiki page (on 
wiki.pg) that would link to each option?



I've seen people not doing so more often
than you would think. Perhaps because they are DBAs and not sysadmins? I also
meant a tool to do things like verify that the changes are valid, as someone
else mentioned elsewhere in this thread.


pg_ctl -D data check?

I would +1 that. Including (in later releases):

WARNING: You specify 66536 for shared buffers but you only have 131072 
of memory. Consider decreasing the parameter.


Obviously we would need more non math friendly wording.




* It might be nice to mention other ways to reload the file, such as
'service postgresql reload', or whatever Windows uses.



I think a url to the docs is a better idea here.


Good point. Maybe a sort of DBA basics page in the docs is warranted for
things like this.



Yeah I could buy into this.


  * Since the executable is now named postgres (thank goodness we got

rid of postmaster), the file should be named 'postgres.conf'. This would
also be a way to quickly distinguish 'old' vs 'new' style conf files if
we end up making major changes to it.



It was never postmaster.conf (that I can recall). I don't see the issue
here. Consider apache... It isn't apache.conf.


Not saying it ever was postmaster.conf: just that I'm glad we finally
changed the name. As for the Apache project, the httpd executable reads the
httpd.conf file. Hence, one might expect the postgres executable to read a
postgres.conf file.


Maybe, but I think I would need more convincing.

Sincerely,

Joshua D. Drake

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler


On Jun 1, 2008, at 21:08, Tom Lane wrote:


David E. Wheeler [EMAIL PROTECTED] writes:

I really need case-insensitive string comparison in my database.


Okay ... according to whose locale?


I'm using C. Of course you're correct that it depends on the locale, I  
always forget that. But does not the Unicode standard offer up some  
sort locale-independent case-insensitivity, so that it gets it right  
some large percentage of the time?



Ideally there'd be a nice ITEXT data type (and friends, ichar,
ivarchar, etc.). But of course there isn't, and for years I've just
used LOWER() on indexes and queries to get the same result.



Only it turns out that I'm of course not getting the same result.


I think that means you're not using the right locale.


What locale is right? If I have a Web app, there could be data in many  
different languages in a single table/column.


1. Does the use of the tolower() C function in the citext data type  
on

pgfoundry basically give me the same results as using lower() in my
SQL has for all these years?


[ broken record... ]  Kinda depends on your locale.  However,  
tolower()

is 100% guaranteed not to work for multibyte encodings, so citext is
quite useless if you're using UTF8.  This is fixable, no doubt, but
it's not fixed in the project as it stands.


Right, okay; thanks. I'm thinking about using it for email addresses  
and domain names, however, so it might be adequate for those  
applications.



2. Isn't the ICU library distributed with PostgreSQL?


Nope, it is not, and we have already pretty much determined that we
do not want to make Postgres depend on ICU.  See the archives.


Damn. Okay, thanks.

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Tom Lane

David E. Wheeler [EMAIL PROTECTED] writes:
 On Jun 1, 2008, at 21:08, Tom Lane wrote:
 Okay ... according to whose locale?

 I'm using C. Of course you're correct that it depends on the locale, I  
 always forget that. But does not the Unicode standard offer up some  
 sort locale-independent case-insensitivity, so that it gets it right  
 some large percentage of the time?

Not really, and in any case the C locale completely disables any
knowledge of Unicode.  C locale knows about 7-bit ASCII and nothing
more.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Overhauling GUCS

2008-06-01 Thread Tom Lane

Joshua D. Drake [EMAIL PROTECTED] writes:
 Greg Sabino Mullane wrote:
 Ugh, why so much context switching? Put the docs next to the setting. URLs
 are nice but not necessary. If you are arguing for minimum comments in
 conf files, please make a patch for pg_hba.conf ;)

 Hah! Well I don't know that a minimum of comments is what I am arguing 
 as much as not too much comments. The comments in general in the 
 postgresql.conf are useless unless you have previous knowledge. I really 
 think that if we take advantage of the fact that we have static URLs 
 that life would be easier overall.

Yeah ... this thread started out with the idea of reducing duplication,
so how did we arrive at wanting to duplicate the SGML documentation into
postgresql.conf.sample?  I think we need to integrate, not duplicate.

(pg_hba.conf.sample is a pretty good example of what not to do, I think
... and it used to be even worse ...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Oleg Bartunov


David,

we wrote contrib module (mchar) for one customer, which ports its
application from mssql to postgres. It does case-insensitive comparison 
for new data type 'mchar' and linked with ICU for system independent locale.


Oleg

On Sun, 1 Jun 2008, David E. Wheeler wrote:


Howdy,

I'm sure I'm just showing off my ignorance here, but here goes

I really need case-insensitive string comparison in my database. Ideally 
there'd be a nice ITEXT data type (and friends, ichar, ivarchar, etc.). But 
of course there isn't, and for years I've just used LOWER() on indexes and 
queries to get the same result.


Only it turns out that I'm of course not getting the same result. This 
script:


#!/usr/local/bin/perl -w

use strict;
use warnings;
use utf8;
binmode STDOUT, ':utf8';
use DBI;

my $dbh = DBI-connect('dbi:Pg:dbname=try', 'postgres', '', { pg_enable_utf8 
= 1 });

for my $char qw(  A B C D ) {
  print $char: , $dbh-selectrow_array('SELECT LOWER(?)', undef, $char ), 
$/;

}

Yields this output:

: 
: 
: 
: 
: 
: 
: 
: 
: 
: 
: 
: 
: 
A: a

B: b
C: c
D: d

So it doesn't really work on anything other than ASCII, it looks like. So I 
have two questions:


1. Does the use of the tolower() C function in the citext data type on 
pgfoundry basically give me the same results as using lower() in my SQL has 
for all these years? IOW, does it convert letters to lowercase in the same 
way that the LOWER() SQL function does? If so, I think I might start to use 
it for my case-insensitive columns and simplify my SQL a bit.


http://pgfoundry.org/projects/citext/

2. Isn't the ICU library distributed with PostgreSQL? And if so, could it not 
be used to create proper case conversions in LOWER() and friends and, 
ultimately, to create a case-insensitive text type in core? I'm seeing that 
it has a constant named U_COMPARE_IGNORE_CASE that can be used with its 
unorm_compare() function:


http://www.icu-project.org/apiref/icu4c/unorm_8h.html#6cc4c8b0d5be7ce1ac4b600ace7817f5
http://www.icu-project.org/apiref/icu4c/unorm_8h.html#991e0fe6f0d062dd6e8e924517f3f437

I don't really know C, but if that's stuff there, can't we take advantage of 
it for proper case-insensitive comparisons (and conversions)?


Thanks,

David





Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler


On Jun 1, 2008, at 22:18, Tom Lane wrote:

I'm using C. Of course you're correct that it depends on the  
locale, I

always forget that. But does not the Unicode standard offer up some
sort locale-independent case-insensitivity, so that it gets it right
some large percentage of the time?


Not really, and in any case the C locale completely disables any
knowledge of Unicode.  C locale knows about 7-bit ASCII and nothing
more.


And the locale can only be set by initdb?

I don't suppose that there are any collations that sort and index case- 
insensitively, are there? I don't see anything suggestive in `locale - 
a`…


Thanks,

David
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread David E. Wheeler


On Jun 1, 2008, at 22:21, Oleg Bartunov wrote:


David,

we wrote contrib module (mchar) for one customer, which ports its
application from mssql to postgres. It does case-insensitive  
comparison for new data type 'mchar' and linked with ICU for system  
independent locale.


That sounds promising. I don't suppose that it has been released, has  
it?


Thanks,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Case-Insensitve Text Comparison

2008-06-01 Thread Oleg Bartunov


On Sun, 1 Jun 2008, David E. Wheeler wrote:


On Jun 1, 2008, at 22:21, Oleg Bartunov wrote:


David,

we wrote contrib module (mchar) for one customer, which ports its
application from mssql to postgres. It does case-insensitive comparison for 
new data type 'mchar' and linked with ICU for system independent locale.


That sounds promising. I don't suppose that it has been released, has it?


It's available as a part of patch, see  (use google translate)
http://v8.1c.ru/overview/postgres_patches_notes.htm

Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Core team statement on replication in PostgreSQL

2008-06-01 Thread Hannu Krosing

On Thu, 2008-05-29 at 12:05 -0700, Robert Hodges wrote:
 Hi everyone, 
 
 First of all, I’m absolutely delighted that the PG community is
 thinking seriously about replication.  
 
 Second, having a solid, easy-to-use database availability solution
 that works more or less out of the box would be an enormous benefit to
 customers.  Availability is the single biggest problem for customers
 in my experience and as other people have commented the alternatives
 are not nice.  It’s an excellent idea to build off an existing feature
 —PITR is already pretty useful and the proposed features are solid
 next steps.  The fact that it does not solve all problems is not a
 drawback but means it’s likely to get done in a reasonable timeframe. 
 
 Third, you can’t stop with just this feature.  (This is the BUT part
 of the post.)  The use cases not covered by this feature area actually
 pretty large.  Here are a few that concern me: 
 
 1.) Partial replication. 
 2.) WAN replication. 

1.)  2.) are better done asunc, the domain of Slony-I/Londiste

 3.) Bi-directional replication.  (Yes, this is evil but there are
 problems where it is indispensable.) 

Sure, it is also a lot harder and always has several dimensions
(performanse/availability7locking) which play against each other

 4.) Upgrade support.  Aside from database upgrade (how would this ever
 really work between versions?), it would not support zero-downtime app
 upgrades, which depend on bi-directional replication tricks. 

Or you could use zero-downtime  app upgrades, which don't depend on
this :P

 5.) Heterogeneous replication. 
 6.) Finally, performance scaling using scale-out over large numbers of
 replicas.  I think it’s possible to get tunnel vision on this—it’s not
 a big requirement in the PG community because people don’t use PG in
 the first place when they want to do this.  They use MySQL, which has
 very good replication for performance scaling, though it’s rather weak
 for availability.  

Again, doing scale-out over large number of replicas should either be
async or for sync use some broadcast channel to all slaves (and still be
a performance problem on master, as it has to wait for slowest slave).

 As a consequence, I don’t see how you can get around doing some sort
 of row-based replication like all the other databases. 

Is'nt WAL-base replication some sort of row-based replication ?

  Now that people are starting to get religion on this issue I would
 strongly advocate a parallel effort to put in a change-set extraction
 API that would allow construction of comprehensive master/slave
 replication. 

Triggers. see pgQ's logtrigga()/logutrigga(). See slides for Marko
Kreen's presentation at pgCon08.

  (Another approach would be to make it possible for third party apps
 to read the logs and regenerate SQL.) 

which logs ? WAL or SQL command logs ?

 There are existing models for how to do change set extraction; we have
 done it several times at my company already.  There are also research
 projects like GORDA that have looked fairly comprehensively at this
 problem.

pgQ with its triggers does a pretty good job of change-set extraction.

--
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

39 matches

Mail list logo