Re: [HACKERS] [pgsql-www] commitfest.postgresql.org

2009-07-05 Thread Brendan Jurd
2009/7/3 Robert Haas robertmh...@gmail.com:
 The application stamps comments with the
 community login of the person who left them, but the import stamped
 them with names instead.  This is actually of some significance, since
 the app will allow you to edit your own comments but not those of
 other people.  We could probably fix this if someone can give us
 access to (or a dump of) the realname to username mappings from the
 community login DB.

Nobody came forward with a mapping of real names to community logins,
so I went ahead and did this the dumb, slow way (eyeballing the wiki
history and manually creating a mapping for the names we have in the
commitfest app so far).

I've updated the patch comment creator field with the login names I
was able to map, but there are a few contributors who either don't
have a community login, or haven't used the wiki.  In those cases I
left the name as-is.

You should now be able to edit comments that you created.  If you
think you should be able to edit a comment, but you can't because the
login name is wrong, let me know and I'll fix it up.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] problem with varlena and extended type

2009-07-05 Thread Pavel Stehule
2009/7/4 Greg Stark gsst...@mit.edu:
 On Sat, Jul 4, 2009 at 10:31 PM, Greg Starkgsst...@mit.edu wrote:
 It's pretty hard to guess where your bug is sitting here with no code
 and no idea even what you've done to trigger it.

 At a guess there someplace you haven't detoasted a datum that had to
 be detoasted. But like I said that's just a guess.


 Actually on further thought I think this smells like a memory
 management bug. Maybe you've either you've prematurely freed this data
 structure or realloc'd it without tracking the new pointer and have
 returned a pointer to the freed object.

good shot - I had problem with repalloc

thank you very much
Pavel



 --
 greg
 http://mit.edu/~gsstark/resume.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-Dimensional Histograms

2009-07-05 Thread Gregory Maxwell
On Mon, Jun 29, 2009 at 10:22 PM, Robert Haasrobertmh...@gmail.com wrote:
 I'm finding myself unable to follow all the terminology on this thead.
  What's dimension reduction?  What's PCA?
[snip]

Imagine you have a dataset with two variables, say height in inches
and age in years. For tue purpose of discussion lets pretend for a
moment that all the people in your sample have height the same as
their age.

You could create a 2d histogram of your data:

 |0002
 |0060
a|0300
g|4000
e|0003
 |0010
 |0100
 |
  height

You could store this 2d histogram as is and use it for all the things
you'd use histograms for or you could make an observation of the
structure and apply a rotation and flattening of the data and convert
it to a 1d histogram

[0113426200...]  which is far more compact.

Often data has significant correlation, so it's often possible to
reduce the dimensionality without reducing the selectivity of the
histogram greatly.

This becomes tremendously important as the number of dimensions goes
up because the volume of a N dimensional space increases incredibly
fast as the number of dimensions increase.

PCA is used as one method of dimensionality reduction. In PCA you find
a linear transformation (scaling, rotation) of the data that aligns
the data so that the axis lines cut through the data-space in the
orientations with the greatest variance.

I have no clue how you would apply PCA to postgresql histograms, since
to build the PCA transform you need to do some non-trivial operations
with the data.  Perhaps PCA could be done on a random sample of a
table, then that transformation could be stored and used to compute
the histograms. I'm sure there has been a lot of research on this.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Did COPY performance regression solve in 8.4rc2?

2009-07-05 Thread Toshihiro Kitagawa
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  - shared_buffers = 128MB
  
 What happens with a larger value for shared_buffers?

COPY performance of PostgreSQL 8.4.0 was a little bit better than
PostgreSQL 8.3.0 when shared_buffes was 1GB.
My server has 2GB RAM.

* Shared_buffers = 1GB
- 8.4.0
real31m13.873s
real30m17.180s
real29m16.170s
- 8.4rc2
real29m46.035s
real28m31.467s
real29m5.781s
- 8.4rc1
real29m35.403s
real28m44.221s
real29m20.309s
- 8.3.0
real31m10.434s
real32m39.912s
real32m8.221s

* Shared_buffers = 512MB
- 8.4.0
real28m37.817s
real29m44.449s
real28m10.886s
- 8.4rc2
real28m0.657s
real29m50.888s
real28m28.037s
- 8.4rc1
real28m58.592s
real28m25.756s
real30m11.641s
- 8.3.0
real23m59.923s
real24m13.717s
real24m40.246s

Regards,

-- 
Toshihiro Kitagawa kitag...@sraoss.co.jp
SRA OSS, Inc. Japan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Merge Append Patch merged up to 85devel

2009-07-05 Thread Gregory Stark

Here's a copy of the merge-append patch that I sent months ago merged up to
head. I haven't really added any additional functionality since then.

Heikki suggested I separate the Append and MergeAppend nodes into two executor
nodes. I had that half done in my tree but looking it over it leads to a lot
of duplicated code and a strange effect that there's on Path node but two
Executor nodes which seems strange. I'm not sure which way to go here but at
least for now I'm leaving it this way since it's less code to write. If we
want it the other way to commit then I'll do it.

The other pending question is the same I had back when I originally submitted
it. I don't really understand what's going on with eclasses and what
invariants we're aiming to maintain with them. I don't see a problem tossing
all the child relation attributes into the same eclass even though they're not
strictly speaking equivalent. No join above the append path is going to see
the child attributes anyways. But that might be shortsighted as I'm not really
sure what the consequences are and what other uses we have envisioned for
eclasses in the future.

diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 0938e94..cf0f3a1 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -59,9 +59,25 @@
 
 #include executor/execdebug.h
 #include executor/nodeAppend.h
+#include utils/lsyscache.h
+#include access/nbtree.h
+
+/* It gets quite confusing having a heap array (indexed by integers) which
+ * contains integers which index into the slots array. These typedefs try to
+ * clear it up but without making simple inline accessing functions they don't
+ * actually produce any warnings on mistakes */
+
+typedef int SlotNumber;
+typedef int HeapPosition;
+
+#define WHICHPLAN_PLANS_UNINITIALIZED (-1)
 
 static bool exec_append_initialize_next(AppendState *appendstate);
 
+static int heap_compare_slots(AppendState *node, SlotNumber slot1, SlotNumber slot2);
+
+static void heap_siftup_slot(AppendState *node);
+static void heap_insert_slot(AppendState *node, SlotNumber new_slot);
 
 /* 
  *		exec_append_initialize_next
@@ -177,12 +193,14 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 
 		appendstate-as_firstplan = tplan;
 		appendstate-as_lastplan = tplan;
+		appendstate-as_is_ordered = false;
 	}
 	else
 	{
 		/* normal case, scan all subplans */
 		appendstate-as_firstplan = 0;
 		appendstate-as_lastplan = nplans - 1;
+		appendstate-as_is_ordered = node-isOrdered;
 	}
 
 	/*
@@ -224,11 +242,50 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	ExecAssignResultTypeFromTL(appendstate-ps);
 	appendstate-ps.ps_ProjInfo = NULL;
 
-	/*
-	 * return the result from the first subplan's initialization
-	 */
-	appendstate-as_whichplan = appendstate-as_firstplan;
-	exec_append_initialize_next(appendstate);
+	if (!appendstate-as_is_ordered) 
+	{
+		/*
+		 * return the result from the first subplan's initialization
+		 */
+		appendstate-as_whichplan = appendstate-as_firstplan;
+		exec_append_initialize_next(appendstate);
+	} else {
+		/* set up scan keys and initialize *all* the subnodes */
+		int i;
+
+		appendstate-as_nkeys = node-numCols;
+		appendstate-as_scankeys = palloc(sizeof(ScanKeyData) *  node-numCols);
+		appendstate-as_slots = palloc(sizeof(TupleTableSlot *) * nplans);
+		appendstate-as_heap = palloc(sizeof(int) * nplans);
+		appendstate-as_heap_size = 0;
+
+		for (i=0; i  nplans; i++)
+		{
+			appendstate-as_whichplan = i;
+			exec_append_initialize_next(appendstate);
+		}
+
+		appendstate-as_whichplan = WHICHPLAN_PLANS_UNINITIALIZED;
+
+		for (i=0; i  node-numCols; i++)
+		{
+			Oid sortFunction;
+			bool reverse;
+
+			get_compare_function_for_ordering_op(node-sortOperators[i],
+ sortFunction, reverse);
+
+			ScanKeyInit(appendstate-as_scankeys[i],
+		node-sortColIdx[i],
+		InvalidStrategy,
+		sortFunction,
+		(Datum)0);
+			if (reverse)
+appendstate-as_scankeys[i].sk_flags |= SK_BT_DESC;
+			if (node-nullsFirst[i])
+appendstate-as_scankeys[i].sk_flags |= SK_BT_NULLS_FIRST;
+		}
+	}
 
 	return appendstate;
 }
@@ -253,47 +310,168 @@ ExecCountSlotsAppend(Append *node)
 TupleTableSlot *
 ExecAppend(AppendState *node)
 {
-	for (;;)
-	{
-		PlanState  *subnode;
-		TupleTableSlot *result;
+	if (!node-as_is_ordered)
+		for (;;)
+		{
+			PlanState  *subnode;
+			TupleTableSlot *result;
 
-		/*
-		 * figure out which subplan we are currently processing
-		 */
-		subnode = node-appendplans[node-as_whichplan];
+			/*
+			 * figure out which subplan we are currently processing
+			 */
+			subnode = node-appendplans[node-as_whichplan];
 
-		/*
-		 * get a tuple from the subplan
-		 */
-		result = ExecProcNode(subnode);
+			/*
+			 * get a tuple from the subplan
+			 */
+			result = ExecProcNode(subnode);
+
+			if (!TupIsNull(result))
+			{
+/*
+ * If the subplan gave 

[HACKERS] Merge Append Patch merged up to 85devel

2009-07-05 Thread Gregory Stark

Here's a copy of the merge-append patch that I sent months ago merged up to
head. I haven't really added any additional functionality since then.

Heikki suggested I separate the Append and MergeAppend nodes into two executor
nodes. I had that half done in my tree but looking it over it leads to a lot
of duplicated code and a strange effect that there's on Path node but two
Executor nodes which seems strange. I'm not sure which way to go here but at
least for now I'm leaving it this way since it's less code to write. If we
want it the other way to commit then I'll do it.

The other pending question is the same I had back when I originally submitted
it. I don't really understand what's going on with eclasses and what
invariants we're aiming to maintain with them. I don't see a problem tossing
all the child relation attributes into the same eclass even though they're not
strictly speaking equivalent. No join above the append path is going to see
the child attributes anyways. But that might be shortsighted as I'm not really
sure what the consequences are and what other uses we have envisioned for
eclasses in the future.

diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 0938e94..cf0f3a1 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -59,9 +59,25 @@
 
 #include executor/execdebug.h
 #include executor/nodeAppend.h
+#include utils/lsyscache.h
+#include access/nbtree.h
+
+/* It gets quite confusing having a heap array (indexed by integers) which
+ * contains integers which index into the slots array. These typedefs try to
+ * clear it up but without making simple inline accessing functions they don't
+ * actually produce any warnings on mistakes */
+
+typedef int SlotNumber;
+typedef int HeapPosition;
+
+#define WHICHPLAN_PLANS_UNINITIALIZED (-1)
 
 static bool exec_append_initialize_next(AppendState *appendstate);
 
+static int heap_compare_slots(AppendState *node, SlotNumber slot1, SlotNumber slot2);
+
+static void heap_siftup_slot(AppendState *node);
+static void heap_insert_slot(AppendState *node, SlotNumber new_slot);
 
 /* 
  *		exec_append_initialize_next
@@ -177,12 +193,14 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 
 		appendstate-as_firstplan = tplan;
 		appendstate-as_lastplan = tplan;
+		appendstate-as_is_ordered = false;
 	}
 	else
 	{
 		/* normal case, scan all subplans */
 		appendstate-as_firstplan = 0;
 		appendstate-as_lastplan = nplans - 1;
+		appendstate-as_is_ordered = node-isOrdered;
 	}
 
 	/*
@@ -224,11 +242,50 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
 	ExecAssignResultTypeFromTL(appendstate-ps);
 	appendstate-ps.ps_ProjInfo = NULL;
 
-	/*
-	 * return the result from the first subplan's initialization
-	 */
-	appendstate-as_whichplan = appendstate-as_firstplan;
-	exec_append_initialize_next(appendstate);
+	if (!appendstate-as_is_ordered) 
+	{
+		/*
+		 * return the result from the first subplan's initialization
+		 */
+		appendstate-as_whichplan = appendstate-as_firstplan;
+		exec_append_initialize_next(appendstate);
+	} else {
+		/* set up scan keys and initialize *all* the subnodes */
+		int i;
+
+		appendstate-as_nkeys = node-numCols;
+		appendstate-as_scankeys = palloc(sizeof(ScanKeyData) *  node-numCols);
+		appendstate-as_slots = palloc(sizeof(TupleTableSlot *) * nplans);
+		appendstate-as_heap = palloc(sizeof(int) * nplans);
+		appendstate-as_heap_size = 0;
+
+		for (i=0; i  nplans; i++)
+		{
+			appendstate-as_whichplan = i;
+			exec_append_initialize_next(appendstate);
+		}
+
+		appendstate-as_whichplan = WHICHPLAN_PLANS_UNINITIALIZED;
+
+		for (i=0; i  node-numCols; i++)
+		{
+			Oid sortFunction;
+			bool reverse;
+
+			get_compare_function_for_ordering_op(node-sortOperators[i],
+ sortFunction, reverse);
+
+			ScanKeyInit(appendstate-as_scankeys[i],
+		node-sortColIdx[i],
+		InvalidStrategy,
+		sortFunction,
+		(Datum)0);
+			if (reverse)
+appendstate-as_scankeys[i].sk_flags |= SK_BT_DESC;
+			if (node-nullsFirst[i])
+appendstate-as_scankeys[i].sk_flags |= SK_BT_NULLS_FIRST;
+		}
+	}
 
 	return appendstate;
 }
@@ -253,47 +310,168 @@ ExecCountSlotsAppend(Append *node)
 TupleTableSlot *
 ExecAppend(AppendState *node)
 {
-	for (;;)
-	{
-		PlanState  *subnode;
-		TupleTableSlot *result;
+	if (!node-as_is_ordered)
+		for (;;)
+		{
+			PlanState  *subnode;
+			TupleTableSlot *result;
 
-		/*
-		 * figure out which subplan we are currently processing
-		 */
-		subnode = node-appendplans[node-as_whichplan];
+			/*
+			 * figure out which subplan we are currently processing
+			 */
+			subnode = node-appendplans[node-as_whichplan];
 
-		/*
-		 * get a tuple from the subplan
-		 */
-		result = ExecProcNode(subnode);
+			/*
+			 * get a tuple from the subplan
+			 */
+			result = ExecProcNode(subnode);
+
+			if (!TupIsNull(result))
+			{
+/*
+ * If the subplan gave 

Re: [HACKERS] Merge Append Patch merged up to 85devel

2009-07-05 Thread Robert Haas

On Jul 5, 2009, at 10:02 AM, Gregory Stark st...@mit.edu wrote:
Here's a copy of the merge-append patch that I sent months ago  
merged up to

head. I haven't really added any additional functionality since then.

Heikki suggested I separate the Append and MergeAppend nodes into  
two executor
nodes. I had that half done in my tree but looking it over it leads  
to a lot
of duplicated code and a strange effect that there's on Path node  
but two
Executor nodes which seems strange. I'm not sure which way to go  
here but at
least for now I'm leaving it this way since it's less code to write.  
If we

want it the other way to commit then I'll do it.

The other pending question is the same I had back when I originally  
submitted

it. I don't really understand what's going on with eclasses and what
invariants we're aiming to maintain with them. I don't see a problem  
tossing
all the child relation attributes into the same eclass even though  
they're not
strictly speaking equivalent. No join above the append path is  
going to see
the child attributes anyways. But that might be shortsighted as I'm  
not really
sure what the consequences are and what other uses we have  
envisioned for

eclasses in the future.


Can you provide some more details about the objective of this patch?   
Or a link to previous discussion?


Thanks,

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feedback on writing extensible modules

2009-07-05 Thread Dimitri Fontaine

Hi,

Le 31 mai 09 à 18:21, Tom Lane a écrit :

The reason this doesn't work is that SPI can only be invoked inside a
transaction, and you're not inside one when a library is being
preloaded.


Please find attached a little little patch which run  
process_local_preload_libraries from within a transaction.


The following patch to preprepare makes it working fine when the GUC  
preprepare.at_init is on and the module is being loaded from  
local_preload_librairies:

  
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/preprepare/preprepare/pre_prepare.c.diff?r1=1.1r2=1.2


You could maybe make this work by executing your own transaction
to do it


I took the option to have PostgreSQL provide the same context in  
preloading that when loading later, in that in both case _PG_init()  
runs inside an already existing transaction. I don't see a current way  
for _PG_init() to distinguish between being called at backend fork()  
time or from a user transaction, so figured it was better this way.



but I really have to wonder if it's a good idea.  One
point to think about is that elog(ERROR) still means elog(FATAL)
at this point, so any brokenness in the queries you're trying to
prepare will result in locking all users out of the database.



Well loading custom code at init time is a foot-gun reserved to  
superusers with access to the local file system (where to put the  
custom .so) and allowed to signal postmaster. You're right that a  
failure in the module init routine will prevent anyone from connecting  
to the server, but the cure is to clean local_preload_librairies then  
restart.
Or with the preprepare example, to set preprepare.at_init to off then  
reload.


Regards,
--
dim



preload-spi.diff
Description: Binary data




PS: sorry I don't have the toolsuite to provide a context diff  
tonight, but given the size of the patch I figured I'd send it anyway.  
Will cook a context diff tomorrow if needed, it's running late here.  
Oh, we're already tomorrow, even.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feedback on writing extensible modules

2009-07-05 Thread Tom Lane
Dimitri Fontaine dfonta...@hi-media.com writes:
 Please find attached a little little patch which run  
 process_local_preload_libraries from within a transaction.

This is inevitably going to break other people's code.  Put the
transaction wrapper in your own stuff if you have to have it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 8.4, One-Time Filter and subquery ( ... FROM function() union all ... )

2009-07-05 Thread Sergey Burladyan

I am testing some of my queries with 8.4 and find some performance decline.

8.4 always execute functions in this subquery, even if result do not need it.
8.3 correctly optimize this and do not execute this functions, here is example:

create function foo() returns int language sql as $$ select pg_sleep(5); select 
1 $$;

PostgreSQL 8.3.7 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 
4.3.3-5) 4.3.3

EXPLAIN ANALYZE select * from (select 1 as i, r from foo() r union all select 
2, r from foo() r) as x where i = 3;
   QUERY PLAN

 Result  (cost=0.00..0.54 rows=2 width=8) (actual time=0.006..0.006 rows=0 
loops=1)
   -  Append  (cost=0.00..0.54 rows=2 width=8) (actual time=0.003..0.003 
rows=0 loops=1)
 -  Result  (cost=0.00..0.26 rows=1 width=4) (actual time=0.001..0.001 
rows=0 loops=1)
   One-Time Filter: false
   -  Function Scan on foo r  (cost=0.00..0.26 rows=1 width=4) 
(never executed)
 -  Result  (cost=0.00..0.26 rows=1 width=4) (actual time=0.001..0.001 
rows=0 loops=1)
   One-Time Filter: false
   -  Function Scan on foo r  (cost=0.00..0.26 rows=1 width=4) 
(never executed)
 Total runtime: 0.053 ms

PostgreSQL 8.4.0 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 
4.3.3-13) 4.3.3, 32-bit

EXPLAIN ANALYZE select * from (select 1 as i, r from foo() r union all select 
2, r from foo() r) as x where i = 3;
  QUERY PLAN
---
 Result  (cost=0.00..0.53 rows=2 width=36) (actual time=10007.352..10007.352 
rows=0 loops=1)
   -  Append  (cost=0.00..0.53 rows=2 width=36) (actual 
time=10007.351..10007.351 rows=0 loops=1)
 -  Function Scan on foo r  (cost=0.00..0.26 rows=1 width=36) (actual 
time=5003.342..5003.342 rows=0 loops=1)
   Filter: (1 = 3)
 -  Function Scan on foo r  (cost=0.00..0.26 rows=1 width=36) (actual 
time=5004.004..5004.004 rows=0 loops=1)
   Filter: (2 = 3)
 Total runtime: 10007.464 ms

BTW, if i move function from FROM to SELECT - 8.4 correctly optimize it like 
8.3:

EXPLAIN ANALYZE select * from (select 1 as i, foo() as r union all select 2, 
foo()) as x where i = 3;
   QUERY PLAN

 Result  (cost=0.00..0.54 rows=2 width=8) (actual time=0.005..0.005 rows=0 
loops=1)
   -  Append  (cost=0.00..0.54 rows=2 width=8) (actual time=0.003..0.003 
rows=0 loops=1)
 -  Result  (cost=0.00..0.26 rows=1 width=0) (actual time=0.001..0.001 
rows=0 loops=1)
   One-Time Filter: false
 -  Result  (cost=0.00..0.26 rows=1 width=0) (actual time=0.000..0.000 
rows=0 loops=1)
   One-Time Filter: false
 Total runtime: 0.048 ms


Is this expected behavior ? Can 8.4 optimize first query like 8.3 ?

Thinks !

ps: no response in pgsql-performance so i try ask in pgsql-hackers

-- 
Sergey Burladyan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4, One-Time Filter and subquery ( ... FROM function() union all ... )

2009-07-05 Thread Sergey Burladyan
Sergey Burladyan eshkin...@gmail.com writes:

 Thinks !

Th_a_nks ! :)

-- 
Sergey Burladyan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Merge Append Patch merged up to 85devel

2009-07-05 Thread Greg Stark
On Sun, Jul 5, 2009 at 10:34 PM, Robert Haasrobertmh...@gmail.com wrote:
 On Jul 5, 2009, at 10:02 AM, Gregory Stark st...@mit.edu wrote:

 Here's a copy of the merge-append patch that I sent months ago merged up to
 head. I haven't really added any additional functionality since then.

 Can you provide some more details about the objective of this patch?  Or a
 link to previous discussion?


It's one piece of the puzzle of how to deal with partitioned tables
more completely.

The basic problem is that currently the planner assumes all Append
nodes produce unordered output. That means there's no way for the
planner to use any indexes on partitions to produce a desired
ordering. That means it's impossible to do a merge join or to satisfy
an ORDER BY clause without introducing a sort even if you have
matching indexes on every partition.

Lots of common cases arise with partitioned tables where this kicks
in. Many of them will be eliminated as our partitioning support gets
more intelligent and is able to recognize how the partitioning key
relates to the requested order or collapse singleton partition scans
in cases where the parent table currently interferes. However there
will always be cases where those mechanisms to simplify the plan
sufficiently and we end up with an Append node of unordered partitions
and we need this as a back-stop to avoid awful plans.

The merge-append node works by teaching the Append node what sort
order it's aiming to produce. The append node then keeps a heap of
slots and returns the tuple from the top child plan replacing it in
the heap with the next plan.

This produces plans like this:

  QUERY PLAN

 Result  (cost=0.20..489.00 rows=9600 width=4)
   -  Append  (cost=0.20..489.00 rows=9600 width=4)
 -  Index Scan using p_pkey on p  (cost=0.00..80.25 rows=2400 width=4)
 -  Index Scan using p1_pkey on p1 p  (cost=0.00..80.25
rows=2400 width=4)
 -  Index Scan using p2_pkey on p2 p  (cost=0.00..80.25
rows=2400 width=4)
 -  Index Scan using p3_pkey on p3 p  (cost=0.00..80.25
rows=2400 width=4)
(6 rows)

Instead of plans like this which is the best we can do today:

QUERY PLAN
--
 Sort  (cost=770.98..794.98 rows=9600 width=4)
   Sort Key: public.p.i
   -  Result  (cost=0.00..136.00 rows=9600 width=4)
 -  Append  (cost=0.00..136.00 rows=9600 width=4)
   -  Seq Scan on p  (cost=0.00..34.00 rows=2400 width=4)
   -  Seq Scan on p1 p  (cost=0.00..34.00 rows=2400 width=4)
   -  Seq Scan on p2 p  (cost=0.00..34.00 rows=2400 width=4)
   -  Seq Scan on p3 p  (cost=0.00..34.00 rows=2400 width=4)
(8 rows)


-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP: generalized index constraints

2009-07-05 Thread Jeff Davis
This is a follow up to my old proposal here:

http://archives.postgresql.org/pgsql-hackers/2008-06/msg00404.php

Top pointed out a few problems here:

http://archives.postgresql.org/pgsql-hackers/2008-06/msg00427.php

Here are my updated answers:

1. Not a problem with the new design, which checks the constraints from
ExecInsertIndexTuples().

2. Not a problem for similar reasons.

3. I don't have an answer here yet, but I have a few thoughts. I see it
as a separate proposal. My hand-waving answer is that it should be just
as possible as before to append index constraint failures to a big list,
and loop through it as long as we're making progress. If I need a more
solid proposal for this problem before my generalized constraints
proposal is considered, let me know.

To try out my patch:

(1) Apply patch to 8.5-devel and Init DB

(2) Install contrib/btree_gist (only necessary for this example, patch
works with Btree and GIN, too).

(3)
  = create table test(i int, c circle);
  = create index test_idx on test using gist(i, c);
  = UPDATE pg_index SET indconstrats = '3 3' 
 WHERE indexrelid='test_idx'::regclass;

  In the above query, 3 is the equality strategy number for the GiST
opclass for integers, and 3 is also the overlaps strategy number for
the GiST opclass for circles, so we put a 3 for each attribute. What
this will mean is that it will reject any new tuple when there is
already another tuple in the table with an equal value of i AND an
overlapping value of c. Concurrency should behave identically to UNIQUE
on a btree.

(4) Now, try some inserts (concurrent or otherwise) and see what
happens.

Ultimately, I think the language for this might shape up something like:

CREATE INDEX test_idx ON test USING gist
  (i CONSTRAINT =, c CONSTRAINT );

which would avoid the need for updating the catalog, of course.

Limitations:

 * Still not deferrable, even 'til the end of the command.
 * Your constraint must be symmetric (if tuple A conflicts with tuple B,
tuple B must conflict with tuple A).
 * The types have to match between the left and right arguments in the
operator class and the type of the column in the table. This is normally
true, but the GIN Array opclass works on type anyarray, but the table
has a normal type, which causes a problem. Maybe it's possible to be
smarter about this, but the workaround is to just create more opclasses
(I believe).

Any input is appreciated (design problems, implementation, language
ideas, or anything else). I'd like to get it into shape for the July 15
commitfest if no major problems are found.

Regards,
Jeff Davis
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 1515d9f..eedb456 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -26,6 +26,7 @@
  *		index_vacuum_cleanup	- post-deletion cleanup of an index
  *		index_getprocid - get a support procedure OID
  *		index_getprocinfo - get a support procedure's lookup info
+ *  index_check_constraint - check index constraints
  *
  * NOTES
  *		This file contains the index_ routines which used
@@ -64,9 +65,13 @@
 
 #include access/relscan.h
 #include access/transam.h
+#include miscadmin.h
 #include pgstat.h
 #include storage/bufmgr.h
 #include storage/lmgr.h
+#include storage/lwlock.h
+#include storage/procarray.h
+#include utils/lsyscache.h
 #include utils/relcache.h
 #include utils/snapmgr.h
 #include utils/tqual.h
@@ -116,6 +121,19 @@ do { \
 static IndexScanDesc index_beginscan_internal(Relation indexRelation,
 		 int nkeys, ScanKey key);
 
+typedef struct
+{
+	Oid	relid;
+	TransactionId		xid;
+	ItemPointerData		tid;
+} CurrentIndexInsertEntry;
+
+static CurrentIndexInsertEntry *CurrentIndexInsertsTable = NULL;
+
+static bool index_check_constraint_conflict(TupleTableSlot *slot,
+			HeapTuple tup, int2 *heap_attnums,
+			int2 index_natts,
+			Oid *constraint_procs);
 
 /* 
  *   index_ interface functions
@@ -846,3 +864,278 @@ index_getprocinfo(Relation irel,
 
 	return locinfo;
 }
+
+void
+index_check_constraint(Relation heap, Relation index,
+		ItemPointer tid, TupleTableSlot *slot)
+{
+		IndexScanDesc	 index_scan;
+		HeapTuple		 tup;
+		ScanKeyData		*scankeys;
+		int2vector		*constr_strats;
+		Oid*constr_procs;
+		int i;
+		int2			*heap_attnums = index-rd_index-indkey.values;
+		int2			 index_natts  = index-rd_index-indnatts;
+		SnapshotData	 DirtySnapshot;
+		int nkeys		  = 0;
+
+		CurrentIndexInsertEntry *MyIndexInsertEntry;
+		CurrentIndexInsertEntry	 potential_conflicts[MaxBackends];
+		int		 n_potential_conflicts = 0;
+
+		/* Find constraint strategy numbers */
+		constr_strats = RelationGetIndexConstraintStrategies(index);
+
+		/* return if no constraint */
+		if (constr_strats == NULL)
+			return;
+
+		/*
+		 * if any of the indexed columns are NULL, the constraint
+		 * is satisfied
+		 */
+		

[HACKERS] REINDEX is not a btree

2009-07-05 Thread Vanessa Lopez

Hello!

I got into a situation I don't know how the get out ..
First, I could not access to my biggest database in postgre anymore  
because it suddenly gave the error (after long time working with no  
problems)


ERROR: could not open relation 1663/392281/530087: No such file  
or directory


After trying with several backups with no success, I did a vacuum and  
I tried to REINDEX the database (in the standalone back-end).


Unfortunately the process was interrupted, and when I tried to start  
postgres again  I got the error:


'SQL select * from pg_database order by datname failed : index  
pg_authid_rolname_index is not a btree


I connected as a standalone mode again to REINDEX the database:
pg_ctl stop -D /data/pgsql/data
/usr/bin/postgres -D /data/pgsql/data dbpedia_infoboxes
REINDEX database dbpedia_infoboxes

The REINDEX was successful this time but I was still having the is  
not a btree problem, so I tried again with:

pg_ctl stop -D /data/pgsql/data
/usr/bin/postgres -D /data/pgsql/data dbpedia_infoboxes
REINDEX SYSTEM dbpedia_infoboxes

The process finish, but I was still having the is not a btree problem.
And even more, now not only the same problem is not a btree is still  
there, but also I can not connect in the standalone mode anymore:


bash-3.2$ /usr/bin/postgres -D /data/pgsql/data dbpedia_infoboxes
FATAL:  index pg_database_datname_index is not a btree
(I tried with other databases as well and the same)

I don't know much about postgre, I have no clue what else I can do.
Please, please any help is very very much appreciated I have lots of  
databases and months of work in postgre (also lots of backups for the  
data in /data) but I don't know how to make postgres to work again.

 (it is working in unix red hat).

Millions of thanks in advance, solving this problem is crucial for me.
Vanessa