Re: [PATCHES] Updated bitmap index patch

2007-05-04 Thread Mark Kirkwood

Mark Kirkwood wrote:



I have applied this to todays HEAD performed some quick tests - looks 
good! I have to re-create a TPC-H dataset to test one of the previous 
bugs, so I'll probably look at that tomorrow or so.




The TPC-H query query that previously produced a SIGSEGV now runs and 
gives the correct answer.


Cheers

Mark

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Updated bitmap index patch

2007-05-04 Thread Finlay Thompson
Hi Gavin

Thanks for the new patch! 

I ran some address matching on the patched code and have generated
another ERROR:  out of memory problem. 

The strange thing is that it runs over 150 queries without problem and
then crashes.

I have attached the logfile (well some of it).

If you want I can send you the schema as well


Finlay


On Thu, 2007-05-03 at 13:51 +1000, Gavin Sherry wrote:
 Hi all,
 
 Attached is an updated bitmap index patch. It contains bug fixes, API
 changes, binary changes (page identifier to distinguish it from other
 indexes) and has been brought up to HEAD.
 
 I worked on a few approaches to VACUUM, none very satisfactory. The
 problem is, breaking a compressed word representing matches can have
 serious consequences -- at the least, creation of new words, at the worst,
 creation of a new page. If a lot of this were to happen, REINDEX would be
 much more efficient (this is what earlier patches did).
 
 One approach I looked at was modifying the existing read API to be able to
 do something like kill prior tuple. This, I think, made the API quite
 complex and it was hard to implement, since the existing mechanism
 decompresses words on the fly and it would be hard to identify which TID
 is no longer a match. So, I dropped this idea pretty quickly.
 
 The second approach is to just manually traverse each vector and change
 matches to non-matches where necessary. The complexity then is in managing
 the consequences of breaking compressed words, doing WAL (efficiently) and
 calculating free space. I've only partially implemented this approach. At
 this stage, I don't have time to finish it due to other commitments.
 
 Thanks,
 
 Gavin
 ---(end of broadcast)--- TIP 
 2: Don't 'kill -9' the postmaster

NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index input_10_pkey 
for table input_10
TopMemoryContext: 10951272 total in 335 blocks; 20248 free (388 chunks); 
10931024 used
TopTransactionContext: 8192 total in 1 blocks; 6816 free (0 chunks); 1376 used
RI compare cache: 8192 total in 1 blocks; 5904 free (0 chunks); 2288 used
RI query cache: 24576 total in 2 blocks; 14136 free (5 chunks); 10440 used
Local Buffer Lookup Table: 24576 total in 2 blocks; 12984 free (4 chunks); 
11592 used
Prepared Queries: 8192 total in 1 blocks; 1800 free (0 chunks); 6392 used
Operator class cache: 8192 total in 1 blocks; 3848 free (0 chunks); 4344 used
Type information cache: 8192 total in 1 blocks; 1800 free (0 chunks); 6392 used
MessageContext: 1048576 total in 8 blocks; 491792 free (4 chunks); 556784 used
smgr relation table: 24576 total in 2 blocks; 16080 free (4 chunks); 8496 used
TransactionAbortContext: 32768 total in 1 blocks; 32752 free (0 chunks); 16 used
Portal hash: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used
PortalHeapMemory: 4096 total in 1 blocks; 1912 free (0 chunks); 2184 used
ExecutorState: 2765031488 total in 19271 blocks; 7629976 free (15 chunks); 
2757401512 used
HashBitmap: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
HashBitmap: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
HashBitmap: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
HashBitmap: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
HashBitmap: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used
HashBitmap: 8192 total in 1 blocks; 3912 free (0 chunks); 4280 used

-- snip --
19085 more lines like that
-- snip --

HashBitmap: 24576 total in 2 blocks; 14112 free (4 chunks); 10464 used
HashBitmap: 24576 total in 2 blocks; 14112 free (4 chunks); 10464 used
HashBitmap: 24576 total in 2 blocks; 14112 free (4 chunks); 10464 used
HashBitmap: 24576 total in 2 blocks; 14112 free (4 chunks); 10464 used
TupleSort: 1335472 total in 12 blocks; 1014144 free (16387 chunks); 321328 used
TupleSort: 24600 total in 2 blocks; 7520 free (0 chunks); 17080 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Unique: 0 total in 0 blocks; 0 free (0 chunks); 0 used
Relcache by OID: 8192 total in 1 blocks; 1816 free (0 chunks); 6376 used
CacheMemoryContext: 1191760 

Re: [PATCHES] document plperl argument and return value representation

2007-05-04 Thread Alvaro Herrera
Andrew Dunstan wrote:
 
 
 Alvaro Herrera wrote:
 Andrew Dunstan wrote:
   
 The attached docs patch makes clearer how arguments and return values in 
 pl/perl are escaped. This is to clarify the situation that Theo 
 Schlossnagle recently reported on -bugs.
 
 
 I find the mix of arguments and results a bit confusing.  Maybe you
 could put them in separate paragraphs.
 
 Is this better?

Yup, looks very nice.

This message also serves to test whether I removed the annoying
Mail-Followup-To header.  It was actually a _feature_ of mutt
(followup_to, which fired when I replied to a list to which I'm
subscribed), which I disabled.  Sorry about that.

ETOOMANYWHICH, sorry about that as well :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PATCHES] small patch for guc issues

2007-05-04 Thread Joachim Wieland
The appended patch addresses the outstanding issues of the recent guc patch.
It makes PGCLIENTENCODING work again and uses bsearch() instead of
iterating over the array of guc variables in guc_get_index().

Joachim
Index: src/backend/utils/misc/guc.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
retrieving revision 1.390
diff -c -r1.390 guc.c
*** src/backend/utils/misc/guc.c	4 May 2007 01:13:44 -	1.390
--- src/backend/utils/misc/guc.c	4 May 2007 16:14:21 -
***
*** 2692,2709 
  	return true;
  }
  
- static int
- guc_get_index(const char *name)
- {
- 	int i;
- 
- 	for (i = 0; i  num_guc_variables; i++)
- 		if (guc_name_compare(name, guc_variables[i]-name) == 0)
- 			return i;
- 
- 	return -1;
- }
- 
  /*
   * Create and add a placeholder variable. It's presumed to belong
   * to a valid custom variable class at this point.
--- 2692,2697 
***
*** 2843,2848 
--- 2831,2860 
  }
  
  
+ static int
+ guc_get_index(const char *name)
+ {
+ 	const char **key = name;
+ 	struct config_generic **res;
+ 
+ 	Assert(name);
+ 
+ 	/*
+ 	 * By equating const char ** with struct config_generic *, we are assuming
+ 	 * the name field is first in config_generic.
+ 	 */
+ 	res = (struct config_generic **) bsearch((void *) key,
+ 			 (void *) guc_variables,
+ 			 num_guc_variables,
+ 			 sizeof(struct config_generic *),
+ 			 guc_var_compare);
+ 	if (!res)
+ 		return -1;
+ 
+ 	return res - guc_variables;
+ }
+ 
+ 
  /*
   * Initialize GUC options during program startup.
   *
***
*** 4095,4103 
  	 * we can't set the variable itself.  There's one exception to
  	 * this rule: if we want to apply the default value to variables
  	 * that were removed from the configuration file.  This is
! 	 * indicated by source == PGC_S_DEFAULT.
  	 */
! 	if (record-source  source  source != PGC_S_DEFAULT)
  	{
  		if (changeVal  !makeDefault)
  		{
--- 4107,4116 
  	 * we can't set the variable itself.  There's one exception to
  	 * this rule: if we want to apply the default value to variables
  	 * that were removed from the configuration file.  This is
! 	 * indicated by source == PGC_S_DEFAULT and context == PGC_SIGHUP.
  	 */
! 	if (record-source  source
! 		 (source != PGC_S_DEFAULT || context != PGC_SIGHUP))
  	{
  		if (changeVal  !makeDefault)
  		{

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] updated SORT/LIMIT patch

2007-05-04 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 Updated patch against cvs update in case it makes applying easier.

 Applied with revisions --- notably, I avoided adding any overhead to
 HEAPCOMPARE() by the expedient of reversing the logical sort order
 before heapify'ing.  We couldn't have done that before the NULLS_FIRST
 patch went in, but now it's trivial to make the sort order reverse
 fully.

Hum. The major change I see is the bit related to rescans where you made it
resort if the bound had changed. But surely the only way the bound can change
is if it's a parameter, and if there is a parameter then surely the executor
must be doing more than just a plain rescan? The sort key could have changed
if it depends on the parameter.

What does the executor do differently in the case of a subplan with a
parameter that makes it re-execute the plan from scratch and not just do a
simple rescan?

 Since you didn't include any documentation patch for the
 optimize_bounded_sort GUC variable, I assumed it was meant only for
 debugging and hid it behind #ifdef DEBUG_BOUNDED_SORT.

Sure, I originally had it #ifdef'd on TRACE_SORT but took it out for reasons
that I don't recall.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] updated SORT/LIMIT patch

2007-05-04 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Hum. The major change I see is the bit related to rescans where you made it
 resort if the bound had changed. But surely the only way the bound can change
 is if it's a parameter, and if there is a parameter then surely the executor
 must be doing more than just a plain rescan?

The problem is that a parameter change in the LIMIT's expression would
not normally be propagated below the LIMIT.  In this case, since we're
allowing its effects to bubble down one more level of the tree, we need
to make sure that that level is recomputed too.

 What does the executor do differently in the case of a subplan with a
 parameter that makes it re-execute the plan from scratch and not just do a
 simple rescan?

Look at the chgParam signaling.  Since a Sort node itself has no
parameters, it historically has only had to re-sort if its input node
suffers a parameter change, which it checks in ExecReScanSort.  But now
the bound effectively acts like a parameter, and has to force a
recomputation.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Updated bitmap index patch

2007-05-04 Thread Jie Zhang
Finlay,

Thanks for testing. If you can send me the schema, that would be great.

Thanks,
Jie


On 5/4/07 5:41 AM, Finlay Thompson [EMAIL PROTECTED] wrote:

 Hi Gavin
 
 Thanks for the new patch!
 
 I ran some address matching on the patched code and have generated
 another ERROR:  out of memory problem.
 
 The strange thing is that it runs over 150 queries without problem and
 then crashes.
 
 I have attached the logfile (well some of it).
 
 If you want I can send you the schema as well
 
 
 Finlay
 
 
 On Thu, 2007-05-03 at 13:51 +1000, Gavin Sherry wrote:
 Hi all,
 
 Attached is an updated bitmap index patch. It contains bug fixes, API
 changes, binary changes (page identifier to distinguish it from other
 indexes) and has been brought up to HEAD.
 
 I worked on a few approaches to VACUUM, none very satisfactory. The
 problem is, breaking a compressed word representing matches can have
 serious consequences -- at the least, creation of new words, at the worst,
 creation of a new page. If a lot of this were to happen, REINDEX would be
 much more efficient (this is what earlier patches did).
 
 One approach I looked at was modifying the existing read API to be able to
 do something like kill prior tuple. This, I think, made the API quite
 complex and it was hard to implement, since the existing mechanism
 decompresses words on the fly and it would be hard to identify which TID
 is no longer a match. So, I dropped this idea pretty quickly.
 
 The second approach is to just manually traverse each vector and change
 matches to non-matches where necessary. The complexity then is in managing
 the consequences of breaking compressed words, doing WAL (efficiently) and
 calculating free space. I've only partially implemented this approach. At
 this stage, I don't have time to finish it due to other commitments.
 
 Thanks,
 
 Gavin
 ---(end of broadcast)--- TIP
 2: Don't 'kill -9' the postmaster
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] updated SORT/LIMIT patch

2007-05-04 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Look at the chgParam signaling.  Since a Sort node itself has no
 parameters, it historically has only had to re-sort if its input node
 suffers a parameter change, which it checks in ExecReScanSort.  But now
 the bound effectively acts like a parameter, and has to force a
 recomputation.

 Hm, that all makes sense now. But then there's something mysterious going on
 still as the regression test I tried to write for this actually does work:

Yeah, because in this example nodeSort doesn't ask for randomAccess to
the sort result, and so ExecReScanSort is forced to repeat the sort
anyway.

[ greps a bit... ]  It looks like the only way that you could expose the
bug in the current state of the system would be if the sort/limit with
the outer parameter were the inside of a nestloop join in the subplan.
nodeNestloop would set EXEC_FLAG_REWIND, causing nodeSort to set
randomAccess, allowing ExecReScanSort to suppose that it could rewind
the sort.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly