date:20070724


Alvaro Herrera [EMAIL PROTECTED] writes:

 We didn't, but while I agree with the idea, I think 5% is too low.  I
 don't want autovacuum to get excessively aggressive.  Is 10% not enough?

Well let me flip it around. Would you think a default fillfactor of 10% would
be helpful or overkill? I think it would nearly always be overkill and waste
heap space and therefore cache hit rate and i/o bandwidth.

I get my 5% intuition from the TPCC stock table which has about 20 tuples per
page. That means a fillfactor or vacuum at 5% both translate into trying to
maintain a margin of one tuple's worth of space per page. Enough for an update
to happen without migrating to a new page.

That's actually a fairly wide table though. A narrower table could easily have
50-100 tuple per page which would require only 1-2% of dead space overhead.

idle speculation

Perhaps the two parameters should be tied together and we should make the
autovacuum parameter: max(1%, min(10%, fillfactor(table))) and make the
default fill factor 5%.

Hm. We have the width of the table in the stats don't we? We could actually
calculate the 1 tuple's worth of space percentage automatically on a
per-table basis. Or for that matter instead of calculating it as a percentage
of the whole table, just compare the number of updates/deletes with the number
of pages in the table.

/speculation

 How about the analyze scale factor, should we keep the current 10%?  I
 have less of a problem with reducing it further since analyze is cheaper
 than vacuum.

My try to maintain one tuple's worth of space model doesn't answer this
question at all. It depends entirely on whether the ddl is changing the data
distribution.

Perhaps this should be 1/max(stats_target) for the table. So the default would
be 10% but if you raise the stats_target for a column to 100 it would go down
to 1% or so.

The idea being that if you have ten buckets then updating 1/10th of the rows
stands an even chance of doubling or halving the size of your bucket. Except
there's no math behind that intuition at all and I rather doubt it makes much
sense.

Actually I feel like there should be a factor of 2 or more in there as well.
If you modify 1/10th of the rows and you have 10 buckets then we should be
analyzing *before* the distribution has a chance to be modified beyond
recognition.

Perhaps I shouldn't have closed the speculation tag so early :) The problem
if we try to calculate reasonable defaults like this is it makes it unclear
how to expose any knob for the user to adjust it if they need to.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] GucContext of log_autovacuum

On Tue, 2007-07-24 at 13:50 +0900, ITAGAKI Takahiro wrote:
 The GucContext of log_autovacuum is PGC_BACKEND in the CVS HEAD,
 but should it be PGC_SIGHUP? We cannot modify the variable on-the-fly
 because the parameter is used only by autovacuum worker processes.
 The similar variables, like autovacuum_vacuum_scale_factor, are
 defined as PGC_SIGHUP.

Agreed, PGC_SIGHUP seems a much better setting. The PGC_BACKEND setting
originated with me and probably nobody thought to change that aspect of
the patch. 

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] avoiding WAL logging in 8.3

On Tue, 2007-07-24 at 13:04 +0900, Tatsuo Ishii wrote:

 I noticed in 8.3 there are chances where we can avoid WAL logging. For
 example, 8.3's pgbench was modified to use TRUNCATE right before
 COPY. Is there any documentation which describes that kind of
 techniques? If there's none, I would volunteer the work to create such
 a documentation since I think this is valuable information for DBAs
 who wish to migrate to 8.3.

The Performance Tips section has been modified to describe this. Would
you like me to add something elsewhere also? Multiple entry points to
information helps everybody, so I'll happily add more.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] EXEC_EVALDEBUG debugging broken?

2007-07-24 Thread peter . trautmeier

Hi all,

I am using version 8.2.4 of the source and compiled it with
both OPTIMIZER_DEBUG and EXEC_EVALDEBUG enabled to take a look
at how quals are evaluated by the executor.

However, when I issue a query like

SELECT name FROM city WHERE population  10 LIMIT 10;

I get the following debug output from postgres:

After canonicalize_qual()
   {OPEXPR 
   :opno 97 
   :opfuncid 66 
   :opresulttype 16 
   :opretset false 
   :args (
  {VAR 
  :varno 1 
  :varattno 4 
  :vartype 23 
  :vartypmod -1 
  :varlevelsup 0 
  :varnoold 1 
  :varoattno 4
  }
  {CONST 
  :consttype 23 
  :constlen 4 
  :constbyval true 
  :constisnull false 
  :constvalue 4 [ -96 -122 1 0 ]
  }
   )
   }

RELOPTINFO (1): rows=1018 width=88
baserestrictinfo: city.population  10
path list:
SeqScan(1) rows=1018 cost=0.00..66.16

cheapest startup path:
SeqScan(1) rows=1018 cost=0.00..66.16

cheapest total path:
SeqScan(1) rows=1018 cost=0.00..66.16

WARNING:  could not dump unrecognized node type: 404
ExecQual: qual is (
   {
   }
)


WARNING:  could not dump unrecognized node type: 404
ExecQual: qual is (
   {
   }
)

... and many more of this WARNINGs.

What happens to the OpExpr on its way from canonicalize_qual() to ExecQual() 
that makes _outNode() stumble over it when it is encountered
 in ExecQual()?

Regards,
Peter

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] avoiding WAL logging in 8.3

2007-07-24 Thread Tatsuo Ishii

  I noticed in 8.3 there are chances where we can avoid WAL logging. For
  example, 8.3's pgbench was modified to use TRUNCATE right before
  COPY. Is there any documentation which describes that kind of
  techniques? If there's none, I would volunteer the work to create such
  a documentation since I think this is valuable information for DBAs
  who wish to migrate to 8.3.
 
 The Performance Tips section has been modified to describe this. Would
 you like me to add something elsewhere also? Multiple entry points to
 information helps everybody, so I'll happily add more.

Thanks for pointing out. I found following:

COPY is fastest when used within the same transaction as an earlier
CREATE TABLE or TRUNCATE command. In such cases no WAL needs to be
written, because in case of an error, the files containing the newly
loaded data will be removed anyway.

Sounds great!

BTW, I noticed that COPY, CLUSTER, B-Tree split logging improvements
in Josh's presentation in Tokyo. Are they just internal changes and
are nothing to do with DBA's job?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] avoiding WAL logging in 8.3

On Tue, 2007-07-24 at 18:45 +0900, Tatsuo Ishii wrote:
   I noticed in 8.3 there are chances where we can avoid WAL logging. For
   example, 8.3's pgbench was modified to use TRUNCATE right before
   COPY. Is there any documentation which describes that kind of
   techniques? If there's none, I would volunteer the work to create such
   a documentation since I think this is valuable information for DBAs
   who wish to migrate to 8.3.
  
  The Performance Tips section has been modified to describe this. Would
  you like me to add something elsewhere also? Multiple entry points to
  information helps everybody, so I'll happily add more.
 
 Thanks for pointing out. I found following:
 
 COPY is fastest when used within the same transaction as an earlier
 CREATE TABLE or TRUNCATE command. In such cases no WAL needs to be
 written, because in case of an error, the files containing the newly
 loaded data will be removed anyway.
 
 Sounds great!
 
 BTW, I noticed that COPY, CLUSTER, B-Tree split logging improvements
 in Josh's presentation in Tokyo. Are they just internal changes and
 are nothing to do with DBA's job?

Cluster is also mentioned lower down
http://developer.postgresql.org/pgdocs/postgres/populate.html#POPULATE-PITR

The b-tree split logging is an algorithmic reduction in WAL, so isn't
user visible or optional in any way.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Kerberos warnings on win32

2007-07-24 Thread Magnus Hagander

When building with Kerberos support (or GSSAPI, but not SSPI) on Win32, a
whole bunch of warnings come out due to redefinitions of macros in the
kerberos headers. The reason for this is that Kerberos leaks the
HAVE_something macros from autoconf into the header files that are
included by PostgreSQL. 

The attached file removes this by undefing the macros before we include the
kerberos files. But this is perhaps just too ugly to deal with and we
should live with the warnings instead?

For MSVC, we can suppress the warnings with a #pragma around the include,
but I don't know if that's possible in mingw/gcc. It'll look something likt
the second patch attached (only did that for one of the places that'd need
it, to show what it looks like)

Thoughts? Worth doing anything about?

//Magnus
Index: src/backend/libpq/auth.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/libpq/auth.c,v
retrieving revision 1.155
diff -c -r1.155 auth.c
*** src/backend/libpq/auth.c24 Jul 2007 09:00:27 -  1.155
--- src/backend/libpq/auth.c24 Jul 2007 10:21:26 -
***
*** 96,101 
--- 96,113 
   *
   */

+ #ifdef WIN32
+ /*
+  * Kerberos on windows leaks header definitions from autoconf, causing a
+  * bunch of warnings. Undefine those here and let krb redefine them.
+  */
+ #undef HAVE_NETINET_IN_H
+ #undef HAVE_STRING_H
+ #undef HAVE_STRDUP
+ #undef HAVE_STRERROR
+ #undef HAVE_SYS_TYPES_H
+ #undef HAVE_STDLIB_H
+ #endif
  #include krb5.h
  /* Some old versions of Kerberos do not include com_err.h in krb5.h */
  #if !defined(__COM_ERR_H)  !defined(__COM_ERR_H__)
Index: src/include/libpq/libpq-be.h
===
RCS file: /projects/cvsroot/pgsql/src/include/libpq/libpq-be.h,v
retrieving revision 1.62
diff -c -r1.62 libpq-be.h
*** src/include/libpq/libpq-be.h23 Jul 2007 10:16:54 -  1.62
--- src/include/libpq/libpq-be.h24 Jul 2007 10:16:25 -
***
*** 30,35 
--- 30,47 
  #endif

  #ifdef ENABLE_GSS
+ #ifdef WIN32
+ /*
+  * Kerberos on windows leaks header definitions from autoconf, causing a
+  * bunch of warnings. Undefine those here and let krb redefine them.
+  */
+ #undef HAVE_NETINET_IN_H
+ #undef HAVE_STRING_H
+ #undef HAVE_STRDUP
+ #undef HAVE_STRERROR
+ #undef HAVE_SYS_TYPES_H
+ #undef HAVE_STDLIB_H
+ #endif
  #if defined(HAVE_GSSAPI_H)
  #include gssapi.h
  #else
Index: src/interfaces/libpq/fe-auth.c
===
RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-auth.c,v
retrieving revision 1.131
diff -c -r1.131 fe-auth.c
*** src/interfaces/libpq/fe-auth.c  24 Jul 2007 09:00:27 -  1.131
--- src/interfaces/libpq/fe-auth.c  24 Jul 2007 10:21:03 -
***
*** 54,59 
--- 54,71 
   * MIT Kerberos authentication system - protocol version 5
   */

+ #ifdef WIN32
+ /*
+  * Kerberos on windows leaks header definitions from autoconf, causing a
+  * bunch of warnings. Undefine those here and let krb redefine them.
+  */
+ #undef HAVE_NETINET_IN_H
+ #undef HAVE_STRING_H
+ #undef HAVE_STRDUP
+ #undef HAVE_STRERROR
+ #undef HAVE_SYS_TYPES_H
+ #undef HAVE_STDLIB_H
+ #endif
  #include krb5.h
  /* Some old versions of Kerberos do not include com_err.h in krb5.h */
  #if !defined(__COM_ERR_H)  !defined(__COM_ERR_H__)
Index: src/interfaces/libpq/libpq-int.h
===
RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/libpq-int.h,v
retrieving revision 1.126
diff -c -r1.126 libpq-int.h
*** src/interfaces/libpq/libpq-int.h23 Jul 2007 18:59:50 -  1.126
--- src/interfaces/libpq/libpq-int.h24 Jul 2007 10:14:28 -
***
*** 45,50 
--- 45,62 
  #include pqexpbuffer.h

  #ifdef ENABLE_GSS
+ #ifdef WIN32
+ /*
+  * Kerberos on windows leaks header definitions from autoconf, causing a
+  * bunch of warnings. Undefine those here and let krb redefine them.
+  */
+ #undef HAVE_NETINET_IN_H
+ #undef HAVE_STRING_H
+ #undef HAVE_STRDUP
+ #undef HAVE_STRERROR
+ #undef HAVE_SYS_TYPES_H
+ #undef HAVE_STDLIB_H
+ #endif
  #if defined(HAVE_GSSAPI_H)
  #include gssapi.h
  #else
Index: src/interfaces/libpq/libpq-int.h
===
RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/libpq-int.h,v
retrieving revision 1.126
diff -c -r1.126 libpq-int.h
*** src/interfaces/libpq/libpq-int.h23 Jul 2007 18:59:50 -  1.126
--- src/interfaces/libpq/libpq-int.h24 Jul 2007 10:29:02 -
***
*** 45,55 
--- 45,61 
  #include pqexpbuffer.h
  
  #ifdef ENABLE_GSS
+ #ifdef WIN32_ONLY_COMPILER
+ #pragma warning(disable:4005)
+ #endif
  #if defined(HAVE_GSSAPI_H)
  #include gssapi.h
  #else
  #include gssapi/gssapi.h
  #endif
+ #ifdef

[HACKERS] DLLIMPORT definition

2007-07-24 Thread Magnus Hagander

The DLLIMPORT definition used on Win32 conflicts with the headers in TCL,
at least, and possibly others.

One way to fix it is similar to the HAVE_xyz ones that I talk about in my
other email. Another way to do it would be for us to use PGDLLIMPORT
instead of DLLIMPORT. That way we'd be sure not to conflict with any
*other* third party modules as well, which could happen if you have a
server-side module that links pg to something.

We seem to have a little over 100 entries of DLLIMPORT including comments
and the ecpg regression tests that duplicate it in the output files,
so it's not a huge thing to change.

Thoughts?

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] pgcrypto strong ciphers limitation

2007-07-24 Thread Zdenek Kotala

Stefan reported me that prcrypto regression test fails on solaris 10 
with openssl support. I investigated this problem and the result is that 
Solaris 10 delivers only support for short keys up to 128. Strong crypto 
(SUNWcry and SUNWcryr packages) is available on web download pages. (It 
is result of US crypto export policy.)


However, on default installation  (which is commonly used) it is a 
problem. Regression test cannot be fixed because it tests strong 
ciphers, but there two very strange issue:


1) First issue is blowfish cipher. Because pgcrypto uses old interface 
instead new evp it calls bf_set_key function which does not return any 
output and cut key if it is too long. See 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/bf/bf_skey.c

line 84.

If user installs strong crypto he will not be able decrypt data which 
has been encrypted before.


The fix of this issue is ugly, because there is not way how to verify 
supported key length with old openssl API and only new API return err if 
length is not supported.



2) AES ciphere crashes when key is longer. It happens because return 
value from AES_set_encrypt_key is ignored and AES_encrypt is called with 
uninitialized structure.



I attach patch which fix both issues, but main problem is there that old 
openssl API is used and supported key lengths are hardcoded. I think we 
can add to TODO list rewrite pgcrypto to use evp openssl interface.



Any comments?

Zdenek

Index: openssl.c
===
RCS file: /projects/cvsroot/pgsql/contrib/pgcrypto/openssl.c,v
retrieving revision 1.30
diff -c -r1.30 openssl.c
*** openssl.c	4 Oct 2006 00:29:46 -	1.30
--- openssl.c	24 Jul 2007 11:20:02 -
***
*** 380,385 
--- 380,399 
  {
  	ossldata   *od = c-ptr;
  
+ 	/* Test if key len is supported. BF_set_key silently cut large keys and it could be
+ 	 be a problem when user transfer crypted data from one server to another. */
+ 	EVP_CIPHER_CTX ctx;
+ 	EVP_CIPHER_CTX_init(ctx);
+ 	EVP_EncryptInit_ex(ctx, EVP_bf_cbc(), NULL, NULL, NULL);
+ 	EVP_CIPHER_CTX_set_key_length(ctx,klen);
+ 	if( !EVP_EncryptInit_ex(ctx,NULL, NULL, key, NULL) )
+ 	{
+ 		EVP_CIPHER_CTX_cleanup(ctx);
+ 		return PXE_KEY_TOO_BIG;
+ 	}
+ 	EVP_CIPHER_CTX_cleanup(ctx);
+ 
+ 	/* Key len is supported. We can use it. */
  	BF_set_key(od-u.bf.key, klen, key);
  	if (iv)
  		memcpy(od-iv, iv, BF_BLOCK);
***
*** 692,705 
  	return 0;
  }
  
! static void
  ossl_aes_key_init(ossldata * od, int type)
  {
  	if (type == AES_ENCRYPT)
! 		AES_set_encrypt_key(od-key, od-klen * 8, od-u.aes_key);
  	else
! 		AES_set_decrypt_key(od-key, od-klen * 8, od-u.aes_key);
! 	od-init = 1;
  }
  
  static int
--- 706,728 
  	return 0;
  }
  
! static int
  ossl_aes_key_init(ossldata * od, int type)
  {
+ 	int err;
+ 	/* Strong key support could miss on some openssl installation, we must
+ 		check return value, from set key function.
+ 	*/ 
  	if (type == AES_ENCRYPT)
! 	err = AES_set_encrypt_key(od-key, od-klen * 8, od-u.aes_key);
  	else
! 		err = AES_set_decrypt_key(od-key, od-klen * 8, od-u.aes_key);
! 
! 	if (err == 0)
! 		od-init = 1;
! 	else 
! 		od-init = 0;
! 	return err;
  }
  
  static int
***
*** 711,717 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od-init)
! 		ossl_aes_key_init(od, AES_ENCRYPT);
  
  	for (; data = end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, od-u.aes_key, AES_ENCRYPT);
--- 734,741 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od-init)
! 		if( ossl_aes_key_init(od, AES_ENCRYPT) )
! 			return PXE_KEY_TOO_BIG;
  
  	for (; data = end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, od-u.aes_key, AES_ENCRYPT);
***
*** 727,733 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od-init)
! 		ossl_aes_key_init(od, AES_DECRYPT);
  
  	for (; data = end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, od-u.aes_key, AES_DECRYPT);
--- 751,758 
  	const uint8 *end = data + dlen - bs;
  
  	if (!od-init)
! 		if( ossl_aes_key_init(od, AES_DECRYPT) )
! 			return PXE_KEY_TOO_BIG;
  
  	for (; data = end; data += bs, res += bs)
  		AES_ecb_encrypt(data, res, od-u.aes_key, AES_DECRYPT);
***
*** 741,748 
  	ossldata   *od = c-ptr;
  
  	if (!od-init)
! 		ossl_aes_key_init(od, AES_ENCRYPT);
! 
  	AES_cbc_encrypt(data, res, dlen, od-u.aes_key, od-iv, AES_ENCRYPT);
  	return 0;
  }
--- 766,774 
  	ossldata   *od = c-ptr;
  
  	if (!od-init)
! 		if( ossl_aes_key_init(od, AES_ENCRYPT) )
! 			return PXE_KEY_TOO_BIG;
! 	
  	AES_cbc_encrypt(data, res, dlen, od-u.aes_key, od-iv, AES_ENCRYPT);
  	return 0;
  }
***
*** 754,760 
  	ossldata   *od = c-ptr;
  
  	if (!od-init)
! 		ossl_aes_key_init(od, AES_DECRYPT);
  
  	AES_cbc_encrypt(data, res, dlen, od-u.aes_key, od-iv, AES_DECRYPT);
  	return 0;
--- 780,787 
  	ossldata   *od =

Re: [HACKERS] Reviewing new index types (was Re: [PATCHES]Updatedbitmap indexpatch)

On Mon, 2007-07-23 at 23:11 +0100, Simon Riggs wrote:
 On Mon, 2007-07-23 at 17:19 -0400, Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   ... BMI is not useful at all
   for PKs, whilst GIT is specifically designed to handle them.
  
  This seems a strange statement, because GIT doesn't look particularly
  efficient for unique indexes AFAICS.  In the worst case you'd have to
  look individually at each tuple on a heap page to check for uniqueness
  conflict (no binary search, because you couldn't assume they are
  ordered).
 
 That is one of a few heuristics about the patch that need some active
 discussion, so I'm glad you asked.
 
 The main use case is nearly-unique, so for cases where we have a
 Master:Detail relationship, e.g. Order:OrderItem. The Order index is a
 PK, with the OrderItem index as a nearly unique key. The index is not
 brilliant for the Order index, but is good for the OrderItem index.
 
 Heikki designed the grouping so that there is a state change between
 non-grouped and non-grouped (normal) index entries. By default the patch
 uses a threshold of non-grouped - grouped at N=2 index entries and then
 no limit on the number of rows/block. Currently you can tune N, but we
 might also envisage setting a limit on the width of the range of values
 to limit the number of tids stored in a grouped index entry. That could
 control the uniqueness overhead.

Possibly Heikki might add more here, but it occurs to me that I didn't
mention two other things about uniqueness checking.

The state change occurs when the block fills, so up to that point all
the index entries are separate, so no additional uniqueness checking
cost. When the state change does occur the highest value is always left
as a singleton index entry, again to speed uniqueness checking. This
copes with INSERTs, since the dominant use case is to have a
similar-to-the-last-high-value or increasing key (for PKs).

Lastly, GIT is designed to work in conjunction with HOT. When doing HOT
updates there are no index insertions, so far fewer uniqueness checks
need to be performed anyway.

So overall, GIT is reasonably well suited to unique indexes. But I think
you can see that these behaviours influence the performance
considerably, even though they are just small parts of the patch.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] pgcrypto strong ciphers limitation

2007-07-24 Thread Marko Kreen


On 7/24/07, Zdenek Kotala [EMAIL PROTECTED] wrote:

Stefan reported me that prcrypto regression test fails on solaris 10
with openssl support. I investigated this problem and the result is that
Solaris 10 delivers only support for short keys up to 128. Strong crypto
(SUNWcry and SUNWcryr packages) is available on web download pages. (It
is result of US crypto export policy.)


Ugh, deliberately broken OpenSSL...


However, on default installation  (which is commonly used) it is a
problem. Regression test cannot be fixed because it tests strong
ciphers, but there two very strange issue:

1) First issue is blowfish cipher. Because pgcrypto uses old interface
instead new evp it calls bf_set_key function which does not return any
output and cut key if it is too long. See
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/bf/bf_skey.c
line 84.

If user installs strong crypto he will not be able decrypt data which
has been encrypted before.

The fix of this issue is ugly, because there is not way how to verify
supported key length with old openssl API and only new API return err if
length is not supported.


NAK.  The fix is broken because it uses EVP interface.  EVP is not
a general-purpose interface because not all valid keys for cipher
pass thru it.  Only key-lengths used in SSL will work...

Could you rework the fix that it uses the BF_* interface,
does a test-encoding with full-length key and compares it to
expected result.  And does it just once, not on each call.

That should be put into separate function probably.


2) AES ciphere crashes when key is longer. It happens because return
value from AES_set_encrypt_key is ignored and AES_encrypt is called with
uninitialized structure.


ACK, error checking is good.  But please return PXE_KEY_TOO_BIG
directly from ossl_aes_key_init.

I must admit the internal API for ciphers is clumsy and could
need rework to something saner.  This shows here.


I attach patch which fix both issues, but main problem is there that old
openssl API is used and supported key lengths are hardcoded. I think we
can add to TODO list rewrite pgcrypto to use evp openssl interface.


pgcrypto _was_ written using EVP, but I needed to rewrite it
when I found out EVP supports only key lengths used in SSL.

--
marko

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Reviewing new index types (was Re: [PATCHES] Updated bitmap indexpatch)

2007-07-24 Thread Heikki Linnakangas

Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 ... BMI is not useful at all
 for PKs, whilst GIT is specifically designed to handle them.
 
 This seems a strange statement, because GIT doesn't look particularly
 efficient for unique indexes AFAICS.  In the worst case you'd have to
 look individually at each tuple on a heap page to check for uniqueness
 conflict (no binary search, because you couldn't assume they are
 ordered).

It handles them in the sense that a clustered PK index is way smaller
than a normal PK index. Unlike the bitmap index, which is not suitable
for highly distinct columns.

Inserting and performing a uniqueness check is more expensive on a
clustered index, because as you said it needs to scan the heap page
looking for conflicts. It's alleviated by the heuristics Simon
mentioned; a page is groupified when only when it gets full, which
means there'll usually be a mixture of normal and groupified tuples on a
leaf page. In particular, if there's hot key values that are repeatedly
inserted, the index tuples corresponding those key values are likely to
stay as normal index tuples, and are therefore cheaper to check
uniqueness against.

Also IIRC, the patch tries to keep the last index tuple on a page as a
normal index tuple, which catches the important special case of
inserting monotonically increasing keys, like with a sequence-generated PK.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] EXEC_EVALDEBUG debugging broken?

[EMAIL PROTECTED] writes:
 WARNING:  could not dump unrecognized node type: 404
 ExecQual: qual is (
{
}
 )

Yeah, that code is toast, we probably ought to remove it.  It hasn't
worked since the changes to make the executor treat plan trees as
read-only.  Making it work would require teaching outfuncs.c how to dump
all the different expression state node types, which seems like more
maintenance effort than is justified for debug support that no one uses.
(Dumping an expression tree over again on each evaluation seems of
pretty questionable usefulness to me anyway.)

I'd suggest using EXPLAIN VERBOSE instead, which will give you
the same printout that this would have given you back when it did
work, but only once instead of over again for each row.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Kerberos warnings on win32

Magnus Hagander [EMAIL PROTECTED] writes:
 The attached file removes this by undefing the macros before we include the
 kerberos files. But this is perhaps just too ugly to deal with and we
 should live with the warnings instead?

Ick.  I don't like any of these patches.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] strange buildfarm failure on lionfish


http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13


any ideas ?


Stefan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] pgcrypto strong ciphers limitation

2007-07-24 Thread Zdenek Kotala


Marko Kreen wrote:

On 7/24/07, Zdenek Kotala [EMAIL PROTECTED] wrote:




However, on default installation  (which is commonly used) it is a
problem. Regression test cannot be fixed because it tests strong
ciphers, but there two very strange issue:

1) First issue is blowfish cipher. Because pgcrypto uses old interface
instead new evp it calls bf_set_key function which does not return any
output and cut key if it is too long. See
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/bf/bf_skey.c 


line 84.

If user installs strong crypto he will not be able decrypt data which
has been encrypted before.

The fix of this issue is ugly, because there is not way how to verify
supported key length with old openssl API and only new API return err if
length is not supported.


NAK.  The fix is broken because it uses EVP interface.  EVP is not
a general-purpose interface because not all valid keys for cipher
pass thru it.  Only key-lengths used in SSL will work...


I'm not openssl expert, but if you look how to EVP call for setkey is 
implemented you can see that finally is call BF_set_key. Only there is 
one extra layer  see 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/evp/e_bf.c




Could you rework the fix that it uses the BF_* interface,
does a test-encoding with full-length key and compares it to
expected result.  And does it just once, not on each call.


OK. I can do, but it is not general solution. Because it will work only 
in our case, because we know 128 is a restricted limit.



That should be put into separate function probably.


yes


2) AES ciphere crashes when key is longer. It happens because return
value from AES_set_encrypt_key is ignored and AES_encrypt is called with
uninitialized structure.


ACK, error checking is good.  But please return PXE_KEY_TOO_BIG
directly from ossl_aes_key_init.


OK.


I must admit the internal API for ciphers is clumsy and could
need rework to something saner.  This shows here.


I attach patch which fix both issues, but main problem is there that old
openssl API is used and supported key lengths are hardcoded. I think we
can add to TODO list rewrite pgcrypto to use evp openssl interface.


pgcrypto _was_ written using EVP, but I needed to rewrite it
when I found out EVP supports only key lengths used in SSL.


Is it still correct? It seems that blowfish accepts all key range, but 
How I mention I'm not openssl guru and documentation is very bad :(.


Zdenek




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] strange buildfarm failure on lionfish


Stefan Kaltenbrunner wrote:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13 


clownfish just hit the same problem:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=clownfishdt=2007-07-24%2013:08:29


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] strange buildfarm failure on lionfish

2007-07-24 Thread Zdenek Kotala


Stefan Kaltenbrunner wrote:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13 




any ideas ?



This test is very sensitive to floating point operations behavior. Any 
gcc, libc update on this machine?


Zdenek

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] strange buildfarm failure on lionfish


Zdenek Kotala wrote:

Stefan Kaltenbrunner wrote:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13 




any ideas ?



This test is very sensitive to floating point operations behavior. Any 
gcc, libc update on this machine?


nope - in fact nobody was even logged in on the box for over two weeks - 
and clownfish failed too (which is a completely different 
hardware/software combination) - so this seems in fact to be a result of 
the libpq changes ?!



Stefan

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] strange buildfarm failure on lionfish

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:

 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13

That's just a faulty test:

SELECT t.d1 + i.f1 AS 102 FROM TIMESTAMP_TBL t, INTERVAL_TBL i
  WHERE t.d1 BETWEEN '1990-01-01' AND '2001-01-01'
AND i.f1 BETWEEN '00:00' AND '23:00';

Note that there's no ORDER BY on the test. I bet the planner came up with an
entirely different plan than usual which generated the records in a different
order.

Offhand I can only think of one kind of plan myself but I'm sure the planner
is more inventive than me :)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] EXEC_EVALDEBUG debugging broken?

2007-07-24 Thread peter . trautmeier

Von: Tom Lane [EMAIL PROTECTED]
 [EMAIL PROTECTED] writes:
  WARNING:  could not dump unrecognized node type: 404
  ExecQual: qual is (
 {
 }
  )
 
 Yeah, that code is toast, we probably ought to remove it.  It hasn't
 worked since the changes to make the executor treat plan trees as
 read-only.  

Thanks Tom!

Interesting, what do you mean by Plan trees are 'read only' now? Is it the 
distinction between Plan trees and their corresponding PlanState nodes that 
indicate the 'read only' behaviour and the 'writeable' state of the Plan, 
respectively, that was introduced at that time?

 Making it work would require teaching outfuncs.c how to dump
 all the different expression state node types, which seems like more
 maintenance effort than is justified for debug support that no one uses.

Ok, but what type has this qual from my example that was once a OpExpr as soon 
as it arrives at ExecQual? It's obviously not a OpExpr - otherwise _outNode 
wouldn't stumble over it.
(Is there a way do get this type info with gdb's help?)

 I'd suggest using EXPLAIN VERBOSE instead, which will give you
 the same printout that this would have given you back when it did
 work, but only once instead of over again for each row.

Thanks, I hadn't seen the VERBOSE option before.

Regards,
Peter

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] strange buildfarm failure on lionfish

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13
 any ideas ?

I saw what I think was the identical failure last night on my own
machine, but it wasn't repeatable.  Evidently the planner is changing to
a different plan for those queries, but why has this only started
recently?  Maybe the recent changes to autovacuum defaults are causing
autovac to hit these tables when it never did before?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] EXEC_EVALDEBUG debugging broken?

[EMAIL PROTECTED] writes:
 Interesting, what do you mean by Plan trees are 'read only' now? Is it the 
 distinction between Plan trees and their corresponding PlanState nodes that 
 indicate the 'read only' behaviour and the 'writeable' state of the Plan, 
 respectively, that was introduced at that time?

Yeah, exactly.  ExecInitExpr builds an ExprState tree that mirrors the
structure of the Expr tree but contains all the run-time-variable data.
This tree is what's now being passed to ExecQual.

The problem is that outfuncs.c knows about all the Expr node types and
none of the ExprState types, there being no need to dump the latter in
normal use.  There is a valid argument that we ought to support dumping
PlanState and ExprState trees for debugging purposes, but it just seems
like more maintenance effort than it's worth ...

 (Is there a way do get this type info with gdb's help?)

p *(Node *) ptr ought to do it.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] Design: Escort info from WHERE clause to executor?

2007-07-24 Thread peter . trautmeier

Hi all,

I want to pass additional weight info from the WHERE clause to the executor and 
I hope someone can help me with this.

I accept clauses like the following

WHERE (foo='a'){1}
WHERE (foo='a'){1} OR (bar='b'){2}
WHERE ((foo='a'){1} OR (bar='b'){2})){42} OR (baz='c'){3}

where the {} takes an integer as a weight that is attached to the preceding 
(partial) condition.

In the executor, I need to access (1) the logical value of and (2) the weight 
associated with _each_ subexpression that was entered. (Getting the weight from 
the parser to the executor is in itself a journey it seems, as some expression 
types are created anew - and not copied - and lose their annotated weight over 
and over again.)

Furthermore I need the structure of OR to be preserved; the OR-of-OR structure 
from the last WHERE must be preserved or at least be  reconstructible and must 
not be folded into a 3-valued OR (as canonicalize_qual and friends do.)

To sum up, I am looking for a (decently efficient) scheme that is able to

(1) pass arbitrary conditional expressions from WHERE to the executor in a 
structure preserving way. 
(2) annotate arbitrary expressions with weights that survive on its way from 
the parser to the executor.
(3) access the logical value of particular subexpressions.

I have some basic ideas how at least some of the requirements might be 
achieved. But as I am not totally satisfied with my ideas I hope you can 
provide me with some fresh input.

ANY ideas are welcome.

Regards,
Peter

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] strange buildfarm failure on lionfish

Tom Lane [EMAIL PROTECTED] writes:

 Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13
 any ideas ?

 I saw what I think was the identical failure last night on my own
 machine, but it wasn't repeatable.  Evidently the planner is changing to
 a different plan for those queries, but why has this only started
 recently?  Maybe the recent changes to autovacuum defaults are causing
 autovac to hit these tables when it never did before?

Indeed the only alternate plan I can imagine for this is to do the join the
other way around. And given the large difference in sizes between the two
tables the only way I could get that to happen was by obliterating the
statistics entirely for one table but having stats for the other.

This does raise a possible issue with autovacuum. Treating ANALYZE like VACUUM
and running it on individual tables one at a time is probably the wrong thing
to be doing. What really has to happen is it should run analyze on all tables
together in a single transaction and commit all the new stats together.
Out-of-sync stats can be worse than out-of-date stats.


With stats on timestamp_tbl but not on interval_tbl:

postgres-# 
QUERY PLAN 
---
 Nested Loop  (cost=2.02..48.29 rows=432 width=24) (actual time=0.112..1.515 
rows=104 loops=1)
   -  Seq Scan on interval_tbl i  (cost=0.00..36.55 rows=9 width=16) (actual 
time=0.036..0.070 rows=2 loops=1)
 Filter: ((f1 = '00:00:00'::interval) AND (f1 = '23:00:00'::interval))
   -  Materialize  (cost=2.02..2.50 rows=48 width=8) (actual time=0.030..0.377 
rows=52 loops=2)
 -  Seq Scan on timestamp_tbl t  (cost=0.00..1.97 rows=48 width=8) 
(actual time=0.048..0.333 rows=52 loops=1)
   Filter: ((d1 = '1990-01-01 00:00:00'::timestamp without time 
zone) AND (d1 = '2001-01-01 00:00:00'::timestamp without time zone))
 Total runtime: 1.904 ms
(7 rows)



All other combinations perform the join the other way around:

With both analyzed:

postgres-#  
QUERY PLAN  
-
 Nested Loop  (cost=1.15..6.37 rows=144 width=24) (actual time=0.109..1.653 
rows=104 loops=1)
   -  Seq Scan on timestamp_tbl t  (cost=0.00..1.97 rows=48 width=8) (actual 
time=0.063..0.356 rows=52 loops=1)
 Filter: ((d1 = '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 = '2001-01-01 00:00:00'::timestamp without time zone))
   -  Materialize  (cost=1.15..1.18 rows=3 width=16) (actual time=0.003..0.008 
rows=2 loops=52)
 -  Seq Scan on interval_tbl i  (cost=0.00..1.15 rows=3 width=16) 
(actual time=0.017..0.052 rows=2 loops=1)
   Filter: ((f1 = '00:00:00'::interval) AND (f1 = 
'23:00:00'::interval))
 Total runtime: 2.025 ms
(7 rows)


With no stats at all:

postgres-# postgres-#   
   QUERY PLAN   
   
-
 Nested Loop  (cost=36.56..80.89 rows=99 width=24) (actual time=0.147..1.698 
rows=104 loops=1)
   -  Seq Scan on timestamp_tbl t  (cost=0.00..42.10 rows=11 width=8) (actual 
time=0.101..0.388 rows=52 loops=1)
 Filter: ((d1 = '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 = '2001-01-01 00:00:00'::timestamp without time zone))
   -  Materialize  (cost=36.56..36.65 rows=9 width=16) (actual 
time=0.003..0.008 rows=2 loops=52)
 -  Seq Scan on interval_tbl i  (cost=0.00..36.55 rows=9 width=16) 
(actual time=0.017..0.053 rows=2 loops=1)
   Filter: ((f1 = '00:00:00'::interval) AND (f1 = 
'23:00:00'::interval))
 Total runtime: 2.063 ms
(7 rows)

With stats on interval_tbl but not timestamp_tbl:

postgres-# postgres-#   
   QUERY PLAN   
   
-
 Nested Loop  (cost=1.15..44.00 rows=33 width=24) (actual time=0.100..1.725 
rows=104 loops=1)
   -  Seq Scan on timestamp_tbl t  (cost=0.00..42.10 rows=11 width=8) (actual 
time=0.055..0.351 rows=52 loops=1)
 Filter: ((d1 = '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 = '2001-01-01 00:00:00'::timestamp

Re: [HACKERS] autovacuum default parameters

2007-07-24 Thread Jim Nasby


On Jul 24, 2007, at 1:02 AM, Gregory Stark wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

We didn't, but while I agree with the idea, I think 5% is too low.  I
don't want autovacuum to get excessively aggressive.  Is 10% not  
enough?


Well let me flip it around. Would you think a default fillfactor of  
10% would
be helpful or overkill? I think it would nearly always be overkill  
and waste

heap space and therefore cache hit rate and i/o bandwidth.

I get my 5% intuition from the TPCC stock table which has about 20  
tuples per
page. That means a fillfactor or vacuum at 5% both translate into  
trying to
maintain a margin of one tuple's worth of space per page. Enough  
for an update

to happen without migrating to a new page.

That's actually a fairly wide table though. A narrower table could  
easily have
50-100 tuple per page which would require only 1-2% of dead space  
overhead.


idle speculation

Perhaps the two parameters should be tied together and we should  
make the
autovacuum parameter: max(1%, min(10%, fillfactor(table))) and make  
the

default fill factor 5%.

Hm. We have the width of the table in the stats don't we? We could  
actually

calculate the 1 tuple's worth of space percentage automatically on a
per-table basis. Or for that matter instead of calculating it as a  
percentage
of the whole table, just compare the number of updates/deletes with  
the number

of pages in the table.

/speculation

How about the analyze scale factor, should we keep the current  
10%?  I
have less of a problem with reducing it further since analyze is  
cheaper

than vacuum.


My try to maintain one tuple's worth of space model doesn't  
answer this
question at all. It depends entirely on whether the ddl is changing  
the data

distribution.

Perhaps this should be 1/max(stats_target) for the table. So the  
default would
be 10% but if you raise the stats_target for a column to 100 it  
would go down

to 1% or so.

The idea being that if you have ten buckets then updating 1/10th of  
the rows
stands an even chance of doubling or halving the size of your  
bucket. Except
there's no math behind that intuition at all and I rather doubt it  
makes much

sense.

Actually I feel like there should be a factor of 2 or more in there  
as well.
If you modify 1/10th of the rows and you have 10 buckets then we  
should be

analyzing *before* the distribution has a chance to be modified beyond
recognition.

Perhaps I shouldn't have closed the speculation tag so early :)  
The problem
if we try to calculate reasonable defaults like this is it makes it  
unclear

how to expose any knob for the user to adjust it if they need to.


In reality, I think trying to get much below 10% on any large-ish  
production systems just isn't going to work well. It's starting to  
approach the point where you need to be vacuuming continuously, which  
is going to put us right back into starvation territory.


Put another way, there's only so low you can get table bloat with  
vacuum as it currently stands. If you want to do better, you need  
things like HOT and DSM.


Regarding page splits, it might make sense to drop the fillfactor a  
bit. I'm thinking that in most cases, the difference between 85% and  
90% won't be noticed. For cases where it will matter (ie: insert- 
only), you'd want to set fillfactor to 100% anyway.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] strange buildfarm failure on lionfish

2007-07-24 Thread Alvaro Herrera

Gregory Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
  http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfishdt=2007-07-24%2005:30:13
  any ideas ?
 
  I saw what I think was the identical failure last night on my own
  machine, but it wasn't repeatable.  Evidently the planner is changing to
  a different plan for those queries, but why has this only started
  recently?  Maybe the recent changes to autovacuum defaults are causing
  autovac to hit these tables when it never did before?

That's quite possible, because the change in threshold means it will hit
small tables earlier than it used to do.

 This does raise a possible issue with autovacuum. Treating ANALYZE like VACUUM
 and running it on individual tables one at a time is probably the wrong thing
 to be doing. What really has to happen is it should run analyze on all tables
 together in a single transaction and commit all the new stats together.
 Out-of-sync stats can be worse than out-of-date stats.

One problem with that is that it will keep the locks on each table until
the end of all analyzes.

What I don't understand is what you mean with it obliterating the
stats for a table.  I mean, when analyze runs, it _updates_ the stats
for the table, so there's never a time when the table does not have any
stats (unless, of course, analyze has never been run on the table).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] strange buildfarm failure on lionfish

Alvaro Herrera [EMAIL PROTECTED] writes:
 Gregory Stark wrote:
 What really has to happen is it should run analyze on all tables
 together in a single transaction and commit all the new stats together.
 Out-of-sync stats can be worse than out-of-date stats.

 One problem with that is that it will keep the locks on each table until
 the end of all analyzes.

Yeah, that seems entirely infeasible, even if I agreed with the premise
which I don't think I do.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] strange buildfarm failure on lionfish

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 I saw what I think was the identical failure last night on my own
 machine, but it wasn't repeatable.  Evidently the planner is changing to
 a different plan for those queries, but why has this only started
 recently?  Maybe the recent changes to autovacuum defaults are causing
 autovac to hit these tables when it never did before?

 Indeed the only alternate plan I can imagine for this is to do the join the
 other way around. And given the large difference in sizes between the two
 tables the only way I could get that to happen was by obliterating the
 statistics entirely for one table but having stats for the other.

Yeah, I turned off autovac and let the tests run through 'horology',
then stopped and looked at the plans for these queries.  For the first
one, what you get with no ANALYZE having been done is

explain SELECT t.d1 + i.f1 AS 102 FROM TIMESTAMP_TBL t, INTERVAL_TBL i
  
  WHERE t.d1 BETWEEN '1990-01-01' AND '2001-01-01'
AND i.f1 BETWEEN '00:00' AND '23:00';
 QUERY PLAN 
 
-
 Nested Loop  (cost=36.56..80.89 rows=99 width=24)
   -  Seq Scan on timestamp_tbl t  (cost=0.00..42.10 rows=11 width=8)
 Filter: ((d1 = '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 = '2001-01-01 00:00:00'::timestamp without time zone))
   -  Materialize  (cost=36.56..36.65 rows=9 width=16)
 -  Seq Scan on interval_tbl i  (cost=0.00..36.55 rows=9 width=16)
   Filter: ((f1 = '00:00:00'::interval) AND (f1 = 
'23:00:00'::interval))

If timestamp_tbl is then ANALYZEd, the plan flips around to put i on the
outside of the nestloop (because the estimate of the number of matching
rows rises to 49, which is pretty good because the actual is 52).

OTOH, if interval_tbl is ANALYZEd, the estimate for it drops to 2 rows
(again a much better match to reality) and we go back to preferring
i on the inside, with or without timestamp_tbl having been analyzed.

And, at least in the serial-schedule case, the stats at this point
look like

 relid  | schemaname |relname| seq_scan | seq_tup_read | idx_scan | 
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | 
last_vacuum | last_autovacuum | last_analyze | last_autoanalyze 
++---+--+--+--+---+---+---+---+++-+-+--+--
 132885 | public | interval_tbl  |   22 |  210 |  | 
  |10 | 0 | 0 | 10 |  0 |   
  | |  | 
 132879 | public | timestamp_tbl |   45 | 2444 |  | 
  |74 | 0 | 8 | 66 |  8 |   
  | |  | 

So yesterday's change to reduce the analyze threshold to 50 means that
timestamp_tbl is now vulnerable to being asynchronously analyzed while
the tests run.

While I don't have any very strong objection to putting an ORDER BY
on these particular queries, I'm worried about how many other regression
tests will now start showing random failures.  We have an awful lot
of small tables in the tests ...

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] strange buildfarm failure on lionfish

2007-07-24 Thread Alvaro Herrera

Tom Lane wrote:

 While I don't have any very strong objection to putting an ORDER BY
 on these particular queries, I'm worried about how many other regression
 tests will now start showing random failures.  We have an awful lot
 of small tables in the tests ...

Maybe what we could do is set higher thresholds for the regression
database with ALTER DATABASE.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] strange buildfarm failure on lionfish

Alvaro Herrera [EMAIL PROTECTED] writes:
 What I don't understand is what you mean with it obliterating the
 stats for a table.

The point seems to be that if there is no pg_statistic data for these
tables, we fall back to default estimates of the selectivity of the
WHERE clauses, and those produce rowcount estimates that are pretty
far away from the truth.  As soon as some pg_statistic entries
exist, we obtain better rowcount estimates, and that changes the join
plan.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] pgcrypto strong ciphers limitation

2007-07-24 Thread Marko Kreen

On 7/24/07, Zdenek Kotala [EMAIL PROTECTED] wrote:

Marko Kreen wrote:
NAK. The fix is broken because it uses EVP interface. EVP is not
a general-purpose interface because not all valid keys for cipher
pass thru it. Only key-lengths used in SSL will work...

I'm not openssl expert, but if you look how to EVP call for setkey is
implemented you can see that finally is call BF_set_key. Only there is
one extra layer see
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/evp/e_bf.c

I glanced into evp.h for 0.9.7 and 0.9.6j and remembered that
there were 2 things EVP forced - key length and padding.

When I replied to you I remembered things bit wrong, there are
indeed way for changing key size even in 0.9.6, but not for
padding. EVP_CIPHER_CTX_set_padding() appers in only in 0.9.7.

I suspect as I could not work around forced padding I did not
research key size issue very deeply.

So we can revisit the issue when we are ready to drop
support for 0.9.6x.

Could you rework the fix that it uses the BF_* interface,
does a test-encoding with full-length key and compares it to
expected result. And does it just once, not on each call.

OK. I can do, but it is not general solution. Because it will work only
in our case, because we know 128 is a restricted limit.

It _is_ a general solution if you test with a 448 bit key.

Using BF_ API but testing via EVP_ API is unobvious first,
in addition leaving the user depending whether the molesters
got all the details right.

When everything uses EVP then indeed, we can test via EVP.

I must admit the internal API for ciphers is clumsy and could
need rework to something saner. This shows here.

I attach patch which fix both issues, but main problem is there that old
openssl API is used and supported key lengths are hardcoded. I think we
can add to TODO list rewrite pgcrypto to use evp openssl interface.

pgcrypto _was_ written using EVP, but I needed to rewrite it
when I found out EVP supports only key lengths used in SSL.

Is it still correct? It seems that blowfish accepts all key range, but

Yes, seems since 0.9.7 we could work with EVP.

How I mention I'm not openssl guru and documentation is very bad :(.

It's somewhat lacking, yes. User is forced to read their source
which isn't very nice either...

--
marko

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] avoiding WAL logging in 8.3

2007-07-24 Thread Jim Nasby


On Jul 23, 2007, at 11:30 PM, Simon Riggs wrote:

On Tue, 2007-07-24 at 13:04 +0900, Tatsuo Ishii wrote:

I noticed in 8.3 there are chances where we can avoid WAL logging.  
For

example, 8.3's pgbench was modified to use TRUNCATE right before
COPY. Is there any documentation which describes that kind of
techniques? If there's none, I would volunteer the work to create  
such

a documentation since I think this is valuable information for DBAs
who wish to migrate to 8.3.


The Performance Tips section has been modified to describe this. Would
you like me to add something elsewhere also? Multiple entry points to
information helps everybody, so I'll happily add more.


I would mention it in the documentation for each affected command  
(COPY, TRUNCATE, etc). I suspect a lot of folks end up only using the  
SQL reference section.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] strange buildfarm failure on lionfish

Tom Lane [EMAIL PROTECTED] writes:

 Alvaro Herrera [EMAIL PROTECTED] writes:
 Gregory Stark wrote:
 What really has to happen is it should run analyze on all tables
 together in a single transaction and commit all the new stats together.
 Out-of-sync stats can be worse than out-of-date stats.

 One problem with that is that it will keep the locks on each table until
 the end of all analyzes.

 Yeah, that seems entirely infeasible, even if I agreed with the premise
 which I don't think I do.

Well that's just what ANALYZE with no arguments at all does. It's also only a
ShareUpdateExclusiveLock which prevents other vacuums and DDL but not any
other DML. And ANALYZE goes by pretty quickly even on large tables.

Another idea is that perhaps it should only do this for all never-analyzed
tables together. That's where the out-of-sync stats is most likely to hurt.
But I'm not sure where to go with that since there's no guarantee that all the
never-analyzed tables will be the small ones.

Really it seems like having autovacuum touch never-analyzed tables and having
reasonable default stats for never-analyzed tables don't mix well together.

The reasonable default stats are there because if you analyze a fresh empty
table you'll get some degenerate plans which will behave terribly when you
start loading even a few records into it. The reasonable default stats give
you something akin to a rule-based plan until you have some reasonable data
loaded to analyze.

In a perfect world I would say autovacuum shouldn't analyze never-analyzed
tables, just print a warning for the DBA. But we get questions all the time
about bad plans that show tables which have never been analyzed so that
doesn't sound palatable either.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] msvc and vista fun

2007-07-24 Thread Andrew Dunstan




Dave Page wrote:

Andrew Dunstan wrote:

On a somewhat related note, I have had spectacular lack of success in 
getting either MSVC or MinGW builds to work on Vista - so much so 
that I have currently abandoned my attempts on that platform and I 
resorted to resuscitating an old XP box for testing. Following some 
advice from Magnus, I added ACLs to the build root for both an admin 
and a non-admin user (cacls buildroot /E /T /G AdminUser:C and 
similarly for a non-admin user) . I can build as the admin user but 
when I come to run initdb it fails, complaining that it can't find 
the postgres executable. 


Yeah, I ran into that problem as well. I'll look at my Vista box when 
I'm in the office tomorrow and see if I can figure out what hack fixed 
it for me.





I have never heard back on this, AFAIK. If anyone has instructions on 
how to manage this please let me know. My current status with MSVC/vista 
is still that I can build but not run as an admin user, and run but not 
build as a non-admin user. Bleah.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] strange buildfarm failure on lionfish

Gregory Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 Gregory Stark wrote:
 What really has to happen is it should run analyze on all tables
 together in a single transaction and commit all the new stats together.
 Out-of-sync stats can be worse than out-of-date stats.
 
 One problem with that is that it will keep the locks on each table until
 the end of all analyzes.
 
 Yeah, that seems entirely infeasible, even if I agreed with the premise
 which I don't think I do.

 Well that's just what ANALYZE with no arguments at all does.

Not unless you wrap it in a transaction block --- otherwise it does a
transaction per table.  If you try wrapping it in a transaction block
on a production system, you'll soon find you don't like it.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] pgcrypto strong ciphers limitation

Marko Kreen wrote:
 On 7/24/07, Zdenek Kotala [EMAIL PROTECTED] wrote:
 Marko Kreen wrote:
  NAK.  The fix is broken because it uses EVP interface.  EVP is not
  a general-purpose interface because not all valid keys for cipher
  pass thru it.  Only key-lengths used in SSL will work...

 I'm not openssl expert, but if you look how to EVP call for setkey is
 implemented you can see that finally is call BF_set_key. Only there is
 one extra layer  see
 http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/common/openssl/crypto/evp/e_bf.c

 
 I glanced into evp.h for 0.9.7 and 0.9.6j and remembered that
 there were 2 things EVP forced - key length and padding.
 
 When I replied to you I remembered things bit wrong, there are
 indeed way for changing key size even in 0.9.6, but not for
 padding.  EVP_CIPHER_CTX_set_padding() appers in only in 0.9.7.
 
 I suspect as I could not work around forced padding I did not
 research key size issue very deeply.
 
 So we can revisit the issue when we are ready to drop
 support for 0.9.6x.

the last openssl 0.9.6 release was in march 2004 and 0.9.7 is available
since early 2003 - I don't think dropping support for it in 8.3+ would
be unreasonable at all ...


Stefan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Design: Escort info from WHERE clause to executor?

2007-07-24 Thread imad


It looks like you need a customized version of AExpr Node.
In the backend parser, an AExpr Node is constructed against each given
WHERE expression. You can store the weight along with the expression.
Further, don't forget to upgrade the copy functions and equal
functions for AExpr if you want to take this weight value all the way
upto the executor.


--Imad
www.EnterpriseDB.com


On 7/24/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

Hi all,

I want to pass additional weight info from the WHERE clause to the executor and 
I hope someone can help me with this.

I accept clauses like the following

WHERE (foo='a'){1}
WHERE (foo='a'){1} OR (bar='b'){2}
WHERE ((foo='a'){1} OR (bar='b'){2})){42} OR (baz='c'){3}

where the {} takes an integer as a weight that is attached to the preceding 
(partial) condition.

In the executor, I need to access (1) the logical value of and (2) the weight 
associated with _each_ subexpression that was entered. (Getting the weight from 
the parser to the executor is in itself a journey it seems, as some expression 
types are created anew - and not copied - and lose their annotated weight over 
and over again.)

Furthermore I need the structure of OR to be preserved; the OR-of-OR structure 
from the last WHERE must be preserved or at least be  reconstructible and must 
not be folded into a 3-valued OR (as canonicalize_qual and friends do.)

To sum up, I am looking for a (decently efficient) scheme that is able to

(1) pass arbitrary conditional expressions from WHERE to the executor in a 
structure preserving way.
(2) annotate arbitrary expressions with weights that survive on its way from 
the parser to the executor.
(3) access the logical value of particular subexpressions.

I have some basic ideas how at least some of the requirements might be 
achieved. But as I am not totally satisfied with my ideas I hope you can 
provide me with some fresh input.

ANY ideas are welcome.

Regards,
Peter

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] pgcrypto strong ciphers limitation

2007-07-24 Thread Marko Kreen


On 7/24/07, Stefan Kaltenbrunner [EMAIL PROTECTED] wrote:

Marko Kreen wrote:
 So we can revisit the issue when we are ready to drop
 support for 0.9.6x.

the last openssl 0.9.6 release was in march 2004 and 0.9.7 is available
since early 2003 - I don't think dropping support for it in 8.3+ would
be unreasonable at all ...


Now that I think about it, then yes, dropping 0.9.6 from 8.4
onwards should be no problem.  Considering the code could need
good cleanup anyway, that may be a good moment for it.

--
marko

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] strange buildfarm failure on lionfish

Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 While I don't have any very strong objection to putting an ORDER BY
 on these particular queries, I'm worried about how many other regression
 tests will now start showing random failures.  We have an awful lot
 of small tables in the tests ...

 Maybe what we could do is set higher thresholds for the regression
 database with ALTER DATABASE.

That seems to make sense at least as a short-term response.  We weren't
seeing buildfarm failures with the previous defaults, so setting those
values with ALTER oughta do it.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] pgcrypto strong ciphers limitation

Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Marko Kreen wrote:
 So we can revisit the issue when we are ready to drop
 support for 0.9.6x.

 the last openssl 0.9.6 release was in march 2004 and 0.9.7 is available
 since early 2003 - I don't think dropping support for it in 8.3+ would
 be unreasonable at all ...

Any major rewrite of pgcrypto would be for 8.4 (or later) at this point.
I tend to agree that we could drop 0.9.6x support in that timeframe.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Kerberos warnings on win32

2007-07-24 Thread Magnus Hagander

Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 The attached file removes this by undefing the macros before we include the
 kerberos files. But this is perhaps just too ugly to deal with and we
 should live with the warnings instead?
 
 Ick.  I don't like any of these patches.

You know, I kind of expected that response :-P

We'll just live with the warnings then, since they're not critical.
Meanwhile, I'll file a bug with the Kerberos folks.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] msvc and vista fun

2007-07-24 Thread Andrei Kovalevski


Andrew Dunstan wrote:

Dave Page wrote:

Andrew Dunstan wrote:
On a somewhat related note, I have had spectacular lack of success 
in getting either MSVC or MinGW builds to work on Vista - so much so 
that I have currently abandoned my attempts on that platform and I 
resorted to resuscitating an old XP box for testing. Following some 
advice from Magnus, I added ACLs to the build root for both an admin 
and a non-admin user (cacls buildroot /E /T /G AdminUser:C and 
similarly for a non-admin user) . I can build as the admin user but 
when I come to run initdb it fails, complaining that it can't find 
the postgres executable. 
Yeah, I ran into that problem as well. I'll look at my Vista box when 
I'm in the office tomorrow and see if I can figure out what hack 
fixed it for me.


I have never heard back on this, AFAIK. If anyone has instructions on 
how to manage this please let me know. My current status with 
MSVC/vista is still that I can build but not run as an admin user, and 
run but not build as a non-admin user. Bleah.


cheers

andrew
Described situation looks like you are trying to run initdb under Admin 
account. This will happen even if currently logged user in not admin, 
but initdb has property 'run as administrator' set or you are trying to 
run it under some file commander which is running in admin mode.


Andrei.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] DLLIMPORT definition