Re: [HACKERS] [patch] PL/Python is too lossy with floats

2015-06-01 Thread Marko Kreen
On Wed, Mar 11, 2015 at 9:49 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 3/3/15 9:32 AM, Marko Kreen wrote:
 PL/Python uses str(v) to convert float data, but is lossy
 by design.  Only repr(v) is guaranteed to have enough
 precision to make floats roundtrip properly:

 committed

In 9.3 and before, numeric is converted to float in PL/Python.

Please reconsider backporting.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] PL/Python is too lossy with floats

2015-03-03 Thread Marko Kreen
PL/Python uses str(v) to convert float data, but is lossy
by design.  Only repr(v) is guaranteed to have enough
precision to make floats roundtrip properly:

  https://docs.python.org/2/library/functions.html#func-repr
  https://docs.python.org/2/library/functions.html#str

Example:

  $ python
   repr(100100100.654321)
  '100100100.654321'
   str(100100100.654321)
  '100100100.654'

Attached patch uses PyObject_Repr() for float data.

As it's annoying-to-debug problem and the patch is simple,
perhaps it's worth backporting?

-- 
marko

diff --git a/src/pl/plpython/expected/plpython_types.out b/src/pl/plpython/expected/plpython_types.out
index 61d800b..17057a5 100644
--- a/src/pl/plpython/expected/plpython_types.out
+++ b/src/pl/plpython/expected/plpython_types.out
@@ -354,6 +354,14 @@ CONTEXT:  PL/Python function test_type_conversion_float8
 
 (1 row)
 
+SELECT * FROM test_type_conversion_float8(100100100.654321);
+INFO:  (100100100.654321, type 'float')
+CONTEXT:  PL/Python function test_type_conversion_float8
+ test_type_conversion_float8 
+-
+100100100.654321
+(1 row)
+
 CREATE FUNCTION test_type_conversion_oid(x oid) RETURNS oid AS $$
 plpy.info(x, type(x))
 return x
diff --git a/src/pl/plpython/plpy_typeio.c b/src/pl/plpython/plpy_typeio.c
index 524534e..8c70c7c 100644
--- a/src/pl/plpython/plpy_typeio.c
+++ b/src/pl/plpython/plpy_typeio.c
@@ -760,6 +760,18 @@ PLyObject_ToDatum(PLyObToDatum *arg, int32 typmod, PyObject *plrv)
 
 	if (PyUnicode_Check(plrv))
 		plrv_bo = PLyUnicode_Bytes(plrv);
+	else if (PyFloat_Check(plrv))
+	{
+		/* use repr() for floats, str() is lossy */
+#if PY_MAJOR_VERSION = 3
+		PyObject   *s = PyObject_Repr(plrv);
+
+		plrv_bo = PLyUnicode_Bytes(s);
+		Py_XDECREF(s);
+#else
+		plrv_bo = PyObject_Repr(plrv);
+#endif
+	}
 	else
 	{
 #if PY_MAJOR_VERSION = 3
diff --git a/src/pl/plpython/sql/plpython_types.sql b/src/pl/plpython/sql/plpython_types.sql
index d9d0db6..19d920d 100644
--- a/src/pl/plpython/sql/plpython_types.sql
+++ b/src/pl/plpython/sql/plpython_types.sql
@@ -122,6 +122,7 @@ SELECT * FROM test_type_conversion_float8(100);
 SELECT * FROM test_type_conversion_float8(-100);
 SELECT * FROM test_type_conversion_float8(50.5);
 SELECT * FROM test_type_conversion_float8(null);
+SELECT * FROM test_type_conversion_float8(100100100.654321);
 
 
 CREATE FUNCTION test_type_conversion_oid(x oid) RETURNS oid AS $$

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting Windows SChannel as OpenSSL replacement

2014-06-09 Thread Marko Kreen
On Mon, Jun 09, 2014 at 02:45:08PM +0300, Heikki Linnakangas wrote:
 Thoughts? While we're at it, we'll probably want to refactor things
 so that it's easy to support other SSL implementations too, like
 gnutls.

One project that is proud to support several SSL implementations
is curl: http://curl.haxx.se/

Git: https://github.com/bagder/curl.git
Implementations: https://github.com/bagder/curl/tree/master/lib/vtls

List from vtls.c:

- OpenSSL
- GnuTLS
- NSS
- QsoSSL
- GSKit
- PolarSSL
- CyaSSL
- Schannel SSPI
- SecureTransport (Darwin)

We cannot reuse the code directly, but seems it's usable for
reference for various gotchas that need to be solved.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [9.4] Minor SSL/ECDH related doc fixes

2014-05-17 Thread Marko Kreen
- Clarify ECDH decription in release notes.
- Fix default value - it's 'prime256v1'.
- List curves with good cross-platform support explicitly
  (NIST P-256 / P-384 / P-521).

The -list_curves output is full of garbage, it's hard to know which
ones make sense to use.  Only those three curves are supported
cross-platform - OpenSSL/Java/Windows - so list them explicitly.

Only reason to tune this value is changing overall security
level up/down, so now this can be done safely and quickly.

Only upwards though.  We could also list here NIST P-192/P-224
(prime192v1, secp224r1), but those are not supported by Windows.
And prime256v1 is quite fast already.

In the future it might make philosophical sense to list
also Brainpool curves (RFC7027), or some new curves from
http://safecurves.cr.yp.to/ when they are brought to TLS.
But currently only NIST/NSA curves are working option,
so let's keep it simple for users.

-- 
marko

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d9e5985..4a666d0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1020,13 +1020,23 @@ include 'filename'
   /term
   listitem
para
-Specifies the name of the curve to use in ECDH key exchanges.  The
-default is literalprime256p1/.
+Specifies the name of the curve to use in ECDH key exchange.
+It needs to be supported by all clients that connect.
+It does not need to be same curve as used by server's
+Elliptic Curve key.  The default is literalprime256v1/.  
/para
 
para
-The list of available curves can be shown with the command
-literalopenssl ecparam -list_curves/literal.
+OpenSSL names for most common curves:
+literalprime256v1/ (NIST P-256),
+literalsecp384r1/ (NIST P-384),
+literalsecp521r1/ (NIST P-521).
+   /para
+
+   para
+The full list of available curves can be shown with the command
+literalopenssl ecparam -list_curves/literal.  Not all of them
+are usable in TLS though.
/para
   /listitem
  /varlistentry
diff --git a/doc/src/sgml/release-9.4.sgml b/doc/src/sgml/release-9.4.sgml
index 3070d0b..7c6fb8f 100644
--- a/doc/src/sgml/release-9.4.sgml
+++ b/doc/src/sgml/release-9.4.sgml
@@ -603,17 +603,23 @@
/para
 
para
-Such keys are faster and have improved security over previous
-options. The new configuration
-parameter link linkend=guc-ssl-ecdh-curvevarnamessl_ecdh_curve//link
-controls which curve is used.
+This allows use of Elliptic Curve keys for server authentication.
+Such keys are faster and have improved security over RSA keys.
+Also it makes RSA keys perform slightly faster, as ECDHE-RSA key
+exchange will be used over DHE-RSA if both sides support it.
+   /para
+
+   para
+The new configuration parameter
+link linkend=guc-ssl-ecdh-curvevarnamessl_ecdh_curve//link
+controls which curve is used for ECDH.
/para
   /listitem
 
   listitem
para
 Improve the default link
-linkend=guc-ssl-ciphersvarnamessl_ciphers//link ciphers
+linkend=guc-ssl-ciphersvarnamessl_ciphers//link value
 (Marko Kreen)
/para
   /listitem

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with txid_snapshot_in/out() functionality

2014-04-14 Thread Marko Kreen
On Sun, Apr 13, 2014 at 05:46:20PM -0400, Jan Wieck wrote:
 On 04/13/14 14:22, Jan Wieck wrote:
 On 04/13/14 08:27, Marko Kreen wrote:
 I think you need to do SET_VARSIZE also here.  Alternative is to
 move SET_VARSIZE after sort_snapshot().
 
 And it seems the drop-double-txid logic should be added also to
 txid_snapshot_recv().  It seems weird to have it behave differently
 from txid_snapshot_in().
 
 
 Thanks,
 
 yes on both issues. Will create another patch.
 
 New patch attached.
 
 New github commit is 
 https://github.com/wieck/postgres/commit/b8fd0d2eb78791e5171e34aecd233fd06218890d

Looks OK to me.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Problem with txid_snapshot_in/out() functionality

2014-04-13 Thread Marko Kreen
On Sat, Apr 12, 2014 at 02:10:13PM -0400, Jan Wieck wrote:
 Since it doesn't seem to produce any side effects, I'd think that
 making the snapshot unique within txid_current_snapshot() and
 filtering duplicates on input should be sufficient and eligible for
 backpatching.

Agreed.

 The attached patch adds a unique loop to the internal
 sort_snapshot() function and skips duplicates on input. The git
 commit is here:
 
 https://github.com/wieck/postgres/commit/a88a2b2c25b856478d7e2b012fc718106338fe00

   static void
   sort_snapshot(TxidSnapshot *snap)
   {
 + txidlast = 0;
 + int nxip, idx1, idx2;
 + 
   if (snap-nxip  1)
 + {
   qsort(snap-xip, snap-nxip, sizeof(txid), cmp_txid);
 + nxip = snap-nxip;
 + idx1 = idx2 = 0;
 + while (idx1  nxip)
 + {
 + if (snap-xip[idx1] != last)
 + last = snap-xip[idx2++] = snap-xip[idx1];
 + else
 + snap-nxip--;
 + idx1++;
 + }
 + }
   }

I think you need to do SET_VARSIZE also here.  Alternative is to
move SET_VARSIZE after sort_snapshot().

And it seems the drop-double-txid logic should be added also to
txid_snapshot_recv().  It seems weird to have it behave differently
from txid_snapshot_in().

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2014-02-23 Thread Marko Kreen
On Sat, Feb 22, 2014 at 08:31:14PM -0500, Peter Eisentraut wrote:
 On 2/2/14, 7:16 AM, Marko Kreen wrote:
  On Thu, Dec 12, 2013 at 04:32:07PM +0200, Marko Kreen wrote:
  Attached patch changes default ciphersuite to HIGH:MEDIUM:+3DES:!aNULL
  and also adds documentation about reasoning for it.
  
  This is the last pending SSL cleanup related patch:
  
https://commitfest.postgresql.org/action/patch_view?id=1310
  
  Peter, you have claimed it as committer, do you see any remaining
  issues with it?
 
 I'm OK with this change on the principle of clarifying and refining the
 existing default.  But after inspecting the expanded cipher list with
 the openssl cipher tool, I noticed that the new default re-enabled MD5
 ciphers.  Was that intentional?

Yes, kind of.  First note that only RC4-MD5 is SSLv3+,
rest are SSLv2-only suites.

There are 2 points relevant about RC4-MD5:

* Main reason MEDIUM was added is to get RC4, for compatibility.

* ALthough MD5 is broken, TLS protocol uses HMAC-MD5 which is not.
  So RC4-MD5 is weak suite not because of MD5 but because of RC4.

My conclusion is it's unnecessary to add '!MD5' to MEDIUM as
that would not actually make things more secure.   Instead
'MEDIUM' alone is enough to show that user will not get
state-of-the-art-only suites.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2014-02-02 Thread Marko Kreen
On Thu, Dec 12, 2013 at 04:32:07PM +0200, Marko Kreen wrote:
 Attached patch changes default ciphersuite to HIGH:MEDIUM:+3DES:!aNULL
 and also adds documentation about reasoning for it.

This is the last pending SSL cleanup related patch:

  https://commitfest.postgresql.org/action/patch_view?id=1310

Peter, you have claimed it as committer, do you see any remaining
issues with it?

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: libpq: Support TLS versions beyond TLSv1.

2014-01-31 Thread Marko Kreen
On Sat, Jan 25, 2014 at 12:25:30PM -0500, Tom Lane wrote:
 Alternatively, given that TLS has been around for a dozen years and
 openssl versions that old have not gotten security updates for a long
 time, why don't we just reject SSLv3 on the backend side too?
 I guess it's barely possible that somebody out there is using a
 non-libpq-based client that uses a non-TLS-capable SSL library, but
 surely anybody like that is overdue to move into the 21st century.
 An SSL library that old is probably riddled with security issues.

Attached patch disables SSLv3 in backend.

TLS is supported in OpenSSL since fork from SSLeay, in Java since 1.4.2,
in Windows since XP.  It's hard to imagine this causing any
compatibility problems.

-- 
marko

diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 43633e7..fc749f4 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -880,9 +880,9 @@ initialize_SSL(void)
 			SSLerrmessage(;
 	}
 
-	/* set up ephemeral DH keys, and disallow SSL v2 while at it */
+	/* set up ephemeral DH keys, and disallow SSL v2/v3 while at it */
 	SSL_CTX_set_tmp_dh_callback(SSL_context, tmp_dh_cb);
-	SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_DH_USE | SSL_OP_NO_SSLv2);
+	SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_DH_USE | SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);
 
 	/* set up ephemeral ECDH keys */
 	initialize_ecdh();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix memset usage in pgcrypto

2014-01-21 Thread Marko Kreen
On Mon, Jan 20, 2014 at 06:49:21PM -0300, Alvaro Herrera wrote:
 Marko Kreen escribió:
  http://www.viva64.com/en/b/0227/ reported that on-stack memset()s
  might be optimized away by compilers.  Fix it.
 
 Just to clarify, this needs to be applied to all branches, right?  If
 so, does the version submitted apply cleanly to all of them?

It does apply cleanly.  It is not critical fix, but it's simple,
so I think it should be back-patched.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto missing header file inclusions

2013-12-30 Thread Marko Kreen
On Sat, Dec 28, 2013 at 10:36:19PM -0500, Peter Eisentraut wrote:
 While playing around with the pginclude tools, I noticed that pgcrypto
 header files are failing to include some header files whose symbols they
 use.  This change would fix it:
 
 diff --git a/contrib/pgcrypto/pgp.h b/contrib/pgcrypto/pgp.h
 index 3022abf..f856e07 100644
 --- a/contrib/pgcrypto/pgp.h
 +++ b/contrib/pgcrypto/pgp.h
 @@ -29,6 +29,9 @@
   * contrib/pgcrypto/pgp.h
   */
  
 +#include mbuf.h
 +#include px.h
 +
  enum PGP_S2K_TYPE
  {
 PGP_S2K_SIMPLE = 0,
 
 Does that look reasonable?

It's a style question - currently pgp.h expects other headers to be
included in .c files.   Including them in pgp.h might be better style,
but then please sync .c files with it too.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix memset usage in pgcrypto

2013-12-26 Thread Marko Kreen
http://www.viva64.com/en/b/0227/ reported that on-stack memset()s
might be optimized away by compilers.  Fix it.

* Replace memset() with px_memset()
* Add px_memset to copy_crlf()
* ADd px_memset to pgp-s2k.c

-- 
marko

diff --git a/contrib/pgcrypto/crypt-blowfish.c b/contrib/pgcrypto/crypt-blowfish.c
index b49747d..fbaa3d7 100644
--- a/contrib/pgcrypto/crypt-blowfish.c
+++ b/contrib/pgcrypto/crypt-blowfish.c
@@ -35,6 +35,7 @@
 #include postgres.h
 
 #include px-crypt.h
+#include px.h
 
 #ifdef __i386__
 #define BF_ASM0	/* 1 */
@@ -616,7 +617,7 @@ _crypt_blowfish_rn(const char *key, const char *setting,
 	count = (BF_word) 1  ((setting[4] - '0') * 10 + (setting[5] - '0'));
 	if (count  16 || BF_decode(data.binary.salt, setting[7], 16))
 	{
-		memset(data.binary.salt, 0, sizeof(data.binary.salt));
+		px_memset(data.binary.salt, 0, sizeof(data.binary.salt));
 		return NULL;
 	}
 	BF_swap(data.binary.salt, 4);
@@ -729,7 +730,7 @@ _crypt_blowfish_rn(const char *key, const char *setting,
 /* Overwrite the most obvious sensitive data we have on the stack. Note
  * that this does not guarantee there's no sensitive data left on the
  * stack and/or in registers; I'm not aware of portable code that does. */
-	memset(data, 0, sizeof(data));
+	px_memset(data, 0, sizeof(data));
 
 	return output;
 }
diff --git a/contrib/pgcrypto/crypt-md5.c b/contrib/pgcrypto/crypt-md5.c
index 2a5cd70..6a09d76 100644
--- a/contrib/pgcrypto/crypt-md5.c
+++ b/contrib/pgcrypto/crypt-md5.c
@@ -89,7 +89,7 @@ px_crypt_md5(const char *pw, const char *salt, char *passwd, unsigned dstlen)
 		px_md_update(ctx, final, pl  MD5_SIZE ? MD5_SIZE : pl);
 
 	/* Don't leave anything around in vm they could use. */
-	memset(final, 0, sizeof final);
+	px_memset(final, 0, sizeof final);
 
 	/* Then something really weird... */
 	for (i = strlen(pw); i; i = 1)
@@ -154,7 +154,7 @@ px_crypt_md5(const char *pw, const char *salt, char *passwd, unsigned dstlen)
 	*p = '\0';
 
 	/* Don't leave anything around in vm they could use. */
-	memset(final, 0, sizeof final);
+	px_memset(final, 0, sizeof final);
 
 	px_md_free(ctx1);
 	px_md_free(ctx);
diff --git a/contrib/pgcrypto/fortuna.c b/contrib/pgcrypto/fortuna.c
index 1228fb4..47380a8 100644
--- a/contrib/pgcrypto/fortuna.c
+++ b/contrib/pgcrypto/fortuna.c
@@ -34,6 +34,7 @@
 #include sys/time.h
 #include time.h
 
+#include px.h
 #include rijndael.h
 #include sha2.h
 #include fortuna.h
@@ -169,7 +170,7 @@ md_result(MD_CTX * ctx, uint8 *dst)
 
 	memcpy(tmp, ctx, sizeof(*ctx));
 	SHA256_Final(dst, tmp);
-	memset(tmp, 0, sizeof(tmp));
+	px_memset(tmp, 0, sizeof(tmp));
 }
 
 /*
@@ -243,7 +244,7 @@ enough_time_passed(FState *st)
 	if (ok)
 		memcpy(last, tv, sizeof(tv));
 
-	memset(tv, 0, sizeof(tv));
+	px_memset(tv, 0, sizeof(tv));
 
 	return ok;
 }
@@ -290,8 +291,8 @@ reseed(FState *st)
 	/* use new key */
 	ciph_init(st-ciph, st-key, BLOCK);
 
-	memset(key_md, 0, sizeof(key_md));
-	memset(buf, 0, BLOCK);
+	px_memset(key_md, 0, sizeof(key_md));
+	px_memset(buf, 0, BLOCK);
 }
 
 /*
@@ -341,8 +342,8 @@ add_entropy(FState *st, const uint8 *data, unsigned len)
 	if (pos == 0)
 		st-pool0_bytes += len;
 
-	memset(hash, 0, BLOCK);
-	memset(md, 0, sizeof(md));
+	px_memset(hash, 0, BLOCK);
+	px_memset(md, 0, sizeof(md));
 }
 
 /*
@@ -378,7 +379,7 @@ startup_tricks(FState *st)
 		encrypt_counter(st, buf + CIPH_BLOCK);
 		md_update(st-pool[i], buf, BLOCK);
 	}
-	memset(buf, 0, BLOCK);
+	px_memset(buf, 0, BLOCK);
 
 	/* Hide the key. */
 	rekey(st);
diff --git a/contrib/pgcrypto/internal-sha2.c b/contrib/pgcrypto/internal-sha2.c
index f86b478..912effb 100644
--- a/contrib/pgcrypto/internal-sha2.c
+++ b/contrib/pgcrypto/internal-sha2.c
@@ -84,7 +84,7 @@ int_sha224_free(PX_MD *h)
 {
 	SHA224_CTX *ctx = (SHA224_CTX *) h-p.ptr;
 
-	memset(ctx, 0, sizeof(*ctx));
+	px_memset(ctx, 0, sizeof(*ctx));
 	px_free(ctx);
 	px_free(h);
 }
@@ -132,7 +132,7 @@ int_sha256_free(PX_MD *h)
 {
 	SHA256_CTX *ctx = (SHA256_CTX *) h-p.ptr;
 
-	memset(ctx, 0, sizeof(*ctx));
+	px_memset(ctx, 0, sizeof(*ctx));
 	px_free(ctx);
 	px_free(h);
 }
@@ -180,7 +180,7 @@ int_sha384_free(PX_MD *h)
 {
 	SHA384_CTX *ctx = (SHA384_CTX *) h-p.ptr;
 
-	memset(ctx, 0, sizeof(*ctx));
+	px_memset(ctx, 0, sizeof(*ctx));
 	px_free(ctx);
 	px_free(h);
 }
@@ -228,7 +228,7 @@ int_sha512_free(PX_MD *h)
 {
 	SHA512_CTX *ctx = (SHA512_CTX *) h-p.ptr;
 
-	memset(ctx, 0, sizeof(*ctx));
+	px_memset(ctx, 0, sizeof(*ctx));
 	px_free(ctx);
 	px_free(h);
 }
diff --git a/contrib/pgcrypto/internal.c b/contrib/pgcrypto/internal.c
index a02c943..7b33e49 100644
--- a/contrib/pgcrypto/internal.c
+++ b/contrib/pgcrypto/internal.c
@@ -142,7 +142,7 @@ int_md5_free(PX_MD *h)
 {
 	MD5_CTX*ctx = (MD5_CTX *) h-p.ptr;
 
-	memset(ctx, 0, sizeof(*ctx));
+	px_memset(ctx, 0, sizeof(*ctx));
 	px_free(ctx);
 	px_free(h);
 }
@@ -190,7 +190,7 @@ int_sha1_free(PX_MD *h)
 {
 	SHA1_CTX   *ctx = (SHA1_CTX *) h-p.ptr;
 
-	memset(ctx, 0, sizeof(*ctx));
+	px_memset(ctx, 0, sizeof(*ctx));
 	px_free(ctx);
 	

Re: [HACKERS] SSL: better default ciphersuite

2013-12-17 Thread Marko Kreen
On Tue, Dec 17, 2013 at 11:26:13AM -0500, Bruce Momjian wrote:
 On Tue, Dec 17, 2013 at 09:51:30AM -0500, Robert Haas wrote:
  I'm starting to think we should just leave this well enough alone.  We
  can't seem to find two people with the same idea of what would be
  better than what we have now.  And of course the point of making it a
  setting in the first place is that each person can set it to whatever
  they deem best.
 
 Yes, I am seeing that too.  Can we agree on one that is _better_ than
 what we have now, even if we can't agree on a _best_ one?

To recap - old settings are:

  DEFAULT:!LOW:!EXP:!MD5:@STRENGTH
  prefer-client-order

new proposal is:

  HIGH:MEDIUM:+3DES:!aNULL
  prefer-server-order

This is better than old state in following aspects:

- First, it does not remove any ciphers compared to current
  list.  So anything that could connect previously can connect
  still.

- Clearer to people not intimately familiar with OpenSSL and TLS.
  In particular, the 'MEDIUM' communicates that some less secure
  ciphers are enabled (RC4).

- Fixes the 3DES ordering.  OpenSSL default list is ordered roughly
  by (key-bits, ECDHE, DHE, plain RSA).  3DES has 168-bit key so
  it appears before 128-bit ciphers, although it offers around 112-bit
  actual security.  This problem exists already with existing Postgres
  versions: if you set suite to AES128:3DES, then libpq-based clients
  will use 3DES.

When combined with prefer-server-order, it has following benefits:

- Clarity: admin can look at configured cipher order and make reasonable
  guesses what will be used.

- Actually activates the 3DES fix.  Although non-libpq/OpenSSL based
  clients did used their own order, OpenSSL-based client did have
  same order problem in client-side.

- Old clients that did prefer RC4 will use it as last resort only,
  when only alternative is 3DES.

- Old clients that did prefer non-DHE ciphers will use DHE ciphers
  if available.


One goal the new settings *do not* try to achieve is to pick the absolutely
fastest cipher from the secure ones.  Old settings did not it either,
when speaking of libpq clients.  Java did try from client-side, but
as a result old deployed versions use now insecure settings.  I think
it's best when the default settings prefer security over speed,
everyone who is has special requirements speed-wise - AES is slow -
can tune list themselves.


So, does anyone have reasons not to use proposed new default?

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2013-12-17 Thread Marko Kreen
On Sun, Dec 15, 2013 at 05:10:38PM -0500, James Cloos wrote:
  MK == Marko Kreen mark...@gmail.com writes:
  PE == Peter Eisentraut pete...@gmx.net writes:

 PE Any other opinions on this out there?
 
 For reference, see:
 
   https://wiki.mozilla.org/Security/Server_Side_TLS
 
 for the currently suggested suite for TLS servers.
 
 That is:
 
 ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:
 ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:
 DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:
 ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:
 ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:
 ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:
 DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:
 DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:
 AES256-GCM-SHA384:ECDHE-RSA-RC4-SHA:ECDHE-ECDSA-RC4-SHA:
 AES128:AES256:RC4-SHA:HIGH:
 !aNULL:!eNULL:!EXPORT:!DES:!3DES:!MD5:!PSK

This is example of ciphersuite list for people who have special
requirements and care about tracking yearly changes in SSL landscape.
And can deploy config changes relatively fast.

This discussion is about Postgres default suite which cannot and should
not be periodically changed, for people who leave Postgres settings
to defaults and expect setup work well.

We would like to leave as much as possible to OpenSSL, but not more.

Looking at the history of OpenSSL, their default order has been
good, except the 3DES vs. AES128 priority.

Looking into future, I guess following events are likely:

- RC4 gets practially broken and/or removed from TLS
  (draft-popov-tls-prohibiting-rc4-01).

- New ciphersuites: Salsa/Chacha (256-bit key).

- New modes: CCM (RFC6655, draft-mcgrew-tls-aes-ccm-ecc-07),
  other ciphers with GCM, new AEAD constructs.

- CBC mode fixes: pad-mac-encrypt, pad-encrypt-mac.  Those may
  be implemented with TLS extensions, so no new ciphersuites.

RC4 situation - the 'MEDIUM' in my proposal communicates
that not all ciphers are best, and prefer-server-order
makes sure it is selected as last resort.  So that is solved.

New ciphersuites - if we want to select fastest from secure
suites we need to change configuration periodically
(RC4-AES128-CBC-AES128-GCM-SALSA) and I don't think Postgres
should bother we that.  So I think it's better to leave ordering
new ciphers to OpenSSL, and people who have special requirements
can worry about best configuration for specific stack they are running.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2013-12-13 Thread Marko Kreen
On Thu, Dec 12, 2013 at 09:18:03PM -0500, Peter Eisentraut wrote:
 On Thu, 2013-12-12 at 12:30 +0200, Marko Kreen wrote:
  First, if there is explicit wish to keep RC4/SEED in play, I'm fine
  with HIGH:MEDIUM:!aNULL as new default.  Clarity-wise, it's still
  much better than current value.  And this value will result *exactly*
  same list in same order as current value.
 
 If we have to make a change, I'd go for that, but I'm not convinced that
 this is necessarily clearer.

Yeah, the clarity argument is getting thinner...

And my latest patch was for HIGH:MEDIUM:+3DES:!aNULL.

I still think it's better to have positive statements there -
gimme this and that - instad badly-named 'DEFAULT' and then
lot's of negatives applied to it.  But it's not that straightforward
anymore - the +3DES breaks the leave everything to OpenSSL angle.

But we do need to change default suite list to have one that works
well with prefer-server-ciphers option, which means it should contain
at least the +3DES workaround.  Client that don't want AES256 are
reasonable as AES256 does not have any practical advantages over AES128.

I don't think just reverting the default is good idea - we should then
add documentation to option that if you flip this, add such fixes
to cipher list.  Which seems silly.

And not documenting anything and just leaving matters to admins
seems bad idea too - they are not in better position to do such
research than we are now.

So I think we can pick good default, now, and everybody will benefit.


For fun, how to go overboard on the issue - Mozilla recommendations
for TLS setup on their infrastructure:

  https://wiki.mozilla.org/Security/Server_Side_TLS

It also discusses various issues with TLS, so it's good read.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2013-12-12 Thread Marko Kreen
On Wed, Dec 11, 2013 at 10:08:44PM -0500, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  Any other opinions on this out there?  All instances of other
  SSL-enabled servers out there, except nginx, default to some variant of
  DEFAULT:!LOW:... or HIGH:MEDIUM:  The proposal here is essentially
  to disable MEDIUM ciphers by default, which is explicitly advised
  against in the Postfix and Dovecot documentation, for example.
 
 Doesn't seem like a great idea then.

First, if there is explicit wish to keep RC4/SEED in play, I'm fine
with HIGH:MEDIUM:!aNULL as new default.  Clarity-wise, it's still
much better than current value.  And this value will result *exactly*
same list in same order as current value.


But please don't look into SMTP/IMAP world for sane defaults,
their situation is quite different:

* Lot's of old and crap mail clients around (usually named Outlook).
* They use aNULL ciphers for opportunistic SSL.
* They use aNULL ciphers because CA-less certs.

If you need aNULL enabled anyway, there is no point in secure ciphers.

 I assume that if left to its own
 devices, PG presently selects some MEDIUM-level cipher by default?  If so,
 it sounds like this change amounts to imposing a performance penalty for
 SSL connections by fiat.  On the other hand, if we select a HIGH cipher by
 default, then aren't we just refusing to let clients who explicitly ask
 for a MEDIUM cipher have one?  Either way, I'd want to see a pretty darn
 airtight rationale for that, and there sure isn't one in this thread
 so far.

Performance penalty can happen for clients that support only RC4 and 3DES,
and prefer RC4 currently.  3DES is slow cipher, so falling back to it
would be noticeable.

According to ssllabs.com, Java 6 does prefer RC4, but it would fall back
to AES, which is fast cipher.  Java 7 prefers AES.

OpenSSL has always used AES, so no change there.

I know that SChannel SSL library in Windows XP (and earlier) is such
RC4+3DES only implementation, but I have not heard about anything
using it to connect to Postgres.

Also I have not heard about any Postgres clients actually allowing
to configure ciphers, so my impression all client libraries
use defaults, which usually prefer AES anyway.

 The part of the patch that removes @STRENGTH seems plausible, though,
 if Marko is correct that that's effectively overriding a hand-tailored
 ordering.
 
 In the end I wonder why our default isn't just DEFAULT.  Anybody who
 thinks that's an inappropriate default should be lobbying the OpenSSL
 folk, not us, I should think.

It's easy to see why - then every Postgres admin who wants SSL *must*
configure the suite.  The ALL:!aNULL:!eNULL default is clearly
a library default, which does not know in what situation it is used,
geared at max compatibility.

It's especially bad choice because it will look fine to people
unfamiliar with OpenSSL internal nuances.  As seen in this thread.

My goal is to have default which will be secure by default,
and give rough impression what it is about.

-

Hm, looking at Java suites, I notice a problem that is due to bug
in OpenSSL default cipher order - it orders 3DES before AES128.

So if we give priority to server cipher order and client does
not accept AES256 (like Java 6/7), 3DES will be picked.

This is indeed bug that should be fixed on OpenSSL side, but seems
we cannot wait for their fix...

So my new proposal would be to pick one from following defaults:

1) HIGH:+3DES:!aNULL - disables RC4, orders 3DES last.

2) HIGH:MEDIUM:+3DES:!aNULL - no suite changes from current one,
   except 3DES is ordered last.

+3DES reorders already picked 3DES suites to the end.  As my
impression is that no clients ever have actually used 3DES,
it's fine to use !3DES there.  It's clearer too.  But if max
compatibility is goal, then +3DES is better.

It's not as nice and simple as I hoped though. :(

-- 
marko

PS.  Use openssl ciphers -v 'HIGH:...'  file and diff -u
on files to see changes in different values.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2013-12-12 Thread Marko Kreen
On Thu, Dec 12, 2013 at 01:33:57PM +0100, Magnus Hagander wrote:
 On Thu, Dec 12, 2013 at 11:30 AM, Marko Kreen mark...@gmail.com wrote:
  On Wed, Dec 11, 2013 at 10:08:44PM -0500, Tom Lane wrote:
  I know that SChannel SSL library in Windows XP (and earlier) is such
  RC4+3DES only implementation, but I have not heard about anything
  using it to connect to Postgres.
 
  Also I have not heard about any Postgres clients actually allowing
  to configure ciphers, so my impression all client libraries
  use defaults, which usually prefer AES anyway.
 
 
 I don't know, but I would assume that npgsql which sit on top of dotnet,
 would sit on top of schannel in the end.

Probably yes.

 That said, this is XP and earlier, right? Newer versions of Windows have
 better defaults?

Yes, since Vista it supports AES:

  
http://msdn.microsoft.com/en-us/library/windows/desktop/ff468651%28v=vs.85%29.aspx

  So my new proposal would be to pick one from following defaults:
 
  1) HIGH:+3DES:!aNULL - disables RC4, orders 3DES last.
 
  2) HIGH:MEDIUM:+3DES:!aNULL - no suite changes from current one,
 except 3DES is ordered last.
 
  +3DES reorders already picked 3DES suites to the end.  As my
  impression is that no clients ever have actually used 3DES,
  it's fine to use !3DES there.  It's clearer too.  But if max
  compatibility is goal, then +3DES is better.
 
  It's not as nice and simple as I hoped though. :(
 
 
 We definitely want to explain in a comment next to the default value the
 exact reasoning behind the default value, I think. That doesn't mean people
 will understand it, but it means they at least have a chance.
 
 That said, #2 seems reasonable I think, but I wouldn't object to #1 either
 :)

Although I don't think that making .NET users on XP use 3DES is a big
problem, there is also no urgent need to drop RC4, as long as all
reasonable alternatives are used first.

Attached patch changes default ciphersuite to HIGH:MEDIUM:+3DES:!aNULL
and also adds documentation about reasoning for it.  (I tried to be
consistent there about cipher vs. cipher suite, not sure whether
I succeeded.)

Summary: this default with previous server-order-first patch make sure
that MEDIUM ciphers are never picked accidentally (eg. the bad default
in Java 6), but only when explicitly requested or when only alternative
is 3DES.

-- 
marko

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8896988..2755f55 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -876,12 +876,62 @@ include 'filename'
   /indexterm
   listitem
para
-Specifies a list of acronymSSL/ ciphers that are allowed to be
+Specifies a list of acronymSSL/ cipher suites that are allowed to be
 used on secure connections.  See
 the citerefentryrefentrytitleciphers//citerefentry manual page
 in the applicationOpenSSL/ package for the syntax of this setting
-and a list of supported values.  The default value is usually
-reasonable, unless you have specific security requirements.
+and a list of supported values.  The default value is
+literalHIGH:MEDIUM:+3DES:!aNULL/.  It is usually reasonable,
+unless you have specific security requirements.
+   /para
+   para
+Explanation for default value:
+variablelist
+ varlistentry
+  termliteralHIGH/literal/term
+  listitem
+   para
+Cipher suites that use ciphers from literalHIGH/ section.
+(AES, Camellia, 3DES)
+   /para
+  /listitem
+ /varlistentry
+ varlistentry
+  termliteralMEDIUM/literal/term
+  listitem
+   para
+Cipher suites that use ciphers from literalMEDIUM/ section.
+(RC4, SEED)
+   /para
+  /listitem
+ /varlistentry
+ varlistentry
+  termliteral+3DES/literal/term
+  listitem
+   para
+OpenSSL default order for literalHIGH/ is problematic as it orders 3DES
+higher than AES128.  This is wrong as 3DES offers less security than AES128
+and it is also much slower.  literal+3DES/ reorders it after all other
+literalHIGH/ and literalMEDIUM/ ciphers.
+   /para
+  /listitem
+ /varlistentry
+ varlistentry
+  termliteral!aNULL/literal/term
+  listitem
+   para
+Disables anonymous cipher suites that do no authentication.
+Such cipher suites are vulnerable to a man in the middle
+attack and so should not be used.
+   /para
+  /listitem
+ /varlistentry
+/variablelist
+Available cipher suite details will vary across OpenSSL versions, these values
+are taken from applicationOpenSSL/ 1.0.1.  Use command
+literalopenssl ciphers -v 'HIGH:MEDIUM:+3DES:!aNULL'/literal
+to see actual details

Re: [HACKERS] Feature request: Logging SSL connections

2013-12-08 Thread Marko Kreen
On Fri, Dec 06, 2013 at 02:53:27PM +0100, Dr. Andreas Kunert wrote:
 Anything else missing?
 
 Functionally it's fine now, but I see few style problems:
 
 - if (port-ssl  0) is wrong, -ssl is pointer.  So use just
if (port-ssl).
 - Deeper indentation would look nicer with braces.
 - There are some duplicated message, could you restructure it so that
each message exists only once.
 
 New version is attached. I could add braces before and after the
 ereport()-calls too, but then I need two more #ifdefs to catch the
 closing braces.

Thank you.  Looks good now.  I added it to next commitfest:

  https://commitfest.postgresql.org/action/patch_view?id=1324

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature request: Logging SSL connections

2013-12-06 Thread Marko Kreen
On Fri, Dec 06, 2013 at 11:43:55AM +0100, Dr. Andreas Kunert wrote:
 That seems useful.  Do we need more information, like whether a client
 certificate was presented, or what ciphers were used?
 
 Yes, please show ciphersuite and TLS version too.  Andreas, you can use my
 recent \conninfo patch as template:
 

  https://github.com/markokr/postgres/commit/7d1b27ac74643abd15007cc4ec0b56ba92b39d90
 
 Also, please show the SSL level also for walsender connections.  It's
 quite important to know whether they are using SSL or not.
 
 But I think the 'bits' output is unnecessary, as it's cipher strength
 is known by ciphersuite.  Perhaps it can be removed from \conninfo too.
 
 A new patch is attached. I added the ciphersuite and TLS version
 like shown in your template (minus the 'bits' output). I also added
 the SSL information for walsender connections, but due to a missing
 test setup I cannot test that part.
 
 Anything else missing?

Functionally it's fine now, but I see few style problems:

- if (port-ssl  0) is wrong, -ssl is pointer.  So use just
  if (port-ssl).

- Deeper indentation would look nicer with braces.

- There are some duplicated message, could you restructure it so that
  each message exists only once.

Something like this perhaps:

#if USE_SSL
if (port-ssl)
{
if (walsender) ..
else ..
}
else
#endif
...

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Feature request: Logging SSL connections

2013-12-05 Thread Marko Kreen
On Thu, Dec 05, 2013 at 09:43:31AM -0500, Peter Eisentraut wrote:
 On 12/5/13, 8:53 AM, Dr. Andreas Kunert wrote:
  we were really missing the information in our log files if (and which
  of) our users are using SSL during their connections.
  
  The attached patch is a very simple solution to this problem - it just
  tests if the ssl pointer in Port is null. If no, it adds SSL to the
  logfile, otherwise it adds NOSSL.
 
 That seems useful.  Do we need more information, like whether a client
 certificate was presented, or what ciphers were used?

Yes, please show ciphersuite and TLS version too.  Andreas, you can use my
recent \conninfo patch as template:

  
https://github.com/markokr/postgres/commit/7d1b27ac74643abd15007cc4ec0b56ba92b39d90

Also, please show the SSL level also for walsender connections.  It's
quite important to know whether they are using SSL or not.

But I think the 'bits' output is unnecessary, as it's cipher strength
is known by ciphersuite.  Perhaps it can be removed from \conninfo too.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] libpq: Support TLSv1.1+ (was: fe-secure.c and SSL/TLS)

2013-12-03 Thread Marko Kreen
As pointed out by Jeffrey Walton, only SSLv23_method() will negotiate
higher TLS versions than immediately requested.  All other _method()
functions will negotiate only one specific version.

Attached are two patches:

* libpq.tls11plus.diff

Use SSLv23_method() instead TLSv1_method in fe-secure.c.  Also
disable SSLv2 and SSLv3 so TLSv1 continues to be minimum
TLS version negotiated.  This means TLSv1.1 or TLSv1.2 will be
used as protocol if both sides can speak it.

* psql.conninfo.tlsver.diff

Make psql \conninfo show TLS protocol version.  It's really hard
to see what version is actually used otherwise without using
network dumps.


Random findings
---

- TLSv1.1 and TLSv1.2 are implemented in OpenSSL 1.0.1.

- All OpenSSL 1.0.1+ versions negotiate TLSv1.2 with SSLv23_method()
  by default.  This is default also in OpenSSL git branches (1.0.1-stable,
  1.0.2-stable, master).

- OpenSSL versions up to 0.9.7f send SSLv2 ClientHello from SSLv23_method()
  even when SSLv2 and SSLv3 are disabled.  They still manage to
  negotiate TLSv1.  Since version 0.9.7h, OpenSSL sends TLSv1 ClientHello
  in those circumstances.

- Ubuntu uses compilation flag that disables TLSv1.2 in SSLv23_method().

- Ubuntu also uses compilation flag to cut TLSv1.2 cipher list to first 25.
  This means only AES256 usable.  This also disables secure renegotation
  as that is signaled with extra ciphersuite.  And because of previous flag,
  it affects only programs using TLSv1_2_method() directly.
  I see signs of confusion here.

- Debian and Fedora OpenSSL 1.0.1+ packages do not mess with TLSv1.2,
  so I assume everything is fine there.


Because the patches are small and look safe, it might be better if they
are handled together with other SSL patches in this commitfest.  That
would give them more mileage before release, to see if any problems
pop out.  But I can add them to next commitfest if that is not OK.

-- 
marko

diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index d88c752..74b9fa2 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -966,7 +966,12 @@ init_ssl_system(PGconn *conn)
 			SSL_load_error_strings();
 		}
 
-		SSL_context = SSL_CTX_new(TLSv1_method());
+		/*
+		 * SSLv23_method() is only method that negotiates
+		 * higher protocol versions.  Rest of the methods
+		 * allow only one specific TLS version.
+		 */
+		SSL_context = SSL_CTX_new(SSLv23_method());
 		if (!SSL_context)
 		{
 			char	   *err = SSLerrmessage();
@@ -981,6 +986,9 @@ init_ssl_system(PGconn *conn)
 			return -1;
 		}
 
+		/* Disable old protocol versions */
+		SSL_CTX_set_options(SSL_context, SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);
+
 		/*
 		 * Disable OpenSSL's moving-write-buffer sanity check, because it
 		 * causes unnecessary failures in nonblocking send cases.
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 638d8cb..0763163 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1795,8 +1795,8 @@ printSSLInfo(void)
 		return;	/* no SSL */
 
 	SSL_get_cipher_bits(ssl, sslbits);
-	printf(_(SSL connection (cipher: %s, bits: %d)\n),
-		   SSL_get_cipher(ssl), sslbits);
+	printf(_(SSL connection (protocol: %s, cipher: %s, bits: %d)\n),
+		   SSL_get_version(ssl), SSL_get_cipher(ssl), sslbits);
 #else
 
 	/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fe-secure.c and SSL/TLS

2013-11-30 Thread Marko Kreen
On Sat, Nov 30, 2013 at 03:46:06AM -0500, Jeffrey Walton wrote:
  I believe the standard way of achieving TLS1.0 and above is to use
  the SSLv23_client_method() and then remove the SSL protocols with
  SSL_OP_NO_SSLv2 and SSL_OP_NO_SSLv3. I have to use handwaiving around
  standard because I don't believe its documented anywhere (one of the
  devs told me its the standard way to do it.).
 
  Indeed - Python ssl module seems to achieve TLSv1.1 and it uses
  SSLv23_method().  But still no TLSv1.2.
 It sounds like they are using the TLSv1_1_method(). You can check it
 with Wireshark. The Client Hello will advertise the highest version of
 the protocol supported. See http://postimg.org/image/e4mk3nhhl/.

No, they are using SSLv23_method().  And I can confirm - I did small
C program with SSLv23_method and SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3,
and it requests up to TLSv1.1.

 If Python is not advertising TLS 1.2, then they should use the
 SSLv23_method() with SSL_OP_NO_SSLv2, SSL_OP_NO_SSLv3 and
 SSL_OP_NO_TLSv1. That will get them TLS 1.1 and above. From ssl.h,
 around line 605:
 
 #define SSL_OP_NO_SSLv20x0100L
 #define SSL_OP_NO_SSLv30x0200L
 #define SSL_OP_NO_TLSv10x0400L
 #define SSL_OP_NO_TLSv1_20x0800L
 #define SSL_OP_NO_TLSv1_10x1000L
 
 If you only want TLS 1.1 and 1.2, you can further trim your preferred
 cipher list. TLS 1.1 did not add any ciphers, so your list might look
 like (the TLS 1.0 ciphers can be used in TLS 1.1):

I could not get TLSv1.1+ with that.  But I'm working against
Ubuntu 12.04 default OpenSSL.  I'll try with other versions too.

 Personally, I'd like to drop TLS 1.0 (even though the complaints are
 mainly academic). But I think its still needed for interop. I've never
 rolled a system without it enabled.

Good thing in about libpq is that it knows server is OpenSSL.  Bad thing
is that server may be old, so we need to support servers down to 
OpenSSL 0.9.7.  Which means TLSv1.0.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 1/2] SSL: GUC option to prefer server cipher order

2013-11-29 Thread Marko Kreen
On Fri, Nov 29, 2013 at 09:25:02AM -0500, Peter Eisentraut wrote:
 On Thu, 2013-11-14 at 11:45 +0100, Magnus Hagander wrote:
  I think the default behaviour should be the one we recommend (which
  would be to have the server one be preferred). But I do agree with the
  requirement to have a GUC to be able to  remove it
 
 Is there a reason why you would want to turn it off?

GUC is there so old behaviour can be restored.

Why would anyone want that, I don't know.  In context of PostgreSQL,
I see no reason to prefer old behaviour.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2013-11-29 Thread Marko Kreen
On Fri, Nov 29, 2013 at 09:18:49AM -0500, Peter Eisentraut wrote:
 On Fri, 2013-11-15 at 01:11 +0200, Marko Kreen wrote:
  Attached patch changes the default ciphersuite to
  
  HIGH:!aNULL
  
  instead of old
  
  DEFAULT:!LOW:!EXP:!MD5:@STRENGTH
  
  where DEFAULT is a shortcut for ALL:!aNULL:!eNULL.
 
  Main goal is to leave low-level ciphersuite details to OpenSSL guys
  and give clear impression to Postgres admins what it is about.
 
 If we want to leave the details of the ciphers to OpenSSL, I think we
 shouldn't be second-guessing their judgement of what a reasonable
 default is.

Well, we should - the DEFAULT is clearly a client-side default
for compatibility only.  No server should ever run with it.

OTOH, the whole point of HIGH:!aNULL is to leave low-level
ciphersuite details to OpenSSL guys - as a Postgres admin
is not clear to me that

   DEFAULT:!LOW:!EXP:!MD5:@STRENGTH

is actually good suite.  It contains DEFAULT plus several
fixes which show that DEFAULT is not good enough.  Why all that?

Admin would need to do lot research to see what it is about.
Another aspect is that the HIGH:!aNULL is more futureproof
as default, current default has needed fixes (!LOW:!EXP:!MD5)
and would need more fixes in the future (RC4).

Also note that OpenSSL has only one ordered cipher list - ALL.
All other tokens simply walk that list while keeping the order.
So it's not like not using DEFAULT would somehow lose important
details about order.

 I checked Apache mod_ssl and Postfix, and even though they are
 configuring this slightly differently, I think their defaults end up
 being about the same as what PostgreSQL currently has.
 
 https://httpd.apache.org/docs/2.4/mod/mod_ssl.html#sslciphersuite
 http://www.postfix.org/postconf.5.html#smtpd_tls_mandatory_ciphers

My proposal is inspired by nginx default:

  http://nginx.org/en/docs/http/ngx_http_ssl_module.html#ssl_ciphers

  HIGH:
Contains only secure and well-researched algorithms.
  
  !aNULL
Needed to disable suites that do not authenticate server.
DEFAULT includes !aNULL by default.
 
 Wouldn't HIGH exclude aNULL also?  (If not, what about eNULL?)

HIGH/MEDIUM/LOW/ALL are only about cipher strength so they don't
exclude aNULL.  HIGH does exclude eNULL because eNULL ciphers
are not strong enough...

  !MD5
This affects only one suite: DES-CBC3-MD5, which is available only
for SSL2 connections.  So it would only pollute the default value.
 
 I think this is only there for political correctness.

But it's noise so should be removed.

  @STRENGTH
The OpenSSL cipher list is already sorted by humans,
it's unlikely that mechanical sort would improve things.
Also the existence of this value in old list is rather
dubious, as server cipher order was never respected anyway.
 
 Aren't you proposing to change that?

No, ALL and subsets of it (HIGH) are already ordered by @STRENGTH.
@STRENGTH as token is only useful if admin does complex filtering by
other parameters then wants to reorder it again by cipher strength.

Eg. an other default I've considered is:

  EECDH+HIGH:EDH+HIGH:@STRENGTH

which enforces ephemeral key use.  @STRENGTH is actually useful there,
as without it ECDHE-AES128 would be preferred to DHE-AES256.

But it exposes unnecessary complexity to database admins who
are not particularly familiar with TLS and OpenSSL internals.
So I think the HIGH:!aNULL is better default as it's simpler.
And ephemeral keys are preferred anyway.

-- 
marko


PS. @STRENGTH and OpenSSL default order in general has problem
that it orders 3DES (168-bit key) before AES128, but 3DES
strength is around 112-bit only.  So crypto-wise, the
perfect default, while keeping compatibility, would be:

  EECDH+HIGH:EDH+HIGH:@STRENGTH:+3DES

but it's getting messier and messier...

PS2.  And more fragile too - admin could change +3DES to -3DES
and that would be fine, but plain '3DES' would enable aNULL suite.
So keeping '!aNULL' visible might not be bad idea.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 1/2] SSL: GUC option to prefer server cipher order

2013-11-29 Thread Marko Kreen
On Fri, Nov 29, 2013 at 05:51:28PM +0200, Heikki Linnakangas wrote:
 On 11/29/2013 05:43 PM, Marko Kreen wrote:
 On Fri, Nov 29, 2013 at 09:25:02AM -0500, Peter Eisentraut wrote:
 On Thu, 2013-11-14 at 11:45 +0100, Magnus Hagander wrote:
 I think the default behaviour should be the one we recommend (which
 would be to have the server one be preferred). But I do agree with the
 requirement to have a GUC to be able to  remove it
 
 Is there a reason why you would want to turn it off?
 
 GUC is there so old behaviour can be restored.
 
 Why would anyone want that, I don't know.  In context of PostgreSQL,
 I see no reason to prefer old behaviour.
 
 Imagine that the server is public, and anyone can connect. The
 server offers SSL protection not to protect the data in the server,
 since that's public anyway, but to protect the communication of the
 client. In that situation, it should be the client's choice what
 encryption to use (if any). This is analogous to using https on a
 public website.
 
 I concur that that's pretty far-fetched. Just changing the behavior,
 with no GUC, is fine by me.

But client can control that behaviour - it just needs to specify
suites it wants and drop the rest.

So only question is that does any client have better (non-tuned?)
defaults than we can set from server.

Considering the whole HTTPS world has answered 'no' to that question
and nowadays server-controlled behaviour is preferred, I think it's
safe to change the behaviour in Postgres too.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fe-secure.c and SSL/TLS

2013-11-29 Thread Marko Kreen
Reply to mails in pgsql-bugs:

  
http://www.postgresql.org/message-id/CAH8yC8mc_2J2UY0Q42WQdWFyaoqT3onG+83Fr=vn46j5+ml...@mail.gmail.com

and

  
http://www.postgresql.org/message-id/CAH8yC8nZVUyCQznkQd8=ELMM4k_=uxjrjt8yf9v22cy2x_d...@mail.gmail.com


* Default ciphersuite

 I would argue nothing should be left to chance, and the project should
 take control of everything. But I don't really have a dog in the fight ;)

Indeed, on my own servers I've stopped bothering with pattern-based
ciphersuite strings, now I am listing ciphers explicitly.

But the discussion here is about default suite for admins who don't
know or care about TLS.  Also, it would be good if it does not
need to be tuned yearly to stay good.


* SSL_get_verify_result

I think Postgres uses SSL_VERIFY_PEER + SSL_set_verify() callback instead.
At least for me, the psql -d dbname=foo sslmode=verify-ca fails
when cert does not match.


* SNI (Server Name Indication extension).

It might be proper to configure it for connections, but it's unlikely
to be useful as the many-domains-few-ips-one-port problem really does
not apply to databases.  And from my experience, even the non-SNI
hostname check is underused (or even - unusable) in many database
setups.


* TLSv1.2

That's the remaining problem with Postgres SSL - new SHA2/AESGCM
ciphersuites will not be used even when both client and server
support them.  Also CBC-mode fixes in TLSv1.1 will be missed.

It's a client-side OpenSSL problem and caused indeed
by following line in fe-secure.c:

SSL_context = SSL_CTX_new(TLSv1_method());

It's an ugly problem, because TLSv1_method() actually *should* mean
TLSv1.0 and higher, but due to problems with various broken
SSL implementations, it's disabled.

I see various ways it can improve:

1) OpenSSL guys change default back to TLSv1.0+.
2) OpenSSL guys give switch to make TLSv1_method() mean TLSv1.0+.
3) libpq starts using TLSv1_2_method() by default.
4) libpq will give switch to users to request TLSv1.2.

I see 1) and 3) as unlikely to happen.  As it's not an urgent problem,
we could watch if 2) happens and go with 4) otherwise.


I tried your suggested OP_ALL patch and it does not work.  And it's
even harmful to use as it disables few security workarounds.
It's mainly for servers for compat with legacy browsers.

I also tried to clear OP_NO_TLSv1_x to see if there is some default
OPs in TLSv1_method() that we could change, but that also did not work.
My conclusion is that currently there is no switch to make TLSv1.0+
work.  (OpenSSL v1.0.1 / 1.1.0-git).

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fe-secure.c and SSL/TLS

2013-11-29 Thread Marko Kreen
On Fri, Nov 29, 2013 at 06:01:01PM -0500, Jeffrey Walton wrote:
 I know of no other ways to check the result of OpenSSL's chain
 validation. The open question (for me) is where are
 SSL_get_verify_result/X509_V_OK checked? Neither show up in the
 Postgres sources.

According to SSL_set_verify manpage, you are perhaps talking about
SSL_VERIFY_NONE case?  Which has suggestion that you should call
SSL_get_verify_result if you want to know if cert was valid.

But if SSL_VERIFY_PEER is used, this is not needed.

  3) libpq starts using TLSv1_2_method() by default.
  4) libpq will give switch to users to request TLSv1.2.
 This might have negative effects on non-TLSv1.2 clients. For example,
 an Android 2.3 device can only do TLSv1.0 (IIRC). I think there's a
 similar limitation on a lot of Windows XP clients (depending on the IE
 version and SChannel version). And OpenSSL-based clients prior to
 1.0.0h (released 14 Mar 2012) will have trouble (if I am reading the
 change log correctly).

Note we are talking about client-side settings here.  So the negative
effect would be that clients with TLSv1.2+ libpq cannot connect to
old servers.

 I believe the standard way of achieving TLS1.0 and above is to use
 the SSLv23_client_method() and then remove the SSL protocols with
 SSL_OP_NO_SSLv2 and SSL_OP_NO_SSLv3. I have to use handwaiving around
 standard because I don't believe its documented anywhere (one of the
 devs told me its the standard way to do it.).

Indeed - Python ssl module seems to achieve TLSv1.1 and it uses
SSLv23_method().  But still no TLSv1.2.

I'll play with it a bit to see whether it can have any negative effects.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-27 Thread Marko Kreen
On Tue, Nov 26, 2013 at 07:12:00PM -0200, Rodolfo Campero wrote:
 2013/11/26 Heikki Linnakangas hlinnakan...@vmware.com
  Oops, sorry about that. Fixed.
 
 Maybe be you forgot to modify
 plpython_types_3.out

Yes.  Heikki, please fix plpython_types_3.out too.

See attached patch.

-- 
marko

diff --git a/src/pl/plpython/expected/plpython_types_3.out b/src/pl/plpython/expected/plpython_types_3.out
index 25331f2..e104356 100644
--- a/src/pl/plpython/expected/plpython_types_3.out
+++ b/src/pl/plpython/expected/plpython_types_3.out
@@ -664,6 +664,9 @@ SELECT * FROM test_type_conversion_array_error();
 ERROR:  return value of function with array return type is not a Python sequence
 CONTEXT:  while creating return value
 PL/Python function test_type_conversion_array_error
+--
+-- Domains over arrays
+--
 CREATE DOMAIN ordered_pair_domain AS integer[] CHECK (array_length(VALUE,1)=2 AND VALUE[1]  VALUE[2]);
 CREATE FUNCTION test_type_conversion_array_domain(x ordered_pair_domain) RETURNS ordered_pair_domain AS $$
 plpy.info(x, type(x))

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-26 Thread Marko Kreen
On Tue, Nov 26, 2013 at 12:23:48AM +0200, Heikki Linnakangas wrote:
 The new behavior is clearly better, but it is an incompatibility
 nonetheless. I don't do anything with PL/python myself, so I don't
 have a good feel of how much that'll break people's applications.
 Probably not much I guess. But warrants a mention in the release
 notes at least. Any thoughts on that?

Yes it warrants a mention but nothing excessive, in 9.0 the non-domain
arrays has same non-compatible change with only one sentence in notes.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-24 Thread Marko Kreen
On Sat, Nov 23, 2013 at 11:09:53AM -0200, Rodolfo Campero wrote:
 2013/11/22 Marko Kreen mark...@gmail.com
  One more thing - please update Python 3 regtests too.
 
 The attached patch (version 3) includes the expected results for Python 3
 (file plpython_types_3.out).

Thanks.  Looks good now.

Tagging as ready for committer.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-22 Thread Marko Kreen
On Sat, Oct 26, 2013 at 11:17:19AM -0200, Rodolfo Campero wrote:
 The attached patch add support of domains over arrays to PL/Python (eg:
 CREATE DOMAIN my_domain AS integer[]).
 
 Basically it just uses get_base_element_type instead of get_element_type
 in plpy_typeio.c, and uses domain_check before returning a sequence as
 array in PLySequence_ToArray whenever appropriate.

Generally looks fine.  Please lose the C++ comments though, this style
is not used in Postgres sources.

 There's one line I'm not sure about; I modified a switch statement (line
 427):
 switch (element_type ? element_type : getBaseType(arg-typoid))
 The rationale is that when element_type is set, it is already a base type,
 because there is no support of arrays of domains in PostgreSQL, but this
 may not held true in the future.

Was there any actual need to modify that?  Or was it just performance
optimization?  ATM it creates asymmetry between PLy_output_datum_func2
and PLy_input_datum_func2.

If it's just performace optimization, then it should be done in both
functions, but seems bad idea to do it in this patch.  So I think
it's better to leave it out.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/Python: domain over array support

2013-11-22 Thread Marko Kreen
On Fri, Nov 22, 2013 at 08:45:56AM -0200, Rodolfo Campero wrote:
 There are other cosmetic changes in this patch, wrt previous version (not
 preexistent code):
  * adjusted alignment of variable name rv in line 12
  * reworded comment in line 850, resulting in more than 80 characters, so I
 splitted the comment into a multiline comment following the surrounding
 style.

Good.

One more thing - please update Python 3 regtests too.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendPQExpBufferVA vs appendStringInfoVA

2013-11-18 Thread Marko Kreen
On Mon, Nov 18, 2013 at 06:18:01PM +1300, David Rowley wrote:
 On Mon, Nov 18, 2013 at 1:01 AM, Marko Kreen mark...@gmail.com wrote:
  I am bit suspicious of performance impact of this patch, but think
  that it's still worthwhile as it decreases code style where single
  string argument is given to printf-style function without %s.
 
 
 This thread probably did not explain very will the point of this patch.
 All this kicked up from an earlier patch which added for alignment in the
 log_line_prefix GUC. After some benchmarks were done on the proposed patch
 for that, it was discovered that replacing appendStringInfoString with
 appendStringInfo gave a big enough slowdown to matter in high volume
 logging scenarios. That patch was revised and the appendStringInfo()'s were
 only used when they were really needed and performance increased again.
 
 I then ran a few benchmarks seen here:
 http://www.postgresql.org/message-id/caaphdvp2ulkpuhjnckonbgg2vxpvxolopzhrgxbs-m0r0v4...@mail.gmail.com
 
 To compare appendStringInfo(si, %s, str); with appendStringinfoString(a,
 str); and appendStringInfo(si, str);
 
 The conclusion to those benchmarks were that appendStringInfoString() was
 the best function to use when no formatting was required, so I submitted a
 patch which replaced appendStringInfo() with appendStringInfoString() where
 that was possible and that was accepted.
 
 appendPQExpBuffer() and appendPQExpBufferStr are the front end versions of
 appendStringInfo, so I spent an hour or so replacing these calls too as it
 should show a similar speedup, though in this case likely the performance
 is less critical, my thinking was more along the lines of, it increases
 performance a little bit with a total of 0 increase in code complexity.

I was actually praising the patch that it reduces complexity,
so it's worth applying even if there is no performance win.

With performance win, it's doubleplus good.

The patch applies and regtests work fine.  So I mark it as
ready for committer.

 The findings in the benchmarks in the link above also showed that we might
 want to look into turning appendStringInfoString into a macro
 around appendBinaryStringInfo() to allow us to skip the strlen() call for
 string constants... It's unclear at the moment if this would be a good idea
 or much of an improvement, so it was left for something to think about for
 the future.

In any case it should be separate patch.  Perhaps applying the same
optimization for all such functions.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] appendPQExpBufferVA vs appendStringInfoVA

2013-11-17 Thread Marko Kreen
On Thu, Nov 14, 2013 at 09:33:59PM +1300, David Rowley wrote:
 On Sun, Nov 3, 2013 at 3:18 AM, David Rowley dgrowle...@gmail.com wrote:
  I'm low on ideas on how to improve things much around here for now, but
  for what it's worth, I did create a patch which changes unnecessary calls
  to appendPQExpBuffer() into calls to appendPQExpBufferStr() similar to the
  recent one for appendStringInfo and appendStringInfoString.
 
 Attached is a re-based version of this.

It does not apply anymore, could you resend it?

I am bit suspicious of performance impact of this patch, but think
that it's still worthwhile as it decreases code style where single
string argument is given to printf-style function without %s.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-16 Thread Marko Kreen
On Fri, Nov 15, 2013 at 02:16:52PM -0800, Adrian Klaver wrote:
 On 11/15/2013 11:49 AM, Marko Kreen wrote:
 On Fri, Nov 15, 2013 at 11:16:25AM -0800, Adrian Klaver wrote:
 The description of the GUCs show up in the documentation but I am
 not seeing the GUCs themselves in postgresql.conf, so I could test
 no further. It is entirely possible I am missing a step and would
 appreciate enlightenment.
 
 Sorry, I forgot to update sample config.
 
 ssl-prefer-server-cipher-order-v2.patch
 - Add GUC to sample config
 - Change default value to 'true', per comments from Alvaro and Magnus.
 
 ssl-ecdh-v2.patch
 - Add GUC to sample config
 
 
 Well that worked.
 I made ssl connections to the server using psql and verified it
 respected the order of ssl_ciphers. I do not have a client available
 with a different view of cipher order so I cannot test that.

Well, these are GUC patches so the thing to test is whether the GUCs work.

ssl-prefer-server-cipher-order:
  Use non-standard cipher order in server, eg: RC4-SHA:DHE-RSA-AES128-SHA,
  see if on/off works.  You can see OpenSSL default order with
  openssl ciphers -v.

ssl-ecdh: 
  It should start using ECDHE-RSA immediately.  Also see if adding
  !ECDH to ciphers will fall back to DHE.  It's kind of hard to test
  the ssl_ecdh_curve as you can't see it anywhere.  I tested it by
  measuring if bigger curve slowed connecting down...

  Bonus - test EC keys:
$ openssl ecparam -name prime256v1 -out ecparam.pem
$ openssl req -x509 -newkey ec:ecparam.pem -days 9000 -nodes \
  -subj '/C=US/ST=Somewhere/L=Test/CN=localhost' \
  -keyout server.key -out server.crt

ssl-better-default:
  SSL should stay working, openssl ciphers -v 'value' should not contain
  any weak suites (RC4, SEED, DES-CBC, EXP, NULL) and no non-authenticated
  suites (ADH/AECDH).

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-16 Thread Marko Kreen
Thanks for testing!

On Sat, Nov 16, 2013 at 12:17:40PM -0800, Adrian Klaver wrote:
 On 11/16/2013 06:24 AM, Marko Kreen wrote:
 ssl-better-default:
SSL should stay working, openssl ciphers -v 'value' should not contain
any weak suites (RC4, SEED, DES-CBC, EXP, NULL) and no non-authenticated
suites (ADH/AECDH).
 
 Not sure about the above, if it is a GUC I can't find it. If it is
 something else than I will have to plead ignorance.

The patch just changes the default value for 'ssl_ciphers' GUC.

The question is if the value works at all, and is good.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-16 Thread Marko Kreen
On Sat, Nov 16, 2013 at 01:03:05PM -0800, Adrian Klaver wrote:
 On 11/16/2013 12:37 PM, Marko Kreen wrote:
 Thanks for testing!
 
 On Sat, Nov 16, 2013 at 12:17:40PM -0800, Adrian Klaver wrote:
 On 11/16/2013 06:24 AM, Marko Kreen wrote:
 ssl-better-default:
SSL should stay working, openssl ciphers -v 'value' should not contain
any weak suites (RC4, SEED, DES-CBC, EXP, NULL) and no non-authenticated
suites (ADH/AECDH).
 
 Not sure about the above, if it is a GUC I can't find it. If it is
 something else than I will have to plead ignorance.
 
 The patch just changes the default value for 'ssl_ciphers' GUC.
 
 I am still not sure what patch you are talking about. The two
 patches I saw where for server_prefer and ECDH key exchange.
 
 
 The question is if the value works at all, and is good.
 
 
 What value would we be talking about?

Ah, sorry.  It's this one:

   https://commitfest.postgresql.org/action/patch_view?id=1310

 Note: I have been working through a head cold and thought processes
 are sluggish, handle accordingly:)

Get better soon!  :)

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-16 Thread Marko Kreen
On Sat, Nov 16, 2013 at 02:07:57PM -0800, Adrian Klaver wrote:
 On 11/16/2013 01:13 PM, Marko Kreen wrote:
 https://commitfest.postgresql.org/action/patch_view?id=1310
 
 Got it, applied it.
 
 Results:
 
 openssl ciphers  -v  'HIGH:!aNULL'|egrep
 '(RC4|SEED|DES-CBC|EXP|NULL|ADH|AECDH)'
 
 ECDHE-RSA-DES-CBC3-SHA  SSLv3 Kx=ECDH Au=RSA  Enc=3DES(168) Mac=SHA1
 ECDHE-ECDSA-DES-CBC3-SHA SSLv3 Kx=ECDH Au=ECDSA Enc=3DES(168) Mac=SHA1
 EDH-RSA-DES-CBC3-SHASSLv3 Kx=DH   Au=RSA  Enc=3DES(168) Mac=SHA1
 EDH-DSS-DES-CBC3-SHASSLv3 Kx=DH   Au=DSS  Enc=3DES(168) Mac=SHA1
 ECDH-RSA-DES-CBC3-SHA   SSLv3 Kx=ECDH/RSA Au=ECDH Enc=3DES(168) Mac=SHA1
 ECDH-ECDSA-DES-CBC3-SHA SSLv3 Kx=ECDH/ECDSA Au=ECDH Enc=3DES(168) Mac=SHA1
 DES-CBC3-SHASSLv3 Kx=RSA  Au=RSA  Enc=3DES(168) Mac=SHA1
 DES-CBC3-MD5SSLv2 Kx=RSA  Au=RSA  Enc=3DES(168) Mac=MD5

DES-CBC3 is 3DES, which is fine.  Plain DES-CBC would be bad.

If you don't see any other issues perhaps they are ready for committer?

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-16 Thread Marko Kreen
On Sat, Nov 16, 2013 at 02:54:22PM -0800, Adrian Klaver wrote:
 On 11/16/2013 02:41 PM, Marko Kreen wrote:
 If you don't see any other issues perhaps they are ready for committer?
 
 I do not have any other questions/issues at this point. I am new to
 the review process, so I am not quite sure how it proceeds from
 here.

Simple - just click on edit patch on commitfest page and change
patch status to ready for committer.  Then committers will know
that they should look at the patch.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-16 Thread Marko Kreen
On Sat, Nov 16, 2013 at 03:21:19PM -0800, Adrian Klaver wrote:
 On 11/16/2013 03:09 PM, Marko Kreen wrote:
 On Sat, Nov 16, 2013 at 02:54:22PM -0800, Adrian Klaver wrote:
 On 11/16/2013 02:41 PM, Marko Kreen wrote:
 If you don't see any other issues perhaps they are ready for committer?
 
 I do not have any other questions/issues at this point. I am new to
 the review process, so I am not quite sure how it proceeds from
 here.
 
 Simple - just click on edit patch on commitfest page and change
 patch status to ready for committer.  Then committers will know
 that they should look at the patch.
 
 
 Done for both:
 
 SSL: better default ciphersuite
 SSL: prefer server cipher order

I think you already handled the ECDH one too:

  https://commitfest.postgresql.org/action/patch_view?id=1286

 Thanks for helping me through the process.

Thanks for reviewing.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review:Patch: SSL: prefer server cipher order

2013-11-15 Thread Marko Kreen
On Fri, Nov 15, 2013 at 11:16:25AM -0800, Adrian Klaver wrote:
 The description of the GUCs show up in the documentation but I am
 not seeing the GUCs themselves in postgresql.conf, so I could test
 no further. It is entirely possible I am missing a step and would
 appreciate enlightenment.

Sorry, I forgot to update sample config.

ssl-prefer-server-cipher-order-v2.patch
- Add GUC to sample config
- Change default value to 'true', per comments from Alvaro and Magnus.

ssl-ecdh-v2.patch
- Add GUC to sample config

-- 
marko

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77a9303..56bfa01 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -883,6 +883,18 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-ssl-prefer-server-ciphers xreflabel=ssl_prefer_server_ciphers
+  termvarnamessl_prefer_server_ciphers/varname (typebool/type)/term
+  indexterm
+   primaryvarnamessl_prefer_server_ciphers/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies whether to prefer client or server ciphersuite.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-password-encryption xreflabel=password_encryption
   termvarnamepassword_encryption/varname (typeboolean/type)/term
   indexterm
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 7f01a78..2094674 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -112,6 +112,9 @@ static bool ssl_loaded_verify_locations = false;
 /* GUC variable controlling SSL cipher list */
 char	   *SSLCipherSuites = NULL;
 
+/* GUC variable: if false, prefer client ciphers */
+bool	   SSLPreferServerCiphers;
+
 /*  */
 /*		 Hardcoded values		*/
 /*  */
@@ -845,6 +848,10 @@ initialize_SSL(void)
 	if (SSL_CTX_set_cipher_list(SSL_context, SSLCipherSuites) != 1)
 		elog(FATAL, could not set the cipher list (no valid ciphers available));
 
+	/* Let server choose order */
+	if (SSLPreferServerCiphers)
+		SSL_CTX_set_options(SSL_context, SSL_OP_CIPHER_SERVER_PREFERENCE);
+
 	/*
 	 * Load CA store, so we can verify client certificates if needed.
 	 */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 54d8078..0ec5ddf 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -127,6 +127,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool synchronize_seqscans;
 extern char *SSLCipherSuites;
+extern bool SSLPreferServerCiphers;
 
 #ifdef TRACE_SORT
 extern bool trace_sort;
@@ -801,6 +802,15 @@ static struct config_bool ConfigureNamesBool[] =
 		check_ssl, NULL, NULL
 	},
 	{
+		{ssl_prefer_server_ciphers, PGC_POSTMASTER, CONN_AUTH_SECURITY,
+			gettext_noop(Give priority to server ciphersuite order.),
+			NULL
+		},
+		SSLPreferServerCiphers,
+		true,
+		NULL, NULL, NULL
+	},
+	{
 		{fsync, PGC_SIGHUP, WAL_SETTINGS,
 			gettext_noop(Forces synchronization of updates to disk.),
 			gettext_noop(The server will use the fsync() system call in several places to make 
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34a2d05..a190e5f 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -86,6 +86,7 @@
 #ssl_key_file = 'server.key'		# (change requires restart)
 #ssl_ca_file = ''			# (change requires restart)
 #ssl_crl_file = ''			# (change requires restart)
+#ssl_prefer_server_ciphers = on		# (change requires restart)
 #password_encryption = on
 #db_user_namespace = off
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 56bfa01..3785052 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -895,6 +895,19 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-ssl-ecdh-curve xreflabel=ssl_ecdh_curve
+  termvarnamessl_ecdh_curve/varname (typestring/type)/term
+  indexterm
+   primaryvarnamessl_ecdh_curve/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies name of EC curve which will be used in ECDH key excanges.
+Default is literalprime256p1/.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-password-encryption xreflabel=password_encryption
   termvarnamepassword_encryption/varname (typeboolean/type)/term
   indexterm
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 2094674..8d688f2 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -69,6 +69,9 @@
 #if SSLEAY_VERSION_NUMBER = 0x0907000L
 #include openssl/conf.h
 #endif
+#if (OPENSSL_VERSION_NUMBER = 0x0090800fL)  !defined(OPENSSL_NO_ECDH)
+#include openssl/ec.h
+#endif
 

[HACKERS] SSL: better default ciphersuite

2013-11-14 Thread Marko Kreen
Attached patch changes the default ciphersuite to

HIGH:!aNULL

instead of old

DEFAULT:!LOW:!EXP:!MD5:@STRENGTH

where DEFAULT is a shortcut for ALL:!aNULL:!eNULL.


Main goal is to leave low-level ciphersuite details to OpenSSL guys
and give clear impression to Postgres admins what it is about.

Compared to old value, new value will remove all suites with RC4 and SEED
from ciphersuite list.  If OpenSSL is compiled with support for SSL2,
it will include following suite: DES-CBC3-MD5, usable only for SSL2
connections.

Tested with OpenSSL 0.9.7 - 1.0.1, using openssl ciphers -v ... command.


Values used
---

HIGH:
  Contains only secure and well-researched algorithms.

!aNULL
  Needed to disable suites that do not authenticate server.
  DEFAULT includes !aNULL by default.


Values not used
---

!MD5
  This affects only one suite: DES-CBC3-MD5, which is available only
  for SSL2 connections.  So it would only pollute the default value.

@STRENGTH
  The OpenSSL cipher list is already sorted by humans,
  it's unlikely that mechanical sort would improve things.
  Also the existence of this value in old list is rather
  dubious, as server cipher order was never respected anyway.

-- 
marko

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ffc69c7..d4e6c52 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -3144,7 +3144,7 @@ static struct config_string ConfigureNamesString[] =
 		},
 		SSLCipherSuites,
 #ifdef USE_SSL
-		DEFAULT:!LOW:!EXP:!MD5:@STRENGTH,
+		HIGH:!aNULL,
 #else
 		none,
 #endif
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 34a2d05..e6b7f9a 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -79,7 +79,7 @@
 
 #authentication_timeout = 1min		# 1s-600s
 #ssl = off# (change requires restart)
-#ssl_ciphers = 'DEFAULT:!LOW:!EXP:!MD5:@STRENGTH'	# allowed SSL ciphers
+#ssl_ciphers = 'HIGH:!aNULL'		# allowed SSL ciphers
 	# (change requires restart)
 #ssl_renegotiation_limit = 512MB	# amount of data between renegotiations
 #ssl_cert_file = 'server.crt'		# (change requires restart)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH 1/2] SSL: GUC option to prefer server cipher order

2013-11-06 Thread Marko Kreen

By default OpenSSL (and SSL/TLS in general) lets client cipher
order take priority.  This is OK for browsers where the ciphers
were tuned, but few Postgres client libraries make cipher order
configurable.  So it makes sense to make cipher order in
postgresql.conf take priority over client defaults.

This patch adds setting 'ssl_prefer_server_ciphers' which can be
turned on so that server cipher order is preferred.

The setting SSL_OP_CIPHER_SERVER_PREFERENCE appeared in
OpenSSL 0.9.7 (31 Dec 2002), not sure if #ifdef is required
for conditional compilation.
---
 doc/src/sgml/config.sgml  | 12 
 src/backend/libpq/be-secure.c |  7 +++
 src/backend/utils/misc/guc.c  | 10 ++
 3 files changed, 29 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 77a9303..56bfa01 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -883,6 +883,18 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-ssl-prefer-server-ciphers xreflabel=ssl_prefer_server_ciphers
+  termvarnamessl_prefer_server_ciphers/varname (typebool/type)/term
+  indexterm
+   primaryvarnamessl_prefer_server_ciphers/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies whether to prefer client or server ciphersuite.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-password-encryption xreflabel=password_encryption
   termvarnamepassword_encryption/varname (typeboolean/type)/term
   indexterm
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 7f01a78..2094674 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -112,6 +112,9 @@ static bool ssl_loaded_verify_locations = false;
 /* GUC variable controlling SSL cipher list */
 char	   *SSLCipherSuites = NULL;
 
+/* GUC variable: if false, prefer client ciphers */
+bool	   SSLPreferServerCiphers;
+
 /*  */
 /*		 Hardcoded values		*/
 /*  */
@@ -845,6 +848,10 @@ initialize_SSL(void)
 	if (SSL_CTX_set_cipher_list(SSL_context, SSLCipherSuites) != 1)
 		elog(FATAL, could not set the cipher list (no valid ciphers available));
 
+	/* Let server choose order */
+	if (SSLPreferServerCiphers)
+		SSL_CTX_set_options(SSL_context, SSL_OP_CIPHER_SERVER_PREFERENCE);
+
 	/*
 	 * Load CA store, so we can verify client certificates if needed.
 	 */
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 538d027..7f1771a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -127,6 +127,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool synchronize_seqscans;
 extern char *SSLCipherSuites;
+extern bool SSLPreferServerCiphers;
 
 #ifdef TRACE_SORT
 extern bool trace_sort;
@@ -801,6 +802,15 @@ static struct config_bool ConfigureNamesBool[] =
 		check_ssl, NULL, NULL
 	},
 	{
+		{ssl_prefer_server_ciphers, PGC_POSTMASTER, CONN_AUTH_SECURITY,
+			gettext_noop(Give priority to server ciphersuite order.),
+			NULL
+		},
+		SSLPreferServerCiphers,
+		false,
+		NULL, NULL, NULL
+	},
+	{
 		{fsync, PGC_SIGHUP, WAL_SETTINGS,
 			gettext_noop(Forces synchronization of updates to disk.),
 			gettext_noop(The server will use the fsync() system call in several places to make 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH 2/2] SSL: Support ECDH key excange.

2013-11-06 Thread Marko Kreen

This sets up ECDH key exchange, when compiling against OpenSSL
that supports EC.  Then ECDHE-RSA and ECDHE-ECDSA ciphersuites
can be used for SSL connections.  Latter one means that EC keys
are now usable.

The reason for EC key exchange is that it's faster than DHE
and it allows to go to higher security levels where RSA will
be horribly slow.

Quick test with single-threaded client connecting repeatedly
to server on same machine, then closes connection.  Measured
is connections-per-second.

  Key DHE ECDHE
  RSA-1024177.5   278.1   (x 1.56)
  RSA-2048140.5   191.1   (x 1.36)
  RSA-409659.567.3(x 1.13)
  ECDSA-256   280.7   (~ RSA-3072)
  ECDSA-384   128.9   (~ RSA-7680)

There is also new GUC option - ssl_ecdh_curve - that specifies
curve name used for ECDH.  It defaults to prime256v1, which
is the most common curve in use in HTTPS.  According to NIST
should be securitywise similar to ~3072 bit RSA/DH.
(http://www.keylength.com / NIST Recommendations).

Other commonly-implemented curves are secp384r1 and secp521r1
(OpenSSL names).  The rest are not recommended as EC curves
needed to be exchanged by name and need to be explicitly
supprted by both client and server.  TLS does have free-form
curve exchange, but few client libraries implement that,
at least OpenSSL does not.

Full list can be seen with openssl ecparam -list_curves.

It does not tune ECDH curve with key size automatically,
like DHE does.  The reason is the curve naming situation.
---
 doc/src/sgml/config.sgml  | 13 +
 src/backend/libpq/be-secure.c | 32 
 src/backend/utils/misc/guc.c  | 16 
 3 files changed, 61 insertions(+)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 56bfa01..3785052 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -895,6 +895,19 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-ssl-ecdh-curve xreflabel=ssl_ecdh_curve
+  termvarnamessl_ecdh_curve/varname (typestring/type)/term
+  indexterm
+   primaryvarnamessl_ecdh_curve/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies name of EC curve which will be used in ECDH key excanges.
+Default is literalprime256p1/.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-password-encryption xreflabel=password_encryption
   termvarnamepassword_encryption/varname (typeboolean/type)/term
   indexterm
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 2094674..8d688f2 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -69,6 +69,9 @@
 #if SSLEAY_VERSION_NUMBER = 0x0907000L
 #include openssl/conf.h
 #endif
+#if (OPENSSL_VERSION_NUMBER = 0x0090800fL)  !defined(OPENSSL_NO_ECDH)
+#include openssl/ec.h
+#endif
 #endif   /* USE_SSL */
 
 #include libpq/libpq.h
@@ -112,6 +115,9 @@ static bool ssl_loaded_verify_locations = false;
 /* GUC variable controlling SSL cipher list */
 char	   *SSLCipherSuites = NULL;
 
+/* GUC variable for default ECHD curve. */
+char	   *SSLECDHCurve;
+
 /* GUC variable: if false, prefer client ciphers */
 bool	   SSLPreferServerCiphers;
 
@@ -765,6 +771,29 @@ info_cb(const SSL *ssl, int type, int args)
 	}
 }
 
+#if (OPENSSL_VERSION_NUMBER = 0x0090800fL)  !defined(OPENSSL_NO_ECDH)
+static void
+initialize_ecdh(void)
+{
+	EC_KEY *ecdh;
+	int nid;
+
+	nid = OBJ_sn2nid(SSLECDHCurve);
+	if (!nid)
+		elog(FATAL, ECDH: curve not known: %s, SSLECDHCurve);
+
+	ecdh = EC_KEY_new_by_curve_name(nid);
+	if (!ecdh)
+		elog(FATAL, ECDH: failed to allocate key);
+
+	SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_ECDH_USE);
+	SSL_CTX_set_tmp_ecdh(SSL_context, ecdh);
+	EC_KEY_free(ecdh);
+}
+#else
+#define initialize_ecdh()
+#endif
+
 /*
  *	Initialize global SSL context.
  */
@@ -844,6 +873,9 @@ initialize_SSL(void)
 	SSL_CTX_set_tmp_dh_callback(SSL_context, tmp_dh_cb);
 	SSL_CTX_set_options(SSL_context, SSL_OP_SINGLE_DH_USE | SSL_OP_NO_SSLv2);
 
+	/* set up ephemeral ECDH keys */
+	initialize_ecdh();
+
 	/* set up the allowed cipher list */
 	if (SSL_CTX_set_cipher_list(SSL_context, SSLCipherSuites) != 1)
 		elog(FATAL, could not set the cipher list (no valid ciphers available));
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 7f1771a..defd44a 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -127,6 +127,7 @@ extern char *temp_tablespaces;
 extern bool ignore_checksum_failure;
 extern bool synchronize_seqscans;
 extern char *SSLCipherSuites;
+extern char *SSLECDHCurve;
 extern bool SSLPreferServerCiphers;
 
 #ifdef TRACE_SORT
@@ -3151,6 +3152,21 @@ static struct config_string ConfigureNamesString[] =
 	},
 
 	{
+		{ssl_ecdh_curve, PGC_POSTMASTER, CONN_AUTH_SECURITY,
+			gettext_noop(Sets the list of EC curve used for ECDH.),
+			NULL,
+			

Re: [HACKERS] [PATCH 1/2] SSL: GUC option to prefer server cipher order

2013-11-06 Thread Marko Kreen
On Wed, Nov 06, 2013 at 09:57:32PM -0300, Alvaro Herrera wrote:
 Marko Kreen escribió:
 
  By default OpenSSL (and SSL/TLS in general) lets client cipher
  order take priority.  This is OK for browsers where the ciphers
  were tuned, but few Postgres client libraries make cipher order
  configurable.  So it makes sense to make cipher order in
  postgresql.conf take priority over client defaults.
  
  This patch adds setting 'ssl_prefer_server_ciphers' which can be
  turned on so that server cipher order is preferred.
 
 Wouldn't it make more sense to have this enabled by default?

Well, yes.  :)

I would even drop the GUC setting, but hypothetically there could
be some sort of backwards compatiblity concerns, so I added it
to patch and kept old default.  But if noone has strong need for it,
the setting can be removed.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-07-03 Thread Marko Kreen
On Wed, Jul 3, 2013 at 8:07 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-07-02 16:29:56 -0400, Robert Haas wrote:
 There's no possibility of getting confused here; if X is still around
 at all, it's xmax is of the same generation as Y's xmin.  Otherwise,
 we've had an undetected XID wraparound.

 Another issue I thought about is what we will return for SELECT xmin
 FROM blarg; Some people use that in their applications (IIRC
 skytools/pqg/londiste does so) and they might get confused if we
 suddently return xids from the future. On the other hand, not returning
 the original xid would be a shame as well...

Skytools/pqg/londiste use only txid_* APIs, they don't look at
4-byte xids on table rows.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 aggregate

2013-06-27 Thread Marko Kreen
On Thu, Jun 27, 2013 at 11:28 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 26 June 2013 21:46, Peter Eisentraut pete...@gmx.net wrote:
 On 6/26/13 4:04 PM, Dean Rasheed wrote:
 A quick google search reveals several people asking for something like
 this, and people recommending md5(string_agg(...)) or
 md5(string_agg(md5(...))) based solutions, which are doomed to failure
 on larger tables.

 The thread discussed several other options of checksumming tables that
 did not have the air of a crytographic offering, as Noah put it.


 True but md5 has the advantage of being directly comparable with the
 output of Unix md5sum, which would be useful if you loaded data from
 external files and wanted to confirm that your import process didn't
 mangle it.

The problem with md5_agg() is that it's only useful in toy scenarios.

It's more useful give people script that does same sum(hash(row))
on dump file than try to run MD5 on ordered rows.

Also, I don't think anybody actually cares about MD5(table-as-bytes), instead
people want way to check if 2 tables or table and dump are same.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 aggregate

2013-06-17 Thread Marko Kreen
On Mon, Jun 17, 2013 at 11:34:52AM +0100, Dean Rasheed wrote:
 On 15 June 2013 10:22, Dean Rasheed dean.a.rash...@gmail.com wrote:
  There seem to be 2 separate directions that this could go, which
  really meet different requirements:
 
  1). Produce an unordered sum for SQL to compare 2 tables regardless of
  the order in which they are scanned. A possible approach to this might
  be something like an aggregate
 
  md5_total(text/bytea) returns text
 
  that returns the sum of the md5 values of each input value, treating
  each md5 value as an unsigned 128-bit integer, and then producing the
  hexadecimal representation of the final sum. This should out-perform a
  solution based on numeric addition, and in typical cases, the result
  wouldn't be much longer than a regular md5 sum, and so would be easy
  to eyeball for differences.
 
 
 I've been playing around with the idea of an aggregate that computes
 the sum of the md5 hashes of each of its inputs, which I've called
 md5_total() for now, although I'm not particularly wedded to that
 name. Comparing it with md5_agg() on a 100M row table (see attached
 test script) produces interesting results:
 
 SELECT md5_agg(foo.*::text)
   FROM (SELECT * FROM foo ORDER BY id) foo;
 
  50bc42127fb9b028c9708248f835ed8f
 
 Time: 92960.021 ms
 
 SELECT md5_total(foo.*::text) FROM foo;
 
  02faea7fafee4d253fc94cfae031afc43c03479c
 
 Time: 96190.343 ms
 
 Unlike md5_agg(), it is no longer a true MD5 sum (for one thing, its
 result is longer) but it seems like it would be very useful for
 quickly comparing data in SQL, since its value is not dependent on the
 row-order making it easier to use and better performing if there is no
 usable index for ordering.
 
 Note, however, that if there is an index that can be used for
 ordering, the performance is not necessarily better than md5_agg(), as
 this example shows. There is a small additional overhead per row for
 initialising the MD5 sums, and adding the results to the total, but I
 think the biggest factor is that md5_total() is processing more data.
 The reason is that MD5 works on 64-byte blocks, so the total amount of
 data going through the core MD5 algorithm is each row's size is
 rounded up to a multiple of 64. In this simple case it ends up
 processing around 1.5 times as much data:
 
 SELECT sum(length(foo.*::text)) AS md5_agg,
sum(((length(foo.*::text)+63)/64)*64) AS md5_total FROM foo;
 
   md5_agg   |  md5_total
 +-
  8103815438 | 12799909248
 
 although of course that overhead won't be as large on wider tables,
 and even in this case the overall performance is still on a par with
 md5_agg().
 
 ISTM that both aggregates are potentially useful in different
 situations. I would probably typically use md5_total() because of its
 simplicity/order-independence and consistent performance, but
 md5_agg() might also be useful when comparing with external data.

Few notes:

- Index-scan over whole table is *very* bad for larger tables (few times
  bigger than available RAM).  If you want to promote such use you should
  also warn against use on loaded server.

- It's pointless to worry about overflow on 128-bit ints.  Just let it
  happen.  Adding complexity for that does not bring any advantage.

- Using some faster 128-bit hash may be useful - check out CityHash
  or SpookyHash.  You can get C implementation from pghashlib.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add regression tests for DISCARD

2013-06-17 Thread Marko Kreen
On Mon, May 13, 2013 at 2:58 AM, Robins Tharakan thara...@gmail.com wrote:
 Please find attached a patch that adds basic regression tests for DISCARD
 command.

 Any and all feedback is obviously welcome.

Perhaps existing tests in guc.sql should be merged into it?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Processing long AND/OR lists

2013-06-16 Thread Marko Kreen
On Mon, May 27, 2013 at 5:59 AM, Christopher Browne cbbro...@gmail.com wrote:
 This situation falls from a problem that we noticed a mighty long time ago
 in Slony, where the set of XIDs outstanding gets very large, and, attendant
 to that, the set of action id values by which tuples are being filtered,
 gets correspondingly large.

 It happens when there is a long pause in application of replication data,
 and is commonly the consequence of setting up replication on a very large
 data table that takes a long time for the initial data copy.

 At the time, Neil Conway observed this query breakage with a query that was
 roughly 640K in size, from whence fell jokes to the effect, who would ever
 need a query larger than 640K?

 The resolution that I introduced at the time was to write a little parser
 that would recognize sequences of adjacent values and merge them into
 BETWEEN A and B clauses, which would bring the query size back to a
 reasonable size.

PgQ uses simpler optimization to keep IN list size down -
it aggressively enlarges the main xid range and later processes
rows with txid_is_visible_in_snapshot():

  
https://github.com/markokr/skytools/blob/master/sql/pgq/functions/pgq.batch_event_sql.sql

IOW - it assumes the open-xid distribution is not uniformly random.

This additional optimization was ignored when pgq long-tx
approach was first imported to slony:

  http://lists.slony.info/pipermail/slony1-general/2007-July/006238.html

I guess the reason was to have minimal patch.
You might want to play with that now.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MD5 aggregate

2013-06-14 Thread Marko Kreen
On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Attached is a patch implementing a new aggregate function md5_agg() to
 compute the aggregate MD5 sum across a number of rows. This is
 something I've wished for a number of times. I think the primary use
 case is to do a quick check that 2 tables, possibly on different
 servers, contain the same data, using a query like

   SELECT md5_agg(foo.*::text) FROM (SELECT * FROM foo ORDER BY id) foo;

 or

   SELECT md5_agg(foo.*::text ORDER BY id) FROM foo;

 these would be equivalent to

   SELECT md5(string_agg(foo.*::text, '' ORDER BY id)) FROM foo;

 but without the excessive memory consumption for the intermediate
 concatenated string, and the resulting 1GB table size limit.

It's more efficient to calculate per-row md5, and then sum() them.
This avoids the need for ORDER BY.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] erroneous restore into pg_catalog schema

2013-05-14 Thread Marko Kreen
On Tue, May 14, 2013 at 09:29:38AM +0200, Dimitri Fontaine wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Marko Kreen (mark...@gmail.com) wrote:
  On Sat, May 04, 2013 at 10:57:44PM +0200, Dimitri Fontaine wrote:
   Other than adminpack, I know of PGQ installing their objects in
   pg_catalog. They only began doing that when switching to the CREATE
   EXTENSION facility. And they set relocatable to false.
  
  FYI - PgQ and related modules install no objects into pg_catalog.
  
  I used schema='pg_catalog' because I had trouble getting schema='pgq'
  to work.  I wanted 'pgq' schema to live and die with extension,
  and that was only way I got it to work on 9.1.
 
 Sorry, I didn't take the time to actually read \dx+ pgq, just remembered
 (and checked) that the control file did mention pg_catalog.
 
 There is a documented way to have the schema live and die with the
 extension), which is:
 
   relocatable = false
   schema = 'pgq'
 
 Then CREATE EXTENSION will also create the schema, that will be a member
 of the extension, and thus will get DROPed with it.

That's the problem - it's not dropped with it.

Btw, if I do ALTER EXTENSION pgq ADD SCHEMA pgq; during
CREATE EXTENSION pgq FROM 'unpackaged'; then Postgres
will complain:

  ERROR:  cannot add schema pgq to extension pgq because the schema
  contains the extension

Seems the code is pretty sure it's invalid concept...

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] erroneous restore into pg_catalog schema

2013-05-13 Thread Marko Kreen
On Sat, May 04, 2013 at 10:57:44PM +0200, Dimitri Fontaine wrote:
 Other than adminpack, I know of PGQ installing their objects in
 pg_catalog. They only began doing that when switching to the CREATE
 EXTENSION facility. And they set relocatable to false.

FYI - PgQ and related modules install no objects into pg_catalog.

I used schema='pg_catalog' because I had trouble getting schema='pgq'
to work.  I wanted 'pgq' schema to live and die with extension,
and that was only way I got it to work on 9.1.

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto: Fix RSA password-protected keys

2013-05-10 Thread Marko Kreen
On Fri, May 10, 2013 at 12:52:55PM -0400, Tom Lane wrote:
 Marko Kreen mark...@gmail.com writes:
  RSA secret key extraction code uses wrong variable so
  that decryption is skipped and only secret keys without
  password work for pgp_pub_decrypt().
 
  Attached patch fixes it and also adds regtest.
 
  Please apply to all branches.
 
 Will do, thanks for the fix!

Thanks.

Re: future changelog entry

The problem is specific to RSA keys, password-protected DSA+ElGamal
keys work fine.  Sorry for not mentioning it earlier.

RSA code was added later than ElGamal, and the bug is probably
because of copy-paste from public key code...

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgcrypto: Fix RSA password-protected keys

2013-05-06 Thread Marko Kreen
RSA secret key extraction code uses wrong variable so
that decryption is skipped and only secret keys without
password work for pgp_pub_decrypt().

Attached patch fixes it and also adds regtest.

Please apply to all branches.


Reported-by: Keith Fiske ke...@omniti.com

-- 
marko

diff --git a/contrib/pgcrypto/pgp-pubkey.c b/contrib/pgcrypto/pgp-pubkey.c
index 283e0ec..9651d5e 100644
--- a/contrib/pgcrypto/pgp-pubkey.c
+++ b/contrib/pgcrypto/pgp-pubkey.c
@@ -408,16 +408,16 @@ process_secret_key(PullFilter *pkt, PGP_PubKey **pk_p,
 		case PGP_PUB_RSA_SIGN:
 		case PGP_PUB_RSA_ENCRYPT:
 		case PGP_PUB_RSA_ENCRYPT_SIGN:
-			res = pgp_mpi_read(pkt, pk-sec.rsa.d);
+			res = pgp_mpi_read(pf_key, pk-sec.rsa.d);
 			if (res  0)
 break;
-			res = pgp_mpi_read(pkt, pk-sec.rsa.p);
+			res = pgp_mpi_read(pf_key, pk-sec.rsa.p);
 			if (res  0)
 break;
-			res = pgp_mpi_read(pkt, pk-sec.rsa.q);
+			res = pgp_mpi_read(pf_key, pk-sec.rsa.q);
 			if (res  0)
 break;
-			res = pgp_mpi_read(pkt, pk-sec.rsa.u);
+			res = pgp_mpi_read(pf_key, pk-sec.rsa.u);
 			if (res  0)
 break;
 			break;
diff --git a/contrib/pgcrypto/sql/pgp-pubkey-decrypt.sql b/contrib/pgcrypto/sql/pgp-pubkey-decrypt.sql
index cc82420..f8495d1 100644
--- a/contrib/pgcrypto/sql/pgp-pubkey-decrypt.sql
+++ b/contrib/pgcrypto/sql/pgp-pubkey-decrypt.sql
@@ -426,6 +426,71 @@ hbt6LhKhCLUNdz/udIt0JAC6c/HdPLSW3HnmM3+iNj+Kug==
 -END PGP PRIVATE KEY BLOCK-
 ');
 
+insert into keytbl (id, name, pubkey, seckey)
+values (7, 'rsaenc2048-psw', '
+same key with password
+', '
+-BEGIN PGP PRIVATE KEY BLOCK-
+Version: GnuPG v1.4.11 (GNU/Linux)
+
+lQPEBELr2m0BCADOrnknlnXI0EzRExf/TgoHvK7Xx/E0keWqV3KrOyC3/tY2KOrj
+UVxaAX5pkFX9wdQObGPIJm06u6D16CH6CildX/vxG7YgvvKzK8JGAbwrXAfk7OIW
+czO2zRaZGDynoK3mAxHRBReyTKtNv8rDQhuZs6AOozJNARdbyUO/yqUnqNNygWuT
+4htFDEuLPIJwAbMSD0BvFW6YQaPdxzaAZm3EWVNbwDzjgbBUdBiUUwRdZIFUhsjJ
+dirFdy5+uuZru6y6CNC1OERkJ7P8EyoFiZckAIE5gshVZzNuyLOZjc5DhWBvLbX4
+NZElAnfiv+4nA6y8wQLSIbmHA3nqJaBklj85AAYp/gcDCNnoEKwFo86JYCE1J92R
+HRQ7DoyAZpW1O0dTXL8Epk0sKsKDrCJOrIkDymsjfyBexADIeqOkioy/50wD2Mku
+CVHKWO2duAiJN5t/FoRgpR1/Q11K6QdfqOG0HxwfIXLcPv7eSIso8kWorj+I01BP
+Fn/atGEbIjdWaz/q2XHbu0Q3x6Et2gIsbLRVMhiYz1UG9uzGJ0TYCdBa2SFhs184
+52akMpD+XVdM0Sq9/Cx40Seo8hzERB96+GXnQ48q2OhlvcEXiFyD6M6wYCWbEV+6
+XQVMymbl22FPP/bD9ReQX2kjrkQlFAtmhr+0y8reMCbcxwLuQfA3173lSPo7jrbH
+oLrGhkRpqd2bYCelqdy/XMmRFso0+7uytHfTFrUNfDWfmHVrygoVrNnarCbxMMI0
+I8Q+tKHMThWgf0rIOSh0+w38kOXFCEqEWF8YkAqCrMZIlJIed78rOCFgG4aHajZR
+D8rpXdUOIr/WeUddK25Tu8IuNJb0kFf12IMgNh0nS+mzlqWiofS5kA0TeB8wBV6t
+RotaeyDNSsMoowfN8cf1yHMTxli+K1Tasg003WVUoWgUc+EsJ5+KTNwaX5uGv0Cs
+j6dg6/FVeVRL9UsyF+2kt7euX3mABuUtcVGx/ZKTq/MNGEh6/r3B5U37qt+FDRbw
+ppKPc2AP+yBUWsQskyrxFgv4eSpcLEg+lgdz/zLyG4qW4lrFUoO790Cm/J6C7/WQ
+Z+E8kcS8aINJkg1skahH31d59ZkbW9PVeJMFGzNb0Z2LowngNP/BMrJ0LT2CQyLs
+UxbT16S/gwAyUpJnbhWYr3nDdlwtC0rVopVTPD7khPRppcsq1f8D70rdIxI4Ouuw
+vbjNZ1EWRJ9f2Ywb++k/xgSXwJkGodUlrUr+3i8cv8mPx+fWvif9q7Y5Ex1wCRa8
+8FAj/o+hEbQlUlNBIDIwNDggRW5jIDxyc2EyMDQ4ZW5jQGV4YW1wbGUub3JnPokB
+NAQTAQIAHgUCQuvabQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIXgAAKCRDImeqTRBlV
+WRzJCACbRhx2fYjPGKta69M5dS+kr5UD/CQmsR2t9cB9zyqhratjPnKW9q13+4AG
+P3aByT14IH1c5Mha8rJkNYD2wxmC8jrrcPiJIYoRG+W1sUATY/t8wBbNWF+r9h11
+m0lEpsmNVff/jU7SpNN6JQ3P7MHd5V85LlDoXIH6QYCLd0PjKU+jNvjiBe5VX0m9
+a1nacE3xoWc1vbM0DnqEuID78Qgkcrmm0ESeg1h+tRfHxSAyYNc/gPzm8eH6l+hj
+gOvUc4Gd6LpBQSF8TcFfT2TZwJh7WVWDvNIP6FWAW7rzmHnX3wwXkGq4REWeVtk5
+yBPp6mOtWDiwaqLJYsoHWU11C8zYnQPEBELr2roBCADrgiWXZMzkQOntZa/NS56+
+CczLFQRQPl/8iJAW1eql/wOJ1UiwGSjT189WCKzE7vtazCIstdCFmwOs4DE6cz4S
+UX4HjzjYHZwmMiuSrIefwuZ7cysMBsMXypQFyMSbqwh102xGvmLz3Z++rydx7Fzl
+1RC/ny2+FN5dzYPO2DNtNi4dR2tjHktsxBWXAKCmxagAIwyxGouuEqDhYdFtwrA9
+Qy+M5n6fmGa1Dx07WWnbIud4uCilv8LPVKx5aJamDYWM3v7kS8n51MfTzeK/xoRM
+2rsgzFdLJqPdbgd2nsD37fngqZnlp7tDxSVSuMckZoSKtq1QsNemtaQSYq7xjPst
+AAYp/gcDCNnoEKwFo86JYAsxoD+wQ0zBi5RBM5EphXTpM1qKxmigsKOvBSaMmr0y
+VjHtGY3poyV3t6VboOGCsFcaKm0tIdDL7vrxxwyYESETpF29b7QrYcoaLKMG7fsy
+t9SUI3UV2H9uUquHgqHtsqz0jYOgm9tYnpesgQ/kOAWI/tej1ZJXUIWEmZMH/W6d
+ATNvZ3ivwApfC0qF5G3oPgBSoIuQ/8I+pN/kmuyNAnJWNgagFhA/2VFBvh5XgztV
+NW7G//KpR1scsn140SO/wpGBM3Kr4m8ztl9w9U6a7NlQZ2ub3/pIUTpSzyLBxJZ/
+RfuZI7ROdgDMKmEgCYrN2kfp0LIxnYL6ZJu3FDcS4V098lyf5rHvB3PAEdL6Zyhd
+qYp3Sx68r0F4vzk5iAIWf6pG2YdfoP2Z48Pmq9xW8qD9iwFcoz9oAzDEMENn6dfq
+6MzfoaXEoYp8cR/o+aeEaGUtYBHiaxQcJYx35B9IhsXXA49yRORK8qdwhSHxB3NQ
+H3pUWkfw368f/A207hQVs9yYXlEvMZikxl58gldCd3BAPqHm/XzgknRRNQZBPPKJ
+BMZebZ22Dm0qDuIqW4GXLB4sLf0+UXydVINIUOlzg+S4jrwx7eZqb6UkRXTIWVo5
+psTsD14wzWBRdUQHZOZD33+M8ugmewvLY/0Uix+2RorkmB7/jqoZvx/MehDwmCZd
+VH8sb2wpZ55sj7gCXxvrfieQD/VeH54OwjjbtK56iYq56RVD0h1az8xDY2GZXeT7
+J0c3BGpuoca5xOFWr1SylAr/miEPxOBfnfk8oZQJvZrjSBGjsTbALep2vDJk8ROD
+sdQCJuU1RHDrwKHlbUL0NbGRO2juJGsatdWnuVKsFbaFW2pHHkezKuwOcaAJv7Xt
+8LRF17czAJ1uaLKwV8Paqx6UIv+089GbWZi7HIkBHwQYAQIACQUCQuvaugIbDAAK
+CRDImeqTRBlVWS7XCACDVstKM+SHD6V0bkfO6ampHzj4krKjN0lonN5+7b7WKpgT
+QHRYvPY8lUiIrjXGISQqEG9M5Bi5ea1aoBZem0P3U/lKheg0lYtA7dM3BqsA2EfG

Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-17 Thread Marko Kreen
On Tue, Apr 16, 2013 at 05:09:19PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I think his point is why don't we clear currval() on DISCARD ALL?  I
  can't think of a good reason we don't.
 
 Because we'd have to invent a new suboperation DISCARD SEQUENCES,
 for one thing, in order to be consistent.  I'd rather ask why it's
 important that we should throw away such state.  It doesn't seem to
 me to be important enough to justify a new subcommand.

consistency is a philosophical thing.  Practical reason for
subcommands is possibility to have partial reset for special
situations, pooling or otherwise.  But such usage seems rather
rare in real life.

If the sequences are not worth subcommand, then let's not give them
subcommand and just wait until someone comes with actual reason
to have one.

But currval() is quite noticeable thing that DISCARD ALL should clear.

 Or, if you'd rather a more direct answer: wanting this sounds like
 evidence of bad application design.  Why is your app dependent on
 getting failures from currval, and isn't there a better way to do it?

It just does not sound like, but thats exactly the request - because
DISCARD ALL leaves user-visible state around, it's hard to fix
application that depends on broken assumptions.


In fact, it was surprise to me that currval() works across transactions.
My alternative proposal would be to get rid of such silly behaviour...

-- 
marko



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto seeding problem when ssl=on

2013-01-14 Thread Marko Kreen
On Mon, Jan 14, 2013 at 12:46 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Fri, Dec 21, 2012 at 10:27 PM, Noah Misch n...@leadboat.com wrote:
 How about instead calling RAND_cleanup() after each backend fork?

 Attached is a patch that adds RAND_cleanup() to fork_process().

 I remain unconvinced that this is the best solution.  Anybody else have
 an opinion?

Do you have knowledge about systems that have /dev/random (blocking)
but not /dev/urandom (non-blocking)?  The only argument I see against
RAND_cleanup() is that postgres might eat entropy from /dev/random (blocking)
and cause both other programs and itself block, waiting for more entropy.

But this can only happen on systems that don't have /dev/urandom.

Note: reading from /dev/urandom does not affect /dev/random.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto seeding problem when ssl=on

2013-01-14 Thread Marko Kreen
On Mon, Jan 14, 2013 at 3:00 PM, Noah Misch n...@leadboat.com wrote:
 On Mon, Jan 14, 2013 at 02:21:00PM +0200, Marko Kreen wrote:
 Note: reading from /dev/urandom does not affect /dev/random.

 Reading from /dev/urandom drains the pool that serves /dev/random:

 $ cat /proc/sys/kernel/random/entropy_avail
 3596
 $ dd iflag=nonblock bs=100 count=1 if=/dev/random of=/dev/null
 1+0 records in
 1+0 records out
 100 bytes (100 B) copied, 0.000174798 s, 572 kB/s
 $ cat /proc/sys/kernel/random/entropy_avail
 2839
 $ head -c1000 /dev/urandom /dev/null
 $ cat /proc/sys/kernel/random/entropy_avail
 212
 $ dd iflag=nonblock bs=100 count=1 if=/dev/random of=/dev/null
 0+1 records in
 0+1 records out
 38 bytes (38 B) copied, 0.000101439 s, 375 kB/s

This slightly weakens my argument, but not completely - any
/dev/urandom-using processes are still unaffected, but
indeed processes using /dev/random can block.  But what are those?

So it's still problem only on systems that do not have /dev/urandom.

Btw, observe fun behaviour:

$ cat /proc/sys/kernel/random/entropy_avail
3451
$ cat /proc/sys/kernel/random/entropy_avail
3218
$ cat /proc/sys/kernel/random/entropy_avail
3000
$ cat /proc/sys/kernel/random/entropy_avail
2771
$ cat /proc/sys/kernel/random/entropy_avail
2551
$ cat /proc/sys/kernel/random/entropy_avail
2321
$ cat /proc/sys/kernel/random/entropy_avail
2101
$ cat /proc/sys/kernel/random/entropy_avail
1759
$ cat /proc/sys/kernel/random/entropy_avail
1527
$ cat /proc/sys/kernel/random/entropy_avail
1319
$ cat /proc/sys/kernel/random/entropy_avail
1080
$ cat /proc/sys/kernel/random/entropy_avail
844
$ cat /proc/sys/kernel/random/entropy_avail
605
$ cat /proc/sys/kernel/random/entropy_avail
366
$ cat /proc/sys/kernel/random/entropy_avail
128
$ cat /proc/sys/kernel/random/entropy_avail
142
$ cat /proc/sys/kernel/random/entropy_avail

Each exec() eats from the pool on modern system.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto seeding problem when ssl=on

2013-01-13 Thread Marko Kreen
On Fri, Dec 21, 2012 at 10:27 PM, Noah Misch n...@leadboat.com wrote:
 How about instead calling RAND_cleanup() after each backend fork?

Attached is a patch that adds RAND_cleanup() to fork_process().
That way all forked processes start with fresh state.  This should
make sure the problem does not happen in any processes
forked by postmaster.

Please backpatch.

...

Alternative is to put RAND_cleanup() to BackendInitialize() so only
new backends start with fresh state.

Another alternative is to put RAND_cleanup() after SSL_accept(),
that way core code sees no change, but other OpenSSL users
in backend operate securely.

-- 
marko


rand_cleanup.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allowing multiple PQclear() calls

2013-01-02 Thread Marko Kreen
On Wed, Jan 2, 2013 at 5:11 PM, Boszormenyi Zoltan z...@cybertec.at wrote:
 2012-12-11 16:09 keltezéssel, Simon Riggs írta:

 On 11 December 2012 12:18, Boszormenyi Zoltan z...@cybertec.at wrote:

 Such mechanism already exist - you just need to set
 your PGresult pointer to NULL after each PQclear().

 So why doesn't PQclear() do that?


 Because then PQclear() would need a ** not a *. Do you want its
 interface changed for 9.3 and break compatibility with previous versions?

 No, but we should introduce a new public API call that is safer,
 otherwise we get people continually re-inventing new private APIs that
 Do the Right Thing, as the two other respondents have shown.


 How about these macros?

* Use do { } while (0) around the macros to get proper statement behaviour.
* The if() is not needed, both PQclear and PQfinish do it internally.
* Docs

Should the names show somehow that they are macros?
Or is it enough that it's mentioned in documentation?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto seeding problem when ssl=on

2012-12-27 Thread Marko Kreen
On Sun, Dec 23, 2012 at 9:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Sat, Dec 22, 2012 at 9:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I'm not thrilled with the suggestion to do RAND_cleanup() after forking
  though, as that seems like it'll guarantee that each session starts out
  with only a minimal amount of entropy.

  No, that's actually the right fix - that forces OpenSSL to do new reseed
  with system randomness, thus making backend (including SSL_accept)
  maximally disconnected from static pool in postmaster.

 I don't think that maximal disconnectedness is the deciding factor
 here, or even a very important factor.  If we do RAND_cleanup() then
 each new session will be forced to suck entropy from /dev/urandom
 (immediately, if an SSL connection is attempted).  This opens the door
 to quasi-denial-of-service attacks that bleed all the entropy out of the
 system, forcing long delays for new PRNG-using processes, Postgres or
 otherwise.  And long delays are the *best case* scenario --- worst case,
 if the system lacks decent /dev/random support, is that new processes
 are starting up with highly correlated PRNG states.

You cannot realistically drain /dev/urandom - it's just a CSPRNG, periodically
seeded from /dev/random.  And it cannot cause long-delays.  Sessions keys
are actually it's main use-case.

And anyway - if best way to attack Postgres SSL session is to drain and
cryptoanalyze /dev/urandom via SSL attempts, then we are in pretty good
shape.  When depening only on properly-used OpenSSL PRNG, we are still
good, but bit worse situation - it's gets minimal amount of entropy
after initial seeding, thus successful cryptoanalyze is more probable
than on /dev/urandom.  And when depending on incorrectly used OpenSSL PRNG
(current situation) then the situation is bad - we seem to be depending
on security of single hash.

 If such an approach were a good thing, the OpenSSL guys would have
 implemented it internally --- it'd be easy enough to do, just by forcing
 RAND_cleanup anytime getpid() reads differently than it did when the
 pool was set up.

I thought about it and I realized they cannot do it - the libcrypto has
no idea of application lifetime, so doing such cleanup without application
knowledge is wrong.

Eg. imagine a daemon that loads config, sets up SSL, goes background into
empty chroot.

So the current idea of hashing getpid() on each call is best they can
do to help badly written applications - like Postgres.

 gettimeofday() might not be the very best way of changing the inherited
 PRNG state from child to child, but I think it's a more attractive
 solution than RAND_cleanup.

1. It optimizes CPU for unrealistic attack scenario.

2. It spends more CPU in single-threaded postmaster when SSL is not used -
   as postmaster does not know whether incoming connection is SSL or not
   it needs to do the PRNG feeding anyway.  This might be noticeable
   in realistic short-connections scenario, where SSL is used only
   for admin or replication connections.

  This also makes behaviour equal to current ssl=off and exec-backend
  mode, which already do initial seeding in backend.

 No, it's not equal, because when ssl=off seeding attempts won't happen
 immediately after fork and thus an attacker doesn't have such an easy
 way of draining entropy.  If we do what you're suggesting, the attacker
 doesn't even need a valid database login to drain entropy --- he can
 just fire-and-forget NEGOTIATE_SSL startup packets.

You can't just fire-and-forget them - you need to do TCP handshake
plus Postgres SSL handshake; this means each request takes noticeable
amount of setup time from attacker side and on server side the sucker's
IP is visible in logs.

And /dev/urandom seeding is prettly inexpensive compared to SSL pubkey
crypto.  It does not look like a scenario we need to design against.

 (The exec-backend
 case will have such issues anyway, but who thought Windows was a secure
 OS?)

ATM the Windows is pretty good shape compared to our Unix situation...

  If you decide against RAND_cleanup in backend and instead
  do workarounds in backend or postmaster, then please
  also to periodic RAND_cleanup in postmaster.  This should
  make harder to exploit weaknesses in reused slowly moving state.

 We could consider that, but it's not apparent to me that it has any
 real value.

Properly-designed and properly-used CSPRNG feeding and extraction
should tolerate minor breaks in low-level hashes/ciphers.  If we are
not using it as intended, then we lose that guarantee and need to
limit the damage on our side.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto seeding problem when ssl=on

2012-12-23 Thread Marko Kreen
On Sat, Dec 22, 2012 at 9:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm not thrilled with the suggestion to do RAND_cleanup() after forking
 though, as that seems like it'll guarantee that each session starts out
 with only a minimal amount of entropy.

No, that's actually the right fix - that forces OpenSSL to do new reseed
with system randomness, thus making backend (including SSL_accept)
maximally disconnected from static pool in postmaster.

This also makes behaviour equal to current ssl=off and exec-backend
mode, which already do initial seeding in backend.

The fact that PRNG behaviour is affected by complex set of compile-
and run-time switches makes current situation rather fragile and
hard to understand.

  What seems to me like it'd be
 most secure is to make the postmaster do RAND_add() with a gettimeofday
 result after each successful fork, say at the bottom of
 BackendStartup().  That way the randomness accumulates in the parent
 process, and there's no way to predict the initial state of any session
 without exact knowledge of every child process launch since postmaster
 start.  So it'd go something like

 #ifdef USE_SSL
 if (EnableSSL)
 {
 struct timeval tv;

 gettimeofday(tv, NULL);
 RAND_add(tv, sizeof(tv), 0);
 }
 #endif

If you decide against RAND_cleanup in backend and instead
do workarounds in backend or postmaster, then please
also to periodic RAND_cleanup in postmaster.  This should
make harder to exploit weaknesses in reused slowly moving state.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgcrypto seeding problem when ssl=on

2012-12-21 Thread Marko Kreen
When there is 'ssl=on' then postmaster calls SSL_CTX_new(),
which asks for random number, thus requiring initialization
of randomness pool (RAND_poll).  After that all forked backends
think pool is already initialized.  Thus they proceed with same
fixed state they got from postmaster.

Output is not completely constant due to:
1) When outputting random bytes, OpenSSL includes getpid()
   output when hashing pool state.
2) SSL_accept() adds time() output into pool before asking
   for random bytes.  This and 1) should make sure SSL handshake
   is done with unique random numbers.  [It's uncertain tho'
   how unpredictable the PRNG is in such mode.]

Now, problem in pgcrypto side is that when compiled with OpenSSL,
it expects the randomness pool to be already initialized.  This
expectation is filled when ssl=off, or when actual SSL connection
is used.  But it's broken when non-SSL connection is used while
having ssl=on in config.  Then all backends are in same state
when they reach pgcrypto functions first time and output is only
affected by pid.

Affected:
* pgp_encrypt*() - it feeds hashes of user data back to pool,
  but this is randomized, thus there is small chance first
  few messages have weaker keys.

* gen_random_bytes() - this does not feed back anything, thus
  when used alone in session, it's output will repeat quite easily
  as randomness sequence is affected only by pid.

Attached patch makes both gen_random_bytes() and pgp_encrypt()
seed pool with output from gettimeofday(), thus getting pool
off from fixed state.  Basically, this mirrors what SSL_accept()
already does.

I'd like to do bigger reorg of seeding code in pgcrypto, but
that might not be back-patchable.  So I propose this simple fix,
which should be applied also to older releases.

-- 
marko


pgcrypto-add-time.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgcrypto seeding problem when ssl=on

2012-12-21 Thread Marko Kreen
On Fri, Dec 21, 2012 at 10:27 PM, Noah Misch n...@leadboat.com wrote:
 This should have gone to secur...@postgresql.org, instead.

It went and this could have been patched in 9.2.2 but they did not care.

 On Fri, Dec 21, 2012 at 06:05:10PM +0200, Marko Kreen wrote:
 When there is 'ssl=on' then postmaster calls SSL_CTX_new(),
 which asks for random number, thus requiring initialization
 of randomness pool (RAND_poll).  After that all forked backends
 think pool is already initialized.  Thus they proceed with same
 fixed state they got from postmaster.

 Attached patch makes both gen_random_bytes() and pgp_encrypt()
 seed pool with output from gettimeofday(), thus getting pool
 off from fixed state.  Basically, this mirrors what SSL_accept()
 already does.

 That adds only 10-20 bits of entropy.  Is that enough?

It's enough to get numbers that are not the same.

Whether it's possible to guess next number when you know
that there is only time() and getpid() added to fixed state
(eg. those cleartext bytes in SSL handshake) - i don't know.

In any case, this is how Postgres already operates SSL connections.

 How about instead calling RAND_cleanup() after each backend fork?

Not instead - the gettimeofday() makes sense in any case.  Considering
that it's immediate problem only for pgcrypto, this patch has higher chance
for appearing in back-branches.

If the _cleanup() makes next RAND_bytes() call RAND_poll() again?
Then yes, it certainly makes sense as core patch.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allowing multiple PQclear() calls

2012-12-11 Thread Marko Kreen
On Tue, Dec 11, 2012 at 6:59 AM, Josh Kupershmidt schmi...@gmail.com wrote:
 Would it be crazy to add an already_freed flag to the pg_result
 struct which PQclear() would set, or some equivalent safety mechanism,
 to avoid this hassle for users?

Such mechanism already exist - you just need to set
your PGresult pointer to NULL after each PQclear().

Later you can safely call PQclear() again on that pointer.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Marko Kreen
On Fri, Oct 12, 2012 at 10:47 PM, Stephen Frost sfr...@snowman.net wrote:
 * Marko Kreen (mark...@gmail.com) wrote:
 As it works only on connect
 time, it can actually be secure, unlike user switching
 with SET ROLE.

 I'm guessing your issue with SET ROLE is that a RESET ROLE can be issued
 later..?  If so, I'd suggest that we look at fixing that, but realize it
 could break poolers.  For that matter, I'm not sure how the proposal to
 allow connections to be authenticated as one user but authorized as
 another (which we actually already support in some cases, eg: peer)
 *wouldn't* break poolers, unless you're suggesting they either use a
 separate connection for every user, or reconnect every time, both of
 which strike me as defeating a great deal of the point of having a
 pooler in the first place...

The point of pooler is to cache things.  The TCP connection
is only one thing to be cached, all the backend-internal
caches are as interesting - prepared plans, compiled functions.

The fact that on role reset you need to drop all those things
is what is breaking pooling.

Of course, I'm speaking only about high-performance situations.
Maybe there are cases where indeed the authenticated
TCP connection is only interesting to be cached.
Eg. with dumb client with raw sql only, where there
is nothing to cache in backend.  But it does not seem
like primary scenario we should optimize for.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-22 Thread Marko Kreen
On Wed, Oct 10, 2012 at 4:24 PM, Marko Kreen mark...@gmail.com wrote:
 The SCRAM looks good from the quick glance.

SCRAM does have weakness - the info necessary to log in
as client (ClientKey) is exposed during authentication
process.

IOW, the stored auth info can be used to log in as client,
if the attacker can listen on or participate in login process.
The random nonces used during auth do not matter,
what matters is that the target server has same StoredKey
(same password, salt and iter).

It seems this particular attack is avoided by SRP.

This weakness can be seen as feature tho - it can be
used by poolers to cache auth info and re-connect
to server later.  They need full access to stored keys still.

But it does make it give different security guaratees
depending whether SSL is in use or not.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Successor of MD5 authentication, let's use SCRAM

2012-10-10 Thread Marko Kreen
On Wed, Oct 10, 2012 at 3:36 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 10 October 2012 11:41, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 Thoughts on that?

 I think there has been enough discussion of md5 problems elsewhere
 that we should provide an alternative.

 If we can agree on that bit first, we can move onto exactly what else
 should be available.

Main weakness in current protocol is that stored value is
plaintext-equivalent - you can use it to log in.

Rest of the problems - use of md5 and how it is used - are relatively minor.
(IOW - they don't cause immediate security incident.)

Which means just slapping SHA1 in place of MD5 and calling it a day
is bad idea.

Another bad idea is to invent our own algorithm - if a security
protocol needs to fulfill more than one requirement, it tends
to get tricky.

I have looked at SRP previously, but it's heavy of complex
bignum math, which makes it problematic to reimplement
in various drivers.  Also many versions of it makes me
dubious of the authors..

The SCRAM looks good from the quick glance.  It uses only
basic crypto tools - hash, hmac, xor.

The stored auth info cannot be used to log in will cause
problems to middleware, but SCRAM defines also
concept of log-in-as-other-user, so poolers can have
their own user that they use to create connections
under another user.  As it works only on connect
time, it can actually be secure, unlike user switching
with SET ROLE.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.1] 2 bugs with extensions

2012-09-28 Thread Marko Kreen
On Wed, Sep 26, 2012 at 7:15 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Marko Kreen mark...@gmail.com writes:
 Can we work out a minimal example to reproduce the bug?

 Yes, the above text or sql/pgq_coop/sql/pgq_coop_init_ext.sql

 I guess you could replace pgq_coop with any extension just
 consisting of just functions.

 I did just that, with a single function, and couldn't reproduce the
 problem either in 9.1 nor in master, with relocatable = true then with
 relocatable = false and schema = 'pg_catalog' as in your repository.

Indeed, after another makefile reorg, I could not reproduce it
on skytools master either, some digging showed that due
to a makefile bug ($ instead $^) the ADD SCHEMA
was missing from .sql file.  So no bug in postgres.

 (The Makefile in skytools/sql/pgq_coop fails on my OS)

 How does it fail?  Are you using gnu make?  What version?

 I guess sed is the problem here, it's a BSD variant:

Could you test if Skytools git now works for you?

I replaced sed usage with awk there, perhaps that will be
more portable.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.1] 2 bugs with extensions

2012-09-26 Thread Marko Kreen
On Wed, Sep 26, 2012 at 5:36 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 After much distractions I've at least been able to do something about
 that bug report.

Thanks.

 2) If there is schema with functions, but nothing else,
 create extension pgq_coop from 'unpackaged';
 drop extension pgq_coop;
 create extension pgq_coop;
 ERROR:  schema pgq_coop already exists

 From reading the scripts, it's not clear to me, but it appears that the
 schema existed before the create from unpackaged then got added to the
 extension by way of

   ALTER EXTENSION pgq_coop ADD SCHEMA pgq_coop;

 Is that true?

Yes.

 Can we work out a minimal example to reproduce the bug?

Yes, the above text or sql/pgq_coop/sql/pgq_coop_init_ext.sql

I guess you could replace pgq_coop with any extension just
consisting of just functions.

 (The Makefile in skytools/sql/pgq_coop fails on my OS)

How does it fail?  Are you using gnu make?  What version?

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sha1, sha2 functions into core?

2012-08-15 Thread Marko Kreen
On Wed, Aug 15, 2012 at 6:11 AM, Bruce Momjian br...@momjian.us wrote:
 Is there a TODO here?

There is still open ToDecide here:

1) Status quo - md5() in core, everything else in pgcrypto

2) Start moving other hashes as separate functions into core: eg. sha1()
Problems:
- sha1 will be obsolete soon, like md5
- many newer hashes: sha2 family, upcoming sha3 family
- hex vs. binary api issue - hex-by-default in not good when
  you actually need cryptographic hash (eg. indexes)
- does not solve the password storage problem.

3) Move digest() from pgcrypto into core, with same api.
Problems:
- does not solve the password storage problem.

4) Move both digest() and crypt() into core, with same api.


Password problem - the cryptographic hashes are meant
for universal usage, thus they need to be usable even
on megabytes of data.  This means they are easily
bruteforceable, when the amount of data is microscopic
(1..16 chars).  Also they are unsalted, thus making
cracking even easier.  crypt() is better api for passwords.

So when the main need to have hashes is password
storage, then 2) is bad choice.

My vote: 4), 1)

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sha1, sha2 functions into core?

2012-08-15 Thread Marko Kreen
On Wed, Aug 15, 2012 at 4:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Wed, Aug 15, 2012 at 6:11 AM, Bruce Momjian br...@momjian.us wrote:
 Is there a TODO here?

 There is still open ToDecide here: [snip]

 The argument against moving crypto code into core remains the same as it
 was, ie export regulations.  I don't see that that situation has changed
 at all.  Thus, I think we should leave all the pgcrypto code where it
 is, in an extension that's easily separated out by anybody who's
 concerned about legal restrictions.  The recent improvements in the ease
 of installing extensions have made it even less interesting than it used
 to be to merge extension-supported code into core --- if anything, we
 ought to be trying to move functionality the other way.

Ok.

 If anybody's concerned about the security of our password storage,
 they'd be much better off working on improving the length and randomness
 of the salt string than replacing the md5 hash per se.

Sorry, I was not clear enough - by password storage I meant password
storage for application-specific passwords, not postgres users.

Postgres own usage of md5 is kind of fine, as:
- only superusers can see the hashes (pg_shadow).
- if somebody sees contents of pg_shadow, they don't need
  to crack them, they can use hashes to log in immediately.

But for storage of application passwords, the hash needs
to be one-way and hard to crack, to decrease any damage
caused by leaks.

-- 
marko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-08-02 Thread Marko Kreen
On Thu, Aug 2, 2012 at 8:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Wed, Aug 1, 2012 at 6:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 So I'm working from the first set of patches in your message
 20120721194907.ga28...@gmail.com.

 Great!

 Here's an updated draft patch.  I've not looked at the docs yet so
 this doesn't include that, but I'm reasonably happy with the code
 changes now.  The main difference from what you had is that I pushed
 the creation of the single-row PGresults into pqRowProcessor, so that
 they're formed immediately while scanning the input message.  That
 other method with postponing examination of the rowBuf does not work,
 any more than it did with the original patch, because you can't assume
 that the network input buffer is going to hold still.  For example,
 calling PQconsumeInput after parseInput has absorbed a row-data message
 but before calling PQgetResult would likely break things.

 In principle I suppose we could hack PQconsumeInput enough that it
 didn't damage the row buffer while still meeting its API contract of
 clearing the read-ready condition on the socket; but it wouldn't be
 simple or cheap to do so.

Any code using single-row-mode is new.  Any code calling consumeInput
before fully draining rows from buffer is buggy, as it allows unlimited growth
of network buffer, which breaks whole reason to use single-row mode.

It was indeed bug in previous patch that it did not handle this situation,
but IMHO it should be handled with error and not with allowing such code
to work.

Previously, whatever sequence the fetch functions were called, the apps
max memory usage was either 1x resultset size, or max 2x resultset size,
if it messed the sequence somehow.  But no more.  So it app knew
that resultset was big, it needed to split it up.

Now with single-row-mode, the apps can assume their max memory usage
is 1*row size + network buffer, lets simplify that to 2x row size.
But no more.  And now they can process huge resultsets assuming
their memory usage will be no more than 2x row size.

And now libpq allows such apps to go from 2x row size to full resultset
size with unfortunate coding.  Why?


I have opinions about reorg details too, but that's mostly matter of taste,
so rather unimportant compared to question whether we should allow
code to break the guarantees the single-row-mode gives.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-08-02 Thread Marko Kreen
On Fri, Aug 3, 2012 at 12:01 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Thu, Aug 2, 2012 at 8:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 In principle I suppose we could hack PQconsumeInput enough that it
 didn't damage the row buffer while still meeting its API contract of
 clearing the read-ready condition on the socket; but it wouldn't be
 simple or cheap to do so.

 Any code using single-row-mode is new.  Any code calling consumeInput
 before fully draining rows from buffer is buggy, as it allows unlimited 
 growth
 of network buffer, which breaks whole reason to use single-row mode.

 Sorry, that argument will not fly.  The API specification for
 PQconsumeInput is that it can be called at any time to drain the kernel
 input buffer, without changing the logical state of the PGconn.  It's
 not acceptable to insert a parenthetical oh, but you'd better not try
 it in single-row mode caveat.

Well, if the old API contract must be kept, then so be it.

 The reason I find this of importance is that PQconsumeInput is meant to
 be used in an application's main event loop for the sole purpose of
 clearing the read-ready condition on the connection's socket, so that
 it can wait for other conditions.  This could very well be decoupled
 from the logic that is involved in testing PQisBusy and
 fetching/processing actual results.  (If we had not intended to allow
 those activities to be decoupled, we'd not have separated PQconsumeInput
 and PQisBusy in the first place.)  Introducing a dependency between
 these things could lead to non-reproducible, timing-dependent, very
 hard-to-find bugs.

 By the same token, if somebody were trying to put single-row-mode
 support into an existing async application, they'd be fooling with the
 parts that issue new queries and collect results, but probably not with
 the main event loop.  Thus, I do not believe your argument above that
 any code using single-row mode is new.  The whole app is not going to
 be new.  It could be very hard to guarantee that there is no path of
 control that invokes PQconsumeInput before the new data is actually
 processed, because there was never before a reason to avoid extra
 PQconsumeInput calls.

I've thought about this.  On first glance indeed, if async app
has both reader and writer registered in event loop, it might be
simpler to keep reading from source socket, event if writer stalls.

But this is exactly the situation where error from consumeInput would help.
Imagine fast source and slow target (common scenario - a database query
for each row).  Reading from source *must* be stopped to get
predictable memory usage,.

 As I said, this is the exact same objection I had to the original scheme
 of exposing the row buffer directly.  Putting a libpq function in front
 of the row buffer doesn't solve the problem if that function is
 implemented in a way that's still vulnerable to misuse or race conditions.

 And now libpq allows such apps to go from 2x row size to full resultset
 size with unfortunate coding.  Why?

 This argument is completely nonsensical.  The contract for
 PQconsumeInput has always been that it consumes whatever the kernel has
 available.  If you don't extract data from the library before calling it
 again, the library's internal buffer may grow to more than the minimum
 workable size, but that's a tradeoff you may be willing to make to
 simplify your application logic.  I do not think that it's an
 improvement to change the API spec to say either that you can't call
 PQconsumeInput in certain states, or that you can but it won't absorb
 data from the kernel.

Old patch had a nice property that we could replace PQgetResult()
with something else.  To allow that and also PQconsumeInput(),
we could store offsets in rowBuf, and then skip realign in PQconsumeInput().

Not sure if the replacement reason is worth keeping, but the
offsets may be useful even now as they might give additional
speedup as they decrease the per-column storage
from 16 to 8 bytes on 64-bit architectures.

May be better left for 9.3 though...

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-08-01 Thread Marko Kreen
On Wed, Aug 1, 2012 at 6:18 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 * Is there better API than PQsetSingleRowMode()?  New PQsend*
   functions is my alternative.

 After thinking it over, I'm really unexcited about adding new versions
 of all the PQsend functions.  If we had the prospect of more flags in
 future that could be added to a bitmask flags argument, it would be
 more palatable, but it's far from clear what those might be.  So I'm
 now leaning towards using PQsetSingleRowMode as-is.

I can imagine such flag - I'd like to have a flag to allow send more
queries to server without waiting the finish of old ones.

Currently, when a batch-processing app wants to keep
backend busy, they need to stuff many statements into
single query.  Which is ugly.  Among other things it
loses the possibility to separate arguments from query,
and it breaks error reporting.  The flag would fix this,
and it would be useful also in other scenarios.

Interestingly, although it affects different area, it would be also
be flag for PQsend* and not for PQexec*.  So maybe
the direction is not completely wrong.

But I cannot imagine why the PQsetSingleRowMode would be
hard to keep working even if we have PQsend functions with flags,
so I'm not worried about getting it exactly right from the start.

 * Should we rollback rowBuf change? I think no, as per benchmark
   it performs better than old code.

 I had already pretty much come to that conclusion just based on code
 review, without thinking about performance.  In particular, we had done
 some nontrivial work towards improving error-case handling in the data
 message parsing code, and I don't really want to give that up, nor
 rewrite it on the fly now.  About the only reason I could see for
 reverting rowBuf was that I thought it might hurt performance; so now
 that you've proven the opposite, we should leave it alone.

Additional argument for rowBuf is Merlin's wish for weirdly optimized
PGresults.  Basically, with rowBuf anybody who wants to experiment
with different ways to process row data just needs to implement
single function which processes data then advances the state
machine.  Like I did with PQgetRowData().

Without it, quite lot of code needs to be patched.

 So I'm working from the first set of patches in your message
 20120721194907.ga28...@gmail.com.

Great!

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Tue, Jul 24, 2012 at 6:18 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Jul 23, 2012 at 5:07 PM, Marko Kreen mark...@gmail.com wrote:
 Does PQgetRowData() break the ability to call PQgetvalue() against the
 result as well as other functions like PQgetisnull()?  If so, I
 object.

 I don't get what are you objecting to.  The PQgetRowData()
 is called instead of PQgetResult() and it returns zero-row PGresult
 for row header and list of PGdataValues that point to actual
 data. You call the value functions, they don't crash, but as
 the result has zero rows (the data is not copied to it)
 they don't do anything useful either.

 The whole point of the API is to avoid the unnecessary copying.

 You can mix the calls to PQgetRowData() and PQgetResult()
 during one resultset, it works fine although does not seem
 that useful.

 I guess you fear that some code can unexpectedly see
 new behavior, and that exactly why the row-callback API
 needs to be scrapped - it allowed such possibility.

 But with PQsetSingleRowMode and PQgetRowData, the
 new behavior is seen only by new code that clearly
 expects it, so I don't see what the problem is.

 Well, for one, it breaks libpqtypes (see line 171@
 http://libpqtypes.esilo.com/browse_source.html?file=exec.c), or at
 least makes it unable to use the row-processing mode.   libpqtypes
 wraps the libpq getter functions and exposes a richer api using only
 the result object.  When writing libpqtypes we went through great
 pains to keep access to server side data through the existing result
 API.  Note, I'm not arguing that compatibility with libpqtypes is a
 desired objective, but the wrapping model that it uses is pretty
 reasonably going to be used by other code -- the awkwardness there
 should be a red flag of sorts.

 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.  I bet you can still maintain
 data access through result object while avoiding extra copies.

Well, the main problem this week is whether we should
apply PQsetSingleRowMode() from single-row-mode1
or from single-row-mode2 branch?

The PQgetRowData() is unimportant as it just exposes
the rowBuf to user and thats all.

Do you have opinion about that?

  For example, maybe PQsetSingleRowMode maybe should return a result object?

What do you mean by that?  And have you though about both
sync and async usage patterns?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Tue, Jul 24, 2012 at 7:08 PM, Marko Kreen mark...@gmail.com wrote:
 The PQgetRowData() is unimportant as it just exposes
 the rowBuf to user and thats all.

So to get back to the more interesting PQgetRowData() discussion...

Did you notice that it's up to app to decide whether it calls
PQgetResult() or PQgetRowData() after PQsetSingleRowMode()?
So there is no way for it to get into conflicts with anything.
If libpqtypes keeps using PQgetResult it keeps getting
PGresult.  No problem.

The PQgetRowData() is meant for use-cases where PGresult is *not* the
main data structure the app operates on.  If the app does dumb
copy out of PGresult as soon as possible, it can move to PGgetRowData().

If app wants do to complex operations with PGresult or just
store it, then it can keep doing it.  Nobody forces the use
of PQgetRowData().


Now the about idea about doing more optimized PGresult - one way of doing
it would be to create zero-copy PGresult that points into network
buffer like PGgetRowData() does.  But this breaks all expectations
of lifetime rules for PGresult, thus seems like bad idea.

Second way is to optimize the copying step - basically just
do a malloc and 2 or 3 memcopies - for struct, headers
and data.  This leaves standalone PGresult around that
behaves as expected.  But for apps that don't care about
PGresult it is still unnecessary copy.  So even if we
optimize PGresult that way, it still seems worthwhile
to have PGgetRowData() around.


Hm, I think it's possible to rig the test to do dummy
copy of pgresult, thus it's possible to see what kind
of speed is possible..  Will try.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Tue, Jul 24, 2012 at 7:25 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 11:08 AM, Marko Kreen mark...@gmail.com wrote:
 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.  I bet you can still maintain
 data access through result object while avoiding extra copies.

 Well, the main problem this week is whether we should
 apply PQsetSingleRowMode() from single-row-mode1
 or from single-row-mode2 branch?

 The PQgetRowData() is unimportant as it just exposes
 the rowBuf to user and thats all.

 right. branch 1 (containing PQgetRowData) seems wrong to me.

Indeed, you are missing the fact that PGgetResult works same
in both branches.,

And please see the benchmart I posted.

Even better, run it yourself...

 What do you mean by that?  And have you though about both
 sync and async usage patterns?

 No, I haven't -- at least not very well.  The basic idea is that
 PQsetSingleRowMode turns into PQgetSingleRowResult() and returns a
 result object.  For row by row an extra API call gets called (maybe
 PQgetNextRow(PGconn, PGresult)) that does the behind the scenes work
 under the existing result object.  This is the same general structure
 you have in branch 2, but the result allocation is move out of the
 loop.  Presumably sync and async would then follow the same pattern,
 but we'd still have to be able to guarantee non-blocking behavior in
 the async api.

Well, as discussed previously it's worthwhile to keep standard functions
like PQisBusy() and PQgetResult() working sensibly in new mode,
and currently they do so.

If you add new functions, you also need to define the behavior
when new and old one's are mixed and it gets messy quickly.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Tue, Jul 24, 2012 at 7:52 PM, Merlin Moncure mmonc...@gmail.com wrote:
 But, the faster rowbuf method is a generally incompatible way of
 dealing with data vs current libpq -- this is bad.  If it's truly
 impossible to get those benefits without bypassing result API that
 then I remove my objection on the grounds it's optional behavior (my
 gut tells me it is possible though).

Um, please clarify what are you talking about here?

What is the incompatibility of PGresult from branch 1?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
 Hm, I think it's possible to rig the test to do dummy
 copy of pgresult, thus it's possible to see what kind
 of speed is possible..  Will try.

I added a new method (-x) to rowdump where it asks for row
with PQgetRowData() and then tries to emulate super-efficient
PGresult copy, then loads data from that PGresult.

Quick reference:
rowdump1 - single-row-mode1 [~ libpq 9.2]
rowdump2 - single-row-mode2 [~ libpq 9.1]

-s - single row mode with PQgetResult()
-z - single row mode with PQgetRowData()
-x - simulated optimized PQgetResult()

-
QUERY: select 1,200,30,rpad('x',30,'z') from
generate_series(1,500)
./rowdump1 -s:   6.28   6.25   6.39  avg:  6.31 [ 100.00 % ]
./rowdump2 -s:   7.49   7.40   7.57  avg:  7.49 [ 118.71 % ]
./rowdump1 -z:   2.86   2.77   2.79  avg:  2.81 [ 44.50 % ]
./rowdump1 -x:   3.46   3.27   3.29  avg:  3.34 [ 52.96 % ]
QUERY: select
rpad('x',10,'z'),rpad('x',20,'z'),rpad('x',30,'z'),rpad('x',40,'z'),rpad('x',50,'z'),rpad('x',60,'z')
from generate_series(1,300)
./rowdump1 -s:   7.76   7.76   7.68  avg:  7.73 [ 100.00 % ]
./rowdump2 -s:   8.24   8.12   8.66  avg:  8.34 [ 107.85 % ]
./rowdump1 -z:   5.34   5.07   5.23  avg:  5.21 [ 67.41 % ]
./rowdump1 -x:   5.53   5.61   5.61  avg:  5.58 [ 72.20 % ]
QUERY: select
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
from generate_series(1,80)
./rowdump1 -s:   7.49   7.66   7.59  avg:  7.58 [ 100.00 % ]
./rowdump2 -s:   7.56   8.12   7.95  avg:  7.88 [ 103.91 % ]
./rowdump1 -z:   2.77   2.76   2.76  avg:  2.76 [ 36.46 % ]
./rowdump1 -x:   3.07   3.05   3.18  avg:  3.10 [ 40.90 % ]
QUERY: select 1000,rpad('x', 400, 'z'),rpad('x', 4000, 'z') from
generate_series(1,10)
./rowdump1 -s:   2.66   2.62   2.67  avg:  2.65 [ 100.00 % ]
./rowdump2 -s:   3.11   3.14   3.11  avg:  3.12 [ 117.74 % ]
./rowdump1 -z:   2.49   2.46   2.47  avg:  2.47 [ 93.33 % ]
./rowdump1 -x:   2.59   2.57   2.57  avg:  2.58 [ 97.23 % ]
-

It shows that even if the actual fast row copy will be slower
than this one here, it's still quote competitive approach to
PQgetRowData(), as long it's not too slow.

So the optimized PGgetResult() may be good enough, thus we
can drop the idea of PQgetRowData().

Code attached, also in https://github.com/markokr/pqtest repo.

-- 
marko


pg1 = /opt/apps/pgsql92mode1
pg2 = /opt/apps/pgsql92mode2

X1 = -DHAVE_ROWDATA -I$(pg1)/include/internal -I$(pg1)/include/server

CFLAGS = -O -g -Wall

all: rowdump1 rowdump2

rowdump1: rowdump.c
$(CC) -I$(pg1)/include $(CFLAGS) -o $@ $ -L$(pg1)/lib 
-Wl,-rpath=$(pg1)/lib -lpq $(X1)

rowdump2: rowdump.c
$(CC) -I$(pg2)/include $(CFLAGS) -o $@ $ -L$(pg2)/lib 
-Wl,-rpath=$(pg2)/lib -lpq

clean:
rm -f rowdump1 rowdump2 time.tmp README.html

html: README.html

README.html: README.rst
rst2html $  $@


#include stdio.h
#include stdlib.h
#include string.h
#include unistd.h
#include getopt.h

#include libpq-fe.h

#ifdef HAVE_ROWDATA
#include internal/libpq-int.h
#endif

struct Context {
	PGconn *db;
	int count;

	char *buf;
	int buflen;
	int bufpos;
};

/* print error and exit */
static void die(PGconn *db, const char *msg)
{
	if (db)
		fprintf(stderr, %s: %s\n, msg, PQerrorMessage(db));
	else
		fprintf(stderr, %s\n, msg);
	exit(1);
}

/* write out buffer */
static void out_flush(struct Context *ctx)
{
	int out;
	if (!ctx-buf)
		return;

	out = write(1, ctx-buf, ctx-bufpos);
	if (out != ctx-bufpos)
		die(NULL, failed to write file);
	ctx-bufpos = 0;
	ctx-buflen = 0;
	free(ctx-buf);
	ctx-buf = NULL;
}

/* add char to buffer */
static void out_char(struct Context *ctx, char c)
{
	if (ctx-bufpos + 1  ctx-buflen) {
		if (!ctx-buf) {
			ctx-buflen = 16;
			ctx-buf = malloc(ctx-buflen);
			if (!ctx-buf)
die(NULL, failed to allocate buffer);
		} else {
			ctx-buflen *= 2;
			ctx-buf = realloc(ctx-buf, ctx-buflen);
			if (!ctx-buf)
die(NULL, failed to resize buffer);
		}
	}

	ctx-buf[ctx-bufpos++] = c;
}

/* quote string for copy */
static void proc_value(struct Context *ctx, const char *val, int vlen)
{
	int i;
	char c;

	for (i = 0; i  vlen; i++) {
		c = val[i];
		switch (c) {
		case '\\':
			out_char(ctx, '\\');
			out_char(ctx, '\\');
			break;
		case '\t':
			out_char(ctx, '\\');
			out_char(ctx, 't');
			break;
		case '\n':
			out_char(ctx, '\\');
			out_char(ctx, 'n');
			break;
		case '\r':
			out_char(ctx, '\\');
			out_char(ctx, 'r');
			break;
		default:
			out_char(ctx, c);
			break;
		}
	}
}

/* quote one row for copy from regular PGresult */
static void proc_row(struct Context *ctx, PGresult *res, int tup)
{
	int n = PQnfields(res);
	const char *val;
	int i, vlen;

	ctx-count++;

	for 

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Tue, Jul 24, 2012 at 11:34 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 2:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 In particular I agree that PQsetSingleRowMode is a bit ugly, so I'm
 wondering about your thoughts on providing PQgetSingleRowResult instead.
 I don't see how to make that work in async mode.  I think the library
 needs to be aware of whether it's supposed to return a single-row result
 in advance of actually doing so --- for instance, how can PQisBusy
 possibly give the correct answer if it doesn't know whether having
 received the first row is sufficient?

 Well (Marko probably wants to slap me with a halibut by now), the
 original intent was for it not to have to: PQgetSingleRowResult is
 more 'construct result for single row iteration', so it would never
 block -- it only sets the internal single row mode and instantiates
 the result object.  After that, you can do do standard
 consumeinput/isbusy processing on the conn.  The actual iteration
 routine would be like PQiterateResult which you could guard with
 PQisBusy.  Like I said: it's the same general structure but the result
 construction is moved out of the loop.

If you really don't like PQsetSingleRowMode, then I would prefer
new PQsend* functions to new fetch functions.

Because while send is one-shot affair, receive is complex state-machine
with requires multiple function calls.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Tue, Jul 24, 2012 at 10:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 Either way, it looks like there's plausible paths to optimizing
 repeated result fetch without having expose an alternate data-fetching
 API and that the proposed implementation doesn't appear to be a wall
 in terms of getting there. So I'm firmly in the non-exposed-rowbuf
 camp, even if we can't expose an optimal implementation in time for
 9.2.

 Yeah, the schedule argument is a strong one.  If we put in PQgetRowData
 now, we'll be stuck with it forevermore.  It would be better to push
 that part of the patch to 9.3 so we can have more time to think it over
 and investigate alternatives.  The single-row mode is already enough to
 solve the demonstrated client-side performance issues we know about.
 So IMO it would be a lot smarter to be spending our time right now on
 making sure we have *that* part of the patch right.

Just to show agreement: both PQgetRowData() and optimized PGresult
do not belong to 9.2.

Only open questions are:

* Is there better API than PQsetSingleRowMode()?  New PQsend*
  functions is my alternative.

* Should we rollback rowBuf change? I think no, as per benchmark
  it performs better than old code.

 The thing I fundamentally don't like about PQsetSingleRowMode is that
 there's such a narrow window of time to use it correctly -- as soon as
 you've done even one PQconsumeInput, it's too late.  (Or maybe not, if
 the server is slow, which makes it timing-dependent whether you'll
 observe a bug.  Maybe we should add more internal state so that we can
 consistently throw error if you've done any PQconsumeInput?)  The most
 obvious alternative is to invent new versions of the PQsendXXX functions
 with an additional flag parameter; but there are enough of them to make
 that not very attractive either.

It belongs logically together with PQsend, so if you don't like
current separation, then proper fix is to make them single
function call.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Wed, Jul 25, 2012 at 12:35 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 3:52 PM, Marko Kreen mark...@gmail.com wrote:
 * Is there better API than PQsetSingleRowMode()?  New PQsend*
   functions is my alternative.

 I would prefer to have PQsetSingleRowMode() over new flavor of PQsend.

 To consolidate my argument above: since you're throwing PQgetResult in
 the main iteration loop you're potentially walling yourself off from
 one important optimization: not copying the result at all and reusing
 the old one; that's going to produce the fastest possible code since
 you're recovering all the attribute structures that have already been
 set up unless you're willing to do the following:

I am not forgetting such optimization, I've already implemented it:
it's called PQgetRowData().  Main reason why it works, and so simply,
is that it does not try to give you PGresult.

PGresult carries various historical assumptions:
- Fully user-controlled lifetime.
- Zero-terminated strings.
- Aligned binary values.

Breaking them all does not seem like a good idea.

Trying to make them work with zero-copy seems
not worth the pain.

And if you are ready to introduce new accessor functions,
why care about PGresult at all?  Why not give user
PQgetRowData()?

Btw, PQgetRowData() and any other potential zero-copy API
is not incompatible with both slow PQgetResult() or optimized
PQgetResult().

So if we give only PQgetResult() in 9.2, I do not see that we
are locked out from any interesting optimizations.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen
On Wed, Jul 25, 2012 at 1:29 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 4:59 PM, Marko Kreen mark...@gmail.com wrote:
 So if we give only PQgetResult() in 9.2, I do not see that we
 are locked out from any interesting optimizations.

 Well, you are locked out of having PQgetResult reuse the conn's result
 since that would then introduce potentially breaking changes to user
 code.

You can specify special flags to PQsend or have special PQgetResultWeird()
calls to get PGresults with unusual behavior.  Like I did with PQgetRowData().

I see no reason here to reject PQgetResult() that returns normal PGresult.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-23 Thread Marko Kreen

Here is a simple test program that takes a SELECT
query, reads it and outputs a COPY-formatted stream
to standard output, to simulate some activity.

It operates on 3 modes, specified by commant-line switches:

-f   Load full resultset at once - old way.
-s   Single-Row mode using PQgetResult().
-z   Single-Row mode using PQgetRowData().

It is compiled with 2 different libpqs that correspond to
single-row-modeX branches in my github repo:

rowdump1 - libpq with rowBuf + PQgetRowData().   rowBuf is
   required for PQgetRowData.
   [ https://github.com/markokr/postgres/tree/single-row-mode1 ]

rowdump2 - Plain libpq patched for single-row mode.
   No PQgetRowData() here.
   [ https://github.com/markokr/postgres/tree/single-row-mode2 ]

Notes:

* Hardest part is picking realistic queries that matter.
  It's possible to construct artificial queries that make
  results go either way.

* It does not make sense for compare -f with others.  But it
  does make sense to compare -f from differently patched libpqs
  to detect any potential slowdowns.

* The time measured is User Time of client process.

---
QUERY: select 1,200,30,rpad('x',30,'z') from
generate_series(1,500)
./rowdump1 -f:   3.90   3.90   3.93  avg:  3.91
./rowdump2 -f:   4.03   4.13   4.05  avg:  4.07
./rowdump1 -s:   6.26   6.33   6.49  avg:  6.36
./rowdump2 -s:   7.48   7.46   7.50  avg:  7.48
./rowdump1 -z:   2.88   2.90   2.79  avg:  2.86
QUERY: select
rpad('x',10,'z'),rpad('x',20,'z'),rpad('x',30,'z'),rpad('x',40,'z'),rpad('x',50,'z'),rpad('x',60,'z')
from generate_series(1,300)
./rowdump1 -f:   6.29   6.36   6.14  avg:  6.26
./rowdump2 -f:   6.79   6.69   6.72  avg:  6.73
./rowdump1 -s:   7.71   7.72   7.80  avg:  7.74
./rowdump2 -s:   8.14   8.16   8.57  avg:  8.29
./rowdump1 -z:   6.45   5.15   5.16  avg:  5.59
QUERY: select
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
from generate_series(1,80)
./rowdump1 -f:   5.68   5.66   5.72  avg:  5.69
./rowdump2 -f:   5.69   5.84   5.67  avg:  5.73
./rowdump1 -s:   7.68   7.76   7.67  avg:  7.70
./rowdump2 -s:   7.57   7.54   7.62  avg:  7.58
./rowdump1 -z:   2.78   2.82   2.72  avg:  2.77
QUERY: select 1000,rpad('x', 400, 'z'),rpad('x', 4000, 'z') from
generate_series(1,10)
./rowdump1 -f:   2.71   2.66   2.58  avg:  2.65
./rowdump2 -f:   3.11   3.14   3.16  avg:  3.14
./rowdump1 -s:   2.64   2.61   2.64  avg:  2.63
./rowdump2 -s:   3.15   3.11   3.11  avg:  3.12
./rowdump1 -z:   2.53   2.51   2.46  avg:  2.50
---

Test code attached.  Please play with it.

By this test, both rowBuf and PQgetRowData() look good.

-- 
marko


pg1 = /opt/apps/pgsql92mode1
pg2 = /opt/apps/pgsql92mode2

CFLAGS = -O -g -Wall

all: rowdump1 rowdump2

rowdump1: rowdump.c
$(CC) -I$(pg1)/include $(CFLAGS) -o $@ $ -L$(pg1)/lib 
-Wl,-rpath=$(pg1)/lib -lpq -DHAVE_ROWDATA

rowdump2: rowdump.c
$(CC) -I$(pg2)/include $(CFLAGS) -o $@ $ -L$(pg2)/lib 
-Wl,-rpath=$(pg2)/lib -lpq

clean:
rm -f rowdump1 rowdump2



xtest.sh
Description: Bourne shell script
#include stdio.h
#include stdlib.h
#include string.h
#include unistd.h
#include getopt.h

#include libpq-fe.h

struct Context {
	PGconn *db;
	int count;

	char *buf;
	int buflen;
	int bufpos;
};

static void die(PGconn *db, const char *msg)
{
	if (db)
		fprintf(stderr, %s: %s\n, msg, PQerrorMessage(db));
	else
		fprintf(stderr, %s\n, msg);
	exit(1);
}

static void out_flush(struct Context *ctx)
{
	int out;
	if (!ctx-buf)
		return;

	out = write(1, ctx-buf, ctx-bufpos);
	if (out != ctx-bufpos)
		die(NULL, failed to write file);
	ctx-bufpos = 0;
	ctx-buflen = 0;
	free(ctx-buf);
	ctx-buf = NULL;
}

static void out_char(struct Context *ctx, char c)
{
	if (ctx-bufpos + 1  ctx-buflen) {
		if (!ctx-buf) {
			ctx-buflen = 16;
			ctx-buf = malloc(ctx-buflen);
			if (!ctx-buf)
die(NULL, failed to allocate buffer);
		} else {
			ctx-buflen *= 2;
			ctx-buf = realloc(ctx-buf, ctx-buflen);
			if (!ctx-buf)
die(NULL, failed to resize buffer);
		}
	}

	ctx-buf[ctx-bufpos++] = c;
}

static void proc_value(struct Context *ctx, const char *val, int vlen)
{
	int i;
	char c;

	for (i = 0; i  vlen; i++) {
		c = val[i];
		switch (c) {
		case '\\':
			out_char(ctx, '\\');
			out_char(ctx, '\\');
			break;
		case '\t':
			out_char(ctx, '\\');
			out_char(ctx, 't');
			break;
		case '\n':
			out_char(ctx, '\\');
			out_char(ctx, 'n');
			break;
		case '\r':
			out_char(ctx, '\\');
			out_char(ctx, 'r');
			break;
		default:
			out_char(ctx, c);
			break;
		}
	}
}

static void proc_row(struct Context *ctx, PGresult *res, int tup)
{
	int n = PQnfields(res);
	const char *val;
	int 

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-23 Thread Marko Kreen
On Tue, Jul 24, 2012 at 12:27 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Jul 23, 2012 at 2:05 PM, Marko Kreen mark...@gmail.com wrote:
 It seems odd (but maybe ok) that you have to set the single row mode
 on the connection only to have the server reset it whenever you call a
 send function: maybe rename to PQsetResultSingleRowMode?

Server does not reset it, it's purely client-side flag.  It is reset
by next PQsend*/PQexec* call.  If there are several resultsets
sent by server for one query, they all share the same setting.

I don't think extra-long function names that describe exactly
the function behavior are improvement over shorter but inexact
names, as you need to read the docs anyway to know
the real behavior.  But its a matter of taste, so it can be
renamed if people want it.

Alternative is to create parallel PQsend* functions that set the flag.
It gets rid of the possibility of any extra activity between PQsend
and PQsetSingleRowMode().  But it seems messy to do that
just for single-row-mode, instead it makes sense to have PQsend*
that takes a flags argument to tune it's behavior.  But that
is worth doing only if we have more flags than just one.

 Does PQgetRowData() break the ability to call PQgetvalue() against the
 result as well as other functions like PQgetisnull()?  If so, I
 object.

I don't get what are you objecting to.  The PQgetRowData()
is called instead of PQgetResult() and it returns zero-row PGresult
for row header and list of PGdataValues that point to actual
data. You call the value functions, they don't crash, but as
the result has zero rows (the data is not copied to it)
they don't do anything useful either.

The whole point of the API is to avoid the unnecessary copying.

You can mix the calls to PQgetRowData() and PQgetResult()
during one resultset, it works fine although does not seem
that useful.

I guess you fear that some code can unexpectedly see
new behavior, and that exactly why the row-callback API
needs to be scrapped - it allowed such possibility.

But with PQsetSingleRowMode and PQgetRowData, the
new behavior is seen only by new code that clearly
expects it, so I don't see what the problem is.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-21 Thread Marko Kreen

Here is 2 approaches how to get to state where only PQsetSingleRowMode()
is available.  Both on top of REL9_2_STABLE branch.

a) Remove callback hooks, keep rowBuf, implement single-row-mode on
   top of that.   This was posted before, I just additionally removed
   the PQgetRowData() function.

git pull git://github.com/markokr/postgres.git single-row-mode1
https://github.com/markokr/postgres/commits/single-row-mode1

   Commits:
libpq: Single-row based processing
libpq, dblink: Remove row processor API

   Advantage: makes easier to play with PQgetRowData() or potatial
   faster PGresult creation methods.  Smaller change compared to
   libpq from 9.2beta than b).

b) Revert row-callback changes completely, implement single-row-mode on
   top of virgin libpq.  Only problem here was keeping fixes implemented
   as part of row-callback patch.  Single-row-mode itself is quite simple.

git pull git://github.com/markokr/postgres.git  single-row-mode1
https://github.com/markokr/postgres/commits/single-row-mode1

   Feature patch:

https://github.com/markokr/postgres/commit/b5e822125c655f189875401c61317242705143b9

   Commits:
dblink: revert conversion to row processor API patch
libpq: revert row processor API patch
libpq: random fixes
libpq: single-row mode
dblink: use single-row-mode

   Advantage: smaller change compared to libpq from 9.1 than a).

As the patch has suffered a lot from trying to provide both macro- and
micro-optimization (on-the-fly row processing vs. more efficient row
processing), maybe b) is safer choice for 9.2?

In case somebody wants to look at the patches without git (or web),
I attaches them as tgz too.

-- 
marko



single-row.tgz
Description: GNU Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [9.1] 2 bugs with extensions

2012-07-18 Thread Marko Kreen
I converted Skytools modules to extensions and found 2 problems:

1) Dumpable sequences are not supported - if sequence is tagged
   with pg_catalog.pg_extension_config_dump(), the pg_dump tries
   to run COPY on it.

2) If there is schema with functions, but nothing else,
   then when later converting it to extension by adding
   functions and schema to extension, later drop
   of that extension fails to remove schema.
   Proper CREATE EXT + DROP EXT works fine.

To reproduce, checkout 'master' branch and go directly
to module directory:

$ git clone --recursive git://github.com/markokr/skytools.git
$ cd skytools

1) $ cd sql/pgq  make install installcheck
..
pg_dump regression  test.dump
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  cannot copy from sequence
batch_id_seq
pg_dump: The command was: COPY pgq.batch_id_seq  TO stdout;


2) $ cd sql/pgq_coop  make install installcheck
..
 create extension pgq_coop from 'unpackaged';
 drop extension pgq_coop;
 create extension pgq_coop;
 ERROR:  schema pgq_coop already exists

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-16 Thread Marko Kreen
On Mon, Jul 16, 2012 at 4:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 Now, looking at the problem with some perspective, the solution
 is obvious: when in single-row mode, the PQgetResult() must return
 proper PGresult for that single row.  And everything else follows that.

 Such API is implemented in attached patch:

 I'm starting to look at this patch now.  I think we could drop the
 PQgetRowData() API: it complicates matters for little gain that I can
 see.  The argument for it was to avoid the cost of creating a PGresult
 per row, but we're already going to pay the cost of creating a
 PGresult in order to return the PGRES_SINGLE_TUPLE status.

No.  Please look again, it is supposed to be called instead of PGgetResult().

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-16 Thread Marko Kreen
On Mon, Jul 16, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Mon, Jul 16, 2012 at 4:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm starting to look at this patch now.  I think we could drop the
 PQgetRowData() API: it complicates matters for little gain that I can
 see.  The argument for it was to avoid the cost of creating a PGresult
 per row, but we're already going to pay the cost of creating a
 PGresult in order to return the PGRES_SINGLE_TUPLE status.

 No.  Please look again, it is supposed to be called instead of PGgetResult().

 Mm.  I still think we should drop it, because it's still a dangerous API
 that's not necessary for the principal benefit of this feature.

Yes, it is a secondary feature, but it fits the needs of the actual target
audience of the single-row feature - various high-level wrappers of libpq.

Also it is needed for high-performance situations, where the
single-row-mode fits well even for C clients, except the
advantage is negated by new malloc-per-row overhead.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-16 Thread Marko Kreen
On Mon, Jul 16, 2012 at 6:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Marko Kreen mark...@gmail.com writes:
 On Mon, Jul 16, 2012 at 4:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Mm.  I still think we should drop it, because it's still a dangerous API
 that's not necessary for the principal benefit of this feature.

 Yes, it is a secondary feature, but it fits the needs of the actual target
 audience of the single-row feature - various high-level wrappers of libpq.

 Also it is needed for high-performance situations, where the
 single-row-mode fits well even for C clients, except the
 advantage is negated by new malloc-per-row overhead.

 Absolutely no evidence has been presented that there's any useful
 performance gain to be had there.  Moreover, if there were, we could
 probably work a bit harder at making PGresult creation cheaper, rather
 than having to expose a dangerous API.

Ok, I'm more interested in primary feature,
so no more objections from me.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq compression

2012-06-20 Thread Marko Kreen
On Wed, Jun 20, 2012 at 10:05 PM, Florian Pflug f...@phlo.org wrote:
 I'm starting to think that relying on SSL/TLS for compression of
 unencrypted connections might not be such a good idea after all. We'd
 be using the protocol in a way it quite clearly never was intended to
 be used...

Maybe, but what is the argument that we should avoid
on encryption+compression at the same time?

AES is quite lightweight compared to compression, so should
be no problem in situations where you care about compression.

RSA is noticeable, but only for short connections.
Thus easily solvable with connection pooling.

And for really special compression needs you can always
create a UDF that does custom compression for you.

So what exactly is the situation we need to solve
with postgres-specific protocol compression?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 04/16] Add embedded list interface (header only)

2012-06-19 Thread Marko Kreen
On Wed, Jun 13, 2012 at 2:28 PM, Andres Freund and...@2ndquadrant.com wrote:
 +/*
 + * removes a node from a list
 + * Attention: O(n)
 + */
 +static inline void ilist_s_remove(ilist_s_head *head,
 +                                  ilist_s_node *node)
 +{
 +       ilist_s_node *last = head-head;
 +       ilist_s_node *cur;
 +#ifndef NDEBUG
 +       bool found = false;
 +#endif
 +       while ((cur = last-next))
 +       {
 +               if (cur == node)
 +               {
 +                       last-next = cur-next;
 +#ifndef NDEBUG
 +                       found = true;
 +#endif
 +                       break;
 +               }
 +               last = cur;
 +       }
 +       assert(found);
 +}

This looks weird.

In cyclic list removal is:

  node-prev-next = node-next;
  node-next-prev = node-prev;

And thats it.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 04/16] Add embedded list interface (header only)

2012-06-19 Thread Marko Kreen
On Tue, Jun 19, 2012 at 11:02 PM, Andres Freund and...@2ndquadrant.com wrote:
 On Tuesday, June 19, 2012 09:59:48 PM Marko Kreen wrote:
 On Wed, Jun 13, 2012 at 2:28 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  +/*
  + * removes a node from a list
  + * Attention: O(n)
  + */
  +static inline void ilist_s_remove(ilist_s_head *head,
  +                                  ilist_s_node *node)
  +{
  +       ilist_s_node *last = head-head;
  +       ilist_s_node *cur;
  +#ifndef NDEBUG
  +       bool found = false;
  +#endif
  +       while ((cur = last-next))
  +       {
  +               if (cur == node)
  +               {
  +                       last-next = cur-next;
  +#ifndef NDEBUG
  +                       found = true;
  +#endif
  +                       break;
  +               }
  +               last = cur;
  +       }
  +       assert(found);
  +}

 This looks weird.

 In cyclic list removal is:

   node-prev-next = node-next;
   node-next-prev = node-prev;

 And thats it.
 Thats the single linked list, not the double linked one. Thats why it has a
 O(n) warning tacked on...

Oh, you have several list implementations there.
Sorry, I was just browsing and it caught my eye.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 10/16] Introduce the concept that wal has a 'origin' node

2012-06-19 Thread Marko Kreen
On Mon, Jun 18, 2012 at 6:35 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 13 June 2012 19:28, Andres Freund and...@2ndquadrant.com wrote:
 This adds a new configuration parameter multimaster_node_id which determines
 the id used for wal originating in one cluster.

 Looks good and it seems this aspect at least is commitable in this CF.

 Design decisions I think we need to review are

 * Naming of field. I think origin is the right term, borrowing from Slony.

I have not read too deeply here, so maybe I am missing
some important detail here, but idea that users need
to coordinate a integer config parameter globally does not
sound too attractive to me.

Why not limit integers to local storage only and map
them to string idents on config, UI and transport?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-17 Thread Marko Kreen
On Sat, Jun 16, 2012 at 7:58 PM, Marko Kreen mark...@gmail.com wrote:
 So my preference would be to simply remove the callback API
 but keep the processing and provide PQgetRowData() instead.

This is implemented in attached patch.  It also
converts dblink to use single-row API.

The patch should be applied on top of previous
single-row patch.

Both can be seen also here:

  https://github.com/markokr/postgres/commits/single-row

-- 
marko


remove-rowproc.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-17 Thread Marko Kreen
On Sun, Jun 17, 2012 at 2:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I prefer the description of Marko's API than the one we have now.

 Adopting one API in 9.2 and another in 9.3 would be fairly bad.
 Perhaps we can have both?

I see no reason the keep the (public) callback API around,
except if we don't bother to remove it now.

 Can we see a performance test? Add a row processor API to libpq for
 better handling of large result sets. So idea is we do this many,
 many times so we need to double check the extra overhead is not a
 problem in cases where the dumping overhead is significant.

Not sure what do to want to performance test.

PQgetRowData() uses exactly the same pipeline
that callbacks used.  It will use few more C calls,
not sure it make sense to benchmark them.

Recent dblink change did change palloc() + copy
zero-termination dance to PQgetResult(), which
does malloc() + copy dance internally.  This malloc
vs. palloc might be benchmarkable, but it seems to
go into micro-benchmarking world as the big win came
from avoiding buffering rows.  So yeah, maybe using
PQgetRowData() might be tiny bit faster, but I don't
expect much difference.

But all this affects new users only.  The thing that affects
everybody was the 2-step row processing change
that was done during rowproc patch.

I did benchmark it, and it seems there are column-size
+ column-count patterns where new way is faster,
and some patterns where old way is faster.  But the
difference did not raise above test noise so I concluded
it is insignificant and the malloc+copy avoidance is worth it.

Ofcourse, additional any benchmarking is welcome, so feel
free to pick any situation you care about.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-16 Thread Marko Kreen
Demos:

https://github.com/markokr/libpq-rowproc-demos/blob/master/demo-onerow-sync.c
https://github.com/markokr/libpq-rowproc-demos/blob/master/demo-onerow-async.c

Few clarifications below.

On Fri, Jun 15, 2012 at 9:21 PM, Marko Kreen mark...@gmail.com wrote:
 Now, looking at the problem with some perspective, the solution
 is obvious: when in single-row mode, the PQgetResult() must return
 proper PGresult for that single row.  And everything else follows that.

 Such API is implemented in attached patch:

 * PQsetSingleRowMode(conn): set's single-row mode.

The function can be called only after PQsend* and before any
rows have arrived.  This guarantees there will be no surprises
to PQexec* users who expect full resultset at once.  Also it
guarantees that user will process resultset with PQgetResult()
loop, either sync or async.  Next PQexec/PQsend call will
reset the flag.  So it is active only for duration of processing
results from one command.

Currently it returns FALSE if called in wrong place and does
nothing.  Only question I see here is whether it should set
error state on connection or not.  It does not seem to be
improvement.

 * PQgetRowData(): can be called instead PQgetResult() to get raw row data
  in buffer, for more efficient processing.  This is optional feature
  that provides the original row-callback promise of avoiding unnecessary
  row data copy.

 * Although PQgetRowData() makes callback API unnecessary, it is still
  fully compatible with it - the callback should not see any difference
  whether the resultset is processed in single-row mode or
  old single-PGresult mode.  Unless it wants to - it can check
  PGRES_TUPLES_OK vs. PGRES_SINGLE_TUPLE.

The PQgetResult() is compatible with callbacks, the PQgetRowData()
bypasses them.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-06-16 Thread Marko Kreen
On Sat, Jun 16, 2012 at 6:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I guess this raises the question of whether we ought to revert the
 row-callback patch entirely and support only this approach.  IMO
 it is (barely) not too late to do that for 9.2, if we want to.
 If we don't want to, then this is just another new feature and
 should be considered for 9.3.

I think row-callback is dangerous API that does not solve any
important problems.

But I do like the 2-phase processing the rowproc patch introduced
and having a way to bypass unnecessary malloc()+copy.

So my preference would be to simply remove the callback API
but keep the processing and provide PQgetRowData() instead.

Although the win that it brings is significantly smaller thanks
to single-row PQgetResult().  So if it does not sound interesting
to others, it can be dropped.  Because the single-row processing
is the important feature we need, rest is extra.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] libpq one-row-at-a-time API

2012-06-15 Thread Marko Kreen
The row-processor API is now in 9.2, but it solves only the 
different-row-storage problem, but not the one-row-at-a-time
problem, as libpq is still in control until all rows are received.

This means libpq cannet still be used to implement iterative
result processing that almost all high-level languages are using.

We discussed potential API for fetching on single row at a time,
but did not reach conclusion.  Basic arguments were:

1) Tom: PQisBusy must keep current behaviour.  Thus also PQgetResult()
   must keep current behaviour:
   * PQisBusy() - 0: need to call PQgetResult(), which returns PGresult
   * PQisBusy() - 1: need to call PQconsumeInput()
   * PQisBusy() must be callable several times in a row, thus be
 stateless from clients POV.

2) Me: behaviour must not be controlled by callback, but client code
   that uses PQgetResult() + PQisBusy().

Now, looking at the problem with some perspective, the solution
is obvious: when in single-row mode, the PQgetResult() must return
proper PGresult for that single row.  And everything else follows that.

Such API is implemented in attached patch:

* PQsetSingleRowMode(conn): set's single-row mode.

* PQisBusy(): stops after each row in single-row mode, sets PGASYNC_ROW_READY.
  Thus keeping the property of being repeatedly callable.

* PQgetResult(): returns copy of the row if PGASYNC_ROW_READY.  Sets row
  resultStatus to PGRES_SINGLE_TUPLE.  This needs to be different from
  PGRES_TUPLES_OK to detect resultset end.

* PQgetRowData(): can be called instead PQgetResult() to get raw row data
  in buffer, for more efficient processing.  This is optional feature
  that provides the original row-callback promise of avoiding unnecessary
  row data copy.

* Although PQgetRowData() makes callback API unnecessary, it is still
  fully compatible with it - the callback should not see any difference
  whether the resultset is processed in single-row mode or
  old single-PGresult mode.  Unless it wants to - it can check
  PGRES_TUPLES_OK vs. PGRES_SINGLE_TUPLE.

There is some duplicate code here that can be refactored (callback exec),
but I did not do it yet to avoid affecting existing code too much.

-- 
marko

PS. If a squint it seems like fix of exising API instead of new feature,
so perhaps it can still fit into 9.2?

commit 4114613 (HEAD, single-row)
Author: Marko Kreen mark...@gmail.com
Date:   Sat Apr 7 15:05:01 2012 +0300

Single-row based processing

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5c5dd68..0ea2c1f 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -4018,6 +4018,75 @@ PGresult *PQgetResult(PGconn *conn);
   /note
  /listitem
 /varlistentry
+
+varlistentry id=libpq-pqsetsinglerowmode
+ term
+  functionPQsetSingleRowMode/function
+  indexterm
+   primaryPQsetSingleRowMode/primary
+  /indexterm
+ /term
+
+ listitem
+  para
+   Instead buffering all rows in structnamePGresult/structname
+   until full resultset has arrived, this changes resultset processing
+   to return rows as soon as they arrive from network.
+
+synopsis
+int PQsetSingleRowMode(PGconn *conn);
+/synopsis
+  /para
+
+  para
+   The mode can be changed directly after
+   functionPQsendQuery/function,
+   functionPQsendQueryParams/function,
+   functionPQsendQueryPrepared/function call, and before
+   any result rows have arrived from network.  Then this functions
+   changes mode and returns 1.  Otherwise the mode stays unchanged
+   and this functions returns 0.
+  /para
+
+  para
+   The rows returned have PQresultStatus() of literalPGRES_SINGLE_TUPLE/literal.
+   There will be final PGresult that has either literalPGRES_TUPLES_OK/literal
+   or literalPGRES_FATAL_ERROR/literal result status.  In case
+   of error status, the actual query failed in the middle and received rows
+   should be dropped.
+  /para
+
+ /listitem
+/varlistentry
+
+varlistentry id=libpq-pqgetrowdata
+ term
+  functionPQgetRowData/function
+  indexterm
+   primaryPQgetRowData/primary
+  /indexterm
+ /term
+
+ listitem
+  para
+   In single row mode it is possible to get row data directly,
+   without constructing structnamePGresult/structname for
+   each row.
+
+synopsis
+int PQgetRowData(PGconn *conn, PGresult **hdr, PGdataValue **columns);
+/synopsis
+  /para
+
+  para
+   It can be called everywhere functionPQgetResult/function can.
+   It returns 1 and fills pointers if there is row data avilable.
+   It returns 0 otherwise.  Then functionPQgetResult/function
+   should be called to get final status.
+  /para
+ /listitem
+/varlistentry
+
/variablelist
   /para
 
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1251455..a228a71 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces

  1   2   3   4   5   6   7   >