date:20140421

Hi,

On 2014-04-20 11:24:38 +0200, Palle Girgensohn wrote:
 I see performance degradation with PostgreSQL 9.3 vs 9.2 on FreeBSD, and I'm 
 wondering who to poke to mitigate the problem. In reference to this thread 
 [1], who where the FreeBSD people that Francois mentioned? If mmap needs to 
 perform well in the kernel, I'd like to know of someone with FreeBSD kernel 
 knowledge who is interested in working with mmap perfocmance. If mmap is 
 indeed the cuplrit, I've just tested 9.2.8 vs 9.3.4, I nevere isolated the 
 mmap patch, although I believe Francois did just that with similar results.

If there are indeed such large regressions on FreeBSD we need to treat
them as postgres regressions. It's nicer not to add config options for
things that don't need it, but apparently that's not the case here.

Imo this means we need to add GUC to control wether anon mmap() or sysv
shmem is to be used. In 9.3.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New functions for sslinfo extension

2014-04-21 Thread Dmitry Voronin

Hello, I make an a patch, which adds 4 functions to sslinfo extension module.  You can view some information of certificate's extensions with those functions. The descriptions of functions I posted in my first letter and in source code. What do you think about it? -- Best regards, Dmitry Voronin --- contrib/sslinfo/sslinfo.c	2014-03-17 23:35:47.0 +0400
+++ contrib/sslinfo/sslinfo.c	2014-04-18 11:09:49.567775647 +0400
@@ -5,6 +5,8 @@
  * This file is distributed under BSD-style license.
  *
  * contrib/sslinfo/sslinfo.c
+ * 
+ * Extension functions written by Dmitry Voronin carriingfat...@yandex.ru, CNIIEISU.
  */
 
 #include postgres.h
@@ -14,9 +16,11 @@
 #include miscadmin.h
 #include utils/builtins.h
 #include mb/pg_wchar.h
+#include funcapi.h
 
 #include openssl/x509.h
 #include openssl/asn1.h
+#include openssl/x509v3.h
 
 
 PG_MODULE_MAGIC;
@@ -35,6 +39,11 @@
 Datum		X509_NAME_to_text(X509_NAME *name);
 Datum		ASN1_STRING_to_text(ASN1_STRING *str);
 
+X509_EXTENSION	*get_extension(X509* certificate, char *name);
+Datum 		ssl_get_extension_value(PG_FUNCTION_ARGS);
+Datum		ssl_is_critical_extension(PG_FUNCTION_ARGS);
+Datum 		ssl_get_count_of_extensions(PG_FUNCTION_ARGS);
+Datum		ssl_get_extension_names(PG_FUNCTION_ARGS);
 
 /*
  * Indicates whether current session uses SSL
@@ -371,3 +380,146 @@
 		PG_RETURN_NULL();
 	return X509_NAME_to_text(X509_get_issuer_name(MyProcPort-peer));
 }
+
+
+X509_EXTENSION *get_extension(X509* certificate, char *name) {
+	int 			extension_nid = 0;
+	int 			locate = 0;
+	
+	extension_nid = OBJ_sn2nid(name);
+	if (extension_nid == NID_undef) {
+		extension_nid = OBJ_ln2nid(name);
+		if (extension_nid == NID_undef) 
+			return NULL;
+	}
+	locate = X509_get_ext_by_NID(certificate, extension_nid,  -1);
+	return X509_get_ext(certificate, locate);
+}
+
+/* Returns value of extension. 
+ * 
+ * This function returns value of extension by short name in client certificate. 
+ * 
+ * Returns text datum. 
+ */
+
+PG_FUNCTION_INFO_V1(ssl_get_extension_value);
+Datum
+ssl_get_extension_value(PG_FUNCTION_ARGS) {	
+	X509 			*certificate = MyProcPort - peer;
+	X509_EXTENSION 		*extension = NULL;
+	char 			*extension_name = text_to_cstring(PG_GETARG_TEXT_P(0));
+	BIO 			*bio = NULL;
+	char 			*value = NULL;
+	char 			nullterm = '\0';
+	text 			*result = NULL;
+
+	if (certificate == NULL)
+		PG_RETURN_NULL();
+
+	extension = get_extension(certificate, extension_name);
+	if (extension == NULL)
+		elog(ERROR, Extension by name \%s\ is not found in certificate, extension_name);
+
+	bio = BIO_new(BIO_s_mem());
+	X509V3_EXT_print(bio, extension, -1, -1);
+	BIO_write(bio, nullterm, 1);
+	BIO_get_mem_data(bio, value);
+
+	result = cstring_to_text(value);
+	BIO_free(bio);
+
+	PG_RETURN_TEXT_P(result);
+}
+
+/* Returns status of extension 
+ * 
+ * Returns true, if extension is critical and false, if it is not.
+ * 
+ * Returns bool datum
+ */
+PG_FUNCTION_INFO_V1(ssl_is_critical_extension);
+Datum
+ssl_is_critical_extension(PG_FUNCTION_ARGS) {
+	X509 			*certificate = MyProcPort - peer;
+	X509_EXTENSION 		*extension = NULL;
+	char 			*extension_name = text_to_cstring(PG_GETARG_TEXT_P(0));
+	int 			critical = 0;
+	
+	if (certificate == NULL)
+		PG_RETURN_NULL();
+	
+	extension = get_extension(certificate, extension_name);
+	if (extension == NULL) 
+		elog(ERROR, Extension name \%s\ is not found in certificate, extension_name);
+	
+	critical = X509_EXTENSION_get_critical(extension);
+	PG_RETURN_BOOL(critical);
+}
+
+/* Returns count of extensions in client certificate
+ * 
+ * Returns int datum
+ */
+PG_FUNCTION_INFO_V1(ssl_get_count_of_extensions);
+Datum
+ssl_get_count_of_extensions(PG_FUNCTION_ARGS) {
+	X509 			*certificate = MyProcPort - peer;
+	
+	if (certificate == NULL)
+		PG_RETURN_NULL();
+	
+	PG_RETURN_INT32(X509_get_ext_count(certificate));
+}
+
+/* Returns short names of extensions in client certificate
+ * 
+ * Returns setof text datum
+ */
+PG_FUNCTION_INFO_V1(ssl_get_extension_names);
+Datum
+ssl_get_extension_names(PG_FUNCTION_ARGS) {
+	X509*certificate = MyProcPort - peer;
+	FuncCallContext 		*funcctx;
+	STACK_OF(X509_EXTENSION) 	*extension_stack = NULL;
+	MemoryContext 			oldcontext;
+	int call = 0;
+	int max_calls = 0;
+	X509_EXTENSION			*extension = NULL;
+	ASN1_OBJECT			*object = NULL;
+	int extension_nid = 0;
+	text*result = NULL;
+	
+	if (certificate == NULL)
+		PG_RETURN_NULL();
+	
+	extension_stack = certificate - cert_info - extensions;
+	if (extension_stack == NULL) 
+		PG_RETURN_NULL();
+	
+	if (SRF_IS_FIRSTCALL()) {
+		funcctx = SRF_FIRSTCALL_INIT();
+		oldcontext = MemoryContextSwitchTo(funcctx - multi_call_memory_ctx);
+		funcctx - max_calls = X509_get_ext_count(certificate);
+		MemoryContextSwitchTo(oldcontext);
+	}
+	funcctx = SRF_PERCALL_SETUP();
+	
+	call = funcctx - call_cntr;
+	max_calls = funcctx - max_calls;
+	
+	if (call  max_calls) {
+		extension = sk_X509_EXTENSION_value(extension_stack, call);
+		object =

Re: [HACKERS] Re: [DOCS] Docs incorrectly claiming equivalence between show and pg_settings

2014-04-21 Thread Stefan Seifert

On Saturday 19 April 2014 13:38:16 Tom Lane wrote:

  3  document this restriction
 
 As for (3), I might be wrong, but I don't think the documentation mentions
 the possibility of abusing SET this way at all.  Restrictions in
 undocumented quasi-features are likewise undocumented.

http://www.postgresql.org/docs/devel/static/runtime-config-custom.html
states: Because custom options may need to be set in processes that have not 
loaded the relevant extension module, PostgreSQL will accept a setting for any 
two-part parameter name.
So I'd say the possibility of using SET this way is somewha documented. That's 
at least how I actually got the idea of doing that.

Maybe it can help if I describe our use case: we have some plperlu stored 
procedures and triggers that use them that replicate specific data from our 
internal management application's database to the system on our webserver. 
When running the application's test suite though we obviously don't want that 
and instead want it to replicate to another local database, so we can test the 
triggers as well. We use a custom option set by our test suite to communicate 
the required change in replication target to the stored procedure. As a nice 
side effect we can use the same mechanism to prevent a test sandbox version of 
the application from trying to replicate to the webserver.

So far this works quite well except for the SHOW command throwing an exception 
when we request an unknown configuration parameter complicating our stored 
procedure a little. It would be more comfortable if we could query pg_settings 
and just get an empty result if the configuration parameter is not set.

Regards,
Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] New functions for sslinfo extension

2014-04-21 Thread Воронин Дмитрий

Hello, I make an a patch, which adds 4 functions to sslinfo extension module:1) ssl_get_count_of_extensions() --- get count of X509v3 extensions from client certificate;2) ssl_get_extension_names() --- get short names of X509v3 extensions from client certificate;3) ssl_get_extension_value(text) --- get value of extension from certificate (argument --- short name of extension);4) ssl_is_critical_extension(text) --- returns true, if extension is critical and false, if is not (argument --- short name of extension). You can view some information of certificate's extensions via those functions.I want, that my functions will be included in PostgreSQL release. What do you think about it? -- Best regards, Dmitry Voronin

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 4:10 AM, Andres Freund wrote:

Hi,

On 2014-04-20 11:24:38 +0200, Palle Girgensohn wrote:

I see performance degradation with PostgreSQL 9.3 vs 9.2 on FreeBSD, and I'm 
wondering who to poke to mitigate the problem. In reference to this thread [1], 
who where the FreeBSD people that Francois mentioned? If mmap needs to perform 
well in the kernel, I'd like to know of someone with FreeBSD kernel knowledge 
who is interested in working with mmap perfocmance. If mmap is indeed the 
cuplrit, I've just tested 9.2.8 vs 9.3.4, I nevere isolated the mmap patch, 
although I believe Francois did just that with similar results.

If there are indeed such large regressions on FreeBSD we need to treat
them as postgres regressions. It's nicer not to add config options for
things that don't need it, but apparently that's not the case here.

Imo this means we need to add GUC to control wether anon mmap() or sysv
shmem is to be used. In 9.3.

Greetings,

Andres Freund

Andres, thank you.  Speaking as a FreeBSD developer that would be a good 
idea.


-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Andres Freund and...@2ndquadrant.com writes:
 If there are indeed such large regressions on FreeBSD we need to treat
 them as postgres regressions. It's nicer not to add config options for
 things that don't need it, but apparently that's not the case here.

 Imo this means we need to add GUC to control wether anon mmap() or sysv
 shmem is to be used. In 9.3.

I will resist this mightily.  One of the main reasons to switch to mmap
was so we would no longer have to explain about SysV shm configuration.
Are we going to still have to explain that, but only for FreeBSD?
No thanks.  It will certainly not be the *first* resort.  Instead,
somebody needs to hold the FreeBSD folks' feet to the fire about when
we can expect to see a fix from their side.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On 2014-04-21 10:45:24 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  If there are indeed such large regressions on FreeBSD we need to treat
  them as postgres regressions. It's nicer not to add config options for
  things that don't need it, but apparently that's not the case here.
 
  Imo this means we need to add GUC to control wether anon mmap() or sysv
  shmem is to be used. In 9.3.
 
 I will resist this mightily.  One of the main reasons to switch to mmap
 was so we would no longer have to explain about SysV shm configuration.

It's still explained in the docs and one of the dynshm implementations
is based on sysv shmem. So I don't see this as a convincing reason.

Regressing installed OSs by 15-20% just to save a couple of lines of
docs and code seems rather unconvincing to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

Noah Misch n...@leadboat.com writes:
 I wonder how it would work out to instead delay this new detoast effort until
 toast_insert_or_update().

That would require toast_insert_or_update() to know about every container
datatype.  I doubt it could lead to an extensible or maintainable
solution.

I'm actually planning to set this patch on the shelf for a bit and go
investigate the other alternative, ie, not generating composite Datums
containing toast pointers in the first place.  We now know that the idea
that we aren't going to take performance hits *somewhere* is an illusion,
and I still suspect that the other way is going to lead to a smaller and
cleaner patch.  The main performance downside for plpgsql might be
addressable by making sure that plpgsql record variables fall on the heap
tuple rather than the composite Datum side of the line.  I'm also quite
concerned about correctness: I don't have a lot of confidence that this
patch has closed every loophole with respect to arrays, and it hasn't even
touched ranges or any of the related one-off bugs that I believe exist.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

Hi,

On 2014-04-20 15:38:23 -0400, Tom Lane wrote:
 Some poking around with oprofile says that much of the added time is
 coming from added syscache and typcache lookups, as I suspected.
 
 I did some further work on the patch to make it possible to cache
 the element tuple descriptor across calls for these two functions.
 With the attached patch, the first test case comes down to about 335 ms
 and the second to about 1475 ms (plpgsql is still leaving some extra
 cache lookups on the table).  More could be done with some further API
 changes, but I think this is about as much as is safe to back-patch.

I unfortunately haven't followed this in detail, but shouldn't it be
relatively easily to make this even cheaper by checking for
HEAP_HASEXTERNAL?  If it's not set we don't need to iterate over the
composite's columns, right?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

Andres Freund and...@2ndquadrant.com writes:
 I unfortunately haven't followed this in detail, but shouldn't it be
 relatively easily to make this even cheaper by checking for
 HEAP_HASEXTERNAL?  If it's not set we don't need to iterate over the
 composite's columns, right?

That's the point I made further down: we could do that if we were willing
to abandon the principle that nested fields shouldn't be compressed.
It's not very clear what it'd cost us to give that up.  (Too bad we didn't
define a HEAP_HASCOMPRESSED flag bit ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Magnus Hagander

On Mon, Apr 21, 2014 at 4:51 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-04-21 10:45:24 -0400, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   If there are indeed such large regressions on FreeBSD we need to treat
   them as postgres regressions. It's nicer not to add config options for
   things that don't need it, but apparently that's not the case here.
 
   Imo this means we need to add GUC to control wether anon mmap() or sysv
   shmem is to be used. In 9.3.
 
  I will resist this mightily.  One of the main reasons to switch to mmap
  was so we would no longer have to explain about SysV shm configuration.

 It's still explained in the docs and one of the dynshm implementations
 is based on sysv shmem. So I don't see this as a convincing reason.

 Regressing installed OSs by 15-20% just to save a couple of lines of
 docs and code seems rather unconvincing to me.


There's also the fact that even if it's changed in FreeBSD, that might be
somethign that takes years to trickle out to whatever stable release people
are actually using.

But do we really want a *guc* for it though? Isn't it enough (and in fact
better) with a configure switch to pick the implementation when multiple
are available, that could then be set by default for example by the freebsd
ports build? That's a lot less overhead to keep dragging around...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

On 2014-04-21 11:30:57 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I unfortunately haven't followed this in detail, but shouldn't it be
  relatively easily to make this even cheaper by checking for
  HEAP_HASEXTERNAL?  If it's not set we don't need to iterate over the
  composite's columns, right?
 
 That's the point I made further down:

Oh, sorry. I started reading this thread from the end just now.

 we could do that if we were willing
 to abandon the principle that nested fields shouldn't be compressed.
 It's not very clear what it'd cost us to give that up.

I don't think the cost of that would be all that high. As you argue,
without that trick the cost of iterating over all columns will be paid
all the time, whereas double compression will take effect much less
often. And might even end up being beneficial.
The risk of significant performance regressions due to backpatching
seems significantly less likely if we pay heed to HASEXTERNAL.

 (Too bad we didn't define a HEAP_HASCOMPRESSED flag bit ...)

And too bad that infomask bits are so scarce :(. We really need to
reclaim HEAP_MOVED_OFF and HEAP_MOVED_IN.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Hi,

On 2014-04-21 17:39:39 +0200, Magnus Hagander wrote:
 But do we really want a *guc* for it though? Isn't it enough (and in fact
 better) with a configure switch to pick the implementation when multiple
 are available, that could then be set by default for example by the freebsd
 ports build? That's a lot less overhead to keep dragging around...

Well, we don't know at all it's just freebsd that's affected. In fact, I
would be surprised if there aren't other platforms that regressed due to
this.
I think a configure switch actually ends up being more code than the GUC...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 04/21/2014 11:39 AM, Magnus Hagander wrote:
On Mon, Apr 21, 2014 at 4:51 PM, Andres Freund and...@2ndquadrant.com 
mailto:and...@2ndquadrant.com wrote:


On 2014-04-21 10:45:24 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com
mailto:and...@2ndquadrant.com writes:
  If there are indeed such large regressions on FreeBSD we need
to treat
  them as postgres regressions. It's nicer not to add config
options for
  things that don't need it, but apparently that's not the case
here.

  Imo this means we need to add GUC to control wether anon
mmap() or sysv
  shmem is to be used. In 9.3.

 I will resist this mightily.  One of the main reasons to switch
to mmap
 was so we would no longer have to explain about SysV shm
configuration.

It's still explained in the docs and one of the dynshm implementations
is based on sysv shmem. So I don't see this as a convincing reason.

Regressing installed OSs by 15-20% just to save a couple of lines of
docs and code seems rather unconvincing to me.


There's also the fact that even if it's changed in FreeBSD, that might 
be somethign that takes years to trickle out to whatever stable 
release people are actually using.


But do we really want a *guc* for it though? Isn't it enough (and in 
fact better) with a configure switch to pick the implementation when 
multiple are available, that could then be set by default for example 
by the freebsd ports build? That's a lot less overhead to keep 
dragging around...





That seems to make more sense. I can't imagine why this would be a 
runtime parameter as opposed to build time.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Hi,

On 2014-04-21 11:45:49 -0400, Andrew Dunstan wrote:
 That seems to make more sense. I can't imagine why this would be a runtime
 parameter as opposed to build time.

Because that implies that packagers and porters need to make that
decision. If it's a GUC people can benchmark it and decide.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 8:45 AM, Andrew Dunstan wrote:


On 04/21/2014 11:39 AM, Magnus Hagander wrote:
On Mon, Apr 21, 2014 at 4:51 PM, Andres Freund 
and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote:


On 2014-04-21 10:45:24 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com
mailto:and...@2ndquadrant.com writes:
  If there are indeed such large regressions on FreeBSD we need
to treat
  them as postgres regressions. It's nicer not to add config
options for
  things that don't need it, but apparently that's not the case
here.

  Imo this means we need to add GUC to control wether anon
mmap() or sysv
  shmem is to be used. In 9.3.

 I will resist this mightily.  One of the main reasons to switch
to mmap
 was so we would no longer have to explain about SysV shm
configuration.

It's still explained in the docs and one of the dynshm 
implementations

is based on sysv shmem. So I don't see this as a convincing reason.

Regressing installed OSs by 15-20% just to save a couple of lines of
docs and code seems rather unconvincing to me.


There's also the fact that even if it's changed in FreeBSD, that 
might be somethign that takes years to trickle out to whatever stable 
release people are actually using.


But do we really want a *guc* for it though? Isn't it enough (and in 
fact better) with a configure switch to pick the implementation when 
multiple are available, that could then be set by default for example 
by the freebsd ports build? That's a lot less overhead to keep 
dragging around...





That seems to make more sense. I can't imagine why this would be a 
runtime parameter as opposed to build time.


I am unsure of the true overhead of making this a runtime tunable so 
pardon if I'm asking for a lot.  From the perspective of both an OS 
developer and postgresql user (I am both) it really makes more sense to 
have it a runtime tunable for the following reasons:


From an OS developer making this a runtime allows us to much more 
easily do the testing (instead of needing two compiled versions).
From a sysadmin perspective it makes switching to/from a LOT easier in 
case the new mmap code exposes a stability or performance bug.


-Alfred






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 8:58 AM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-04-21 11:45:49 -0400, Andrew Dunstan wrote:

That seems to make more sense. I can't imagine why this would be a runtime
parameter as opposed to build time.

Because that implies that packagers and porters need to make that
decision. If it's a GUC people can benchmark it and decide.

As against that, the packager would be more likely to get it right
(or even to know that there's an issue).


Can the package builder not set the default for the runtime tunable?

Honestly we're about to select a db platform for another FreeBSD based 
system we are building, I strongly hoping that we can get back to 
sysvshm easily otherwise we may have to select another store.


-Alfred (who still remembers back when Tom had a login on our primary db 
to help us. :) )




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Andres Freund and...@2ndquadrant.com writes:
 On 2014-04-21 11:45:49 -0400, Andrew Dunstan wrote:
 That seems to make more sense. I can't imagine why this would be a runtime
 parameter as opposed to build time.

 Because that implies that packagers and porters need to make that
 decision. If it's a GUC people can benchmark it and decide.

As against that, the packager would be more likely to get it right
(or even to know that there's an issue).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

* Alfred Perlstein (alf...@freebsd.org) wrote:
 Can the package builder not set the default for the runtime tunable?

Yeah, I was thinking about that also, but at least in this case it seems
pretty clear that the 'right' answer is known at build time.

 Honestly we're about to select a db platform for another FreeBSD
 based system we are building, I strongly hoping that we can get back
 to sysvshm easily otherwise we may have to select another store.

Is there no hope of this getting fixed in FreeBSD..?  PG wouldn't be the
only application impacted by this, I'm sure...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On 2014-04-21 11:58:10 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-04-21 11:45:49 -0400, Andrew Dunstan wrote:
  That seems to make more sense. I can't imagine why this would be a runtime
  parameter as opposed to build time.
 
  Because that implies that packagers and porters need to make that
  decision. If it's a GUC people can benchmark it and decide.
 
 As against that, the packager would be more likely to get it right
 (or even to know that there's an issue).

I sure hope that FreeBSD is going to fix this at some point (it's surely
affecting more than just postgres). But since we (and probably the
packagers) don't know which platforms it's going to affect the only
thing we could do would be to add a configure switch. To test people
would need to recompile postgres.
I don't understand what the problem with a GUC here is. It's a pretty
simple patch and that codepath is entered only once, so performance
surely can't be an argument.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-21 Thread Merlin Moncure

On Mon, Apr 21, 2014 at 10:40 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-04-21 11:30:57 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I unfortunately haven't followed this in detail, but shouldn't it be
  relatively easily to make this even cheaper by checking for
  HEAP_HASEXTERNAL?  If it's not set we don't need to iterate over the
  composite's columns, right?

 That's the point I made further down:

 Oh, sorry. I started reading this thread from the end just now.

 we could do that if we were willing
 to abandon the principle that nested fields shouldn't be compressed.
 It's not very clear what it'd cost us to give that up.

 I don't think the cost of that would be all that high.

I think it's pretty reasonable too.

 And too bad that infomask bits are so scarce :(. We really need to
 reclaim HEAP_MOVED_OFF and HEAP_MOVED_IN.

The only consequence of that is losing support for in-place update for
pre-9.0 (of which the only supported version is 8.4).  I figure it's
also pretty reasonable to drop support for IPU for out of support
versions for new versions going forward.  That would recover the bits
and yield some nice cleanups in tqual.c.

8.4 is scheduled to go out of support in July, so if you agree with my
reasoning 9.4 would be a candidate for not honoring 8.4 upgrade
support.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

All,

More on this:

1) I've confirmed at the 2nd site that the issue doesn't happen if
pg_stat_statements.so is not loaded.  So this seems to be confirmation
that either auto_explain, pg_stat_statements, or both need to be loaded
(but not necessarily created as extensions) in order to have the issue.

2) I've been unable to reproduce the issue using a naive test case, but
I'll keep trying.

3) The XID information is interesting.

a) I've confirmed that this is a case of having multiple row versions
rather than allowing a duplicate PK to be inserted.

b) the end result of this corruption is XIDs which go backwards:

  xmin   |  xmax   |   ctid| step_id | workitem_id | status
-+-+---+-+-+-
 3362707 | 3362707 | (6177,31) |   1 |  446469 | pending
 3362710 | 3362707 | (6177,32) |   1 |  446469 | working

 5520953 | 5520953 | (5064,105) |   1 |  727946 | pending
 5520954 | 5520953 | (5064,108) |   1 |  727946 | working

What's additionally problematic is that the current snapshot minxid is
in the 900 range, so it's not clear why any of the above rows are
visible at all.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

* Josh Berkus (j...@agliodbs.com) wrote:
 1) I've confirmed at the 2nd site that the issue doesn't happen if
 pg_stat_statements.so is not loaded.  So this seems to be confirmation
 that either auto_explain, pg_stat_statements, or both need to be loaded
 (but not necessarily created as extensions) in order to have the issue.

Interesting...

 2) I've been unable to reproduce the issue using a naive test case, but
 I'll keep trying.

Thanks.

 3) The XID information is interesting.
 
 a) I've confirmed that this is a case of having multiple row versions
 rather than allowing a duplicate PK to be inserted.
 
 b) the end result of this corruption is XIDs which go backwards:
 
   xmin   |  xmax   |   ctid| step_id | workitem_id | status
 -+-+---+-+-+-
  3362707 | 3362707 | (6177,31) |   1 |  446469 | pending
  3362710 | 3362707 | (6177,32) |   1 |  446469 | working
 
  5520953 | 5520953 | (5064,105) |   1 |  727946 | pending
  5520954 | 5520953 | (5064,108) |   1 |  727946 | working
 
 What's additionally problematic is that the current snapshot minxid is
 in the 900 range, so it's not clear why any of the above rows are
 visible at all.

Can you get the infomask bits..?  What's does pg_controldata report wrt
the MultiXid's?

THanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 9:13 AM, Stephen Frost wrote:

* Alfred Perlstein (alf...@freebsd.org) wrote:

Can the package builder not set the default for the runtime tunable?

Yeah, I was thinking about that also, but at least in this case it seems
pretty clear that the 'right' answer is known at build time.


Honestly we're about to select a db platform for another FreeBSD
based system we are building, I strongly hoping that we can get back
to sysvshm easily otherwise we may have to select another store.

Is there no hope of this getting fixed in FreeBSD..?  PG wouldn't be the
only application impacted by this, I'm sure...
There is definitely hope, however changes to the FreeBSD vm are taken as 
seriously as changes to core changes to Postresql's store. In addition 
changes to vm is somewhat in the realm of complexity of Postgresql store 
as well so it may not be coming in the next few days/weeks, but rather a 
month or two.  I am not sure if an easy fix is available in FreeBSD but 
we will see in short order.


I need to do some research.  I work with Adrian (FreeBSD kernel dev 
mentioned earlier in the thread), I'll grab him today and discuss what 
the issue may be.


-Alfred



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 04/21/2014 11:59 AM, Alfred Perlstein wrote:

On 4/21/14 8:45 AM, Andrew Dunstan wrote:


On 04/21/2014 11:39 AM, Magnus Hagander wrote:
On Mon, Apr 21, 2014 at 4:51 PM, Andres Freund 
and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote:


On 2014-04-21 10:45:24 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com
mailto:and...@2ndquadrant.com writes:
  If there are indeed such large regressions on FreeBSD we need
to treat
  them as postgres regressions. It's nicer not to add config
options for
  things that don't need it, but apparently that's not the case
here.

  Imo this means we need to add GUC to control wether anon
mmap() or sysv
  shmem is to be used. In 9.3.

 I will resist this mightily.  One of the main reasons to switch
to mmap
 was so we would no longer have to explain about SysV shm
configuration.

It's still explained in the docs and one of the dynshm 
implementations

is based on sysv shmem. So I don't see this as a convincing reason.

Regressing installed OSs by 15-20% just to save a couple of 
lines of

docs and code seems rather unconvincing to me.


There's also the fact that even if it's changed in FreeBSD, that 
might be somethign that takes years to trickle out to whatever 
stable release people are actually using.


But do we really want a *guc* for it though? Isn't it enough (and in 
fact better) with a configure switch to pick the implementation when 
multiple are available, that could then be set by default for 
example by the freebsd ports build? That's a lot less overhead to 
keep dragging around...





That seems to make more sense. I can't imagine why this would be a 
runtime parameter as opposed to build time.


I am unsure of the true overhead of making this a runtime tunable so 
pardon if I'm asking for a lot.  From the perspective of both an OS 
developer and postgresql user (I am both) it really makes more sense 
to have it a runtime tunable for the following reasons:


From an OS developer making this a runtime allows us to much more 
easily do the testing (instead of needing two compiled versions).
From a sysadmin perspective it makes switching to/from a LOT easier in 
case the new mmap code exposes a stability or performance bug.





1. OS developers are not the target audience for GUCs. If the OS 
developers want to test and can't be botherrd with building with a 
couple of different parameters then I'm not very impressed.


2. We should be trying to get rid of GUCs where possible, and only add 
them when we must. The more there are the more we confuse users. If a 
packager can pick a default surely they can pick build options too.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Francois Tigeot

On Mon, Apr 21, 2014 at 05:43:46PM +0200, Andres Freund wrote:
 
 On 2014-04-21 17:39:39 +0200, Magnus Hagander wrote:
  But do we really want a *guc* for it though? Isn't it enough (and in fact
  better) with a configure switch to pick the implementation when multiple
  are available, that could then be set by default for example by the freebsd
  ports build? That's a lot less overhead to keep dragging around...
 
 Well, we don't know at all it's just freebsd that's affected. In fact, I
 would be surprised if there aren't other platforms that regressed due to
 this.

The only platforms affected are the BSDs.

I ran (or tried to run) Pgbench on the four major ones during the last two
years. My experience so far:

- FreeBSD: has trouble scaling; there's some interest to improve things but
  nobody has done anything about it yet

- NetBSD: crashes under load; this could have been fixed but when I ran the
  benchmarks in 2012 none of the developers seemed to care.

- OpenBSD: couldn't run the benchmark; for some reason this operating system
  is unable to mmap() 32GB of memory on a recent PC with 128GB of RAM.

- DragonFly: scales better than everything else out there even with mmap()

Given these results I doubt reintroducing SysV shm memory use on PostgreSQL
is worthwile; most platforms where it has a performance impact have much
bigger issues to fix first.

-- 
Francois Tigeot


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 9:24 AM, Andrew Dunstan wrote:


On 04/21/2014 11:59 AM, Alfred Perlstein wrote:

On 4/21/14 8:45 AM, Andrew Dunstan wrote:


On 04/21/2014 11:39 AM, Magnus Hagander wrote:
On Mon, Apr 21, 2014 at 4:51 PM, Andres Freund 
and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote:


On 2014-04-21 10:45:24 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com
mailto:and...@2ndquadrant.com writes:
  If there are indeed such large regressions on FreeBSD we need
to treat
  them as postgres regressions. It's nicer not to add config
options for
  things that don't need it, but apparently that's not the case
here.

  Imo this means we need to add GUC to control wether anon
mmap() or sysv
  shmem is to be used. In 9.3.

 I will resist this mightily.  One of the main reasons to switch
to mmap
 was so we would no longer have to explain about SysV shm
configuration.

It's still explained in the docs and one of the dynshm 
implementations
is based on sysv shmem. So I don't see this as a convincing 
reason.


Regressing installed OSs by 15-20% just to save a couple of 
lines of

docs and code seems rather unconvincing to me.


There's also the fact that even if it's changed in FreeBSD, that 
might be somethign that takes years to trickle out to whatever 
stable release people are actually using.


But do we really want a *guc* for it though? Isn't it enough (and 
in fact better) with a configure switch to pick the implementation 
when multiple are available, that could then be set by default for 
example by the freebsd ports build? That's a lot less overhead to 
keep dragging around...





That seems to make more sense. I can't imagine why this would be a 
runtime parameter as opposed to build time.


I am unsure of the true overhead of making this a runtime tunable so 
pardon if I'm asking for a lot.  From the perspective of both an OS 
developer and postgresql user (I am both) it really makes more sense 
to have it a runtime tunable for the following reasons:


From an OS developer making this a runtime allows us to much more 
easily do the testing (instead of needing two compiled versions).
From a sysadmin perspective it makes switching to/from a LOT easier 
in case the new mmap code exposes a stability or performance bug.





1. OS developers are not the target audience for GUCs. If the OS 
developers want to test and can't be botherrd with building with a 
couple of different parameters then I'm not very impressed.


2. We should be trying to get rid of GUCs where possible, and only add 
them when we must. The more there are the more we confuse users. If a 
packager can pick a default surely they can pick build options too.
Thank you for the lecture Andrew!  Really pleasant way to treat a user 
and a fan of the system. :)




-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

* Alfred Perlstein (alf...@freebsd.org) wrote:
 There is definitely hope, however changes to the FreeBSD vm are
 taken as seriously as changes to core changes to Postresql's store.
 In addition changes to vm is somewhat in the realm of complexity of
 Postgresql store as well so it may not be coming in the next few
 days/weeks, but rather a month or two.  I am not sure if an easy fix
 is available in FreeBSD but we will see in short order.

This has been known for over a year.. :(

 I need to do some research.  I work with Adrian (FreeBSD kernel dev
 mentioned earlier in the thread), I'll grab him today and discuss
 what the issue may be.

Hopefully that'll get things moving in the right direction, finally..

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 04/21/2014 12:25 PM, Alfred Perlstein wrote:






1. OS developers are not the target audience for GUCs. If the OS 
developers want to test and can't be botherrd with building with a 
couple of different parameters then I'm not very impressed.


2. We should be trying to get rid of GUCs where possible, and only 
add them when we must. The more there are the more we confuse users. 
If a packager can pick a default surely they can pick build options too.
Thank you for the lecture Andrew!  Really pleasant way to treat a user 
and a fan of the system. :)





I confess to being mightily confused.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4


 Can you get the infomask bits..?  What's does pg_controldata report wrt
 the MultiXid's?

Can't get the infomask bits.

pg_controldata attached, with some redactions.  Unfortunately, it
appears that they've continued to do tests on this system, so the XID
counter has advanced somewhat.

pg_control version number:937
Catalog version number:   {redacted}
Database system identifier:   {redacted}
Database cluster state:   in production
pg_control last modified: Mon 21 Apr 2014 04:28:30 PM UTC
Latest checkpoint location:   8/C8B05F98
Prior checkpoint location:8/C663A4B0
Latest checkpoint's REDO location:8/C6DC9A10
Latest checkpoint's REDO WAL file:0003000800C6
Latest checkpoint's TimeLineID:   3
Latest checkpoint's PrevTimeLineID:   3
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:  0/11214375
Latest checkpoint's NextOID:  32769
Latest checkpoint's NextMultiXactId:  28939
Latest checkpoint's NextMultiOffset:  58004
Latest checkpoint's oldestXID:674
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  11214375
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint:Mon 21 Apr 2014 04:21:02 PM UTC
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline:   0
Backup start location:0/0
Backup end location:  0/0
End-of-backup record required:no
Current wal_level setting:hot_standby
Current max_connections setting:  100
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 9:34 AM, Stephen Frost wrote:

* Alfred Perlstein (alf...@freebsd.org) wrote:

There is definitely hope, however changes to the FreeBSD vm are
taken as seriously as changes to core changes to Postresql's store.
In addition changes to vm is somewhat in the realm of complexity of
Postgresql store as well so it may not be coming in the next few
days/weeks, but rather a month or two.  I am not sure if an easy fix
is available in FreeBSD but we will see in short order.

This has been known for over a year.. :(

I know!  I remember warning y'all about it back at pgcon last year. :)



I need to do some research.  I work with Adrian (FreeBSD kernel dev
mentioned earlier in the thread), I'll grab him today and discuss
what the issue may be.

Hopefully that'll get things moving in the right direction, finally..
Sure, to be fair, we are under the gun here for a product, it may just 
mean that the end result of that conversation is mysql.


I'm hoping we can use Postgresql as I've been a huge fan since 1999.  I 
based my first successful project on it and had a LOT of help from the 
pgsql community, Tom, Bruce and we even contracted Vadim for some work 
on incremental vacuums!


-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-21 Thread Alvaro Herrera

Merlin Moncure wrote:
 On Mon, Apr 21, 2014 at 10:40 AM, Andres Freund and...@2ndquadrant.com 
 wrote:

  And too bad that infomask bits are so scarce :(. We really need to
  reclaim HEAP_MOVED_OFF and HEAP_MOVED_IN.
 
 The only consequence of that is losing support for in-place update for
 pre-9.0 (of which the only supported version is 8.4).  I figure it's
 also pretty reasonable to drop support for IPU for out of support
 versions for new versions going forward.  That would recover the bits
 and yield some nice cleanups in tqual.c.

There's no way to be sure that the bits have been removed from tables
that were upgraded from a 8.4 server to a 9.0 server and from there to a
newer server.  We'd need some way to catalogue which tables have been
cleaned prior to an upgrade to a release that no longer accepts those
bits; pg_upgrade would then say table xyz needs to be vacuumed before
the upgrade in check mode, or something like that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 9:38 AM, Andrew Dunstan wrote:


On 04/21/2014 12:25 PM, Alfred Perlstein wrote:






1. OS developers are not the target audience for GUCs. If the OS 
developers want to test and can't be botherrd with building with a 
couple of different parameters then I'm not very impressed.


2. We should be trying to get rid of GUCs where possible, and only 
add them when we must. The more there are the more we confuse users. 
If a packager can pick a default surely they can pick build options 
too.
Thank you for the lecture Andrew!  Really pleasant way to treat a 
user and a fan of the system. :)





I confess to being mightily confused.


Sure, to clarify:

Andrew, you just told someone who in a db stack sits both below (as a 
pgsql user 15 years) and above (as a FreeBSD kernel dev 15 years) your 
software what they really need.


-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 04/21/2014 12:44 PM, Alfred Perlstein wrote:

On 4/21/14 9:38 AM, Andrew Dunstan wrote:


On 04/21/2014 12:25 PM, Alfred Perlstein wrote:






1. OS developers are not the target audience for GUCs. If the OS 
developers want to test and can't be botherrd with building with a 
couple of different parameters then I'm not very impressed.


2. We should be trying to get rid of GUCs where possible, and only 
add them when we must. The more there are the more we confuse 
users. If a packager can pick a default surely they can pick build 
options too.
Thank you for the lecture Andrew!  Really pleasant way to treat a 
user and a fan of the system. :)





I confess to being mightily confused.


Sure, to clarify:

Andrew, you just told someone who in a db stack sits both below (as a 
pgsql user 15 years) and above (as a FreeBSD kernel dev 15 years) your 
software what they really need.






I told you what *we* (i.e. the PostgreSQL community) need, IMNSHO (and 
speaking as a Postgres developer and consultant of 10 or so years standing).


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On 2014-04-21 09:42:06 -0700, Alfred Perlstein wrote:
 Sure, to be fair, we are under the gun here for a product, it may just mean
 that the end result of that conversation is mysql.

Personally arguments in that vain are removing just about any incentive
I have to work on the problem.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Alvaro Herrera

Alfred Perlstein wrote:

 I am unsure of the true overhead of making this a runtime tunable so
 pardon if I'm asking for a lot.  From the perspective of both an
 OS developer and postgresql user (I am both) it really makes more
 sense to have it a runtime tunable for the following reasons:
 
 From an OS developer making this a runtime allows us to much more
 easily do the testing (instead of needing two compiled versions).
 From a sysadmin perspective it makes switching to/from a LOT easier
 in case the new mmap code exposes a stability or performance bug.

In this case, AFAICS the only overhead of a runtime option (what we call
a GUC) is the added potential for user confusion, and the necessary
documentation.  If we instead go for a compile-time option, both items
become smaller.

In any case, I don't see that there's much need for a runtime option,
really; you already know that the mmap code path is slower in FreeBSD.
You only need to benchmark both options once the FreeBSD vm code itself
is fixed, right?

In fact, it might not even need to be a configure option; I would
suggest a pg_config_manual.h setting instead, and perhaps tweaks to the
src/template/freebsd file to enable it automatically on the broken
FreeBSD releases.  We could then, in the future, have the template
itself turn the option off for the future FreeBSD release that fixes the
problem.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

Josh Berkus j...@agliodbs.com writes:
 1) I've confirmed at the 2nd site that the issue doesn't happen if
 pg_stat_statements.so is not loaded.  So this seems to be confirmation
 that either auto_explain, pg_stat_statements, or both need to be loaded
 (but not necessarily created as extensions) in order to have the issue.

I looked a little bit for possible interactions between those two modules,
and almost immediately noticed that pg_stat_statements is not being polite
enough to call any previous occupant of the post_parse_analyze_hook.
That's not the bug we're looking for, since auto_explain does not use
post_parse_analyze_hook, but it's a bug nonetheless.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14 4:10 AM, Andres Freund wrote:

Hi,

On 2014-04-20 11:24:38 +0200, Palle Girgensohn wrote:

I see performance degradation with PostgreSQL 9.3 vs 9.2 on FreeBSD, and I'm 
wondering who to poke to mitigate the problem. In reference to this thread [1], 
who where the FreeBSD people that Francois mentioned? If mmap needs to perform 
well in the kernel, I'd like to know of someone with FreeBSD kernel knowledge 
who is interested in working with mmap perfocmance. If mmap is indeed the 
cuplrit, I've just tested 9.2.8 vs 9.3.4, I nevere isolated the mmap patch, 
although I believe Francois did just that with similar results.

If there are indeed such large regressions on FreeBSD we need to treat
them as postgres regressions. It's nicer not to add config options for
things that don't need it, but apparently that's not the case here.

Imo this means we need to add GUC to control wether anon mmap() or sysv
shmem is to be used. In 9.3.

Greetings,

Andres Freund

Andres, thank you.  Speaking as a FreeBSD developer that would be a good 
idea.


--
Alfred Perlstein



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 4/21/14, 9:51 AM, Andrew Dunstan wrote:


On 04/21/2014 12:44 PM, Alfred Perlstein wrote:

On 4/21/14 9:38 AM, Andrew Dunstan wrote:


On 04/21/2014 12:25 PM, Alfred Perlstein wrote:






1. OS developers are not the target audience for GUCs. If the OS 
developers want to test and can't be botherrd with building with a 
couple of different parameters then I'm not very impressed.


2. We should be trying to get rid of GUCs where possible, and only 
add them when we must. The more there are the more we confuse 
users. If a packager can pick a default surely they can pick build 
options too.
Thank you for the lecture Andrew!  Really pleasant way to treat a 
user and a fan of the system. :)





I confess to being mightily confused.


Sure, to clarify:

Andrew, you just told someone who in a db stack sits both below (as a 
pgsql user 15 years) and above (as a FreeBSD kernel dev 15 years) 
your software what they really need.






I told you what *we* (i.e. the PostgreSQL community) need, IMNSHO (and 
speaking as a Postgres developer and consultant of 10 or so years 
standing).


How high on the hierarchy of PostgreSQL's needs is making a single 
option a tunable versus compile time thing?  I mean seriously you mean 
to stick on this one point when one of your users are asking you about 
this?   That is pretty concerning to me.


-Alfred





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Alvaro Herrera

Alfred Perlstein wrote:

 How high on the hierarchy of PostgreSQL's needs is making a single
 option a tunable versus compile time thing?  I mean seriously you
 mean to stick on this one point when one of your users are asking
 you about this?   That is pretty concerning to me.

I think the sticking point here is that the problem affects a single
platform, and it can easily be construed as a platform bug.  For
problems that affect PostgreSQL as a whole for everybody, we hesitate a
lot less when it comes to creating new runtime options.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 4/21/14, 9:51 AM, Andres Freund wrote:

On 2014-04-21 09:42:06 -0700, Alfred Perlstein wrote:

Sure, to be fair, we are under the gun here for a product, it may just mean
that the end result of that conversation is mysql.

Personally arguments in that vain are removing just about any incentive
I have to work on the problem.


I was just explaining that we have a timeline over here and while that 
may disincentive you for providing what we need it would be very unfair.


In that I mean sometimes the reality of a situation can be inconvenient 
and for that I do apologize.


What I am seeing here is unfortunately a very strong departure from 
FreeBSD support by the community from several of the developers.  In 
fact over drinks at pgcon last year there were a TON of jokes making fun 
of FreeBSD users and developers which I took in stride as professional 
joking with alcohol involved.  I thought it was pretty funny.  However a 
year later and I realize that there appears to be a real problem with 
FreeBSD in the pgsql community.


There are other Linux centric dbs to pick from.  If pgsql is just 
another Linux centric DB then that is unfortunate but something I can 
deal with.


-Alfred



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

* Alfred Perlstein (alf...@freebsd.org) wrote:
 How high on the hierarchy of PostgreSQL's needs is making a single
 option a tunable versus compile time thing?  I mean seriously you
 mean to stick on this one point when one of your users are asking
 you about this?   That is pretty concerning to me.

Seriously, we do care that the system is easy to use for both admins and
end users and part of how we do that is by minimizing the number of
tunable options because they add to confusion and increase the
difficulty to manage the system- look at certain other $expensive
RDBMS's if you're unsure about that.

Far better is to work out the *correct* solution to a given problem
rather than punt'ing on it and asking the (almost uniformly
under-informed user) to tell us what to do.

And, yes, while we're interested in our user's input, we do not add new
configuration variables because one user asked us to.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 4/21/14, 9:52 AM, Alvaro Herrera wrote:

Alfred Perlstein wrote:


I am unsure of the true overhead of making this a runtime tunable so
pardon if I'm asking for a lot.  From the perspective of both an
OS developer and postgresql user (I am both) it really makes more
sense to have it a runtime tunable for the following reasons:

 From an OS developer making this a runtime allows us to much more
easily do the testing (instead of needing two compiled versions).
 From a sysadmin perspective it makes switching to/from a LOT easier
in case the new mmap code exposes a stability or performance bug.

In this case, AFAICS the only overhead of a runtime option (what we call
a GUC) is the added potential for user confusion, and the necessary
documentation.  If we instead go for a compile-time option, both items
become smaller.

In any case, I don't see that there's much need for a runtime option,
really; you already know that the mmap code path is slower in FreeBSD.
You only need to benchmark both options once the FreeBSD vm code itself
is fixed, right?

In fact, it might not even need to be a configure option; I would
suggest a pg_config_manual.h setting instead, and perhaps tweaks to the
src/template/freebsd file to enable it automatically on the broken
FreeBSD releases.  We could then, in the future, have the template
itself turn the option off for the future FreeBSD release that fixes the
problem.

That is correct, until you're in prod and suddenly one option becomes 
unstable, or you want to try a quick kernel patch without rebooting.


Look, this is an argument I've lost time and time again in open source 
software communities, the idea of a software option as opposed to 
compile time really seems to hit people the wrong way.


I think from now on it just makes sense to sit back and let whatever 
happens happen.


-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Alfred,

* Alfred Perlstein (alf...@freebsd.org) wrote:
 On 4/21/14, 9:51 AM, Andres Freund wrote:
 On 2014-04-21 09:42:06 -0700, Alfred Perlstein wrote:
 Sure, to be fair, we are under the gun here for a product, it may just mean
 that the end result of that conversation is mysql.
 Personally arguments in that vain are removing just about any incentive
 I have to work on the problem.
 
 I was just explaining that we have a timeline over here and while
 that may disincentive you for providing what we need it would be
 very unfair.

I'm pretty sure Andres was referring to the part where there's a
'threat' to move to some other platform due to a modest performance
degredation, as if it's the only factor involved in making a decision
among the various RDBMS options.  If that's really your deciding
criteria instead of the myriad of other factors, I daresay you have your
priorities mixed up.

 There are other Linux centric dbs to pick from.  If pgsql is just
 another Linux centric DB then that is unfortunate but something I
 can deal with.

These attacks really aren't going to get you anywhere.  We're talking
about a specific performance issue that FreeBSD has and how much PG
(surely not the only application impacted by this issue) should bend
to address it, even though the FreeBSD folks were made aware of the
issue over year ago and have done nothing to address it.

Moreover, you'd like to also define the way we deal with the issue as
being to make it runtime configurable rather than as a compile-time
option, even though 90% of the users out there won't understand the
difference nor would know how to correctly set it (and, in many cases,
may end up making the wrong decision because it's the default for
other platforms, unless we add more code to address this at initdb
time).

Basically, it doesn't sound like you're terribly concerned with the
majority of our user base, even on FreeBSD, and would prefer to try
and browbeat us into doing what you've decided is the correct solution
because it'd work better for you.

I've been guiltly of the same in the past and it's not fun having to
back off from a proposal when it's pointed out that there's a better
option, particularly when it doesn't seem like the alternative is
better for me, but that's just part of working in any large project.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] assertion failure 9.3.4

Hi,

I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.

Since the symptom of the problem seem to be multixacts with more than
one updating xid, I added a check to MultiXactIdCreateFromMembers()
preventing that. That requires to move ISUPDATE_from_mxstatus() to a
header, but I think we should definitely add such a assert.

As it turns out the problem is in the
else if (result == HeapTupleBeingUpdated  wait)
branch in (at least) heap_update(). When the problem is hit the
to-be-updated tuple originally has HEAP_XMIN_COMMITTED |
HEAP_XMAX_LOCK_ONLY | HEAP_XMAX_KEYSHR_LOCK set. So we release the
buffer lock, acquire the tuple lock, and reacquire the buffer lock. But
inbetween the locking backend has actually updated the tuple.
The code tries to protect against that with:
/*
 * recheck the locker; if someone else changed the tuple while
 * we weren't looking, start over.
 */
if ((oldtup.t_data-t_infomask  HEAP_XMAX_IS_MULTI) ||
!TransactionIdEquals(
HeapTupleHeaderGetRawXmax(oldtup.t_data),
 xwait))
goto l2;

can_continue = true;
locker_remains = true;

and similar. The problem is that in Andrew's case the infomask changes
from 0x2192 to 0x2102 (i.e. it's a normal update afterwards), while xmax
stays the same. Ooops.
A bit later there's:
result = can_continue ? HeapTupleMayBeUpdated : HeapTupleUpdated;
So, from thereon we happily continue to update the tuple, thinking
there's no previous updater. Which obviously causes problems.

I've hacked^Wfixed this by changing the infomask test above into
infomask != oldtup.t_data-t_infomask in a couple of places. That seems
to be sufficient to survive the testcase a couple of times.

I am too hungry right now to think about a proper fix for this and
whether there's further problematic areas.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

Andres Freund and...@2ndquadrant.com writes:
 I spent the last two hours poking arounds in the environment Andrew
 provided and I was able to reproduce the issue, find a assert to
 reproduce it much faster and find a possible root cause.

Hmm ... is this the same thing Josh is reporting?  If so, why the
apparent connection with use of pg_stat_statements?  Maybe that
just changes the timing to make the race condition more likely?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 4/21/14, 11:14 AM, Stephen Frost wrote:

Alfred,

* Alfred Perlstein (alf...@freebsd.org) wrote:

On 4/21/14, 9:51 AM, Andres Freund wrote:

On 2014-04-21 09:42:06 -0700, Alfred Perlstein wrote:

Sure, to be fair, we are under the gun here for a product, it may just mean
that the end result of that conversation is mysql.

Personally arguments in that vain are removing just about any incentive
I have to work on the problem.

I was just explaining that we have a timeline over here and while
that may disincentive you for providing what we need it would be
very unfair.

I'm pretty sure Andres was referring to the part where there's a
'threat' to move to some other platform due to a modest performance
degredation, as if it's the only factor involved in making a decision
among the various RDBMS options.  If that's really your deciding
criteria instead of the myriad of other factors, I daresay you have your
priorities mixed up.


There are other Linux centric dbs to pick from.  If pgsql is just
another Linux centric DB then that is unfortunate but something I
can deal with.

These attacks really aren't going to get you anywhere.  We're talking
about a specific performance issue that FreeBSD has and how much PG
(surely not the only application impacted by this issue) should bend
to address it, even though the FreeBSD folks were made aware of the
issue over year ago and have done nothing to address it.

Moreover, you'd like to also define the way we deal with the issue as
being to make it runtime configurable rather than as a compile-time
option, even though 90% of the users out there won't understand the
difference nor would know how to correctly set it (and, in many cases,
may end up making the wrong decision because it's the default for
other platforms, unless we add more code to address this at initdb
time).

Basically, it doesn't sound like you're terribly concerned with the
majority of our user base, even on FreeBSD, and would prefer to try
and browbeat us into doing what you've decided is the correct solution
because it'd work better for you.

I've been guiltly of the same in the past and it's not fun having to
back off from a proposal when it's pointed out that there's a better
option, particularly when it doesn't seem like the alternative is
better for me, but that's just part of working in any large project.

Stephen, please calm down on the hyperbole, seriously, picking another 
db is not an attack.


I was simply asking for a feature that would make my life easier as both 
an admin deploying postgresql and a kernel dev attempting to fix a 
problem.  I'm one guy, probably the only guy right now asking.  Honestly 
the thought of needing to compile two versions of postgresql to do sysv 
vs mmap performance would take me more time than I would like to devote 
to the issue when my time is already limited.


Again, it was an ask, you are free to do what you like, the same way you 
were free to ignore my advice at pgcon about mmap being less efficient.  
It does not make what I'm saying an attack.  Just like when 
interviewing people choosing a different candidate for a job is not an 
attack on the other candidates!


-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

On 04/21/2014 12:26 PM, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 I spent the last two hours poking arounds in the environment Andrew
 provided and I was able to reproduce the issue, find a assert to
 reproduce it much faster and find a possible root cause.
 
 Hmm ... is this the same thing Josh is reporting?  If so, why the
 apparent connection with use of pg_stat_statements?  Maybe that
 just changes the timing to make the race condition more likely?

Well, in Andrew's case it only happens if auto_explain is loaded.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4


On 04/21/2014 03:26 PM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.

Hmm ... is this the same thing Josh is reporting?  If so, why the
apparent connection with use of pg_stat_statements?  Maybe that
just changes the timing to make the race condition more likely?





That's my theory. Josh's case is very similar indeed to mine.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

On 2014-04-21 15:26:03 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I spent the last two hours poking arounds in the environment Andrew
  provided and I was able to reproduce the issue, find a assert to
  reproduce it much faster and find a possible root cause.
 
 Hmm ... is this the same thing Josh is reporting?  If so, why the
 apparent connection with use of pg_stat_statements?  Maybe that
 just changes the timing to make the race condition more likely?

I am not sure. I could only reproduce the problem with both
stat_statements and auto_explain loaded, but I didn't try very
long. It's pretty timing sensitive, so I could very well imagine that
some additional overhead makes it easier to reproduce.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Alfred,

* Alfred Perlstein (alf...@freebsd.org) wrote:
 Stephen, please calm down on the hyperbole, seriously, picking
 another db is not an attack.

Perhaps it wasn't intended by you, but to those of us reading it, your
comments came across clearly as if you don't fix this, then we
won't use PG.  Email certainly isn't perfect but I hope you can
understand our confusion- why else bring up MySQL or any other database
if not to try and push us to do what you want?  It's hard for me to see
how bringing up other databases or making comments about us being
Linux-only are anything other than attempts to persude by threating
that we might lose a user or users.

 I was simply asking for a feature that would make my life easier as
 both an admin deploying postgresql and a kernel dev attempting to
 fix a problem.  I'm one guy, probably the only guy right now asking.
 Honestly the thought of needing to compile two versions of
 postgresql to do sysv vs mmap performance would take me more time
 than I would like to devote to the issue when my time is already
 limited.

That's certainly unfortunate.  For my 2c, I'd recommend that you write a
minimal implementation that allows you to test just the sysv-vs-mmap
case (which could certainly take an option, to avoid having to
recompile during testing), or even ask if anyone here already has; I
wouldn't be at all surprised if both Robert and Francois did exactly
that already, nor would I be surprised if someone volunteered to write
such a small C utility for you, if it meant that this issue would be
fixed in FreeBSD that much sooner.

Asking for help to address the FreeBSD performance would have been much
better received.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On 2014-04-21 15:47:31 -0400, Stephen Frost wrote:
 That's certainly unfortunate.  For my 2c, I'd recommend that you write a
 minimal implementation that allows you to test just the sysv-vs-mmap
 case (which could certainly take an option, to avoid having to
 recompile during testing), or even ask if anyone here already has;

I don't think that's something all that easily testable in
isolation. The behaviour here is heavily related to concurrency.

 I
 wouldn't be at all surprised if both Robert and Francois did exactly
 that already, nor would I be surprised if someone volunteered to write
 such a small C utility for you, if it meant that this issue would be
 fixed in FreeBSD that much sooner.

I don't know, but the patch for a guc would be  10 lines. I think I'd
start with that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-04-21 15:47:31 -0400, Stephen Frost wrote:
  That's certainly unfortunate.  For my 2c, I'd recommend that you write a
  minimal implementation that allows you to test just the sysv-vs-mmap
  case (which could certainly take an option, to avoid having to
  recompile during testing), or even ask if anyone here already has;
 
 I don't think that's something all that easily testable in
 isolation. The behaviour here is heavily related to concurrency.

Concurrency is not terribly hard to generate in a simulated case; I
still feel that doing this independently of PG would probably be better
than involving all the rest of the PG code in testing something this
low-level.

  I
  wouldn't be at all surprised if both Robert and Francois did exactly
  that already, nor would I be surprised if someone volunteered to write
  such a small C utility for you, if it meant that this issue would be
  fixed in FreeBSD that much sooner.
 
 I don't know, but the patch for a guc would be  10 lines. I think I'd
 start with that.

Certainly running a minimally patched PG wouldn't be hard for a kernel
dev either, of course.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] assertion failure 9.3.4

On 2014-04-21 12:31:09 -0700, Josh Berkus wrote:
 On 04/21/2014 12:26 PM, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
  I spent the last two hours poking arounds in the environment Andrew
  provided and I was able to reproduce the issue, find a assert to
  reproduce it much faster and find a possible root cause.
  
  Hmm ... is this the same thing Josh is reporting?  If so, why the
  apparent connection with use of pg_stat_statements?  Maybe that
  just changes the timing to make the race condition more likely?
 
 Well, in Andrew's case it only happens if auto_explain is loaded.

Josh, how long does it take you to reproduce the issue? And can you
reproduce it outside of a production environment?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 4/21/14, 12:47 PM, Stephen Frost wrote:
  Asking for help to address the FreeBSD performance would have been 
much better received. Thanks, Stephen 


That is exactly what I did, I asked for a version of postgresql that was 
easy to switch at runtime between two behaviors.


That would make it a LOT easier to run a few scripts and make sure I got 
the correct binary without having to munge PREFIX and a bunch of PATH 
and other tools to get my test harness to DTRT.


-Alfred



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Bruce Momjian

On Mon, Apr 21, 2014 at 01:52:48PM -0700, Alfred Perlstein wrote:
 
 On 4/21/14, 12:47 PM, Stephen Frost wrote:
   Asking for help to address the FreeBSD performance would have
 been much better received. Thanks, Stephen
 
 That is exactly what I did, I asked for a version of postgresql that
 was easy to switch at runtime between two behaviors.
 
 That would make it a LOT easier to run a few scripts and make sure I
 got the correct binary without having to munge PREFIX and a bunch of
 PATH and other tools to get my test harness to DTRT.

I think the big point is that you must realize that we are dealing with
thousands of users, so making a suggestion without considering its
impact on those thousands of people is not helpful.

We have clearly stated the need to consider those thousands of users,
and you are still saying the same thing --- this would make it easy for
me.  This is not helpful to the discussion, and you must realize that
at some level.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Joshua D. Drake



On 04/21/2014 10:39 AM, Alfred Perlstein wrote:


What I am seeing here is unfortunately a very strong departure from
FreeBSD support by the community from several of the developers.  In
fact over drinks at pgcon last year there were a TON of jokes making fun
of FreeBSD users and developers which I took in stride as professional
joking with alcohol involved.  I thought it was pretty funny.  However a
year later and I realize that there appears to be a real problem with
FreeBSD in the pgsql community.


The reality is, FreeBSD is like Saab (before it died, and no I am not 
suggesting that FreeBSD is dying). Saab was a niche, very cool 
automobile that offered a lot of unique features that others didn't. 
However, they didn't sell very well in the states but had a very devoted 
fan base (myself included).


FreeBSD is awesome. There is no question about that. It certainly has a 
better license than Linux and has offered things for years that Linux 
has never really gotten right (jails/zones).


That said, FreeBSD is niche and Linux is, not. Linux is the king of the 
jungle in this world. Whether we want it to be or not and what that 
means is: that is where the resources go.


If the community had more *BSD presence I think it would be great but it 
isn't all that viable at this point. I do know however that no-one in 
this community would turn down a team of FreeBSD advocates helping us 
make PostgreSQL awesome for PostgreSQL.




There are other Linux centric dbs to pick from.  If pgsql is just


No, there is about 1 and its derivatives thereof. If you want the type 
of features pgsql offers, then pgsql is all you have got.


Sincerely,

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4


 Josh, how long does it take you to reproduce the issue? 

A couple hours.

 And can you
 reproduce it outside of a production environment?

Not yet, still working on that.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

Alfred,

* Alfred Perlstein (alf...@freebsd.org) wrote:
 On 4/21/14, 12:47 PM, Stephen Frost wrote:
   Asking for help to address the FreeBSD performance would have
 been much better received. Thanks, Stephen
 
 That is exactly what I did, I asked for a version of postgresql that
 was easy to switch at runtime between two behaviors.
 
 That would make it a LOT easier to run a few scripts and make sure I
 got the correct binary without having to munge PREFIX and a bunch of
 PATH and other tools to get my test harness to DTRT.

I'm sure one of the hackers would be happy to provide you with a patch
to help you with your testing.

That's quite a different thing from asking for a GUC to be provided and
then supported over the next 5 years as part of the core release, which
is what I believe we all thought you were asking for.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Bruce Momjian

On Mon, Apr 21, 2014 at 02:08:51PM -0700, Joshua Drake wrote:
 If the community had more *BSD presence I think it would be great
 but it isn't all that viable at this point. I do know however that
 no-one in this community would turn down a team of FreeBSD advocates
 helping us make PostgreSQL awesome for PostgreSQL.

I don't think we would even implement a run-time control for Linux or
Windows for this, so I don't even think it is a FreeBSD issue.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 4/21/14, 2:23 PM, Stephen Frost wrote:

Alfred,

* Alfred Perlstein (alf...@freebsd.org) wrote:

On 4/21/14, 12:47 PM, Stephen Frost wrote:

  Asking for help to address the FreeBSD performance would have
been much better received. Thanks, Stephen

That is exactly what I did, I asked for a version of postgresql that
was easy to switch at runtime between two behaviors.

That would make it a LOT easier to run a few scripts and make sure I
got the correct binary without having to munge PREFIX and a bunch of
PATH and other tools to get my test harness to DTRT.

I'm sure one of the hackers would be happy to provide you with a patch
to help you with your testing.

That would be fine.

That's quite a different thing from asking for a GUC to be provided and
then supported over the next 5 years as part of the core release, which
is what I believe we all thought you were asking for.
I did not know that GUCs were not classified into 
experimental/non-experimental.  The fact that a single GUC would need 
to be supported for 5 years is definitely something to consider.  Now I 
understand the push back a little more.


-Alfred



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On 2014-04-21 17:21:20 -0400, Bruce Momjian wrote:
 On Mon, Apr 21, 2014 at 02:08:51PM -0700, Joshua Drake wrote:
  If the community had more *BSD presence I think it would be great
  but it isn't all that viable at this point. I do know however that
  no-one in this community would turn down a team of FreeBSD advocates
  helping us make PostgreSQL awesome for PostgreSQL.
 
 I don't think we would even implement a run-time control for Linux or
 Windows for this, so I don't even think it is a FreeBSD issue.

I think some of the arguments in this thread are pretty damn absurd. We
have just introduced dynamic_shared_memory_type.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD


On 4/21/14, 4:08 PM, Joshua D. Drake wrote:

If the community had more *BSD presence I think it would be great but it isn't 
all that viable at this point. I do know however that no-one in this community 
would turn down a team of FreeBSD advocates helping us make PostgreSQL awesome 
for PostgreSQL.


I assume you meant FreeBSD awesome for PostgreSQL? :)

I'm also a big fan of *BSD but the reality is it's MUCH harder to get *BSD into 
a corporation than linux. Now, if FreeBSD had a bunch of stuff that made 
PostgreSQL run like 4x faster on *BSD than Linux that would be a different 
story.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?


On 4/15/14, 1:15 PM, Peter Geoghegan wrote:

On Tue, Apr 15, 2014 at 9:30 AM, Merlin Moncuremmonc...@gmail.com  wrote:

There are many reports of improvement from lowering shared_buffers.
The problem is that it tends to show up on complex production
workloads and that there is no clear evidence pointing to problems
with the clock sweep; it could be higher up in the partition locks or
something else entirely (like the O/S).  pgbench is also not the
greatest tool for sniffing out these cases: it's too random and for
large database optimization is generally an exercise in de-randomizing
i/o patterns.  We really, really need a broader testing suite that
covers more usage patterns.

I find it quite dissatisfying that we know so little about this.


This is an area where additional stats gathering would be very valuable. We're 
running 8G buffers on 512G servers because bumping it up hurt performance, but 
we have no clue why. Was it due to buffer pins? How many times the clock had to 
sweep to find a victim? Something else entirely? No idea... :(
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?


On 4/18/14, 2:51 PM, Atri Sharma wrote:


I feel that if there is no memory pressure, frankly it doesnt matter much about 
what gets out and what not. The case I am specifically targeting is when the 
clocksweep gets to move about a lot i.e. high memory pressure workloads. Of 
course,  I may be totally wrong here.


Well, there's either memory pressure or there isn't. If there isn't then it's 
all moot *because we're not evicting anything*.


One thing that I discussed with Merlin offline and am now concerned about is 
how will the actual eviction work. We cannot traverse the entire list and then 
find all the buffers with refcount 0 and then do another traversal to find the 
oldest one.


This is why OSes use multiple page pools. If we're going to use a clock sweep 
at all I think we need to use the same.

Every time we discuss this stuff it feels like we're completely reinventing the 
wheel that was solved by OSes years ago. :(
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] AXLE Plans for 9.5 and 9.6

2014-04-21 Thread Simon Riggs

I've discussed 2ndQuadrant's involvement in the AXLE project a few
times publicly, but never on this mailing list. The project relates to
innovation and improvement in Business Intelligence for systems based
upon PostgreSQL in the range of 10-100TB.

Our work will span the 9.5 and 9.6 cycles. We're looking to make
measurable improvements in a number of cases; one of those is TPC-H,
since its a publicly accessible benchmark, another is a more private
benchmark on healthcare data. In brief, this means speeding up the
performance of large queries, data loading and looking at very large
systems issues.

Some of areas of RD are definitely on the roadmap, others are more
flexible. Some of this is in progress, other stuff is not even at the
design stage - yet, just a few paragraphs along the lines of we will
look at these topics. If we have room, its possible we may
accommodate other topics; this is not carte blanche, but the reason
for posting here is so people know we will take input, following the
normal community process. Detailed in-person discussions at PGCon are
expected and the Wiki pages will be updated for each aspect.

BI-related Indexing
* MinMax indexes
* Bitmap indexes

Large Systems
* Freeze avoidance
* Storage management issues for very large systems

Storage Efficiency
* Compression
* Column Orientation

Optimisation
* Bulk loading speed improvements
* Bulk FK evaluation
* Executor tuning for very large queries

Query tuning
* Approximate queries, sampling
* Materialized Views

...and possibly some other aspects.

2ndQuadrant is also assisting other researchers on GPU and FPGA
topics, which may also yield work of interest to PostgreSQL project.

Couple of points: The project is time limited, so if work gets pushed
back beyond that then we'll lose the opportunity to contribute. Please
support our work with timely objections, assistance in defining the
path forwards and limiting the scope to something that avoids wasting
this opportunity. Further funding is possible if we don't squander
this. We are being funded to make best efforts to contribute to open
source PostgreSQL, not pay-for-commit.

AXLE is funded by the EU under FP7 Grant Agreement 318633.  Further
details are available here http://www.axleproject.eu/

(There are also other 2ndQuadrant development projects in progress,
this is just one of the larger ones).

Best Regards

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?


On 4/16/14, 10:28 AM, Robert Haas wrote:

Also, I think the scalability problems around buffer eviction are
eminently solvable, and in particular I'm hopeful that Amit is going
to succeed in solving them.  Suppose we have a background process
(whether the background writer or some other) that runs the clock
sweep, identifies good candidates for eviction, and pushes them on a
set of, say, 16 free-lists protected by spinlocks.  (The optimal
number of free-lists probably depends on the size of shared_buffers.)


How *certain* are we that a single freelist lock (that actually ONLY protects 
the freelist) would be that big a deal? I suspect it wouldn't be much of an 
issue at all:

- Right now (IIRC) it's tied into the clock as well, so immediate fail on 
scaling...
- The clock is WAY more expensive than grabbing one buffer off the free list. 
Last I looked it was so bad that even if the next buffer the clock hit was free 
it was still worse than hitting the free list.

I strongly suspect that a single freelist lock (that didn't protect anything 
else) would be fine. I think it'd be folly to start with a more complex 
multi-lock/multi-freelist implementation before we knew we needed one.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread David G Johnston

Jim Nasby-2 wrote
 I feel that if there is no memory pressure, frankly it doesnt matter much
 about what gets out and what not. The case I am specifically targeting is
 when the clocksweep gets to move about a lot i.e. high memory pressure
 workloads. Of course,  I may be totally wrong here.
 
 Well, there's either memory pressure or there isn't. If there isn't then
 it's all moot *because we're not evicting anything*.

The trade-off I'm seeing here is between measuring when there is no memory
pressure - and thus eating at performance while not actually evicting
buffers - and not measuring but then encountering memory pressure and not
having a clue as to what should be evicted.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Clock-sweep-not-caching-enough-B-Tree-leaf-pages-tp5799947p5800988.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-21 Thread Claudio Freire

On Mon, Apr 21, 2014 at 8:07 PM, David G Johnston
david.g.johns...@gmail.com wrote:
 Jim Nasby-2 wrote
 I feel that if there is no memory pressure, frankly it doesnt matter much
 about what gets out and what not. The case I am specifically targeting is
 when the clocksweep gets to move about a lot i.e. high memory pressure
 workloads. Of course,  I may be totally wrong here.

 Well, there's either memory pressure or there isn't. If there isn't then
 it's all moot *because we're not evicting anything*.

 The trade-off I'm seeing here is between measuring when there is no memory
 pressure - and thus eating at performance while not actually evicting
 buffers - and not measuring but then encountering memory pressure and not
 having a clue as to what should be evicted.


I believe that for the intended use discussed in this thread, a
compile-time switch would be more than enough control, and it would
avoid that tradeoff.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing dependency to wsock32.lib when compiling code on WIndows

2014-04-21 Thread Michael Paquier

On Mon, Apr 21, 2014 at 12:54 PM, Michael Paquier michael.paqu...@gmail.com
 wrote:

 When doing some work on Windows, I noticed that the mkvc specs in
 src/tools/msvc use wsock32.lib, which is as far as I understand an
 old, obsolete version of the Windows socket library. Wouldn't it make
 sense to update the specs to build only with ws2_32.lib like in the
 patch attached?

I created an entry in the upcoming commit fest for this patch:
https://commitfest.postgresql.org/action/patch_view?id=1440
Thanks,
-- 
Michael

Re: [HACKERS] AXLE Plans for 9.5 and 9.6

On 04/21/2014 03:41 PM, Simon Riggs wrote:
 Storage Efficiency
 * Compression
 * Column Orientation

You might look at turning this:

http://citusdata.github.io/cstore_fdw/

... into a more integrated part of Postgres.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4



On 04/21/2014 02:54 PM, Andres Freund wrote:

Hi,

I spent the last two hours poking arounds in the environment Andrew
provided and I was able to reproduce the issue, find a assert to
reproduce it much faster and find a possible root cause.



What's the assert that makes it happen faster? That might help a lot in 
constructing a self-contained test.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread David G Johnston

Stephen Frost wrote
* Alfred Perlstein (

alfred@

) wrote:
On 4/21/14, 12:47 PM, Stephen Frost wrote:
Asking for help to address the FreeBSD performance would have
been much better received. Thanks, Stephen

That is exactly what I did, I asked for a version of postgresql that
was easy to switch at runtime between two behaviors.

That would make it a LOT easier to run a few scripts and make sure I
got the correct binary without having to munge PREFIX and a bunch of
PATH and other tools to get my test harness to DTRT.

I'm sure one of the hackers would be happy to provide you with a patch
to help you with your testing.

That's quite a different thing from asking for a GUC to be provided and
then supported over the next 5 years as part of the core release, which
is what I believe we all thought you were asking for.

Alfred,

Are you willing and use a custom 9.3 installed from source or are you asking
for something to actually be released to the wild before you go and test it
- your comments are unclear on this point?

The technical consensus is that the more desirable approach is to have the
determination done at compile-time since - besides testing - no obvious
reason exists that a user, once they have determined the correct option for
their platform, would find reason to change it. Yes, it adds another player
to the game (unless you install from source), but the community is already
structured to rely upon packagers to do the right thing for their platform
so that the amount of customization presented to the user can be minimized.

In short, the goal is to have GUCs limited to work-mix, not platform,
configuration; and definitely not for platform testing purposes. If you are
going to be testing platform performance it seems to be expected that you
have the ability to compile and alter source code. This may indeed limit
the potential number of testers but it does add efficiency to the process
because the testers can make patches.

David J.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Perfomance-degradation-9-3-vs-9-2-for-FreeBSD-tp5800835p5800993.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

Jim Nasby j...@nasby.net writes:
 How *certain* are we that a single freelist lock (that actually ONLY
 protects the freelist) would be that big a deal?

We used to have one.  It was a big bottleneck --- and this was years
ago, when the buffer manager was much less scalable than it is today.
(IIRC, getting rid of a central lock was one of the main advantages
of the current clock sweep code over its predecessor.)

The real issue here is that in the modern code, we hardly ever actually
have anything in the freelist: only when a relation is dropped, or
something like that, do buffers ever get put to the freelist.  So your
argument that removing a buffer from the freelist is cheaper than running
the clock sweep is largely missing the point.  We'd have to run a clock
sweep in order to find something to put in the freelist.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

 I see performance degradation with PostgreSQL 9.3 vs 9.2 on FreeBSD, and I'm 
 wondering who to poke to mitigate the problem. In reference to this thread 
 [1], who where the FreeBSD people that Francois mentioned? If mmap needs to 
 perform well in the kernel, I'd like to know of someone with FreeBSD kernel 
 knowledge who is interested in working with mmap perfocmance. If mmap is 
 indeed the cuplrit, I've just tested 9.2.8 vs 9.3.4, I nevere isolated the 
 mmap patch, although I believe Francois did just that with similar results.

I observe performance degradation with PostgreSQL 9.3 vs 9.2 on Linux
as well.  The hardware is HP DL980G7, 80 cores, 2TB mem, RHEL 6,
pgbench is used (read only query), scale factor is 1,000 (DB size
15GB).

Included is the graph (from PostgreSQL Enterprise Consortium's 2014
report page 13: https://www.pgecons.org/downloads/43). I see up to 14%
degration (at 128 concurrent users) comparing with 9.2.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On Mon, Apr 21, 2014 at 4:10 AM, Andres Freund and...@2ndquadrant.com wrote:
 If there are indeed such large regressions on FreeBSD we need to treat
 them as postgres regressions. It's nicer not to add config options for
 things that don't need it, but apparently that's not the case here.

+1, but I think this is something for packagers to get right, not users.

I really don't like the idea of playing chicken with the FreeBSD
people, especially since we're going to use System V shared memory
into the foreseeable future anyway. It's probably *far* easier for us
to fix it than it is for the FreeBSD people to fix it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

On Mon, Apr 21, 2014 at 5:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 We used to have one.  It was a big bottleneck --- and this was years
 ago, when the buffer manager was much less scalable than it is today.
 (IIRC, getting rid of a central lock was one of the main advantages
 of the current clock sweep code over its predecessor.)

Yes, it was. This is a major advantage of clock sweep, and anything
that replaces it will need to maintain the same advantage. Didn't
someone indicate that clock sweep could beat ARC around that time,
presumably for this reason? If no one did, then my reading of a
variety of other papers on caching indicates that this is probably the
case.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Joshua D. Drake



On 04/21/2014 03:08 PM, Jim Nasby wrote:


On 4/21/14, 4:08 PM, Joshua D. Drake wrote:

If the community had more *BSD presence I think it would be great but
it isn't all that viable at this point. I do know however that no-one
in this community would turn down a team of FreeBSD advocates helping
us make PostgreSQL awesome for PostgreSQL.


I assume you meant FreeBSD awesome for PostgreSQL? :)


Yes. Ty for the correction.



I'm also a big fan of *BSD but the reality is it's MUCH harder to get
*BSD into a corporation than linux. Now, if FreeBSD had a bunch of stuff
that made PostgreSQL run like 4x faster on *BSD than Linux that would be
a different story.


Exactly.

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

* Tatsuo Ishii (is...@postgresql.org) wrote:
 I observe performance degradation with PostgreSQL 9.3 vs 9.2 on Linux
 as well.  The hardware is HP DL980G7, 80 cores, 2TB mem, RHEL 6,
 pgbench is used (read only query), scale factor is 1,000 (DB size
 15GB).

Can you isolate the sysv-vs-mmap patch and see what happens with just
that change..?

 Included is the graph (from PostgreSQL Enterprise Consortium's 2014
 report page 13: https://www.pgecons.org/downloads/43). I see up to 14%
 degration (at 128 concurrent users) comparing with 9.2.

That URL returns 'Forbidden'...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

Peter Geoghegan p...@heroku.com writes:
 On Mon, Apr 21, 2014 at 5:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 We used to have one.  It was a big bottleneck --- and this was years
 ago, when the buffer manager was much less scalable than it is today.
 (IIRC, getting rid of a central lock was one of the main advantages
 of the current clock sweep code over its predecessor.)

 Yes, it was. This is a major advantage of clock sweep, and anything
 that replaces it will need to maintain the same advantage. Didn't
 someone indicate that clock sweep could beat ARC around that time,
 presumably for this reason? If no one did, then my reading of a
 variety of other papers on caching indicates that this is probably the
 case.

ARC *was* the predecessor algorithm.  See commit 5d5087363.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

 * Tatsuo Ishii (is...@postgresql.org) wrote:
 I observe performance degradation with PostgreSQL 9.3 vs 9.2 on Linux
 as well.  The hardware is HP DL980G7, 80 cores, 2TB mem, RHEL 6,
 pgbench is used (read only query), scale factor is 1,000 (DB size
 15GB).
 
 Can you isolate the sysv-vs-mmap patch and see what happens with just
 that change..?

Unfortunately I don't have an access to the machine at this moment.

 Included is the graph (from PostgreSQL Enterprise Consortium's 2014
 report page 13: https://www.pgecons.org/downloads/43). I see up to 14%
 degration (at 128 concurrent users) comparing with 9.2.
 
 That URL returns 'Forbidden'...

Sorry for this. I sent a problem report to the person in charge.  In
the mean time, please go to:
https://www.pgecons.org/download/works_2013/ then click the link 2013
年度WG1活動報告 (sorry for not English). You should be able to
download a report (PDF).

Also the report is written in Japanese. I hope you can read at leat
the graph in page 13 and the table in page 14.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

On Mon, Apr 21, 2014 at 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ARC *was* the predecessor algorithm.  See commit 5d5087363.

I believe that the main impetus for replacing ARC with clock sweep
came from patent issues, though. It was a happy coincidence that clock
sweep happened to be better than ARC, but that doesn't mean that ARC
didn't have some clear advantages, even if it wasn't worth it on
balance. LRU-K, and 2Q have roughly the same advantages. I'm
reasonably confident you can have the best of both worlds, or
something closer to it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] assertion failure 9.3.4

All,

I've taken a stab at creating a reproduceable test case based on the
characterisitics of the production issues I'm seeing.  But clearly
there's an element I'm missing, because I'm not able to produce the bug
with a pgbench-based test case.

My current test has FKs, updating both FK'd tables, updating both
indexed and non-indexed columns, and doing multiple updates in the same
transaction, and lock-blocking.

Files are attached in case someone has better ideas.  queue_bench.sql is
the setup file, and then you do:

pgbench -n -T 600 -c 15 -j 5 -f queue_adder.bench -f queue_worker.bench
-f queue_worker.bench -f queue_worker.bench -f queue_worker.bench {dbname}

... or whatever levels of c and j make sense on your hardware.

FWIW, this particular test case might make a nice destruction test case
for replication testing, too.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
CREATE TABLE queue (
queue_id serial not null primary key,
ts timestamptz not null default now(),
cat int not null,
status text not null default 'new',
current_step int not null default 1
);

CREATE TABLE queue_steps (
queue_id int not null,
step int not null,
status text not null default 'pending',
done timestamptz,
payload text
);

INSERT INTO queue ( ts, cat, current_step )
SELECT now() - interval '15 seconds' + ( i * interval '1 second' ),
(random()*3)::int,
1
FROM generate_series(1,12) as gs(i);

INSERT INTO queue_steps ( queue_id, step, status, payload )
SELECT queue_id, gs.i, 'pending',
md5(random()::text)
FROM queue,
generate_series(1,3) as gs(i);

CREATE INDEX queue_ts ON queue(ts DESC);
CREATE INDEX queue_step_pk ON queue_steps(queue_id, step);
ALTER TABLE queue_steps ADD CONSTRAINT queue_fk FOREIGN KEY ( queue_id ) REFERENCES queue(queue_id);

CREATE FUNCTION add_queue_item ( ncat int )
RETURNS INT
LANGUAGE plpgsql
AS $f$
DECLARE qid INT;
BEGIN
INSERT INTO queue ( ts, cat, status, current_step )
VALUES ( now(), ncat, 'new', 1 )
RETURNING queue_id
INTO qid;

INSERT INTO queue_steps ( queue_id, step, status, payload )
SELECT qid, gs.i, 'pending',
md5(random()::text)
FROM generate_series(1,3) as gs(i);

RETURN qid;
END;
$f$;

CREATE OR REPLACE FUNCTION queue_step ( ncat int )
RETURNS INT
LANGUAGE plpgsql
AS $f$
DECLARE qid INT;
stnum INT;
BEGIN
SELECT queue_id
INTO qid
FROM queue
WHERE status = 'working'
AND cat = ncat
ORDER BY ts LIMIT 1
FOR UPDATE;

IF qid IS NULL THEN
SELECT queue_id
INTO qid
FROM queue
WHERE status = 'new'
AND cat = ncat
ORDER BY ts LIMIT 1
FOR UPDATE;
END IF;

SELECT step FROM queue_steps
INTO stnum
WHERE queue_id = qid
AND status = 'pending'
ORDER BY step LIMIT 1;

IF stnum = 1 THEN
UPDATE queue
SET status = 'working'
WHERE queue_id = qid;
END IF;

UPDATE queue_steps
SET status = 'working'
WHERE queue_id = qid
AND step = stnum;

PERFORM pg_sleep(0.002);

UPDATE queue_steps
SET status = 'done',
done = now()
WHERE queue_id = qid
AND step = stnum;

IF stnum = 3 THEN
UPDATE queue
SET status = 'done'
WHERE queue_id = qid;
ELSE
UPDATE queue
SET current_step = stnum + 1
WHERE queue_id = qid;
END IF;

RETURN qid;

END; $f$;

ANALYZE;
\setrandom rcat 0 3
\setrandom wtime 1 3
BEGIN;
SELECT add_queue_item(:rcat);
END;
\sleep :wtime ms

\setrandom rcat 0 3
\setrandom wtime 1 3
BEGIN;
SELECT queue_step(:rcat);
\sleep :wtime ms
END;
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 04/21/2014 08:49 PM, Stephen Frost wrote:

* Tatsuo Ishii (is...@postgresql.org) wrote:

I observe performance degradation with PostgreSQL 9.3 vs 9.2 on Linux
as well.  The hardware is HP DL980G7, 80 cores, 2TB mem, RHEL 6,
pgbench is used (read only query), scale factor is 1,000 (DB size
15GB).

Can you isolate the sysv-vs-mmap patch and see what happens with just
that change..?



This is exactly why we need a benchfarm.

I actually have a client working based on Greg Smith's pgbench tools.

What we would need is a way to graph the results - that's something 
beyond my very rudimentary expertise in web programming. If anyone feels 
like collaborating I'd be glad to hear from them (The web site is 
programmed in perl + TemplateToolkit, but even that's not immutable. I'm 
open to using, say, node.js plus one of its templating engines.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

 This is exactly why we need a benchfarm.
 
 I actually have a client working based on Greg Smith's pgbench tools.
 
 What we would need is a way to graph the results - that's something
 beyond my very rudimentary expertise in web programming. If anyone
 feels like collaborating I'd be glad to hear from them (The web site
 is programmed in perl + TemplateToolkit, but even that's not
 immutable. I'm open to using, say, node.js plus one of its templating
 engines.

gnuplot? (the graph I attached was created by gnuplt).

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

Peter Geoghegan p...@heroku.com writes:
 On Mon, Apr 21, 2014 at 5:50 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ARC *was* the predecessor algorithm.  See commit 5d5087363.

 I believe that the main impetus for replacing ARC with clock sweep
 came from patent issues, though.

That was one issue, but performance gains were a large part of it too,
and the main reason why we picked clock sweep rather than something else.
Did you read the commit message I pointed to?

(See also 4e8af8d27.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

On Mon, Apr 21, 2014 at 6:08 PM, Tatsuo Ishii is...@postgresql.org wrote:
 What we would need is a way to graph the results - that's something
 beyond my very rudimentary expertise in web programming. If anyone
 feels like collaborating I'd be glad to hear from them (The web site
 is programmed in perl + TemplateToolkit, but even that's not
 immutable. I'm open to using, say, node.js plus one of its templating
 engines.

 gnuplot? (the graph I attached was created by gnuplt).

That's all pgbench-tools itself uses.

The problem with a performance farm is that it's relatively hard to
donate a performance farm member. It more or less requires expensive
hardware, and a large amount of rigor in testing and normalizing
various aspects of the environment that might otherwise add noise.
Then again, it might only take 2 or 3 servers to make a huge
difference. There are a number of different things that would be
immediately compelling to target with that kind of thing, so the first
step is non-obvious too.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD



On 04/21/2014 09:16 PM, Peter Geoghegan wrote:

On Mon, Apr 21, 2014 at 6:08 PM, Tatsuo Ishii is...@postgresql.org wrote:

What we would need is a way to graph the results - that's something
beyond my very rudimentary expertise in web programming. If anyone
feels like collaborating I'd be glad to hear from them (The web site
is programmed in perl + TemplateToolkit, but even that's not
immutable. I'm open to using, say, node.js plus one of its templating
engines.

gnuplot? (the graph I attached was created by gnuplt).

That's all pgbench-tools itself uses.

The problem with a performance farm is that it's relatively hard to
donate a performance farm member. It more or less requires expensive
hardware, and a large amount of rigor in testing and normalizing
various aspects of the environment that might otherwise add noise.
Then again, it might only take 2 or 3 servers to make a huge
difference. There are a number of different things that would be
immediately compelling to target with that kind of thing, so the first
step is non-obvious too.




If we never start we'll never get there.

I can think of several organizations that might be approached to donate 
hardware.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

On Mon, Apr 21, 2014 at 6:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Did you read the commit message I pointed to?

Yes.

 (See also 4e8af8d27.)

Oh, I wasn't actually aware of the fact that 2Q made it into the tree.
I thought that the first commit message you referred to just
referenced on-list discussion of 2Q. Interesting.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

On Mon, Apr 21, 2014 at 5:59 PM, Peter Geoghegan p...@heroku.com wrote:
 LRU-K, and 2Q have roughly the same advantages. I'm
 reasonably confident you can have the best of both worlds, or
 something closer to it.

Having said that, a big part of what I'd like to accomplish here is to
address the more general problem of correlated references. That's
probably something that has independent value.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DISCARD ALL (Again)

2014-04-21 Thread Fabrízio de Royes Mello

On Mon, Apr 21, 2014 at 12:00 AM, Stephen Frost sfr...@snowman.net wrote:

 * Robert Haas (robertmh...@gmail.com) wrote:
  On Fri, Apr 18, 2014 at 2:24 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   2. While I'm no Python expert, I believe GD is just a specific
instance
   of a general capability for global state in Python.  Are we going to
   promise that any and all user-created data inside Python goes away?
   What about other PLs?  Will users thank us if this suddenly starts
   happening?
 
  This is not the first time that somebody's asked for a way to throw
  away global interpreter state, and I really think we ought to oblige.
  In a connection-pooling environment, you really need a way to get the
  connection back to its original state rather than some not-so-near
  facsimile thereof.  Maybe it'll end up as an optional behavior, and
  which kind of reset to use will become part of the pooler
  configuration, but that doesn't bother me as much as not having it for
  those that want it.

 Drop the connection and reconnect would be the answer to that.  For as
 much as we may hope and wish for a connection to go back to 'the way it
 was upon first connection', throwing away the interpretor *might* (and I
 wouldn't be comfortable claiming it absolutely..) get you there when
 you've only called functions which use interpretors, but people write
 code in C too and we've seen complaints of memory leaks, etc, from C
 libraries and C extensions- and there's nothing we're going to be able
 to do to address that, so this mythical 'DISCARD EVERYTHING' is a pipe
 dream.  (Were we to actually re-exec ourselves into a new process, as if
 we went through a disconnect/reconnect, I'd be more inclined to support
 this capability, but I'm not sure what such would really buy us...)

  What's a bit odd about this request is that it asks for the ability to
  throw away only part of the state.  ISTM that if somebody wants to add
  that kind of capability, they ought to just package a function which
  does precisely that with the plpython extension, or create a Python
  function that zaps that particular variable if that's possible.  I
  think it's clearly useful to have DISCARD ALL be a request to discard
  *everything* in one shot, but it's going to be a stretch to come up
  with DISCARD variants for every kind of partial state removal somebody
  wants to do.

 Agreed.


Doesn't makes sense implement a way to every extension register your own
cleanup code and them with DISCARD EXTENSIONS and/or DISCARD ALL run this
registered procedure?

If yes I can develop a patch to do that (actually I already start with
implementation).

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-21 Thread Noah Misch

On Mon, Apr 21, 2014 at 10:57:34AM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  I wonder how it would work out to instead delay this new detoast effort 
  until
  toast_insert_or_update().
 
 That would require toast_insert_or_update() to know about every container
 datatype.  I doubt it could lead to an extensible or maintainable
 solution.

If that's its worst drawback, it's excellent.

 I'm actually planning to set this patch on the shelf for a bit and go
 investigate the other alternative, ie, not generating composite Datums
 containing toast pointers in the first place.  We now know that the idea
 that we aren't going to take performance hits *somewhere* is an illusion,
 and I still suspect that the other way is going to lead to a smaller and
 cleaner patch.  The main performance downside for plpgsql might be
 addressable by making sure that plpgsql record variables fall on the heap
 tuple rather than the composite Datum side of the line.  I'm also quite
 concerned about correctness: I don't have a lot of confidence that this
 patch has closed every loophole with respect to arrays, and it hasn't even
 touched ranges or any of the related one-off bugs that I believe exist.

I maintain that the potential slowdown is too great to consider adopting that
for the sake of a cleaner patch.  Your last message examined a 67% performance
regression.  The strategy you're outlining now can slow a query by 1,000,000%.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD

2014-04-21 Thread Joshua D. Drake



On 04/21/2014 06:19 PM, Andrew Dunstan wrote:



If we never start we'll never get there.

I can think of several organizations that might be approached to donate
hardware.


Like .Org?

We have a hardware farm, a rack full of hardware and spindles. It isn't 
the most current but it is there.


Sincerely,

JD



--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
Political Correctness is for cowards.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Perfomance degradation 9.3 (vs 9.2) for FreeBSD