[HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Rushabh Lathia
While trying out CURRENT OF with foreign table, ending up with error.

postgres=# select version();
 version

-
 PostgreSQL 9.3devel on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.7 20120313 (Red Hat 4.4.7-3), 64-bit
(1 row)

-- Create exptension  database
postgres=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION
postgres=# create database regression;
CREATE DATABASE

-- Create foreign server
postgres=# CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
postgres-#   OPTIONS (dbname 'regression');
CREATE SERVER
postgres=# CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE USER MAPPING

-- Create table into remote server
postgres=# \c regression
You are now connected to database regression as user rushabh.
regression=# create table test ( a int );
CREATE TABLE
regression=# insert into test values ( 1 );
INSERT 0 1

-- Connect to local server and create test function
regression=# \c postgres
-- Create foreign table
postgres=# create foreign table test ( a int ) server loopback;
CREATE FOREIGN TABLE
postgres=# CREATE OR REPLACE FUNCTION taest_func_dblink2()
postgres-# RETURNS numeric
postgres-# AS $$
postgres$# DECLARE c CURSOR FOR SELECT a FROM test FOR UPDATE;
postgres$# v_i numeric;
postgres$# BEGIN
postgres$# OPEN c;
postgres$# FETCH c INTO v_i;
postgres$# UPDATE test SET a=50 WHERE CURRENT OF c;
postgres$# RETURN 0;
postgres$# END; $$ LANGUAGE plpgsql;
CREATE FUNCTION

postgres=# select taest_func_dblink2();
ERROR:  CURRENT OF cannot be executed
CONTEXT:  SQL statement UPDATE test SET a=50 WHERE CURRENT OF c
PL/pgSQL function taest_func_dblink2() line 7 at SQL statement

Here test ending up with following:

elog(ERROR, CURRENT OF cannot be executed);

should we change this to ereport() or is there some other fix that we
should make?

Regards,
Rushabh Lathia
www.EnterpriseDB.com


Re: [HACKERS] Inconsistent DB data in Streaming Replication

2013-04-19 Thread Martijn van Oosterhout
On Wed, Apr 17, 2013 at 12:49:10PM +0200, Florian Pflug wrote:
 Fixing this on the receive side alone seems quite messy and fragile.
 So instead, I think we should let the master send a shutdown message
 after it has sent everything it wants to send, and wait for the client
 to acknowledge it before shutting down the socket.
 
 If the client fails to respond, we could log a fat WARNING.

ISTM the master should half close the socket, using shutdown(). That
way the client receives an EOF and can still then send its reply to the
master.  Then when the master receives that it can close() completely.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Recovery target 'immediate'

2013-04-19 Thread Robert Haas
On Thu, Apr 18, 2013 at 2:11 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 I just found out that if you use continuous archiving and online backups,
 it's surprisingly difficult to restore a backup, without replaying any more
 WAL than necessary.

 If you don't set a recovery target, PostgreSQL will recover all the WAL it
 finds. You can set recovery target time to a point immediately after the
 end-of-backup record, but that's tricky. You have to somehow find out the
 exact time when the backup ended, and set it to that. But if you set it any
 too early, recovery will abort with requested recovery stop point is before
 consistent recovery point error. And that's not quite precise anyway; not
 all record types carry timestamps, so you will always replay a few extra
 records until the first timestamped record comes along. Setting
 recovery_target_xid is similarly difficult. If you were well prepared, you
 created a named recovery point with pg_create_restore_point() immediately
 after the backup ended, and you can use that, but that requires forethought.

 It seems that we're missing a setting, something like recovery_target =
 'immediate', which would mean stop as soon as consistency is reached. Or
 am I missing some trick?

You know, I've been wondering for years how you're supposed to do
this.  Huge +1 for adding something like this, if it doesn't exist
already.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix typo in contrib/hstore/crc32.c comment

2013-04-19 Thread Robert Haas
On Wed, Apr 17, 2013 at 3:58 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 The attached patch fix a little typo on contrib/hstore/crc32.c comment.

Thanks.  Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Robert Haas
On Wed, Apr 17, 2013 at 6:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 No, it's a critical tool in complexity management.  When you're dealing
 with systems as complicated as a database, every little non-orthogonal
 detail adds up.  DISCARD ALL has a clear definition in terms of simpler
 commands, and it's going to stay that way.  Either this is worth a
 subcommand, or it's not worth worrying about at all.

We had this same argument back in November of 2008.  Marko said:

http://www.postgresql.org/message-id/24710.1227732...@sss.pgh.pa.us

And Greg Stark said:

http://www.postgresql.org/message-id/87iqqapag2@oxford.xeocode.com

And you said:

http://www.postgresql.org/message-id/24710.1227732...@sss.pgh.pa.us

And then you did this:

commit e309739670ac8c2fa0b236d116fcd44b0522025a
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Thu Nov 27 00:28:06 2008 +

Tweak wording of DISCARD ALL description to avoid giving the impression
that the presented list of equivalent operations is meant to be the
primary definition of what it does.  Per comment from Guillaume Smet.

So it seems to me that we pretty much already made a decision that the
controlling definition of DISCARD ALL is that, as the fine manual says
DISCARD ALL resets a session to its original state.  Whatever
decision we make now ought to be consistent with that.

IOW, I don't care whether we introduce a new subcommand or not.  But I
*do* think that that we ought to make our best effort to have DISCARD
ALL clear everything that smells like session-local state.  Random
incompatibilities between what you see when running under a connection
pooler and what you see when connecting the DB directly are *bad*,
regardless of whether a well-designed application should be relying on
those particular things or not.  The whole point of having a
transparent connection pooler is that it's supposed to be transparent
to the application.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Fabrízio de Royes Mello
On Fri, Apr 19, 2013 at 10:50 AM, Robert Haas robertmh...@gmail.com wrote:


 [...]

 So it seems to me that we pretty much already made a decision that the
 controlling definition of DISCARD ALL is that, as the fine manual says
 DISCARD ALL resets a session to its original state.  Whatever
 decision we make now ought to be consistent with that.

 IOW, I don't care whether we introduce a new subcommand or not.  But I
 *do* think that that we ought to make our best effort to have DISCARD
 ALL clear everything that smells like session-local state.  Random
 incompatibilities between what you see when running under a connection
 pooler and what you see when connecting the DB directly are *bad*,
 regardless of whether a well-designed application should be relying on
 those particular things or not.  The whole point of having a
 transparent connection pooler is that it's supposed to be transparent
 to the application.


+1

The attached wip patch do that and introduce a subcommand 'SEQUENCES', but
if we decide to don't add a new subcommand to DISCARD, then its easier to
modify the patch.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


discard_sequences.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Tom Lane
Rushabh Lathia rushabh.lat...@gmail.com writes:
 While trying out CURRENT OF with foreign table, ending up with error.

Yeah, that's an unimplemented feature.

In principle I think it could be made to work with postgres_fdw (since
that uses CTID row identification), but I doubt that it'd be possible
to promise that it works for every FDW.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-19 Thread Peter Eisentraut
On 4/18/13 11:31 AM, Dimitri Fontaine wrote:
 The only questions in this thread are:
 
   - only docs or docs + contrib example?

At this point, all that is appropriate is some documentation of the C
API.  If the contrib example you have in mind is short enough, it might
as well become part of the example in the documentation.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Robert Haas
On Fri, Apr 19, 2013 at 10:05 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 The attached wip patch do that and introduce a subcommand 'SEQUENCES', but
 if we decide to don't add a new subcommand to DISCARD, then its easier to
 modify the patch.

This patch is quite wrong.  It frees seqtab without clearing the
pointer, so the next reference will stomp on memory that may have been
reallocated.  And it doesn't even free seqtab correctly, since it only
frees the first node in the linked list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Robert Haas
On Fri, Apr 19, 2013 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Rushabh Lathia rushabh.lat...@gmail.com writes:
 While trying out CURRENT OF with foreign table, ending up with error.

 Yeah, that's an unimplemented feature.

 In principle I think it could be made to work with postgres_fdw (since
 that uses CTID row identification), but I doubt that it'd be possible
 to promise that it works for every FDW.

So, should we just make that an
ereport(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ...) instead of
elog()?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Adrian Klaver

On 04/19/2013 06:50 AM, Robert Haas wrote:

On Wed, Apr 17, 2013 at 6:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:

No, it's a critical tool in complexity management.  When you're dealing
with systems as complicated as a database, every little non-orthogonal
detail adds up.  DISCARD ALL has a clear definition in terms of simpler
commands, and it's going to stay that way.  Either this is worth a
subcommand, or it's not worth worrying about at all.



And then you did this:

commit e309739670ac8c2fa0b236d116fcd44b0522025a
Author: Tom Lane t...@sss.pgh.pa.us
Date:   Thu Nov 27 00:28:06 2008 +

 Tweak wording of DISCARD ALL description to avoid giving the impression
 that the presented list of equivalent operations is meant to be the
 primary definition of what it does.  Per comment from Guillaume Smet.

So it seems to me that we pretty much already made a decision that the
controlling definition of DISCARD ALL is that, as the fine manual says
DISCARD ALL resets a session to its original state.  Whatever
decision we make now ought to be consistent with that.

IOW, I don't care whether we introduce a new subcommand or not.  But I
*do* think that that we ought to make our best effort to have DISCARD
ALL clear everything that smells like session-local state.  Random
incompatibilities between what you see when running under a connection
pooler and what you see when connecting the DB directly are *bad*,
regardless of whether a well-designed application should be relying on
those particular things or not.  The whole point of having a
transparent connection pooler is that it's supposed to be transparent
to the application.



I understand the confusion on what constitutes ALL in DISCARD, though I 
am not sure about the incompatibility argument. The OP is using the 
transaction mode from pgBouncer and from their docs:


http://wiki.postgresql.org/wiki/PgBouncer

Transaction pooling
Server connection is assigned to client only during a transaction. When 
PgBouncer notices that transaction is over, the server will be put back 
into pool.
This mode breaks few session-based features of PostgreSQL. You can use 
it only when application cooperates by not using features that break. 
See the table below for incompatible features.


 Note that 'transaction' pooling breaks client expectations of server 
by design and can be used only if application cooperates by not using 
non-working features.



Session pooling
server_reset_query = DISCARD ALL;
This will clean everything.

Transaction pooling
server_reset_query =
Yes, empty. In transaction pooling mode the clients should not use any 
session-based features, so there is no need to clean anything. The 
server_reset_query would only add unnecessary round-trip between 
transactions and would drop various caches that the next transaction 
would unnecessarily need to fill again.




I could see the argument for a transparent pooler where it part of the 
core code. Not sure if it is the projects responsibility to maintain 
transparency with the feature matrices of external projects.




--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Apr 19, 2013 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, that's an unimplemented feature.

 So, should we just make that an
 ereport(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ...) instead of
 elog()?

I'm not that excited about the errcode; if we're going to do anything,
changing the message text seems more important.  Perhaps we could have
it say WHERE CURRENT OF is not supported for this table type?  That's
jumping to conclusions about why the expression didn't get converted,
but at least for this case it'd be a more useful user-facing message.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Robert Haas
On Fri, Apr 19, 2013 at 10:24 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Apr 19, 2013 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, that's an unimplemented feature.

 So, should we just make that an
 ereport(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), ...) instead of
 elog()?

 I'm not that excited about the errcode; if we're going to do anything,
 changing the message text seems more important.  Perhaps we could have
 it say WHERE CURRENT OF is not supported for this table type?  That's
 jumping to conclusions about why the expression didn't get converted,
 but at least for this case it'd be a more useful user-facing message.

Yeah, it's probably good to improve the error message, too; and that
suggestion seems as good as any.  But I still think it should be
ereport if it's user-facing.

My main concern was actually whether we ought to be detecting this
earlier in the process, before it gets as far as the executor.  I
haven't scrutinized the code though so have no particular reason to
believe it's not OK as-is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 My main concern was actually whether we ought to be detecting this
 earlier in the process, before it gets as far as the executor.

Yeah, that might be an appropriate response too.  The executor is
coded so cavalierly because it expects the planner to have replaced
the CURRENT OF node with something executable.  As things now stand,
whether that happens or not depends in part on the behavior of FDWs,
so maybe we'd better have the planner check whether it happened.
I'm not sure though if there's any suitably-painless place to do it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Fabrízio de Royes Mello
On Fri, Apr 19, 2013 at 11:12 AM, Robert Haas robertmh...@gmail.com wrote:

 On Fri, Apr 19, 2013 at 10:05 AM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  The attached wip patch do that and introduce a subcommand 'SEQUENCES',
 but
  if we decide to don't add a new subcommand to DISCARD, then its easier to
  modify the patch.

 This patch is quite wrong.  It frees seqtab without clearing the
 pointer, so the next reference will stomp on memory that may have been
 reallocated.  And it doesn't even free seqtab correctly, since it only
 frees the first node in the linked list.


Ohh sorry... you're all right... I completely forgot to finish the
ReleaseSequenceCaches to transverse 'seqtab' linked list and free each
node.

The attached patch have this correct code.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


discard_sequences.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Alvaro Herrera
Fabrízio de Royes Mello escribió:

 Ohh sorry... you're all right... I completely forgot to finish the
 ReleaseSequenceCaches to transverse 'seqtab' linked list and free each
 node.
 
 The attached patch have this correct code.

It seems a bad idea to backpatch this; whoever wants this functionality
in back branches should probably run a patched server.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 It seems a bad idea to backpatch this; whoever wants this functionality
 in back branches should probably run a patched server.

Surely this is 9.4 material at this point in any case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] question about postgres storage management

2013-04-19 Thread Will Childs-Klein
Hello All,
I'm writing today to inquire about finding the exact point in the source
where postgres writes to disk. I'm trying to implement some compression in
postgres. The idea is to compress the data right when its written to disk,
to reduce the amount of data written to disk, reducing the amount of time
of disk read/write. I'm hoping that this reduction in disk IO latency is
greater than the CPU cost incurred by compression, resulting in a speedup.
I will be testing various compression libraries to see which (if any) work
well for various query types. I've been looking through the source code, in
src/backend/storage specifically. I'm thinking something in smgr is where i
want to make my de/compress calls. Specifically in
src/backend/storage/smgr/md.c in the functions mdwrite(...) and
mdread(...). Am I in the right place? If not where should I look?

Sincerely,
Will Childs-Klein


Re: [HACKERS] question about postgres storage management

2013-04-19 Thread Stephen Frost
Will,

* Will Childs-Klein (willc...@gmail.com) wrote:
 I'm writing today to inquire about finding the exact point in the source
 where postgres writes to disk. I'm trying to implement some compression in
 postgres. 

PostgreSQL already does compression in most cases where you'd want it
done..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] question about postgres storage management

2013-04-19 Thread Merlin Moncure
On Fri, Apr 19, 2013 at 10:17 AM, Will Childs-Klein willc...@gmail.com wrote:
 Hello All,
 I'm writing today to inquire about finding the exact point in the source
 where postgres writes to disk. I'm trying to implement some compression in
 postgres. The idea is to compress the data right when its written to disk,
 to reduce the amount of data written to disk, reducing the amount of time of
 disk read/write. I'm hoping that this reduction in disk IO latency is
 greater than the CPU cost incurred by compression, resulting in a speedup. I
 will be testing various compression libraries to see which (if any) work
 well for various query types. I've been looking through the source code, in
 src/backend/storage specifically. I'm thinking something in smgr is where i
 want to make my de/compress calls. Specifically in
 src/backend/storage/smgr/md.c in the functions mdwrite(...) and mdread(...).
 Am I in the right place? If not where should I look?

this is not going to work.  postgres tables are page organized -- if
you compress pages as they are written out they become variable
length.   this in turn would cause the entire file to have to shift up
if you wrote a page back and the size grew.

as noted, postgres already compresses the most interesting case --
when a single tuple spans pages (aka toast).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] currval and DISCARD ALL

2013-04-19 Thread Robert Haas
On Fri, Apr 19, 2013 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 It seems a bad idea to backpatch this; whoever wants this functionality
 in back branches should probably run a patched server.

 Surely this is 9.4 material at this point in any case.

I don't know why this couldn't be slipped into 9.3; we have done worse
later.  But I don't have a personal stake in it either, and will
certainly defer to whatever the consensus is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] question about postgres storage management

2013-04-19 Thread Christopher Browne
I would expect the strategy you have in mind to be more useful to apply at
the filesystem level, so that it's not in Postgres altogether.  (Ala
Stacker, remember DR-DOS?)

But, to speak arguable heresy, the demerits of this sort of thing are
described nicely in Another Database's Documentation: 
http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-compression-internals-algorithms.html


The relevant bit that seems to describe fairly aptly why what you are
suggesting is unlikely to turn out well:

Some operating systems implement compression at the file system level.
Files are typically divided into fixed-size blocks that are compressed into
variable-size blocks, which easily leads into fragmentation. Every time
something inside a block is modified, the whole block is recompressed
before it is written to disk. These properties make this compression
technique unsuitable for use in an update-intensive database system.

The principle described is as applicable to Postgres as it is to InnoDB,
and is as applicable to attempting to compress disk blocks from within the
database as it is to apply it at the filesystem layer.

Postgres *does* make use of data compression, where applicable; see the
documentation for TOAST: 
http://www.postgresql.org/docs/9.2/static/storage-toast.html

You should avail yourself of the code on TOAST:
./src/backend/catalog/toasting.c
./src/backend/access/heap/tuptoaster.c
./src/include/catalog/toasting.h
./src/include/access/tuptoaster.h


Re: [HACKERS] question about postgres storage management

2013-04-19 Thread Merlin Moncure
On Fri, Apr 19, 2013 at 11:17 AM, Christopher Browne cbbro...@gmail.com wrote:

 I would expect the strategy you have in mind to be more useful to apply at
 the filesystem level, so that it's not in Postgres altogether.  (Ala
 Stacker, remember DR-DOS?)

 But, to speak arguable heresy, the demerits of this sort of thing are
 described nicely in Another Database's Documentation:
 http://dev.mysql.com/doc/innodb-plugin/1.0/en/innodb-compression-internals-algorithms.html

 The relevant bit that seems to describe fairly aptly why what you are
 suggesting is unlikely to turn out well:

 Some operating systems implement compression at the file system level.
 Files are typically divided into fixed-size blocks that are compressed into
 variable-size blocks, which easily leads into fragmentation. Every time
 something inside a block is modified, the whole block is recompressed before
 it is written to disk. These properties make this compression technique
 unsuitable for use in an update-intensive database system.

 The principle described is as applicable to Postgres as it is to InnoDB, and
 is as applicable to attempting to compress disk blocks from within the
 database as it is to apply it at the filesystem layer.

 Postgres *does* make use of data compression, where applicable; see the
 documentation for TOAST:
 http://www.postgresql.org/docs/9.2/static/storage-toast.html


...which I turn off in cases where I'm concerned about performance :-D.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inconsistent DB data in Streaming Replication

2013-04-19 Thread Florian Pflug
On Apr19, 2013, at 14:46 , Martijn van Oosterhout klep...@svana.org wrote:
 On Wed, Apr 17, 2013 at 12:49:10PM +0200, Florian Pflug wrote:
 Fixing this on the receive side alone seems quite messy and fragile.
 So instead, I think we should let the master send a shutdown message
 after it has sent everything it wants to send, and wait for the client
 to acknowledge it before shutting down the socket.
 
 If the client fails to respond, we could log a fat WARNING.
 
 ISTM the master should half close the socket, using shutdown(). That
 way the client receives an EOF and can still then send its reply to the
 master.  Then when the master receives that it can close() completely.

Hm, there may be arbitrarily many reply requests within the unread
data in the socket's buffer, so wait for just one reply won't work.
Also, to distinguish a slave which crashes while the master shuts down
from one that has received all WAL and flushed it, the slave should flush
all WAL and send a final reply before closing the socket.

So the master would, upon shutting down, close only its writing end
of the connection, and continue to receive replies until it sees EOF.
After all slaves have gone, the master would emit a WARNING for every
slave whose last logged flush position is earlier than the master's
idea of end-of-wal.

The slave would, upon seeing EOF, flush all its WAL, send a final
reply, and close() the socket.

I'm not sure that relying on TCP's half-close feature has much benefit
over using a home-grown shutdown message, though. Anyway, the basic
shutdown protocol would be the same regardless of what exactly we use
to signal a shutdown.

BTW, I assume we'd only do this for smart shutdowns. 

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] elog() error, trying CURENT OF with foreign table

2013-04-19 Thread Tom Lane
I wrote:
 Robert Haas robertmh...@gmail.com writes:
 My main concern was actually whether we ought to be detecting this
 earlier in the process, before it gets as far as the executor.

 Yeah, that might be an appropriate response too.  The executor is
 coded so cavalierly because it expects the planner to have replaced
 the CURRENT OF node with something executable.  As things now stand,
 whether that happens or not depends in part on the behavior of FDWs,
 so maybe we'd better have the planner check whether it happened.
 I'm not sure though if there's any suitably-painless place to do it.

After looking at this a bit, my memory was faulty: the CurrentOfExpr
isn't really transformed to something else, we just make sure it ends up
in a TidScan node's tidqual list, where it won't be executed in the
normal way.  The equivalent thing for a foreign table would be for the
FDW's execution code to have special smarts about what to do with a
CurrentOfExpr qual.  So there's no way for the core planner to know
whether a ForeignScan incorporating a CurrentOfExpr qual is OK or not.

Short of a major redesign of the way this is handled, treating the
execution-time error as user-facing seems like the thing to do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing message about archive failures

2013-04-19 Thread Daniel Farina
On Wed, Apr 17, 2013 at 7:33 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Wednesday, April 17, 2013, Peter Eisentraut wrote:

 When archive_command fails three times, it prints this message into the
 logs:

 transaction log file \%s\ could not be archived: too many failures

 This leaves it open what happens next.  What will actually happen is
 that it will usually try again after 60 seconds or so, but the message
 indicates something much more fatal than that.

 Could we rephrase this a little bit to make it less dramatic, like

 ... too many failures, will try again later

 ?


 +1  I've found the current message alarming/confusing as well.  But I don't
 really understand the logic behind bursting the attempts, 3 of them one
 second apart, then sleeping 57 seconds, in the first place.

Same.  By now I am numb, but when I was first rolling out archives
ages ago the message was cause for more much alarm than was indicated.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Jeff Janes
On Wed, Apr 3, 2013 at 6:40 PM, Greg Stark st...@mit.edu wrote:


 On Fri, Aug 21, 2009 at 6:54 PM, decibel deci...@decibel.org wrote:

 Would it? Risk seems like it would just be something along the lines of
 the high-end of our estimate. I don't think confidence should be that hard
 either. IE: hard-coded guesses have a low confidence. Something pulled
 right out of most_common_vals has a high confidence.


I wouldn't be so sure of that.  I've run into cases where all of the
frequencies pulled out of most_common_vals are off by orders of magnitude.
 The problem is that if ANALYZE only samples 1/1000th of the table, and it
sees a value twice, it assumes the value is present 2000 times in the
table, even when it was only in the table twice.  Now, for any given value
that occurs twice in the table, it is very unlikely for both of those to
end up in the sample. But when you have millions of distinct values which
each occur twice (or some low number of time), it is a near certainty that
several of them are going to end with both instances in the sample.  Those
few ones that get lucky are of course going to end up in the
most_common_vals list.

Since the hashjoin estimates cost depending on the frequency of the most
common value, having this be systematically off by a factor of 1000 is
rather unfortunate.

The problem here is that the sample size which is adequate for getting a
good estimate of the histograms (which is what controls the sample size
currently) is not adequate for getting a good estimate of most_common_vals.
 Cranking up the statistics_target would give a better estimates of
most_common_vals, but at the expense of having a needlessly large
histogram, which slows down planning.  There is currently no knob to crank
up the sample size for the sake of most common values, but then prune the
histogram back down for storage.

Cheers,

Jeff


Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Claudio Freire
On Fri, Apr 19, 2013 at 6:19 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Wed, Apr 3, 2013 at 6:40 PM, Greg Stark st...@mit.edu wrote:


 On Fri, Aug 21, 2009 at 6:54 PM, decibel deci...@decibel.org wrote:

 Would it? Risk seems like it would just be something along the lines of
 the high-end of our estimate. I don't think confidence should be that hard
 either. IE: hard-coded guesses have a low confidence. Something pulled right
 out of most_common_vals has a high confidence.


 I wouldn't be so sure of that.  I've run into cases where all of the
 frequencies pulled out of most_common_vals are off by orders of magnitude.
 The problem is that if ANALYZE only samples 1/1000th of the table, and it
 sees a value twice, it assumes the value is present 2000 times in the table,
 even when it was only in the table twice.  Now, for any given value that
 occurs twice in the table, it is very unlikely for both of those to end up
 in the sample. But when you have millions of distinct values which each
 occur twice (or some low number of time), it is a near certainty that
 several of them are going to end with both instances in the sample.  Those
 few ones that get lucky are of course going to end up in the
 most_common_vals list.


Especially if there's some locality of occurrence, since analyze
samples pages, not rows.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Jeff Janes
On Thu, Apr 4, 2013 at 11:53 AM, Dimitri Fontaine dimi...@2ndquadrant.frwrote:

 Robert Haas robertmh...@gmail.com writes:
  for estimate_worstcase_fraction.  So, when computing the cost of a
  path, we'd compute our current expected-case estimate, and also a
  worst-case estimate, and then compute the final cost as:

 There also was the idea for the executor to be able to handle alternate
 plans and some heuristic to determine that the actual cost of running a
 plan is much higher than what's been estimated, so much so as to switch
 to starting from scratch with the other plan instead.


Or even before it starts executing.  If the planner realizes the stakes are
high enough, it could abandon its assumptions about how likely it is for a
block to be in cache, and go do a little sampling of the cache and see.  To
be effective it would probably have to sample the OS cache as well as the
shared_buffers, which would certain complicate things and might not be
portable.

Of course right now there is no explicit estimate about the cache hit rate
at all, they are just implicitly built into other settings.  So those would
probably need to be separated into true IO cost, and a default cache
estimate to used when sampling is not warranted.

Cheers,

Jeff


Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Jeff Janes
On Fri, Apr 19, 2013 at 2:24 PM, Claudio Freire klaussfre...@gmail.comwrote:


 Especially if there's some locality of occurrence, since analyze
 samples pages, not rows.


But it doesn't take all rows in each sampled page.  It generally takes
about one row per page, specifically to avoid the problem you indicate.
Maybe it is possible to trick it into taking too many (for example, if most
pages are completely empty), but I haven't experienced that as being a
problem.

Cheers,

Jeff


Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Claudio Freire
On Fri, Apr 19, 2013 at 7:43 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Fri, Apr 19, 2013 at 2:24 PM, Claudio Freire klaussfre...@gmail.com
 wrote:


 Especially if there's some locality of occurrence, since analyze
 samples pages, not rows.


 But it doesn't take all rows in each sampled page.  It generally takes about
 one row per page, specifically to avoid the problem you indicate. Maybe it
 is possible to trick it into taking too many (for example, if most pages are
 completely empty), but I haven't experienced that as being a problem.


Still, I remember a discussion where it was clear there was a bias
towards sampling rows from the same page (or nearby), resulting in
this particular problem.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers