Re: [HACKERS] WIP patch: add (PRE|POST)PROCESSOR options to COPY

2012-09-19 Thread Etsuro Fujita
 From: Tom Lane [mailto:t...@sss.pgh.pa.us]

 I wrote:
  Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
  I have a question.  I think it would be also better to extend the syntax
  for the SQL COPY command in the same way, ie,
  COPY foo from '/home/pgsql/decompress.sh /home/pgsql/foo.csv.gz |' with
  format 'csv'
 
  Yeah, sure --- that case is already superuser-only, so why not give it
  the option of being a popen instead of just fopen.
 
 BTW, one thought that comes to mind is that such an operation is
 extremely likely to fail under environments such as SELinux.  That's
 not necessarily a reason not to do it, but we should be wary of
 promising that it will work everywhere.  Probably a documentation note
 about this would be enough.

OK  I'll revise the patch.

Thank you for your advice!

Best regards,
Etsuro Fujita



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] pg_upgrade from 9.1.3 to 9.2 failed

2012-09-19 Thread Bruce Momjian
On Tue, Sep 18, 2012 at 07:22:39PM -0400, Bruce Momjian wrote:
  Based on the fact that sql_features exists in the information_schema
  schema, I don't think 'sql_features' table is actually being processed
  by pg_upgrade, but I think its TOAST table, because it has a high oid,
  is being processed because it is in the pg_toast schema.  This is
  causing the mismatch between the old and new clusters.
  
  I am thinking this query needs to be split apart into a UNION where the
  second part handles TOAST tables and looks at the schema of the _owner_
  of the TOAST table.  Needs to be backpatched too.
 
 OK, I am at a conference now so will not be able to write-up a patch
 until perhaps next week.  You can drop the information schema in the old
 database and pg_upgrade should run fine.  I will test your failure once
 I create a patch.

One good thing is that pg_upgrade detected there was a bug in the code
and threw an error, rather than producing an inaccurate dump.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal - assign result of query to psql variable

2012-09-19 Thread Shigeru HANADA
On Fri, Aug 10, 2012 at 3:21 AM, Pavel Stehule pavel.steh...@gmail.com
wrote:
 there is new version of this patch

 * cleaned var list parser
 * new regress tests
 * support FETCH_COUNT  0

Here are my review comments.

Submission
==
The patch is formatted in context diff style, and it could be applied
cleanly against latest master.  This patch include document and tests,
but IMO they need some enhancement.

Usability
=
This patch provides new psql command \gset which sends content of query
buffer to server, and stores result of the query into psql variables.
The name \gset is mixture of \g, which sends result to file or pipe,
and \set, which sets variable to some value, so it would sound natural
to psql users.

Freature test
=
Compile completed without warning.  Regression tests for \gset passed,
but I have some comments on them.

- Other regression tests have comment -- ERROR just after queries
which should fail.  It would be nice to follow this manner.
- Typo to few in expected file and source file.
- How about adding testing \gset (no variable list) to should fail?
- Is it intentional that \gset can set special variables such as
AUTOCOMMIT and HOST?  I don't see any downside for this behavior,
because \set also can do that, but it is not documented nor tested at all.

Document

- Adding some description of \gset command, especially about limitation
of variable list, seems necessary.
- In addition to the meta-command section, Advanced features section
mentions how to set psql's variables, so we would need some mention
there too.
- The term target list might not be familiar to users, since it
appears in only sections mentioning PG internal relatively.  I think
that the feature described in the section Retrieving Query Results in
ECPG document is similar to this feature.
http://www.postgresql.org/docs/devel/static/ecpg-variables.html

Coding
==
The code follows our coding conventions.  Here are comments for coding.

- Some typo found in comments, please see attached patch.
- There is a code path which doesn't print error message even if libpq
reported error (PGRES_BAD_RESPONSE, PGRES_NONFATAL_ERROR,
PGRES_FATAL_ERROR) in StoreQueryResult.  Is this intentional?  FYI, ecpg
prints bad response message for those errors.

Although I'll look the code more closely later, but anyway I marked the
patch Waiting on Author for comments above.

Regards,
-- 
Shigeru HANADA
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 0e9b408..a76b84d 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -832,7 +832,7 @@ PrintQueryResults(PGresult *results)
  *
  * Note: Utility function for use by SendQuery() only.
  *
- * Returns true if the query executed sucessfully, false otherwise.
+ * Returns true if the query executed successfully, false otherwise.
  */
 static bool
 StoreQueryResult(PGresult *result)
@@ -865,7 +865,7 @@ StoreQueryResult(PGresult *result)
{
if (!iter)
{
-   psql_error(to few 
target variables\n);
+   psql_error(too few 
target variables\n);
success = false;
break;
}
@@ -902,7 +902,7 @@ StoreQueryResult(PGresult *result)
 
case PGRES_COPY_OUT:
case PGRES_COPY_IN:
-   psql_error(COPY isnot supported by \\gset command\n);
+   psql_error(COPY is not supported by \\gset command\n);
success = false;
break;
 
@@ -1797,7 +1797,7 @@ expand_tilde(char **filename)
 
 
 /*
- * Add name of internal variable to query targer list
+ * Add name of internal variable to query target list
  *
  */
 TargetList
diff --git a/src/test/regress/expected/psql.out 
b/src/test/regress/expected/psql.out
index 90ab9bd..54fa490 100644
--- a/src/test/regress/expected/psql.out
+++ b/src/test/regress/expected/psql.out
@@ -15,7 +15,7 @@ select 10, 20, 'Hello World'
 -- should fail
 \gset ,,
 \gset ,
-to few target variables
+too few target variables
 \gset ,,,
 too many target variables
 -- should be ok

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce palloc's in numeric operations.

2012-09-19 Thread Heikki Linnakangas

On 14.09.2012 11:25, Kyotaro HORIGUCHI wrote:

Hello, I will propose reduce palloc's in numeric operations.

The numeric operations are slow by nature, but usually it is not
a problem for on-disk operations. Altough the slowdown is
enhanced on on-memory operations.

I inspcted them and found some very short term pallocs. These
palloc's are used for temporary storage for digits of unpaked
numerics.

The formats of numeric digits in packed and unpaked forms are
same. So we can kicked out a part of palloc's using digits in
packed numeric in-place to make unpakced one.

In this patch, I added new function set_var_from_num_nocopy() to
do this. And make use of it for operands which won't modified.


Have to be careful to really not modify the operands. numeric_out() and 
numeric_out_sci() are wrong; they call get_str_from_var(), which 
modifies the argument. Same with numeric_expr(): it passes the argument 
to numericvar_to_double_no_overflow(), which passes it to 
get_str_from_var(). numericvar_to_int8() also modifies its argument, so 
all the functions that use that, directly or indirectly, must make a copy.


Perhaps get_str_from_var(), and the other functions that currently 
scribble on the arguments, should be modified to not do so. They could 
easily make a copy of the argument within the function. Then the callers 
could safely use set_var_from_num_nocopy(). The performance would be the 
same, you would have the same number of pallocs, but you would get rid 
of the surprising argument-modifying behavior of those functions.



The performance gain seems quite moderate

'SELECT SUM(numeric_column) FROM on_memory_table' for ten million
rows and about 8 digits numeric runs for 3480 ms aganst original
3930 ms. It's 11% gain.  'SELECT SUM(int_column) FROM
on_memory_table' needed 1570 ms.

Similary 8% gain for about 30 - 50 digits numeric. Performance of
avg(numeric) made no gain in contrast.

Do you think this worth doing?


Yes, I think this is worthwhile. I'm seeing an even bigger gain, with 
smaller numerics. I created a table with this:


CREATE TABLE numtest AS SELECT a::numeric AS col FROM generate_series(1, 
1000) a;


And repeated this query with \timing:

SELECT SUM(col) FROM numtest;

The execution time of that query fell from about 5300 ms to 4300 ms, ie. 
about 20%.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ToDo: allow to get a number of processed rows by COPY statement

2012-09-19 Thread Heikki Linnakangas

On 16.08.2012 14:43, Pavel Stehule wrote:

Hello

here is updated patch


The patch seems to be truncated, it ends with:

*** a/src/test/regress/input/copy.source
--- b/src/test/regress/input/copy.source
***
*** 106,108  this is just a line full of junk that would error out 
if parsed

--- 106,112 
  \.

  copy copytest3 to stdout csv header;
+
+ -- copy should to return processed rows
+ do $$
+

I believe the code changes are OK, but regression test changes are 
missing. Can you resend the full patch, please?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Tom Lane
florian.schoppm...@emc.com (Florian Schoppmann) writes:
 In PostgreSQL 9.1 and 9.2 (possibly also in earlier versions), the query

 --8--
 WITH source AS (
 SELECT i FROM generate_series(1,10) AS i
 )
 SELECT
 i
 FROM
 source, (
 SELECT
 count(*) AS _n
 FROM source
 ) AS _stats
 WHERE
 random()  5::DOUBLE PRECISION/_n;
 --8--

[ doesn't do what you think it should ]

I can't get excited about this.  Any time you put a volatile function
into WHERE, you're playing with fire.  The docs warn against it:
http://www.postgresql.org/docs/9.2/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL

To do what you want, I'd suggest wrapping the join into a sub-select
with an OFFSET 0 clause, which will serve as an optimization fence
that prevents the random() call from being pushed down.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Merlin Moncure
On Wed, Sep 19, 2012 at 9:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 florian.schoppm...@emc.com (Florian Schoppmann) writes:
 In PostgreSQL 9.1 and 9.2 (possibly also in earlier versions), the query

 --8--
 WITH source AS (
 SELECT i FROM generate_series(1,10) AS i
 )
 SELECT
 i
 FROM
 source, (
 SELECT
 count(*) AS _n
 FROM source
 ) AS _stats
 WHERE
 random()  5::DOUBLE PRECISION/_n;
 --8--

 [ doesn't do what you think it should ]

 I can't get excited about this.  Any time you put a volatile function
 into WHERE, you're playing with fire.  The docs warn against it:
 http://www.postgresql.org/docs/9.2/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL

 To do what you want, I'd suggest wrapping the join into a sub-select
 with an OFFSET 0 clause, which will serve as an optimization fence
 that prevents the random() call from being pushed down.

I like the more standard CTE approach to optimization fencing where it works:

postgres=# WITH source AS (
SELECT i, random() r
FROM generate_series(1,10) AS i
)
SELECT
i
FROM
source, (
SELECT
count(*) AS _n
FROM source
) AS _stats
WHERE
r  5::DOUBLE PRECISION/_n;

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Robert Haas
On Wed, Sep 19, 2012 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 florian.schoppm...@emc.com (Florian Schoppmann) writes:
 In PostgreSQL 9.1 and 9.2 (possibly also in earlier versions), the query

 --8--
 WITH source AS (
 SELECT i FROM generate_series(1,10) AS i
 )
 SELECT
 i
 FROM
 source, (
 SELECT
 count(*) AS _n
 FROM source
 ) AS _stats
 WHERE
 random()  5::DOUBLE PRECISION/_n;
 --8--

 [ doesn't do what you think it should ]

 I can't get excited about this.  Any time you put a volatile function
 into WHERE, you're playing with fire.  The docs warn against it:
 http://www.postgresql.org/docs/9.2/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL

 To do what you want, I'd suggest wrapping the join into a sub-select
 with an OFFSET 0 clause, which will serve as an optimization fence
 that prevents the random() call from being pushed down.

You've repeatedly objected to complaints on pgsql-performance on the
grounds that WITH is an optimization fence.  It seems awfully
inconsistent to turn around and say, oh, sometimes it's not a fence
after all.  It seems that users may not rely on WITH either to do the
optimizations necessary to have good performance or to fail to do
optimizations that lead to wrong results.  Ouch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Sep 19, 2012 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 To do what you want, I'd suggest wrapping the join into a sub-select
 with an OFFSET 0 clause, which will serve as an optimization fence
 that prevents the random() call from being pushed down.

 You've repeatedly objected to complaints on pgsql-performance on the
 grounds that WITH is an optimization fence.  It seems awfully
 inconsistent to turn around and say, oh, sometimes it's not a fence
 after all.

Huh?  The join in question is not inside a WITH.  If it were, that
would work too, as noted by Merlin.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Robert Haas
On Wed, Sep 19, 2012 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Sep 19, 2012 at 10:30 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 To do what you want, I'd suggest wrapping the join into a sub-select
 with an OFFSET 0 clause, which will serve as an optimization fence
 that prevents the random() call from being pushed down.

 You've repeatedly objected to complaints on pgsql-performance on the
 grounds that WITH is an optimization fence.  It seems awfully
 inconsistent to turn around and say, oh, sometimes it's not a fence
 after all.

 Huh?  The join in question is not inside a WITH.  If it were, that
 would work too, as noted by Merlin.

Oh, hmm.  I see now: the problem isn't that random() is being pushed
into the WITH, it's that it's being pushed into the join.  Sorry, I
should have read that more carefully.

It still seems like awfully weird behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removal of AcceptInvalidationMessages broke things

2012-09-19 Thread Tom Lane
I looked into bug #7557, which demonstrates a case where a session fails
to notice a just-committed change in table permissions.  This is pretty
obviously due to a failure to read the sinval message notifying other
backends of the pg_class.relacl update.  Some digging in the git history
says it got broken here:

commit 4240e429d0c2d889d0cda23c618f94e12c13ade7
Author: Robert Haas rh...@postgresql.org
Date:   Fri Jul 8 22:19:30 2011 -0400

Try to acquire relation locks in RangeVarGetRelid.

In the previous coding, we would look up a relation in RangeVarGetRelid,
lock the resulting OID, and then AcceptInvalidationMessages().  While
this was sufficient to ensure that we noticed any changes to the
relation definition before building the relcache entry, it didn't
handle the possibility that the name we looked up no longer referenced
the same OID.  This was particularly problematic in the case where a
table had been dropped and recreated: we'd latch on to the entry for
the old relation and fail later on.  Now, we acquire the relation lock
inside RangeVarGetRelid, and retry the name lookup if we notice that
invalidation messages have been processed meanwhile.  Many operations
that would previously have failed with an error in the presence of
concurrent DDL will now succeed.

because that commit removed this bit from relation_openrv:

-   /*
-* Check for shared-cache-inval messages before trying to open the
-* relation.  This is needed to cover the case where the name identifies a
-* rel that has been dropped and recreated since the start of our
-* transaction: if we don't flush the old syscache entry then we'll latch
-* onto that entry and suffer an error when we do RelationIdGetRelation.
-* Note that relation_open does not need to do this, since a relation's
-* OID never changes.
-*
-* We skip this if asked for NoLock, on the assumption that the caller has
-* already ensured some appropriate lock is held.
-*/
-   if (lockmode != NoLock)
-   AcceptInvalidationMessages();

and there are no other places where a transaction will do
AcceptInvalidationMessages if it thinks it already holds lock on the
tables it's working with.

Since we have only a few hours before 9.2.1 is due to wrap, my
inclination for a band-aid fix is to put back that code.  There might be
some more elegant answer, but we haven't got time to find it now.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 It still seems like awfully weird behavior.

Why?  The WHERE condition relates only to the output of the _stats
subquery, so why shouldn't it be evaluated there, rather than after
the join?

In a green field I might agree that we should de-optimize such cases,
but the problem with doing so is that it would totally destroy
performance for cases in which a user has defined a function that's
actually stable or immutable but they forgot to mark it so.  If
VOLATILE weren't the default marking, such a change wouldn't be so
problematic ... but it is.  Given that the behavior has been like
this since the late stone age, I'm not inclined to change it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.

2012-09-19 Thread Tom Lane
BTW, what should our advice be for recovering from corruption due to
this bug?  As far as the btree and GIN problems go, we can tell people
that REINDEX will fix it.  And in 9.1, you don't really need to worry
about the visibility map being bad.  But what do you do in 9.2, if
you have a bad visibility map?  Is there any fix short of VACUUM FULL?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of AcceptInvalidationMessages broke things

2012-09-19 Thread Noah Misch
On Wed, Sep 19, 2012 at 01:17:17PM -0400, Tom Lane wrote:
 I looked into bug #7557, which demonstrates a case where a session fails
 to notice a just-committed change in table permissions.

 -   /*
 -* Check for shared-cache-inval messages before trying to open the
 -* relation.  This is needed to cover the case where the name identifies a
 -* rel that has been dropped and recreated since the start of our
 -* transaction: if we don't flush the old syscache entry then we'll latch
 -* onto that entry and suffer an error when we do RelationIdGetRelation.
 -* Note that relation_open does not need to do this, since a relation's
 -* OID never changes.
 -*
 -* We skip this if asked for NoLock, on the assumption that the caller has
 -* already ensured some appropriate lock is held.
 -*/
 -   if (lockmode != NoLock)
 -   AcceptInvalidationMessages();
 
 and there are no other places where a transaction will do
 AcceptInvalidationMessages if it thinks it already holds lock on the
 tables it's working with.
 
 Since we have only a few hours before 9.2.1 is due to wrap, my
 inclination for a band-aid fix is to put back that code.  There might be
 some more elegant answer, but we haven't got time to find it now.

Sounds fine for now.  I suspect the better change would be to make
AcceptInvalidationMessages() unconditional in LockRelationOid() and friends.
There's no reason to desire recent ACLs only when opening by name.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.

2012-09-19 Thread Simon Riggs
On 19 September 2012 18:47, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, what should our advice be for recovering from corruption due to
 this bug?  As far as the btree and GIN problems go, we can tell people
 that REINDEX will fix it.  And in 9.1, you don't really need to worry
 about the visibility map being bad.  But what do you do in 9.2, if
 you have a bad visibility map?  Is there any fix short of VACUUM FULL?

SET vacuum_freeze_table_age = 0;
VACUUM;

I'm writing the full notes out now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of AcceptInvalidationMessages broke things

2012-09-19 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Wed, Sep 19, 2012 at 01:17:17PM -0400, Tom Lane wrote:
 Since we have only a few hours before 9.2.1 is due to wrap, my
 inclination for a band-aid fix is to put back that code.  There might be
 some more elegant answer, but we haven't got time to find it now.

 Sounds fine for now.  I suspect the better change would be to make
 AcceptInvalidationMessages() unconditional in LockRelationOid() and friends.
 There's no reason to desire recent ACLs only when opening by name.

I think it's enough for now because the first access to a relation in a
statement is always a name-based lookup from the parser.  Were that not
sufficient, we'd have had complaints before.

The core problem really is that GRANT/REVOKE don't take any object-level
lock on what they're changing.  A real fix might require sprinkling
AcceptInvalidationMessages calls into aclchk.c, but I'm unsure of the
performance costs of that.  Anyway, today is not the time to design
something better.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removal of AcceptInvalidationMessages broke things

2012-09-19 Thread Robert Haas
On Wed, Sep 19, 2012 at 2:17 PM, Noah Misch n...@leadboat.com wrote:
 Sounds fine for now.  I suspect the better change would be to make
 AcceptInvalidationMessages() unconditional in LockRelationOid() and friends.
 There's no reason to desire recent ACLs only when opening by name.

I agree, on both counts.  I think I failed to realize when doing that
refactoring that I was reducing the number of cases where
AcceptInvalidationMessages() would actually be called.  AFAICS, the
only reason why we have such complicated rules for calling that
function is that it's traditionally been expensive.  But that should
be much less true due now due to improvements in 9.2 (cf commit
b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4).  So we can probably afford
to be a little less byzantine about the way we do this now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Robert Haas
On Wed, Sep 19, 2012 at 1:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 It still seems like awfully weird behavior.

 Why?  The WHERE condition relates only to the output of the _stats
 subquery, so why shouldn't it be evaluated there, rather than after
 the join?

Because my mental model (and apparently that of the OP) is that the
WHERE clause gets evaluated separately for each row.  Obviously in
many cases that can be optimized without changing the results, but not
in this case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 It still seems like awfully weird behavior.
 
 Why?  The WHERE condition relates only to the output of the _stats
 subquery, so why shouldn't it be evaluated there, rather than
 after the join?

In another thread, Tom Lane t...@sss.pgh.pa.us wrote:
 It's easier to understand why this is if you realize that SQL has
 a very clear model of a pipeline of query execution. 
 Conceptually, what happens is:
 
 1. Form the cartesian product of the tables listed in FROM (ie,
 all combinations of rows).
 
 2. Apply the WHERE condition to each row from 1, and drop rows
 that don't pass it.
 
People expect that the results will be consistent with this model,
even if the implementation is optimized under the covers.  I think
correct semantics should trump performance here.
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread k...@rice.edu
On Wed, Sep 19, 2012 at 02:39:12PM -0500, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:
  Robert Haas robertmh...@gmail.com writes:
  It still seems like awfully weird behavior.
  
  Why?  The WHERE condition relates only to the output of the _stats
  subquery, so why shouldn't it be evaluated there, rather than
  after the join?
 
 In another thread, Tom Lane t...@sss.pgh.pa.us wrote:
  It's easier to understand why this is if you realize that SQL has
  a very clear model of a pipeline of query execution. 
  Conceptually, what happens is:
  
  1. Form the cartesian product of the tables listed in FROM (ie,
  all combinations of rows).
  
  2. Apply the WHERE condition to each row from 1, and drop rows
  that don't pass it.
  
 People expect that the results will be consistent with this model,
 even if the implementation is optimized under the covers.  I think
 correct semantics should trump performance here.
  
 -Kevin
 

It seems like this is what happens here except that the function is
evaluated once for the WHERE and not once per ROW. Both of these meet
the criterion for 2 above and Tom's earlier comments both hold.

Regards,
Ken


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CTE optimization fence on the todo list?

2012-09-19 Thread Daniel Browning
I would like to have the option of disabling the CTE optimization fence for 
certain CTEs and/or queries. Can that be added to the official todo list? If 
not, why not?

I would find the option beneficial because large, complicated queries are 
often a lot clearer, simpler, and easier to read with CTEs than the 
equivalent query without CTEs. In some cases, the query with CTEs is also 
faster because of the optimization fence. But in other cases, the fence 
makes it a lot slower. In the latter cases, you are left with a choice 
between ugly and slow.

If there was some method to disable the optimization fence for certain CTEs 
or entire queries, then it would be possible to have the best of both 
worlds.

I apologize if this has already been covered before. I could only find two 
earlier discussions on this topic:

http://archives.postgresql.org/pgsql-performance/2011-10/msg00208.php
http://archives.postgresql.org/pgsql-performance/2011-11/msg00015.php

In the latter, I counted four people would are in support of the general 
idea: Robert Haas, Andres Freund, Gavin Flower, Justin Pitts. However, I'm 
sure there are a lot of conflicting ideas on how exactly to go about it, 
such as whether to enable or disable it by default, the specific syntax to 
use, backwards compatibility, future-proofing, etc.

One good reason to reject it would be if it can't be done with SQL standard 
syntax and would require some sort of PG-specific hint or GUC variable for 
the query planner. If so, then I understand that it's opposed for all the 
same reasons that hints are opposed in general.

Another good reason to reject it might be because the only way to disable 
the CTE fence is to disable it by default. If that were the case, then I 
would imagine that it would break backwards compatibility, especially in the 
case of writable CTEs that currently depend on the fence for their current 
functionality. If there is no way to palatably enable it by default but 
allow certain CTEs or certain queries to disable it, then I don't see any 
way around that problem.

A third reason I can imagine is that the only desirable solution (e.g. the 
one without additional non-standard keywords or session GUC variables) is 
effectively impossible. For example, if it requires that the query planner 
determine definitively whether a CTE is read only or not, that may be a 
bridge too far.

A fourth possible reason is that the core team feels that CTEs do not 
improve readability, or that any such readability benefits are not worth the 
effort to support the option. Personally, I feel that the queries which 
could most benefit from the readability of CTEs are precisely the same ones 
that could most often benefit from the performance increase of disabling the 
fence (particularly if it could be done on a per-CTE basis rather than for 
the whole query at once).

Of course the real reason could be something else entirely, hence this post. 
Thanks in advance for your feedback.
--
DB


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-09-19 Thread Simon Riggs
On 12 September 2012 04:30, Amit Kapila amit.kap...@huawei.com wrote:
 On Tuesday, September 11, 2012 9:09 PM Alvaro Herrera wrote:
 Excerpts from Boszormenyi Zoltan's message of vie jun 29 09:11:23 -0400 2012:

 We have some use cases for this patch, when can you post
 a new version? I would test and review it.

 What use cases do you have in mind?

   Wouldn't it be helpful for some features like parallel query in future?

Trying to solve that is what delayed this patch, so the scope of this
needs to be permanent daemons rather than dynamically spawned worker
tasks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Kevin Grittner
k...@rice.edu k...@rice.edu wrote:
 On Wed, Sep 19, 2012 at 02:39:12PM -0500, Kevin Grittner wrote:
 In another thread, Tom Lane t...@sss.pgh.pa.us wrote:
 
 2. Apply the WHERE condition to each row from 1, and drop rows
 that don't pass it.
  
 People expect that the results will be consistent with this
 model, even if the implementation is optimized under the
 covers.  I think correct semantics should trump performance
 here.
 
 It seems like this is what happens here except that the function
 is evaluated once for the WHERE and not once per ROW. Both of
 these meet the criterion for 2 above and Tom's earlier comments
 both hold.
 
There really needs to be some way to specify that when an expression
is evaluated for each row in a set, a function used within that
expression is not optimized away for some rows.  Fortunately we have
a way:
 
http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html
 
| VOLATILE indicates that the function value can change even within
| a single table scan, so no optimizations can be made. Relatively
| few database functions are volatile in this sense; some examples
| are random(), [...]
 
The behavior in the OP's query would certainly be sane if the
function were not VOLATILE; as it is, I have a hard time seeing this
as anything but a bug.

There is a workaround, if you don't mind ugly:
 
CREATE FUNCTION random_really_i_mean_it(dummy int)
  RETURNS double precision
  LANGUAGE plpgsql
  VOLATILE
AS $$
BEGIN
  -- no need to reference dummy parameter
  RETURN random();
END;
$$;

WITH source AS (
SELECT i FROM generate_series(1,10) AS i
)
SELECT
i
FROM
source, (
SELECT
count(*) AS _n
FROM source
) AS _stats
WHERE
random_really_i_mean_it(i)  5::DOUBLE PRECISION/_n;
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 There is a workaround, if you don't mind ugly:
 
Or, better:
 
WITH source AS (
SELECT i, random() AS r FROM generate_series(1,10) AS i
)
SELECT
i
FROM
source, (
SELECT
count(*) AS _n
FROM source
) AS _stats
WHERE
r  5::DOUBLE PRECISION/_n;
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread David Johnston
 -Original Message-
 There really needs to be some way to specify that when an expression is
 evaluated for each row in a set, a function used within that expression is
not
 optimized away for some rows.  Fortunately we have a way:
 
 http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html
 
 | VOLATILE indicates that the function value can change even within a
 | single table scan, so no optimizations can be made. Relatively few
 | database functions are volatile in this sense; some examples are
 | random(), [...]
 
 The behavior in the OP's query would certainly be sane if the function
were
 not VOLATILE; as it is, I have a hard time seeing this as anything but a
bug.

What are the arguments against adding a 4th identifier - call it PER_ROW for
this argument?  The main reason VOLATILE is broken is that it is the default
and in order to minimize beginner's penalty it is not treated as such in
some situations.  The new one could behave just like VOLATILE but would
never be optimized away and would always evaluate once for each row in its
context.  

Then the question is whether you write a new random() function or break
backwards compatibility and alter the existing version.

David J.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Kevin Grittner
David Johnston pol...@yahoo.com wrote:
 
 | VOLATILE indicates that the function value can change even
 | within a single table scan, so no optimizations can be made.
 | Relatively few database functions are volatile in this sense;
 | some examples are random(), [...]
 
 What are the arguments against adding a 4th identifier - call it
 PER_ROW for this argument?  The main reason VOLATILE is broken is
 that it is the default and in order to minimize beginner's penalty
 it is not treated as such in some situations.  The new one could
 behave just like VOLATILE but would never be optimized away and
 would always evaluate once for each row in its context.  
 
So how would you document that?  It sounds like the proposed level
would behave exactly as the VOLATILE level is currently documented
to behave; so I guess we could shift the documentation of VOLATILE
to PER_ROW (or whatever).  How would you then describe the behavior
of VOLATILE?
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread David Johnston
 -Original Message-
 
  | VOLATILE indicates that the function value can change even within a
  | single table scan, so no optimizations can be made.
  | Relatively few database functions are volatile in this sense; some
  | examples are random(), [...]
 
  What are the arguments against adding a 4th identifier - call it
  PER_ROW for this argument?  The main reason VOLATILE is broken is that
  it is the default and in order to minimize beginner's penalty it is
  not treated as such in some situations.  The new one could behave just
  like VOLATILE but would never be optimized away and would always
  evaluate once for each row in its context.
 
 So how would you document that?  It sounds like the proposed level would
 behave exactly as the VOLATILE level is currently documented to behave; so
I
 guess we could shift the documentation of VOLATILE to PER_ROW (or
 whatever).  How would you then describe the behavior of VOLATILE?
 

I'm not sure but however we would describe it we might as well make the
change now regardless of whether another level is added.

The main distinguishing characteristic is that VOLATILE is not guaranteed to
evaluate once-per-row if it is not dependent upon particular values within a
given row.  

VOLATILE: A Volatile function used in an ORDER BY or WHERE clause without
referencing any columns from the query itself (i.e., no parameters or all
constants) will be evaluated a single time and the result treated as a
constant (i.e., all rows will have identical values) for that part of the
query.

PER_ROW: A per_row function will be evaluated once for every row that is
visible to the function and will be treated as a virtual column of said
relation with each cell having an its own value as a result of the
function call.

Using random() as an example of the two possible behaviors should further
clarify the differences quite nicely.

Quick pass - hopefully, a) this inspires someone else, and b) this is the
correct understanding in the first place.

David J.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Kevin Grittner
David Johnston pol...@yahoo.com wrote:
 
 VOLATILE: A Volatile function used in an ORDER BY or WHERE clause
 without referencing any columns from the query itself (i.e., no
 parameters or all constants) will be evaluated a single time and
 the result treated as a constant (i.e., all rows will have
 identical values) for that part of the query.
 
I hope you're wrong about the ORDER BY part of that.  A quick test
confirms that it works in ORDER BY, at least for some cases.  If
there are any exceptions to that, I would sure like to know about
it -- and really soon.
 
select * from generate_series(1, 1) s(n)
  order by random() limit 10;
 
-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread David Johnston
 -Original Message-
 From: Kevin Grittner [mailto:kevin.gritt...@wicourts.gov]
 Sent: Wednesday, September 19, 2012 5:51 PM
 To: k...@rice.edu; David Johnston
 Cc: 'Florian Schoppmann'; 'Robert Haas'; pgsql-hackers@postgresql.org;
'Tom
 Lane'
 Subject: RE: [HACKERS] Invalid optimization of VOLATILE function in WHERE
 clause?
 
 David Johnston pol...@yahoo.com wrote:
 
  VOLATILE: A Volatile function used in an ORDER BY or WHERE clause
  without referencing any columns from the query itself (i.e., no
  parameters or all constants) will be evaluated a single time and the
  result treated as a constant (i.e., all rows will have identical
  values) for that part of the query.
 
 I hope you're wrong about the ORDER BY part of that.  A quick test
confirms
 that it works in ORDER BY, at least for some cases.  If there are any
 exceptions to that, I would sure like to know about it -- and really soon.
 
 select * from generate_series(1, 1) s(n)
   order by random() limit 10;
 
 -Kevin

I'd rather have someone who knows the code assert one way or the other; I
tossed it in there because I thought I've seen people complain that random()
doesn't work as expected with ORDER BY but that may just be faulty memory.
It may or may not depend on whether LIMIT/OFFSET are involved...?  Used in
the SELECT-list it gets evaluated for each row and I guess the ORDER BY
could have that behavior as well (I would expect it to anyway), so is it
strictly limited to WHERE clause evaluation that this discrepancy manifests?

David J.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix bufmgr so CHECKPOINT_END_OF_RECOVERY behaves as a shutdown c

2012-09-19 Thread Andres Freund
On Tuesday, September 18, 2012 04:18:01 AM Robert Haas wrote:
  Maybe what we should do is - if this is an end-of-recovery checkpoint
  - *assert* that the BM_PERMANENT bit is set on every buffer we find.
  That would provide a useful cross-check that we don't have a bug
  similar to the one Jeff already fixed in any other code path.
  
  I haven't looked into the details, but can't a new unlogged relation be
  created since the last checkpoint and thus have pages in s_b?
 
 Data changes to unlogged relations are not WAL-logged, so there's no
 reason for recovery to ever read them.  Even if such a reason existed,
 there wouldn't be anything to read, because the backing files are
 unlinked before WAL replay begins.
Back then I thought that resetting the relation by copying the init fork might 
use the buffer cache. It doesn't atm...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Invalid optimization of VOLATILE function in WHERE clause?

2012-09-19 Thread Florian Schoppmann
Tom Lane t...@sss.pgh.pa.us wrote:

 florian.schoppm...@emc.com (Florian Schoppmann) writes:
  [VOLATILE function in WHERE clause *does* get optimized]
 
 I can't get excited about this.  Any time you put a volatile function
 into WHERE, you're playing with fire.  The docs warn against it:
 http://www.postgresql.org/docs/9.2/static/sql-expressions.html#SYNTAX-
 EXPRESS-EVAL

All this section tells me is that one cannot rely on the evaluation
order in expressions, and that side effects are dangerous in WHERE and
HAVING clauses.

I do not read in this section that VOLATILE functions are unsafe per se.
After all, a VOLATILE function is not required to have side effects
(suppose, e.g., somebody implemented a true random nubmer generator).

However,
http://www.postgresql.org/docs/9.2/interactive/sql-createfunction.html
is actually very clear in its wording:

| VOLATILE indicates that the function value can change even within a
| single table scan, so no optimizations can be made.

I therefore tend to see the behavior as a bug. I concede though that
there is also good reason to keep the current behavior (VOLATILE is the
default for UDFs, etc.). But then I think the documentation needs to be
changed, and it has to be made explicit where the optimizer may change
semantics and what, on the other hand, is defined behavior.

E.g., users should need to know if the following rewrite causes defined
or undefined behavior. Does it give correct results by accident, or are
correct results guaranteed?

--8--
WITH source AS (
SELECT i FROM generate_series(1,10) AS i
)
SELECT
i
FROM
source AS _stats
WHERE
random()  5::DOUBLE PRECISION 
   / (SELECT count(*) FROM source);
--8--

 To do what you want, I'd suggest wrapping the join into a sub-select
 with an OFFSET 0 clause, which will serve as an optimization fence
 that prevents the random() call from being pushed down.

My interpretation so far is that VOLATILE functions in a WHERE clause
can always be avoided: E.g., move the condition to the SELECT list and
embed in an outer query that then filters on the condition column. Or is
my assumption wrong, and the optimizer could theoretically interfere
even here?

Florian



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] newline conversion in SQL command strings

2012-09-19 Thread Peter Eisentraut
I have received a number of bug reports about plsh choking on
Windows-style line endings.  The problem is that the user uses some
Windows-based tool or other to execute an SQL command line this:

CREATE FUNCTION foo() RETURNS somethingCRLF
LANGUAGE plshCRLF
AS $$CRLF
#!/bin/shCRLF
CRLF
do somethingCRLF
do somethingCRLF
$$;CRLF

which (apparently, I don't have Windows handy) creates a function with
the prosrc body of

'CRLF
#!/bin/shCRLF
CRLF
do somethingCRLF
do somethingCRLF
'

But executing this fails because Unix shells reject CR characters in
inappropriate places as syntax errors.

I don't know how to handle that.  It would be unfortunate to have the
behavior of a function depend on the kind of client used to create or
modify it.

I suppose the other more mainstream PLs don't expose that problem so
much because the interpreters can handle either kind of line ending.
But there could still be functional differences if for example a line
ending is embedded into a string constant.

Ideas?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit API for large object

2012-09-19 Thread Kohei KaiGai
I checked this patch. It can be applied onto the latest master branch
without any problems. My comments are below.

2012/9/11 Tatsuo Ishii is...@postgresql.org:
 Ok, here is the patch to implement 64-bit API for large object, to
 allow to use up to 4TB large objects(or 16TB if BLCKSZ changed to
 32KB). The patch is based on Jeremy Drake's patch posted on September
 23, 2005
 (http://archives.postgresql.org/pgsql-hackers/2005-09/msg01026.php)
 and reasonably updated/edited to adopt PostgreSQL 9.3 by Nozomi Anzai
 for the backend part and Yugo Nagata for the rest(including
 documentation patch).

 Here are changes made in the patch:

 1) Frontend lo_* libpq functions(fe-lobj.c)(Yugo Nagata)

 lo_initialize() gathers backend 64-bit large object handling
 function's oid, namely lo_lseek64, lo_tell64, lo_truncate64.

 If client calls lo_*64 functions and backend does not support them,
 lo_*64 functions return error to caller. There might be an argument
 since calls to lo_*64 functions can automatically be redirected to
 32-bit older API. I don't know this is worth the trouble though.

I think it should definitely return an error code when user tries to
use lo_*64 functions towards the backend v9.2 or older, because
fallback to 32bit API can raise unexpected errors if application
intends to seek the area over than 2GB.

 Currently lo_initialize() throws an error if one of oids are not
 available. I doubt we do the same way for 64-bit functions since this
 will make 9.3 libpq unable to access large objects stored in pre-9.2
 PostgreSQL servers.

It seems to me the situation to split the case of pre-9.2 and post-9.3
using a condition of conn-sversion = 90300.

 To pass 64-bit integer to PQfn, PQArgBlock is used like this: int *ptr
 is a pointer to 64-bit integer and actual data is placed somewhere
 else. There might be other way: add new member to union u to store
 64-bit integer:

 typedef struct
 {
 int len;
 int isint;
 union
 {
 int*ptr;/* can't use void 
 (dec compiler barfs)   */
 int integer;
 int64   bigint; /* 64-bit integer */
 }   u;
 } PQArgBlock;

 I'm a little bit worried about this way because PQArgBlock is a public
 interface.

I'm inclined to add a new field for the union; that seems to me straight
forward approach.
For example, the manner in lo_seek64() seems to me confusable.
It set 1 on isint field even though pointer is delivered actually.

+   argv[1].isint = 1;
+   argv[1].len = 8;
+   argv[1].u.ptr = (int *) len;

 Also we add new type pg_int64:

 #ifndef NO_PG_INT64
 #define HAVE_PG_INT64 1
 typedef long long int pg_int64;
 #endif

 in postgres_ext.h per suggestion from Tom Lane:
 http://archives.postgresql.org/pgsql-hackers/2005-09/msg01062.php

I'm uncertain about context of this discussion.

Does it make matter if we include stdint.h and use int64_t instead
of the self defined data type?

 2) Backend lo_* functions (be-fsstubs.c)(Nozomi Anzai)

 Add lo_lseek64, lo_tell64, lo_truncate64 so that they can handle
 64-bit seek position and data length. loread64 and lowrite64 are not
 added because if a program tries to read/write more than 2GB at once,
 it would be a sign that the program need to be re-designed anyway.

I think it is a reasonable.

 3) Backend inv_api.c functions(Nozomi Anzai)

 No need to add new functions. Just extend them to handle 64-bit data.

 BTW , what will happen if older 32-bit libpq accesses large objects
 over 2GB?

 lo_read and lo_write: they can read or write lobjs using 32-bit API as
 long as requested read/write data length is smaller than 2GB. So I
 think we can safely allow them to access over 2GB lobjs.

 lo_lseek: again as long as requested offset is smaller than 2GB, there
 would be no problem.

 lo_tell:if current seek position is beyond 2GB, returns an error.

Even though iteration of lo_lseek() may move the offset to 4TB, it also
makes unavailable to use lo_tell() to obtain the current offset, so I think
it is reasonable behavior.

However, error code is not an appropriate one.

+   if (INT_MAX  offset)
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+errmsg(invalid large-object
descriptor: %d, fd)));
+   PG_RETURN_INT32(-1);
+   }

According to the manpage of lseek(2)
EOVERFLOW
The resulting file offset cannot be represented in an off_t.

Please add a new error code such as ERRCODE_BLOB_OFFSET_OVERFLOW.

 4) src/test/examples/testlo64.c added for 64-bit API example(Yugo Nagata)

 Comments and suggestions are welcome.

miscellaneous comments are below.

Regression test is helpful. Even though no need to try to create 4TB large
object, it is helpful to write some chunks around the design boundary.
Could 

Re: [v9.3] Extra Daemons (Re: [HACKERS] elegant and effective way for running jobs inside a database)

2012-09-19 Thread Amit Kapila
On Thursday, September 20, 2012 1:44 AM Simon Riggs wrote:
On 12 September 2012 04:30, Amit Kapila amit.kap...@huawei.com wrote:
 On Tuesday, September 11, 2012 9:09 PM Alvaro Herrera wrote:
 Excerpts from Boszormenyi Zoltan's message of vie jun 29 09:11:23 -0400
2012:

 We have some use cases for this patch, when can you post
 a new version? I would test and review it.

 What use cases do you have in mind?

   Wouldn't it be helpful for some features like parallel query in future?

 Trying to solve that is what delayed this patch, so the scope of this
 needs to be permanent daemons rather than dynamically spawned worker
 tasks.
  
  Why can't worker tasks be also permanent, which can be controlled through
  configuration. What I mean to say is that if user has need for parallel
operations
  he can configure max_worker_tasks and those many worker tasks will get
created.
  Otherwise without having such parameter, we might not be sure whether such
deamons
  will be of use to database users who don't need any background ops.

  The dynamism will come in to scene when we need to allocate such daemons
for particular ops(query), because
  might be operation need certain number of worker tasks, but no such task
is available, at that time it need 
  to be decided whether to spawn a new task or change the parallelism in
operation such that it can be executed with 
  available number of worker tasks.

  Although I understood and agree that such permanent daemons will be
useful for usecases other than 
  parallel operations. However my thinking is that having permanent
daemons can also be useful for parallel ops.
  So even currently it is getting developed for certain usecases but the
overall idea can be enhanced to have them for 
  parallel ops as well.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers